mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-05-07  1:01 Andrew Morton
  2021-05-07  1:02 ` [patch 01/91] alpha: eliminate old-style function definitions Andrew Morton
                   ` (91 more replies)
  0 siblings, 92 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


This is everything else from -mm for this merge window, with the
possible exception of Mike Rapoport's "secretmem" syscall patch series
(https://lkml.kernel.org/r/20210303162209.8609-1-rppt@kernel.org).

I've been wobbly about the secretmem patches due to doubts about
whether the feature is sufficiently useful to justify inclusion, but
developers are now weighing in with helpful information and I've asked Mike
for an extensively updated [0/n] changelog.  This will take a few days
to play out so it is possible that I will prevail upon you for a post-rc1
merge.  If that's a problem, there's always 5.13-rc1.

91 patches, based on 8ca5297e7e38f2dc8c753d33a5092e7be181fff0, plus
previously sent patches.

Thanks.



Subsystems affected by this patch series:

  alpha
  procfs
  sysctl
  misc
  core-kernel
  bitmap
  lib
  compat
  checkpatch
  epoll
  isofs
  nilfs2
  hpfs
  exit
  fork
  kexec
  gcov
  panic
  delayacct
  gdb
  resource
  selftests
  async
  initramfs
  ipc
  mm/cleanups
  drivers/char
  mm/slub
  spelling

Subsystem: alpha

    Randy Dunlap <rdunlap@infradead.org>:
      alpha: eliminate old-style function definitions
      alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h>

Subsystem: procfs

    Colin Ian King <colin.king@canonical.com>:
      fs/proc/generic.c: fix incorrect pde_is_permanent check

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: save LOC in __xlate_proc_name()
      proc: mandate ->proc_lseek in "struct proc_ops"
      proc: delete redundant subset=pid check
      selftests: proc: test subset=pid

Subsystem: sysctl

    zhouchuangao <zhouchuangao@vivo.com>:
      proc/sysctl: fix function name error in comments

Subsystem: misc

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      include: remove pagemap.h from blkdev.h

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: drop inclusion in bitmap.h

    Wan Jiabing <wanjiabing@vivo.com>:
      linux/profile.h: remove unnecessary declaration

Subsystem: core-kernel

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      kernel/async.c: fix pr_debug statement
      kernel/cred.c: make init_groups static

Subsystem: bitmap

    Yury Norov <yury.norov@gmail.com>:
    Patch series "lib/find_bit: fast path for small bitmaps", v6:
      tools: disable -Wno-type-limits
      tools: bitmap: sync function declarations with the kernel
      tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel
      arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300
      lib: extend the scope of small_const_nbits() macro
      tools: sync small_const_nbits() macro with the kernel
      lib: inline _find_next_bit() wrappers
      tools: sync find_next_bit implementation
      lib: add fast path for find_next_*_bit()
      lib: add fast path for find_first_*_bit() and find_last_bit()
      tools: sync lib/find_bit implementation
      MAINTAINERS: add entry for the bitmap API

Subsystem: lib

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      lib/bch.c: fix a typo in the file bch.c

    Wang Qing <wangqing@vivo.com>:
      lib: fix inconsistent indenting in process_bit1()

    ToastC <mrtoastcheng@gmail.com>:
      lib/list_sort.c: fix typo in function description

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      lib/genalloc.c: Fix a typo

    Richard Fitzgerald <rf@opensource.cirrus.com>:
      lib: crc8: pointer to data block should be const

    Zqiang <qiang.zhang@windriver.com>:
      lib: stackdepot: turn depot_lock spinlock to raw_spinlock

    Alex Shi <alexs@kernel.org>:
      lib/percpu_counter: tame kernel-doc compile warning
      lib/genalloc: add parameter description to fix doc compile warning

    Randy Dunlap <rdunlap@infradead.org>:
      lib: parser: clean up kernel-doc

Subsystem: compat

    Masahiro Yamada <masahiroy@kernel.org>:
      include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx()

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: warn when missing newline in return sysfs_emit() formats

    Vincent Mailhol <mailhol.vincent@wanadoo.fr>:
      checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      checkpatch: improve ALLOC_ARRAY_ARGS test

Subsystem: epoll

    Davidlohr Bueso <dave@stgolabs.net>:
    Patch series "fs/epoll: restore user-visible behavior upon event ready":
      kselftest: introduce new epoll test case
      fs/epoll: restore waking from ep_done_scan()

Subsystem: isofs

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      isofs: fix fall-through warnings for Clang

Subsystem: nilfs2

    Liu xuzhi <liu.xuzhi@zte.com.cn>:
      fs/nilfs2: fix misspellings using codespell tool

    Lu Jialin <lujialin4@huawei.com>:
      nilfs2: fix typos in comments

Subsystem: hpfs

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      hpfs: replace one-element array with flexible-array member

Subsystem: exit

    Jim Newsome <jnewsome@torproject.org>:
      do_wait: make PIDTYPE_PID case O(1) instead of O(n)

Subsystem: fork

    Rolf Eike Beer <eb@emlix.com>:
      kernel/fork.c: simplify copy_mm()

    Xiaofeng Cao <cxfcosmos@gmail.com>:
      kernel/fork.c: fix typos

Subsystem: kexec

    Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>:
      kernel/crash_core: add crashkernel=auto for vmcore creation

    Joe LeVeque <jolevequ@microsoft.com>:
      kexec: Add kexec reboot string

    Jia-Ju Bai <baijiaju1990@gmail.com>:
      kernel: kexec_file: fix error return code of kexec_calculate_store_digests()

    Pavel Tatashin <pasha.tatashin@soleen.com>:
      kexec: dump kmessage before machine_kexec

Subsystem: gcov

    Johannes Berg <johannes.berg@intel.com>:
      gcov: combine common code
      gcov: simplify buffer allocation
      gcov: use kvmalloc()

    Nick Desaulniers <ndesaulniers@google.com>:
      gcov: clang: drop support for clang-10 and older

Subsystem: panic

    He Ying <heying24@huawei.com>:
      smp: kernel/panic.c - silence warnings

Subsystem: delayacct

    Yafang Shao <laoar.shao@gmail.com>:
      delayacct: clear right task's flag after blkio completes

Subsystem: gdb

    Johannes Berg <johannes.berg@intel.com>:
      gdb: lx-symbols: store the abspath()

    Barry Song <song.bao.hua@hisilicon.com>:
    Patch series "scripts/gdb: clarify the platforms supporting lx_current and add arm64 support", v2:
      scripts/gdb: document lx_current is only supported by x86
      scripts/gdb: add lx_current support for arm64

Subsystem: resource

    David Hildenbrand <david@redhat.com>:
    Patch series "kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree", v2:
      kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
      kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
      kernel/resource: remove first_lvl / siblings_only logic

    Alistair Popple <apopple@nvidia.com>:
      kernel/resource: allow region_intersects users to hold resource_lock
      kernel/resource: refactor __request_region to allow external locking
      kernel/resource: fix locking in request_free_mem_region

Subsystem: selftests

    Zhang Yunkai <zhang.yunkai@zte.com.cn>:
      selftests: remove duplicate include

Subsystem: async

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      kernel/async.c: stop guarding pr_debug() statements
      kernel/async.c: remove async_unregister_domain()

Subsystem: initramfs

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
    Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3:
      init/initramfs.c: do unpacking asynchronously
      modules: add CONFIG_MODPROBE_PATH

Subsystem: ipc

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ipc/sem.c: mundane typo fixes

Subsystem: mm/cleanups

    Shijie Luo <luoshijie1@huawei.com>:
      mm: fix some typos and code style problems

Subsystem: drivers/char

    David Hildenbrand <david@redhat.com>:
    Patch series "drivers/char: remove /dev/kmem for good":
      drivers/char: remove /dev/kmem for good
      mm: remove xlate_dev_kmem_ptr()
      mm/vmalloc: remove vwrite()

Subsystem: mm/slub

    Maninder Singh <maninder1.s@samsung.com>:
      arm: print alloc free paths for address in registers

Subsystem: spelling

    Drew Fustini <drew@beagleboard.org>:
      scripts/spelling.txt: add "overlfow"

    zuoqilin <zuoqilin@yulong.com>:
      scripts/spelling.txt: Add "diabled" typo

    Drew Fustini <drew@beagleboard.org>:
      scripts/spelling.txt: add "overflw"

    Colin Ian King <colin.king@canonical.com>:
      mm/slab.c: fix spelling mistake "disired" -> "desired"

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      include/linux/pgtable.h: few spelling fixes

    zhouchuangao <zhouchuangao@vivo.com>:
      kernel/umh.c: fix some spelling mistakes

    Xiaofeng Cao <cxfcosmos@gmail.com>:
      kernel/user_namespace.c: fix typos

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      kernel/up.c: fix typo

    Xiaofeng Cao <caoxiaofeng@yulong.com>:
      kernel/sys.c: fix typo

    dingsenjie <dingsenjie@yulong.com>:
      fs: fat: fix spelling typo of values

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ipc/sem.c: spelling fix

    Masahiro Yamada <masahiroy@kernel.org>:
      treewide: remove editor modelines and cruft

    Ingo Molnar <mingo@kernel.org>:
      mm: fix typos in comments

    Lu Jialin <lujialin4@huawei.com>:
      mm: fix typos in comments

 Documentation/admin-guide/devices.txt                         |    2 
 Documentation/admin-guide/kdump/kdump.rst                     |    3 
 Documentation/admin-guide/kernel-parameters.txt               |   18 
 Documentation/dev-tools/gdb-kernel-debugging.rst              |    4 
 MAINTAINERS                                                   |   16 
 arch/Kconfig                                                  |   20 
 arch/alpha/include/asm/io.h                                   |    5 
 arch/alpha/kernel/pc873xx.c                                   |    4 
 arch/alpha/lib/csum_partial_copy.c                            |    1 
 arch/arm/configs/dove_defconfig                               |    1 
 arch/arm/configs/magician_defconfig                           |    1 
 arch/arm/configs/moxart_defconfig                             |    1 
 arch/arm/configs/mps2_defconfig                               |    1 
 arch/arm/configs/mvebu_v5_defconfig                           |    1 
 arch/arm/configs/xcep_defconfig                               |    1 
 arch/arm/include/asm/bug.h                                    |    1 
 arch/arm/include/asm/io.h                                     |    5 
 arch/arm/kernel/process.c                                     |   11 
 arch/arm/kernel/traps.c                                       |    1 
 arch/h8300/include/asm/bitops.h                               |    8 
 arch/hexagon/configs/comet_defconfig                          |    1 
 arch/hexagon/include/asm/io.h                                 |    1 
 arch/ia64/include/asm/io.h                                    |    1 
 arch/ia64/include/asm/uaccess.h                               |   18 
 arch/m68k/atari/time.c                                        |    7 
 arch/m68k/configs/amcore_defconfig                            |    1 
 arch/m68k/include/asm/bitops.h                                |    6 
 arch/m68k/include/asm/io_mm.h                                 |    5 
 arch/mips/include/asm/io.h                                    |    5 
 arch/openrisc/configs/or1ksim_defconfig                       |    1 
 arch/parisc/include/asm/io.h                                  |    5 
 arch/parisc/include/asm/pdc_chassis.h                         |    1 
 arch/powerpc/include/asm/io.h                                 |    5 
 arch/s390/include/asm/io.h                                    |    5 
 arch/sh/configs/edosk7705_defconfig                           |    1 
 arch/sh/configs/se7206_defconfig                              |    1 
 arch/sh/configs/sh2007_defconfig                              |    1 
 arch/sh/configs/sh7724_generic_defconfig                      |    1 
 arch/sh/configs/sh7770_generic_defconfig                      |    1 
 arch/sh/configs/sh7785lcr_32bit_defconfig                     |    1 
 arch/sh/include/asm/bitops.h                                  |    5 
 arch/sh/include/asm/io.h                                      |    5 
 arch/sparc/configs/sparc64_defconfig                          |    1 
 arch/sparc/include/asm/io_64.h                                |    5 
 arch/um/drivers/cow.h                                         |    7 
 arch/xtensa/configs/xip_kc705_defconfig                       |    1 
 block/blk-settings.c                                          |    1 
 drivers/auxdisplay/panel.c                                    |    7 
 drivers/base/firmware_loader/main.c                           |    2 
 drivers/block/brd.c                                           |    1 
 drivers/block/loop.c                                          |    1 
 drivers/char/Kconfig                                          |   10 
 drivers/char/mem.c                                            |  231 --------
 drivers/gpu/drm/qxl/qxl_drv.c                                 |    1 
 drivers/isdn/capi/kcapi_proc.c                                |    1 
 drivers/md/bcache/super.c                                     |    1 
 drivers/media/usb/pwc/pwc-uncompress.c                        |    3 
 drivers/net/ethernet/adaptec/starfire.c                       |    8 
 drivers/net/ethernet/amd/atarilance.c                         |    8 
 drivers/net/ethernet/amd/pcnet32.c                            |    7 
 drivers/net/wireless/intersil/hostap/hostap_proc.c            |    1 
 drivers/net/wireless/intersil/orinoco/orinoco_nortel.c        |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_pci.c           |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_plx.c           |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_tmd.c           |    8 
 drivers/nvdimm/btt.c                                          |    1 
 drivers/nvdimm/pmem.c                                         |    1 
 drivers/parport/parport_ip32.c                                |   12 
 drivers/platform/x86/dell/dell_rbu.c                          |    3 
 drivers/scsi/53c700.c                                         |    1 
 drivers/scsi/53c700.h                                         |    1 
 drivers/scsi/ch.c                                             |    6 
 drivers/scsi/esas2r/esas2r_main.c                             |    1 
 drivers/scsi/ips.c                                            |   20 
 drivers/scsi/ips.h                                            |   20 
 drivers/scsi/lasi700.c                                        |    1 
 drivers/scsi/megaraid/mbox_defs.h                             |    2 
 drivers/scsi/megaraid/mega_common.h                           |    2 
 drivers/scsi/megaraid/megaraid_mbox.c                         |    2 
 drivers/scsi/megaraid/megaraid_mbox.h                         |    2 
 drivers/scsi/qla1280.c                                        |   12 
 drivers/scsi/scsicam.c                                        |    1 
 drivers/scsi/sni_53c710.c                                     |    1 
 drivers/video/fbdev/matrox/matroxfb_base.c                    |    9 
 drivers/video/fbdev/vga16fb.c                                 |   10 
 fs/configfs/configfs_internal.h                               |    4 
 fs/configfs/dir.c                                             |    4 
 fs/configfs/file.c                                            |    4 
 fs/configfs/inode.c                                           |    4 
 fs/configfs/item.c                                            |    4 
 fs/configfs/mount.c                                           |    4 
 fs/configfs/symlink.c                                         |    4 
 fs/eventpoll.c                                                |    6 
 fs/fat/fatent.c                                               |    2 
 fs/hpfs/hpfs.h                                                |    3 
 fs/isofs/rock.c                                               |    1 
 fs/nfs/dir.c                                                  |    7 
 fs/nfs/nfs4proc.c                                             |    6 
 fs/nfs/nfs4renewd.c                                           |    6 
 fs/nfs/nfs4state.c                                            |    6 
 fs/nfs/nfs4xdr.c                                              |    6 
 fs/nfsd/nfs4proc.c                                            |    6 
 fs/nfsd/nfs4xdr.c                                             |    6 
 fs/nfsd/xdr4.h                                                |    6 
 fs/nilfs2/cpfile.c                                            |    2 
 fs/nilfs2/ioctl.c                                             |    4 
 fs/nilfs2/segment.c                                           |    4 
 fs/nilfs2/the_nilfs.c                                         |    2 
 fs/ocfs2/acl.c                                                |    4 
 fs/ocfs2/acl.h                                                |    4 
 fs/ocfs2/alloc.c                                              |    4 
 fs/ocfs2/alloc.h                                              |    4 
 fs/ocfs2/aops.c                                               |    4 
 fs/ocfs2/aops.h                                               |    4 
 fs/ocfs2/blockcheck.c                                         |    4 
 fs/ocfs2/blockcheck.h                                         |    4 
 fs/ocfs2/buffer_head_io.c                                     |    4 
 fs/ocfs2/buffer_head_io.h                                     |    4 
 fs/ocfs2/cluster/heartbeat.c                                  |    4 
 fs/ocfs2/cluster/heartbeat.h                                  |    4 
 fs/ocfs2/cluster/masklog.c                                    |    4 
 fs/ocfs2/cluster/masklog.h                                    |    4 
 fs/ocfs2/cluster/netdebug.c                                   |    4 
 fs/ocfs2/cluster/nodemanager.c                                |    4 
 fs/ocfs2/cluster/nodemanager.h                                |    4 
 fs/ocfs2/cluster/ocfs2_heartbeat.h                            |    4 
 fs/ocfs2/cluster/ocfs2_nodemanager.h                          |    4 
 fs/ocfs2/cluster/quorum.c                                     |    4 
 fs/ocfs2/cluster/quorum.h                                     |    4 
 fs/ocfs2/cluster/sys.c                                        |    4 
 fs/ocfs2/cluster/sys.h                                        |    4 
 fs/ocfs2/cluster/tcp.c                                        |    4 
 fs/ocfs2/cluster/tcp.h                                        |    4 
 fs/ocfs2/cluster/tcp_internal.h                               |    4 
 fs/ocfs2/dcache.c                                             |    4 
 fs/ocfs2/dcache.h                                             |    4 
 fs/ocfs2/dir.c                                                |    4 
 fs/ocfs2/dir.h                                                |    4 
 fs/ocfs2/dlm/dlmapi.h                                         |    4 
 fs/ocfs2/dlm/dlmast.c                                         |    4 
 fs/ocfs2/dlm/dlmcommon.h                                      |    4 
 fs/ocfs2/dlm/dlmconvert.c                                     |    4 
 fs/ocfs2/dlm/dlmconvert.h                                     |    4 
 fs/ocfs2/dlm/dlmdebug.c                                       |    4 
 fs/ocfs2/dlm/dlmdebug.h                                       |    4 
 fs/ocfs2/dlm/dlmdomain.c                                      |    4 
 fs/ocfs2/dlm/dlmdomain.h                                      |    4 
 fs/ocfs2/dlm/dlmlock.c                                        |    4 
 fs/ocfs2/dlm/dlmmaster.c                                      |    4 
 fs/ocfs2/dlm/dlmrecovery.c                                    |    4 
 fs/ocfs2/dlm/dlmthread.c                                      |    4 
 fs/ocfs2/dlm/dlmunlock.c                                      |    4 
 fs/ocfs2/dlmfs/dlmfs.c                                        |    4 
 fs/ocfs2/dlmfs/userdlm.c                                      |    4 
 fs/ocfs2/dlmfs/userdlm.h                                      |    4 
 fs/ocfs2/dlmglue.c                                            |    4 
 fs/ocfs2/dlmglue.h                                            |    4 
 fs/ocfs2/export.c                                             |    4 
 fs/ocfs2/export.h                                             |    4 
 fs/ocfs2/extent_map.c                                         |    4 
 fs/ocfs2/extent_map.h                                         |    4 
 fs/ocfs2/file.c                                               |    4 
 fs/ocfs2/file.h                                               |    4 
 fs/ocfs2/filecheck.c                                          |    4 
 fs/ocfs2/filecheck.h                                          |    4 
 fs/ocfs2/heartbeat.c                                          |    4 
 fs/ocfs2/heartbeat.h                                          |    4 
 fs/ocfs2/inode.c                                              |    4 
 fs/ocfs2/inode.h                                              |    4 
 fs/ocfs2/journal.c                                            |    4 
 fs/ocfs2/journal.h                                            |    4 
 fs/ocfs2/localalloc.c                                         |    4 
 fs/ocfs2/localalloc.h                                         |    4 
 fs/ocfs2/locks.c                                              |    4 
 fs/ocfs2/locks.h                                              |    4 
 fs/ocfs2/mmap.c                                               |    4 
 fs/ocfs2/move_extents.c                                       |    4 
 fs/ocfs2/move_extents.h                                       |    4 
 fs/ocfs2/namei.c                                              |    4 
 fs/ocfs2/namei.h                                              |    4 
 fs/ocfs2/ocfs1_fs_compat.h                                    |    4 
 fs/ocfs2/ocfs2.h                                              |    4 
 fs/ocfs2/ocfs2_fs.h                                           |    4 
 fs/ocfs2/ocfs2_ioctl.h                                        |    4 
 fs/ocfs2/ocfs2_lockid.h                                       |    4 
 fs/ocfs2/ocfs2_lockingver.h                                   |    4 
 fs/ocfs2/refcounttree.c                                       |    4 
 fs/ocfs2/refcounttree.h                                       |    4 
 fs/ocfs2/reservations.c                                       |    4 
 fs/ocfs2/reservations.h                                       |    4 
 fs/ocfs2/resize.c                                             |    4 
 fs/ocfs2/resize.h                                             |    4 
 fs/ocfs2/slot_map.c                                           |    4 
 fs/ocfs2/slot_map.h                                           |    4 
 fs/ocfs2/stack_o2cb.c                                         |    4 
 fs/ocfs2/stack_user.c                                         |    4 
 fs/ocfs2/stackglue.c                                          |    4 
 fs/ocfs2/stackglue.h                                          |    4 
 fs/ocfs2/suballoc.c                                           |    4 
 fs/ocfs2/suballoc.h                                           |    4 
 fs/ocfs2/super.c                                              |    4 
 fs/ocfs2/super.h                                              |    4 
 fs/ocfs2/symlink.c                                            |    4 
 fs/ocfs2/symlink.h                                            |    4 
 fs/ocfs2/sysfile.c                                            |    4 
 fs/ocfs2/sysfile.h                                            |    4 
 fs/ocfs2/uptodate.c                                           |    4 
 fs/ocfs2/uptodate.h                                           |    4 
 fs/ocfs2/xattr.c                                              |    4 
 fs/ocfs2/xattr.h                                              |    4 
 fs/proc/generic.c                                             |   13 
 fs/proc/inode.c                                               |   18 
 fs/proc/proc_sysctl.c                                         |    2 
 fs/reiserfs/procfs.c                                          |   10 
 include/asm-generic/bitops/find.h                             |  108 +++
 include/asm-generic/bitops/le.h                               |   38 +
 include/asm-generic/bitsperlong.h                             |   12 
 include/asm-generic/io.h                                      |   11 
 include/linux/align.h                                         |   15 
 include/linux/async.h                                         |    1 
 include/linux/bitmap.h                                        |   11 
 include/linux/bitops.h                                        |   12 
 include/linux/blkdev.h                                        |    1 
 include/linux/compat.h                                        |    1 
 include/linux/configfs.h                                      |    4 
 include/linux/crc8.h                                          |    2 
 include/linux/cred.h                                          |    1 
 include/linux/delayacct.h                                     |   20 
 include/linux/fs.h                                            |    2 
 include/linux/genl_magic_func.h                               |    1 
 include/linux/genl_magic_struct.h                             |    1 
 include/linux/gfp.h                                           |    2 
 include/linux/init_task.h                                     |    1 
 include/linux/initrd.h                                        |    2 
 include/linux/kernel.h                                        |    9 
 include/linux/mm.h                                            |    2 
 include/linux/mmzone.h                                        |    2 
 include/linux/pgtable.h                                       |   10 
 include/linux/proc_fs.h                                       |    1 
 include/linux/profile.h                                       |    3 
 include/linux/smp.h                                           |    8 
 include/linux/swap.h                                          |    1 
 include/linux/vmalloc.h                                       |    7 
 include/uapi/linux/if_bonding.h                               |   11 
 include/uapi/linux/nfs4.h                                     |    6 
 include/xen/interface/elfnote.h                               |   10 
 include/xen/interface/hvm/hvm_vcpu.h                          |   10 
 include/xen/interface/io/xenbus.h                             |   10 
 init/Kconfig                                                  |   12 
 init/initramfs.c                                              |   38 +
 init/main.c                                                   |    1 
 ipc/sem.c                                                     |   12 
 kernel/async.c                                                |   68 --
 kernel/configs/android-base.config                            |    1 
 kernel/crash_core.c                                           |    7 
 kernel/cred.c                                                 |    2 
 kernel/exit.c                                                 |   67 ++
 kernel/fork.c                                                 |   23 
 kernel/gcov/Kconfig                                           |    1 
 kernel/gcov/base.c                                            |   49 +
 kernel/gcov/clang.c                                           |  282 ----------
 kernel/gcov/fs.c                                              |  146 ++++-
 kernel/gcov/gcc_4_7.c                                         |  173 ------
 kernel/gcov/gcov.h                                            |   14 
 kernel/kexec_core.c                                           |    4 
 kernel/kexec_file.c                                           |    4 
 kernel/kmod.c                                                 |    2 
 kernel/resource.c                                             |  198 ++++---
 kernel/sys.c                                                  |   14 
 kernel/umh.c                                                  |    8 
 kernel/up.c                                                   |    2 
 kernel/user_namespace.c                                       |    6 
 lib/bch.c                                                     |    2 
 lib/crc8.c                                                    |    2 
 lib/decompress_unlzma.c                                       |    2 
 lib/find_bit.c                                                |   68 --
 lib/genalloc.c                                                |    7 
 lib/list_sort.c                                               |    2 
 lib/parser.c                                                  |   61 +-
 lib/percpu_counter.c                                          |    2 
 lib/stackdepot.c                                              |    6 
 mm/balloon_compaction.c                                       |    4 
 mm/compaction.c                                               |    4 
 mm/filemap.c                                                  |    2 
 mm/gup.c                                                      |    2 
 mm/highmem.c                                                  |    2 
 mm/huge_memory.c                                              |    6 
 mm/hugetlb.c                                                  |    6 
 mm/internal.h                                                 |    2 
 mm/kasan/kasan.h                                              |    8 
 mm/kasan/quarantine.c                                         |    4 
 mm/kasan/shadow.c                                             |    4 
 mm/kfence/report.c                                            |    2 
 mm/khugepaged.c                                               |    2 
 mm/ksm.c                                                      |    6 
 mm/madvise.c                                                  |    4 
 mm/memcontrol.c                                               |   18 
 mm/memory-failure.c                                           |    2 
 mm/memory.c                                                   |   18 
 mm/mempolicy.c                                                |    6 
 mm/migrate.c                                                  |    8 
 mm/mmap.c                                                     |    4 
 mm/mprotect.c                                                 |    2 
 mm/mremap.c                                                   |    2 
 mm/nommu.c                                                    |   10 
 mm/oom_kill.c                                                 |    2 
 mm/page-writeback.c                                           |    4 
 mm/page_alloc.c                                               |   16 
 mm/page_owner.c                                               |    2 
 mm/page_vma_mapped.c                                          |    2 
 mm/percpu-internal.h                                          |    2 
 mm/percpu.c                                                   |    2 
 mm/pgalloc-track.h                                            |    6 
 mm/rmap.c                                                     |    2 
 mm/slab.c                                                     |    8 
 mm/slub.c                                                     |    2 
 mm/swap.c                                                     |    4 
 mm/swap_slots.c                                               |    2 
 mm/swap_state.c                                               |    2 
 mm/vmalloc.c                                                  |  124 ----
 mm/vmstat.c                                                   |    2 
 mm/z3fold.c                                                   |    2 
 mm/zpool.c                                                    |    2 
 mm/zsmalloc.c                                                 |    6 
 samples/configfs/configfs_sample.c                            |    2 
 scripts/checkpatch.pl                                         |   15 
 scripts/gdb/linux/cpus.py                                     |   23 
 scripts/gdb/linux/symbols.py                                  |    3 
 scripts/spelling.txt                                          |    3 
 tools/include/asm-generic/bitops/find.h                       |   85 ++-
 tools/include/asm-generic/bitsperlong.h                       |    3 
 tools/include/linux/bitmap.h                                  |   18 
 tools/lib/bitmap.c                                            |    4 
 tools/lib/find_bit.c                                          |   56 -
 tools/scripts/Makefile.include                                |    1 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   44 +
 tools/testing/selftests/kvm/lib/sparsebit.c                   |    1 
 tools/testing/selftests/mincore/mincore_selftest.c            |    1 
 tools/testing/selftests/powerpc/mm/tlbie_test.c               |    1 
 tools/testing/selftests/proc/Makefile                         |    1 
 tools/testing/selftests/proc/proc-subset-pid.c                |  121 ++++
 tools/testing/selftests/proc/read.c                           |    4 
 tools/usb/hcd-tests.sh                                        |    2 
 343 files changed, 1383 insertions(+), 2119 deletions(-)


^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 01/91] alpha: eliminate old-style function definitions
  2021-05-07  1:01 incoming Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 02/91] alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h> Andrew Morton
                   ` (90 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, ink, linux-mm, mattst88, mm-commits, rdunlap, rth, torvalds

From: Randy Dunlap <rdunlap@infradead.org>
Subject: alpha: eliminate old-style function definitions

'make ARCH=alpha W=1' reports a couple of old-style function
definitions with missing parameter list, so fix those.

../arch/alpha/kernel/pc873xx.c: In function 'pc873xx_get_base':
../arch/alpha/kernel/pc873xx.c:16:21: warning: old-style function definition [-Wold-style-definition]
   16 | unsigned int __init pc873xx_get_base()

../arch/alpha/kernel/pc873xx.c: In function 'pc873xx_get_model':
../arch/alpha/kernel/pc873xx.c:21:14: warning: old-style function definition [-Wold-style-definition]
   21 | char *__init pc873xx_get_model()

Link: https://lkml.kernel.org/r/20210421061312.30097-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/kernel/pc873xx.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/alpha/kernel/pc873xx.c~alpha-eliminate-old-style-function-definitions
+++ a/arch/alpha/kernel/pc873xx.c
@@ -13,12 +13,12 @@ static char *pc873xx_names[] = {
 static unsigned int base, model;
 
 
-unsigned int __init pc873xx_get_base()
+unsigned int __init pc873xx_get_base(void)
 {
 	return base;
 }
 
-char *__init pc873xx_get_model()
+char *__init pc873xx_get_model(void)
 {
 	return pc873xx_names[model];
 }
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 02/91] alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h>
  2021-05-07  1:01 incoming Andrew Morton
  2021-05-07  1:02 ` [patch 01/91] alpha: eliminate old-style function definitions Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 03/91] fs/proc/generic.c: fix incorrect pde_is_permanent check Andrew Morton
                   ` (89 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, ink, linux-mm, lkp, mattst88, mm-commits, rdunlap, rth,
	torvalds, viro

From: Randy Dunlap <rdunlap@infradead.org>
Subject: alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h>

Fix "no previous prototype" W=1 warnings from the kernel test robot:

../arch/alpha/lib/csum_partial_copy.c:349:1: error: no previous prototype for 'csum_and_copy_from_user' [-Werror=missing-prototypes]
  349 | csum_and_copy_from_user(const void __user *src, void *dst, int len)
      | ^~~~~~~~~~~~~~~~~~~~~~~
../arch/alpha/lib/csum_partial_copy.c:358:1: error: no previous prototype for 'csum_partial_copy_nocheck' [-Werror=missing-prototypes]
  358 | csum_partial_copy_nocheck(const void *src, void *dst, int len)
      | ^~~~~~~~~~~~~~~~~~~~~~~~~

Link: https://lkml.kernel.org/r/20210425235749.19113-1-rdunlap@infradead.org
Fixes: 808b49da54e6 ("alpha: turn csum_partial_copy_from_user() into csum_and_copy_from_user()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/lib/csum_partial_copy.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/alpha/lib/csum_partial_copy.c~alpha-csum_partial_copyc-add-function-prototypes-from-net-checksumh
+++ a/arch/alpha/lib/csum_partial_copy.c
@@ -13,6 +13,7 @@
 #include <linux/types.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
+#include <net/checksum.h>
 
 
 #define ldq_u(x,y) \
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 03/91] fs/proc/generic.c: fix incorrect pde_is_permanent check
  2021-05-07  1:01 incoming Andrew Morton
  2021-05-07  1:02 ` [patch 01/91] alpha: eliminate old-style function definitions Andrew Morton
  2021-05-07  1:02 ` [patch 02/91] alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h> Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 04/91] proc: save LOC in __xlate_proc_name() Andrew Morton
                   ` (88 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: adobriyan, akpm, christian.brauner, colin.king, gregkh, linux-mm,
	mm-commits, torvalds

From: Colin Ian King <colin.king@canonical.com>
Subject: fs/proc/generic.c: fix incorrect pde_is_permanent check

Currently the pde_is_permanent() check is being run on root multiple times
rather than on the next proc directory entry.  This looks like a
copy-paste error.  Fix this by replacing root with next.

Addresses-Coverity: ("Copy-paste error")
Link: https://lkml.kernel.org/r/20210318122633.14222-1-colin.king@canonical.com
Fixes: d919b33dafb3 ("proc: faster open/read/close with "permanent" files")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/generic.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/proc/generic.c~proc-fix-incorrect-pde_is_permanent-check
+++ a/fs/proc/generic.c
@@ -756,7 +756,7 @@ int remove_proc_subtree(const char *name
 	while (1) {
 		next = pde_subdir_first(de);
 		if (next) {
-			if (unlikely(pde_is_permanent(root))) {
+			if (unlikely(pde_is_permanent(next))) {
 				write_unlock(&proc_subdir_lock);
 				WARN(1, "removing permanent /proc entry '%s/%s'",
 					next->parent->name, next->name);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 04/91] proc: save LOC in __xlate_proc_name()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-05-07  1:02 ` [patch 03/91] fs/proc/generic.c: fix incorrect pde_is_permanent check Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  2:24   ` Linus Torvalds
  2021-05-07  1:02 ` [patch 05/91] proc: mandate ->proc_lseek in "struct proc_ops" Andrew Morton
                   ` (87 subsequent siblings)
  91 siblings, 1 reply; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: adobriyan, akpm, linux-mm, mm-commits, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: proc: save LOC in __xlate_proc_name()

Can't look at this verbosity anymore.

Link: https://lkml.kernel.org/r/YFYXAp/fgq405qcy@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/generic.c |   11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

--- a/fs/proc/generic.c~proc-save-loc-in-__xlate_proc_name
+++ a/fs/proc/generic.c
@@ -166,15 +166,8 @@ static int __xlate_proc_name(const char
 	const char     		*cp = name, *next;
 	struct proc_dir_entry	*de;
 
-	de = *ret;
-	if (!de)
-		de = &proc_root;
-
-	while (1) {
-		next = strchr(cp, '/');
-		if (!next)
-			break;
-
+	de = *ret ?: &proc_root;
+	while ((next = strchr(cp, '/'))) {
 		de = pde_subdir_find(de, cp, next - cp);
 		if (!de) {
 			WARN(1, "name '%s'\n", name);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 05/91] proc: mandate ->proc_lseek in "struct proc_ops"
  2021-05-07  1:01 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-05-07  1:02 ` [patch 04/91] proc: save LOC in __xlate_proc_name() Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 06/91] proc: delete redundant subset=pid check Andrew Morton
                   ` (86 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: adobriyan, akpm, linux-mm, mm-commits, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: proc: mandate ->proc_lseek in "struct proc_ops"

Now that proc_ops are separate from file_operations and other operations
it easy to check all instances to have ->proc_lseek hook and remove check
in main code.

Note:
nonseekable_open() files naturally don't require ->proc_lseek.

Garbage collect pde_lseek() function.

[adobriyan@gmail.com: smoke test lseek()]
  Link: https://lkml.kernel.org/r/YG4OIhChOrVTPgdN@localhost.localdomain
Link: https://lkml.kernel.org/r/YFYX0Bzwxlc7aBa/@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/isdn/capi/kcapi_proc.c                     |    1 
 drivers/net/wireless/intersil/hostap/hostap_proc.c |    1 
 drivers/scsi/esas2r/esas2r_main.c                  |    1 
 fs/proc/inode.c                                    |   14 +----------
 include/linux/proc_fs.h                            |    1 
 tools/testing/selftests/proc/read.c                |    4 ++-
 6 files changed, 9 insertions(+), 13 deletions(-)

--- a/drivers/isdn/capi/kcapi_proc.c~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/drivers/isdn/capi/kcapi_proc.c
@@ -201,6 +201,7 @@ static ssize_t empty_read(struct file *f
 
 static const struct proc_ops empty_proc_ops = {
 	.proc_read	= empty_read,
+	.proc_lseek	= default_llseek,
 };
 
 // ---------------------------------------------------------------------------
--- a/drivers/net/wireless/intersil/hostap/hostap_proc.c~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/drivers/net/wireless/intersil/hostap/hostap_proc.c
@@ -227,6 +227,7 @@ static ssize_t prism2_aux_dump_proc_no_r
 
 static const struct proc_ops prism2_aux_dump_proc_ops = {
 	.proc_read	= prism2_aux_dump_proc_no_read,
+	.proc_lseek	= default_llseek,
 };
 
 
--- a/drivers/scsi/esas2r/esas2r_main.c~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/drivers/scsi/esas2r/esas2r_main.c
@@ -616,6 +616,7 @@ static const struct file_operations esas
 };
 
 static const struct proc_ops esas2r_proc_ops = {
+	.proc_lseek		= default_llseek,
 	.proc_ioctl		= esas2r_proc_ioctl,
 #ifdef CONFIG_COMPAT
 	.proc_compat_ioctl	= compat_ptr_ioctl,
--- a/fs/proc/inode.c~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/fs/proc/inode.c
@@ -273,25 +273,15 @@ void proc_entry_rundown(struct proc_dir_
 	spin_unlock(&de->pde_unload_lock);
 }
 
-static loff_t pde_lseek(struct proc_dir_entry *pde, struct file *file, loff_t offset, int whence)
-{
-	typeof_member(struct proc_ops, proc_lseek) lseek;
-
-	lseek = pde->proc_ops->proc_lseek;
-	if (!lseek)
-		lseek = default_llseek;
-	return lseek(file, offset, whence);
-}
-
 static loff_t proc_reg_llseek(struct file *file, loff_t offset, int whence)
 {
 	struct proc_dir_entry *pde = PDE(file_inode(file));
 	loff_t rv = -EINVAL;
 
 	if (pde_is_permanent(pde)) {
-		return pde_lseek(pde, file, offset, whence);
+		return pde->proc_ops->proc_lseek(file, offset, whence);
 	} else if (use_pde(pde)) {
-		rv = pde_lseek(pde, file, offset, whence);
+		rv = pde->proc_ops->proc_lseek(file, offset, whence);
 		unuse_pde(pde);
 	}
 	return rv;
--- a/include/linux/proc_fs.h~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/include/linux/proc_fs.h
@@ -32,6 +32,7 @@ struct proc_ops {
 	ssize_t	(*proc_read)(struct file *, char __user *, size_t, loff_t *);
 	ssize_t (*proc_read_iter)(struct kiocb *, struct iov_iter *);
 	ssize_t	(*proc_write)(struct file *, const char __user *, size_t, loff_t *);
+	/* mandatory unless nonseekable_open() or equivalent is used */
 	loff_t	(*proc_lseek)(struct file *, loff_t, int);
 	int	(*proc_release)(struct inode *, struct file *);
 	__poll_t (*proc_poll)(struct file *, struct poll_table_struct *);
--- a/tools/testing/selftests/proc/read.c~proc-mandate-proc_lseek-in-struct-proc_ops
+++ a/tools/testing/selftests/proc/read.c
@@ -14,7 +14,7 @@
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 // Test
-// 1) read of every file in /proc
+// 1) read and lseek on every file in /proc
 // 2) readlink of every symlink in /proc
 // 3) recursively (1) + (2) for every directory in /proc
 // 4) write to /proc/*/clear_refs and /proc/*/task/*/clear_refs
@@ -45,6 +45,8 @@ static void f_reg(DIR *d, const char *fi
 	fd = openat(dirfd(d), filename, O_RDONLY|O_NONBLOCK);
 	if (fd == -1)
 		return;
+	/* struct proc_ops::proc_lseek is mandatory if file is seekable. */
+	(void)lseek(fd, 0, SEEK_SET);
 	rv = read(fd, buf, sizeof(buf));
 	assert((0 <= rv && rv <= sizeof(buf)) || rv == -1);
 	close(fd);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 06/91] proc: delete redundant subset=pid check
  2021-05-07  1:01 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-05-07  1:02 ` [patch 05/91] proc: mandate ->proc_lseek in "struct proc_ops" Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 07/91] selftests: proc: test subset=pid Andrew Morton
                   ` (85 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: adobriyan, akpm, gladkov.alexey, linux-mm, mm-commits, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: proc: delete redundant subset=pid check

Two checks in lookup and readdir code should be enough to not have third
check in open code.

Can't open what can't be looked up?

Link: https://lkml.kernel.org/r/YFYYwIBIkytqnkxP@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/inode.c |    4 ----
 1 file changed, 4 deletions(-)

--- a/fs/proc/inode.c~proc-delete-redundant-subset=pid-check
+++ a/fs/proc/inode.c
@@ -483,7 +483,6 @@ proc_reg_get_unmapped_area(struct file *
 
 static int proc_reg_open(struct inode *inode, struct file *file)
 {
-	struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb);
 	struct proc_dir_entry *pde = PDE(inode);
 	int rv = 0;
 	typeof_member(struct proc_ops, proc_open) open;
@@ -497,9 +496,6 @@ static int proc_reg_open(struct inode *i
 		return rv;
 	}
 
-	if (fs_info->pidonly == PROC_PIDONLY_ON)
-		return -ENOENT;
-
 	/*
 	 * Ensure that
 	 * 1) PDE's ->release hook will be called no matter what
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 07/91] selftests: proc: test subset=pid
  2021-05-07  1:01 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-05-07  1:02 ` [patch 06/91] proc: delete redundant subset=pid check Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 08/91] proc/sysctl: fix function name error in comments Andrew Morton
                   ` (84 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: adobriyan, akpm, gladkov.alexey, linux-mm, mm-commits, torvalds

From: Alexey Dobriyan <adobriyan@gmail.com>
Subject: selftests: proc: test subset=pid

Test that /proc instance mounted with

	mount -t proc -o subset=pid

contains only ".", "..", "self", "thread-self" and pid directories.

Note:
Currently "subset=pid" doesn't return "." and ".." via readdir.
This must be a bug.

Link: https://lkml.kernel.org/r/YFYZZ7WGaZlsnChS@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/proc/Makefile          |    1 
 tools/testing/selftests/proc/proc-subset-pid.c |  121 +++++++++++++++
 2 files changed, 122 insertions(+)

--- a/tools/testing/selftests/proc/Makefile~proc-test-subset=pid
+++ a/tools/testing/selftests/proc/Makefile
@@ -12,6 +12,7 @@ TEST_GEN_PROGS += proc-self-map-files-00
 TEST_GEN_PROGS += proc-self-map-files-002
 TEST_GEN_PROGS += proc-self-syscall
 TEST_GEN_PROGS += proc-self-wchan
+TEST_GEN_PROGS += proc-subset-pid
 TEST_GEN_PROGS += proc-uptime-001
 TEST_GEN_PROGS += proc-uptime-002
 TEST_GEN_PROGS += read
--- /dev/null
+++ a/tools/testing/selftests/proc/proc-subset-pid.c
@@ -0,0 +1,121 @@
+/*
+ * Copyright (c) 2021 Alexey Dobriyan <adobriyan@gmail.com>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+/*
+ * Test that "mount -t proc -o subset=pid" hides everything but pids,
+ * /proc/self and /proc/thread-self.
+ */
+#undef NDEBUG
+#include <assert.h>
+#include <errno.h>
+#include <sched.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mount.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <dirent.h>
+#include <unistd.h>
+#include <stdio.h>
+
+static inline bool streq(const char *a, const char *b)
+{
+	return strcmp(a, b) == 0;
+}
+
+static void make_private_proc(void)
+{
+	if (unshare(CLONE_NEWNS) == -1) {
+		if (errno == ENOSYS || errno == EPERM) {
+			exit(4);
+		}
+		exit(1);
+	}
+	if (mount(NULL, "/", NULL, MS_PRIVATE|MS_REC, NULL) == -1) {
+		exit(1);
+	}
+	if (mount(NULL, "/proc", "proc", 0, "subset=pid") == -1) {
+		exit(1);
+	}
+}
+
+static bool string_is_pid(const char *s)
+{
+	while (1) {
+		switch (*s++) {
+		case '0':case '1':case '2':case '3':case '4':
+		case '5':case '6':case '7':case '8':case '9':
+			continue;
+
+		case '\0':
+			return true;
+
+		default:
+			return false;
+		}
+	}
+}
+
+int main(void)
+{
+	make_private_proc();
+
+	DIR *d = opendir("/proc");
+	assert(d);
+
+	struct dirent *de;
+
+	bool dot = false;
+	bool dot_dot = false;
+	bool self = false;
+	bool thread_self = false;
+
+	while ((de = readdir(d))) {
+		if (streq(de->d_name, ".")) {
+			assert(!dot);
+			dot = true;
+			assert(de->d_type == DT_DIR);
+		} else if (streq(de->d_name, "..")) {
+			assert(!dot_dot);
+			dot_dot = true;
+			assert(de->d_type == DT_DIR);
+		} else if (streq(de->d_name, "self")) {
+			assert(!self);
+			self = true;
+			assert(de->d_type == DT_LNK);
+		} else if (streq(de->d_name, "thread-self")) {
+			assert(!thread_self);
+			thread_self = true;
+			assert(de->d_type == DT_LNK);
+		} else {
+			if (!string_is_pid(de->d_name)) {
+				fprintf(stderr, "d_name '%s'\n", de->d_name);
+				assert(0);
+			}
+			assert(de->d_type == DT_DIR);
+		}
+	}
+
+	char c;
+	int rv = readlink("/proc/cpuinfo", &c, 1);
+	assert(rv == -1 && errno == ENOENT);
+
+	int fd = open("/proc/cpuinfo", O_RDONLY);
+	assert(fd == -1 && errno == ENOENT);
+
+	return 0;
+}
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 08/91] proc/sysctl: fix function name error in comments
  2021-05-07  1:01 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-05-07  1:02 ` [patch 07/91] selftests: proc: test subset=pid Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 09/91] include: remove pagemap.h from blkdev.h Andrew Morton
                   ` (83 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, zhouchuangao

From: zhouchuangao <zhouchuangao@vivo.com>
Subject: proc/sysctl: fix function name error in comments

The function name should be modified to register_sysctl_paths instead of
register_sysctl_table_path.

Link: https://lkml.kernel.org/r/1615807194-79646-1-git-send-email-zhouchuangao@vivo.com
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/proc_sysctl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/proc/proc_sysctl.c~proc-sysctl-fix-function-name-error-in-comments
+++ a/fs/proc/proc_sysctl.c
@@ -1563,7 +1563,7 @@ err_register_leaves:
 }
 
 /**
- * register_sysctl_table_path - register a sysctl table hierarchy
+ * register_sysctl_paths - register a sysctl table hierarchy
  * @path: The path to the directory the sysctl table is in.
  * @table: the top-level table structure
  *
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 09/91] include: remove pagemap.h from blkdev.h
  2021-05-07  1:01 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-05-07  1:02 ` [patch 08/91] proc/sysctl: fix function name error in comments Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 10/91] kernel.h: drop inclusion in bitmap.h Andrew Morton
                   ` (82 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, axboe, colyli, dan.j.williams, hch, linux-mm,
	martin.petersen, mm-commits, torvalds, william.kucharski, willy

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: include: remove pagemap.h from blkdev.h

My UEK-derived config has 1030 files depending on pagemap.h before this
change.  Afterwards, just 326 files need to be rebuilt when I touch
pagemap.h.  I think blkdev.h is probably included too widely, but
untangling that dependency is harder and this solves my problem.  x86
allmodconfig builds, but there may be implicit include problems on other
architectures.

Link: https://lkml.kernel.org/r/20210309195747.283796-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>		[nvdimm]
Acked-by: Jens Axboe <axboe@kernel.dk>				[block]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Coly Li <colyli@suse.de>				[bcache]
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>	[scsi]
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 block/blk-settings.c      |    1 +
 drivers/block/brd.c       |    1 +
 drivers/block/loop.c      |    1 +
 drivers/md/bcache/super.c |    1 +
 drivers/nvdimm/btt.c      |    1 +
 drivers/nvdimm/pmem.c     |    1 +
 drivers/scsi/scsicam.c    |    1 +
 include/linux/blkdev.h    |    1 -
 include/linux/swap.h      |    1 +
 9 files changed, 8 insertions(+), 1 deletion(-)

--- a/block/blk-settings.c~include-remove-pagemaph-from-blkdevh
+++ a/block/blk-settings.c
@@ -7,6 +7,7 @@
 #include <linux/init.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <linux/pagemap.h>
 #include <linux/gcd.h>
 #include <linux/lcm.h>
 #include <linux/jiffies.h>
--- a/drivers/block/brd.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/block/brd.c
@@ -18,6 +18,7 @@
 #include <linux/bio.h>
 #include <linux/highmem.h>
 #include <linux/mutex.h>
+#include <linux/pagemap.h>
 #include <linux/radix-tree.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
--- a/drivers/block/loop.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/block/loop.c
@@ -53,6 +53,7 @@
 #include <linux/moduleparam.h>
 #include <linux/sched.h>
 #include <linux/fs.h>
+#include <linux/pagemap.h>
 #include <linux/file.h>
 #include <linux/stat.h>
 #include <linux/errno.h>
--- a/drivers/md/bcache/super.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/md/bcache/super.c
@@ -16,6 +16,7 @@
 #include "features.h"
 
 #include <linux/blkdev.h>
+#include <linux/pagemap.h>
 #include <linux/debugfs.h>
 #include <linux/genhd.h>
 #include <linux/idr.h>
--- a/drivers/nvdimm/btt.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/nvdimm/btt.c
@@ -6,6 +6,7 @@
 #include <linux/highmem.h>
 #include <linux/debugfs.h>
 #include <linux/blkdev.h>
+#include <linux/pagemap.h>
 #include <linux/module.h>
 #include <linux/device.h>
 #include <linux/mutex.h>
--- a/drivers/nvdimm/pmem.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/nvdimm/pmem.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/blkdev.h>
+#include <linux/pagemap.h>
 #include <linux/hdreg.h>
 #include <linux/init.h>
 #include <linux/platform_device.h>
--- a/drivers/scsi/scsicam.c~include-remove-pagemaph-from-blkdevh
+++ a/drivers/scsi/scsicam.c
@@ -17,6 +17,7 @@
 #include <linux/genhd.h>
 #include <linux/kernel.h>
 #include <linux/blkdev.h>
+#include <linux/pagemap.h>
 #include <linux/msdos_partition.h>
 #include <asm/unaligned.h>
 
--- a/include/linux/blkdev.h~include-remove-pagemaph-from-blkdevh
+++ a/include/linux/blkdev.h
@@ -11,7 +11,6 @@
 #include <linux/minmax.h>
 #include <linux/timer.h>
 #include <linux/workqueue.h>
-#include <linux/pagemap.h>
 #include <linux/backing-dev-defs.h>
 #include <linux/wait.h>
 #include <linux/mempool.h>
--- a/include/linux/swap.h~include-remove-pagemaph-from-blkdevh
+++ a/include/linux/swap.h
@@ -10,6 +10,7 @@
 #include <linux/sched.h>
 #include <linux/node.h>
 #include <linux/fs.h>
+#include <linux/pagemap.h>
 #include <linux/atomic.h>
 #include <linux/page-flags.h>
 #include <uapi/linux/mempolicy.h>
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 10/91] kernel.h: drop inclusion in bitmap.h
  2021-05-07  1:01 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-05-07  1:02 ` [patch 09/91] include: remove pagemap.h from blkdev.h Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 11/91] linux/profile.h: remove unnecessary declaration Andrew Morton
                   ` (81 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, linux-mm, linux, mm-commits, torvalds,
	viro, yury.norov

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Subject: kernel.h: drop inclusion in bitmap.h

The bitmap.h header is used in a lot of code around the kernel.  Besides
that it includes kernel.h which sometimes makes a loop.

The problem here is many unneeded loops that make header hell
dependencies.  For example, how may you move bitmap_zalloc() from C-file
to the header?  Currently it's impossible.  And bitmap.h here is only the
tip of an iceberg.

kerne.h is a dump of everything that even has nothing in common at all. 
We may still have it, but in my new code I prefer to include only the
headers that I want to use, without the bulk of unneeded kernel code.

Break the loop by introducing align.h, including it in kernel.h and
bitmap.h followed by replacing kernel.h with limits.h.

Link: https://lkml.kernel.org/r/20210326170347.37441-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Yury Norov <yury.norov@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/align.h  |   15 +++++++++++++++
 include/linux/bitmap.h |    3 ++-
 include/linux/kernel.h |    9 +--------
 3 files changed, 18 insertions(+), 9 deletions(-)

--- /dev/null
+++ a/include/linux/align.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_ALIGN_H
+#define _LINUX_ALIGN_H
+
+#include <linux/const.h>
+
+/* @a is a power of 2 value */
+#define ALIGN(x, a)		__ALIGN_KERNEL((x), (a))
+#define ALIGN_DOWN(x, a)	__ALIGN_KERNEL((x) - ((a) - 1), (a))
+#define __ALIGN_MASK(x, mask)	__ALIGN_KERNEL_MASK((x), (mask))
+#define PTR_ALIGN(p, a)		((typeof(p))ALIGN((unsigned long)(p), (a)))
+#define PTR_ALIGN_DOWN(p, a)	((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))
+#define IS_ALIGNED(x, a)		(((x) & ((typeof(x))(a) - 1)) == 0)
+
+#endif	/* _LINUX_ALIGN_H */
--- a/include/linux/bitmap.h~kernelh-drop-inclusion-in-bitmaph
+++ a/include/linux/bitmap.h
@@ -4,10 +4,11 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/align.h>
 #include <linux/types.h>
 #include <linux/bitops.h>
+#include <linux/limits.h>
 #include <linux/string.h>
-#include <linux/kernel.h>
 
 /*
  * bitmaps provide bit arrays that consume one or more unsigned
--- a/include/linux/kernel.h~kernelh-drop-inclusion-in-bitmaph
+++ a/include/linux/kernel.h
@@ -3,6 +3,7 @@
 #define _LINUX_KERNEL_H
 
 #include <stdarg.h>
+#include <linux/align.h>
 #include <linux/limits.h>
 #include <linux/linkage.h>
 #include <linux/stddef.h>
@@ -30,14 +31,6 @@
  */
 #define REPEAT_BYTE(x)	((~0ul / 0xff) * (x))
 
-/* @a is a power of 2 value */
-#define ALIGN(x, a)		__ALIGN_KERNEL((x), (a))
-#define ALIGN_DOWN(x, a)	__ALIGN_KERNEL((x) - ((a) - 1), (a))
-#define __ALIGN_MASK(x, mask)	__ALIGN_KERNEL_MASK((x), (mask))
-#define PTR_ALIGN(p, a)		((typeof(p))ALIGN((unsigned long)(p), (a)))
-#define PTR_ALIGN_DOWN(p, a)	((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))
-#define IS_ALIGNED(x, a)		(((x) & ((typeof(x))(a) - 1)) == 0)
-
 /* generic data direction definitions */
 #define READ			0
 #define WRITE			1
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 11/91] linux/profile.h: remove unnecessary declaration
  2021-05-07  1:01 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-05-07  1:02 ` [patch 10/91] kernel.h: drop inclusion in bitmap.h Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 12/91] kernel/async.c: fix pr_debug statement Andrew Morton
                   ` (80 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, wanjiabing

From: Wan Jiabing <wanjiabing@vivo.com>
Subject: linux/profile.h: remove unnecessary declaration

Declaring struct pt_regs is unnecessary.  On the one hand, there is no
function using it; on the other hand, struct pt_regs has been declared in
linux/kernel.h.  Remove them.

Link: https://lkml.kernel.org/r/20210401104834.1009157-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/profile.h |    3 ---
 1 file changed, 3 deletions(-)

--- a/include/linux/profile.h~linux-profileh-remove-unnecessary-declaration
+++ a/include/linux/profile.h
@@ -15,7 +15,6 @@
 #define KVM_PROFILING	4
 
 struct proc_dir_entry;
-struct pt_regs;
 struct notifier_block;
 
 #if defined(CONFIG_PROFILING) && defined(CONFIG_PROC_FS)
@@ -84,8 +83,6 @@ int task_handoff_unregister(struct notif
 int profile_event_register(enum profile_type, struct notifier_block * n);
 int profile_event_unregister(enum profile_type, struct notifier_block * n);
 
-struct pt_regs;
-
 #else
 
 #define prof_on 0
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 12/91] kernel/async.c: fix pr_debug statement
  2021-05-07  1:01 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2021-05-07  1:02 ` [patch 11/91] linux/profile.h: remove unnecessary declaration Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 13/91] kernel/cred.c: make init_groups static Andrew Morton
                   ` (79 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, linux-mm, linux, mm-commits, tj, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: kernel/async.c: fix pr_debug statement

An async_func_t returns void - any errors encountered it has to stash
somewhere for consumers to discover later.

Link: https://lkml.kernel.org/r/20210226124355.2503524-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/async.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/async.c~kernel-asyncc-fix-pr_debug-statement
+++ a/kernel/async.c
@@ -124,7 +124,7 @@ static void async_run_entry_fn(struct wo
 	if (initcall_debug && system_state < SYSTEM_RUNNING) {
 		rettime = ktime_get();
 		delta = ktime_sub(rettime, calltime);
-		pr_debug("initcall %lli_%pS returned 0 after %lld usecs\n",
+		pr_debug("initcall %lli_%pS returned after %lld usecs\n",
 			(long long)entry->cookie,
 			entry->func,
 			(long long)ktime_to_ns(delta) >> 10);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 13/91] kernel/cred.c: make init_groups static
  2021-05-07  1:01 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2021-05-07  1:02 ` [patch 12/91] kernel/async.c: fix pr_debug statement Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 14/91] tools: disable -Wno-type-limits Andrew Morton
                   ` (78 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: akpm, linux-mm, linux, mm-commits, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: kernel/cred.c: make init_groups static

init_groups is declared in both cred.h and init_task.h, but it is not
actually referenced anywhere outside of cred.c where it is defined.  So
make it static and remove the declarations.

Link: https://lkml.kernel.org/r/20210310220102.2484201-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/cred.h      |    1 -
 include/linux/init_task.h |    1 -
 kernel/cred.c             |    2 +-
 3 files changed, 1 insertion(+), 3 deletions(-)

--- a/include/linux/cred.h~kernel-credc-make-init_groups-static
+++ a/include/linux/cred.h
@@ -53,7 +53,6 @@ do {							\
 		groups_free(group_info);		\
 } while (0)
 
-extern struct group_info init_groups;
 #ifdef CONFIG_MULTIUSER
 extern struct group_info *groups_alloc(int);
 extern void groups_free(struct group_info *);
--- a/include/linux/init_task.h~kernel-credc-make-init_groups-static
+++ a/include/linux/init_task.h
@@ -25,7 +25,6 @@
 extern struct files_struct init_files;
 extern struct fs_struct init_fs;
 extern struct nsproxy init_nsproxy;
-extern struct group_info init_groups;
 extern struct cred init_cred;
 
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
--- a/kernel/cred.c~kernel-credc-make-init_groups-static
+++ a/kernel/cred.c
@@ -33,7 +33,7 @@ do {									\
 static struct kmem_cache *cred_jar;
 
 /* init to 2 - one for init_task, one to ensure it is never freed */
-struct group_info init_groups = { .usage = ATOMIC_INIT(2) };
+static struct group_info init_groups = { .usage = ATOMIC_INIT(2) };
 
 /*
  * The initial credentials for the initial task
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 14/91] tools: disable -Wno-type-limits
  2021-05-07  1:01 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2021-05-07  1:02 ` [patch 13/91] kernel/cred.c: make init_groups static Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 15/91] tools: bitmap: sync function declarations with the kernel Andrew Morton
                   ` (77 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: disable -Wno-type-limits

Patch series "lib/find_bit: fast path for small bitmaps", v6.

Bitmap operations are much simpler and faster in case of small bitmaps
which fit into a single word.  In linux/bitmap.c we have a machinery that
allows compiler to replace actual function call with a few instructions if
bitmaps passed into the function are small and their size is known at
compile time.

find_*_bit() API lacks this functionality; but users will benefit from it
a lot.  One important example is cpumask subsystem when NR_CPUS <=
BITS_PER_LONG.


This patch (of 12):

GENMASK(h, l) may be passed with unsigned types.  In such case,
type-limits warning is generated for example in case of GENMASK(h, 0).

Link: https://lkml.kernel.org/r/20210401003153.97325-1-yury.norov@gmail.com
Link: https://lkml.kernel.org/r/20210401003153.97325-2-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/scripts/Makefile.include |    1 +
 1 file changed, 1 insertion(+)

--- a/tools/scripts/Makefile.include~tools-disable-wno-type-limits
+++ a/tools/scripts/Makefile.include
@@ -38,6 +38,7 @@ EXTRA_WARNINGS += -Wswitch-enum
 EXTRA_WARNINGS += -Wundef
 EXTRA_WARNINGS += -Wwrite-strings
 EXTRA_WARNINGS += -Wformat
+EXTRA_WARNINGS += -Wno-type-limits
 
 # Makefiles suck: This macro sets a default value of $(2) for the
 # variable named by $(1), unless the variable has been set by
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 15/91] tools: bitmap: sync function declarations with the kernel
  2021-05-07  1:01 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2021-05-07  1:02 ` [patch 14/91] tools: disable -Wno-type-limits Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 16/91] tools: sync BITMAP_LAST_WORD_MASK() macro " Andrew Morton
                   ` (76 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: bitmap: sync function declarations with the kernel

Some functions in tools/include/linux/bitmap.h declare nbits as int.  In
the kernel nbits is declared as unsigned int.

Link: https://lkml.kernel.org/r/20210401003153.97325-3-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/include/linux/bitmap.h |    8 ++++----
 tools/lib/bitmap.c           |    4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

--- a/tools/include/linux/bitmap.h~tools-bitmap-sync-function-declarations-with-the-kernel
+++ a/tools/include/linux/bitmap.h
@@ -30,7 +30,7 @@ void bitmap_clear(unsigned long *map, un
 #define small_const_nbits(nbits) \
 	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
 
-static inline void bitmap_zero(unsigned long *dst, int nbits)
+static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		*dst = 0UL;
@@ -66,7 +66,7 @@ static inline int bitmap_full(const unsi
 	return find_first_zero_bit(src, nbits) == nbits;
 }
 
-static inline int bitmap_weight(const unsigned long *src, int nbits)
+static inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
@@ -74,7 +74,7 @@ static inline int bitmap_weight(const un
 }
 
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
-			     const unsigned long *src2, int nbits)
+			     const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		*dst = *src1 | *src2;
@@ -141,7 +141,7 @@ static inline void bitmap_free(unsigned
  * @buf: buffer to store output
  * @size: size of @buf
  */
-size_t bitmap_scnprintf(unsigned long *bitmap, int nbits,
+size_t bitmap_scnprintf(unsigned long *bitmap, unsigned int nbits,
 			char *buf, size_t size);
 
 /**
--- a/tools/lib/bitmap.c~tools-bitmap-sync-function-declarations-with-the-kernel
+++ a/tools/lib/bitmap.c
@@ -28,11 +28,11 @@ void __bitmap_or(unsigned long *dst, con
 		dst[k] = bitmap1[k] | bitmap2[k];
 }
 
-size_t bitmap_scnprintf(unsigned long *bitmap, int nbits,
+size_t bitmap_scnprintf(unsigned long *bitmap, unsigned int nbits,
 			char *buf, size_t size)
 {
 	/* current bit is 'cur', most recently seen range is [rbot, rtop] */
-	int cur, rbot, rtop;
+	unsigned int cur, rbot, rtop;
 	bool first = true;
 	size_t ret = 0;
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 16/91] tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel
  2021-05-07  1:01 incoming Andrew Morton
                   ` (14 preceding siblings ...)
  2021-05-07  1:02 ` [patch 15/91] tools: bitmap: sync function declarations with the kernel Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 17/91] arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300 Andrew Morton
                   ` (75 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel

Kernel version generates better code.

Link: https://lkml.kernel.org/r/20210401003153.97325-4-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/include/linux/bitmap.h |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

--- a/tools/include/linux/bitmap.h~tools-sync-bitmap_last_word_mask-macro-with-the-kernel
+++ a/tools/include/linux/bitmap.h
@@ -20,12 +20,7 @@ int __bitmap_equal(const unsigned long *
 void bitmap_clear(unsigned long *map, unsigned int start, int len);
 
 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
-
-#define BITMAP_LAST_WORD_MASK(nbits)					\
-(									\
-	((nbits) % BITS_PER_LONG) ?					\
-		(1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL		\
-)
+#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
 
 #define small_const_nbits(nbits) \
 	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 17/91] arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300
  2021-05-07  1:01 incoming Andrew Morton
                   ` (15 preceding siblings ...)
  2021-05-07  1:02 ` [patch 16/91] tools: sync BITMAP_LAST_WORD_MASK() macro " Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:02 ` [patch 18/91] lib: extend the scope of small_const_nbits() macro Andrew Morton
                   ` (74 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	linux, mm-commits, richard.weiyang, sbrivio, torvalds,
	wsa+renesas, ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300

m68k and sh include bitmap/{find,le}.h prior to ffs/fls headers.  New
fast-path implementation in find.h requires ffs/fls.  Reordering the
headers inclusion sequence helps to prevent compile-time implicit function
declaration error.

[yury.norov@gmail.com: h8300: rearrange headers inclusion order in asm/bitops]
  Link: https://lkml.kernel.org/r/20210406183625.794227-1-yury.norov@gmail.com
Link: https://lkml.kernel.org/r/20210401003153.97325-5-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/h8300/include/asm/bitops.h |    8 ++++----
 arch/m68k/include/asm/bitops.h  |    6 +++---
 arch/sh/include/asm/bitops.h    |    5 +++--
 3 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/h8300/include/asm/bitops.h~arch-rearrange-headers-inclusion-order-in-asm-bitops-for-m68k-and-sh
+++ a/arch/h8300/include/asm/bitops.h
@@ -9,6 +9,10 @@
 
 #include <linux/compiler.h>
 
+#include <asm-generic/bitops/fls.h>
+#include <asm-generic/bitops/__fls.h>
+#include <asm-generic/bitops/fls64.h>
+
 #ifdef __KERNEL__
 
 #ifndef _LINUX_BITOPS_H
@@ -173,8 +177,4 @@ static inline unsigned long __ffs(unsign
 
 #endif /* __KERNEL__ */
 
-#include <asm-generic/bitops/fls.h>
-#include <asm-generic/bitops/__fls.h>
-#include <asm-generic/bitops/fls64.h>
-
 #endif /* _H8300_BITOPS_H */
--- a/arch/m68k/include/asm/bitops.h~arch-rearrange-headers-inclusion-order-in-asm-bitops-for-m68k-and-sh
+++ a/arch/m68k/include/asm/bitops.h
@@ -440,8 +440,6 @@ static inline unsigned long ffz(unsigned
 
 #endif
 
-#include <asm-generic/bitops/find.h>
-
 #ifdef __KERNEL__
 
 #if defined(CONFIG_CPU_HAS_NO_BITFIELDS)
@@ -525,10 +523,12 @@ static inline int __fls(int x)
 #define __clear_bit_unlock	clear_bit_unlock
 
 #include <asm-generic/bitops/ext2-atomic.h>
-#include <asm-generic/bitops/le.h>
 #include <asm-generic/bitops/fls64.h>
 #include <asm-generic/bitops/sched.h>
 #include <asm-generic/bitops/hweight.h>
+#include <asm-generic/bitops/le.h>
 #endif /* __KERNEL__ */
 
+#include <asm-generic/bitops/find.h>
+
 #endif /* _M68K_BITOPS_H */
--- a/arch/sh/include/asm/bitops.h~arch-rearrange-headers-inclusion-order-in-asm-bitops-for-m68k-and-sh
+++ a/arch/sh/include/asm/bitops.h
@@ -58,15 +58,16 @@ static inline unsigned long __ffs(unsign
 	return result;
 }
 
-#include <asm-generic/bitops/find.h>
 #include <asm-generic/bitops/ffs.h>
 #include <asm-generic/bitops/hweight.h>
 #include <asm-generic/bitops/lock.h>
 #include <asm-generic/bitops/sched.h>
-#include <asm-generic/bitops/le.h>
 #include <asm-generic/bitops/ext2-atomic.h>
 #include <asm-generic/bitops/fls.h>
 #include <asm-generic/bitops/__fls.h>
 #include <asm-generic/bitops/fls64.h>
 
+#include <asm-generic/bitops/le.h>
+#include <asm-generic/bitops/find.h>
+
 #endif /* __ASM_SH_BITOPS_H */
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 18/91] lib: extend the scope of small_const_nbits() macro
  2021-05-07  1:01 incoming Andrew Morton
                   ` (16 preceding siblings ...)
  2021-05-07  1:02 ` [patch 17/91] arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300 Andrew Morton
@ 2021-05-07  1:02 ` Andrew Morton
  2021-05-07  1:03 ` [patch 19/91] tools: sync small_const_nbits() macro with the kernel Andrew Morton
                   ` (73 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:02 UTC (permalink / raw)
  To: aklimov, akpm, andy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: lib: extend the scope of small_const_nbits() macro

find_bit would also benefit from small_const_nbits() optimizations.  The
detailed comment is provided by Rasmus Villemoes.

Link: https://lkml.kernel.org/r/20210401003153.97325-6-yury.norov@gmail.com
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/bitsperlong.h |   12 ++++++++++++
 include/linux/bitmap.h            |    8 --------
 2 files changed, 12 insertions(+), 8 deletions(-)

--- a/include/asm-generic/bitsperlong.h~lib-extend-the-scope-of-small_const_nbits-macro
+++ a/include/asm-generic/bitsperlong.h
@@ -23,4 +23,16 @@
 #define BITS_PER_LONG_LONG 64
 #endif
 
+/*
+ * small_const_nbits(n) is true precisely when it is known at compile-time
+ * that BITMAP_SIZE(n) is 1, i.e. 1 <= n <= BITS_PER_LONG. This allows
+ * various bit/bitmap APIs to provide a fast inline implementation. Bitmaps
+ * of size 0 are very rare, and a compile-time-known-size 0 is most likely
+ * a sign of error. They will be handled correctly by the bit/bitmap APIs,
+ * but using the out-of-line functions, so that the inline implementations
+ * can unconditionally dereference the pointer(s).
+ */
+#define small_const_nbits(nbits) \
+	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
+
 #endif /* __ASM_GENERIC_BITS_PER_LONG */
--- a/include/linux/bitmap.h~lib-extend-the-scope-of-small_const_nbits-macro
+++ a/include/linux/bitmap.h
@@ -223,14 +223,6 @@ extern int bitmap_print_to_pagebuf(bool
 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
 #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
 
-/*
- * The static inlines below do not handle constant nbits==0 correctly,
- * so make such users (should any ever turn up) call the out-of-line
- * versions.
- */
-#define small_const_nbits(nbits) \
-	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
-
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 19/91] tools: sync small_const_nbits() macro with the kernel
  2021-05-07  1:01 incoming Andrew Morton
                   ` (17 preceding siblings ...)
  2021-05-07  1:02 ` [patch 18/91] lib: extend the scope of small_const_nbits() macro Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 20/91] lib: inline _find_next_bit() wrappers Andrew Morton
                   ` (72 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: sync small_const_nbits() macro with the kernel

Sync implementation with the kernel and move the macro from
tools/include/linux/bitmap.h to tools/include/asm-generic/bitsperlong.h

Link: https://lkml.kernel.org/r/20210401003153.97325-7-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/include/asm-generic/bitsperlong.h |    3 +++
 tools/include/linux/bitmap.h            |    3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

--- a/tools/include/asm-generic/bitsperlong.h~tools-sync-small_const_nbits-macro-with-the-kernel
+++ a/tools/include/asm-generic/bitsperlong.h
@@ -18,4 +18,7 @@
 #define BITS_PER_LONG_LONG 64
 #endif
 
+#define small_const_nbits(nbits) \
+	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
+
 #endif /* __ASM_GENERIC_BITS_PER_LONG */
--- a/tools/include/linux/bitmap.h~tools-sync-small_const_nbits-macro-with-the-kernel
+++ a/tools/include/linux/bitmap.h
@@ -22,9 +22,6 @@ void bitmap_clear(unsigned long *map, un
 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
 #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
 
-#define small_const_nbits(nbits) \
-	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
-
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 20/91] lib: inline _find_next_bit() wrappers
  2021-05-07  1:01 incoming Andrew Morton
                   ` (18 preceding siblings ...)
  2021-05-07  1:03 ` [patch 19/91] tools: sync small_const_nbits() macro with the kernel Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 21/91] tools: sync find_next_bit implementation Andrew Morton
                   ` (71 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: lib: inline _find_next_bit() wrappers

lib/find_bit.c declares five single-line wrappers for _find_next_bit(). 
We may turn those wrappers to inline functions.  It eliminates unneeded
function calls and opens room for compile-time optimizations.

Link: https://lkml.kernel.org/r/20210401003153.97325-8-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/bitops/find.h |   28 +++++++++++---
 include/asm-generic/bitops/le.h   |   17 ++++++--
 lib/find_bit.c                    |   56 +---------------------------
 3 files changed, 37 insertions(+), 64 deletions(-)

--- a/include/asm-generic/bitops/find.h~lib-inline-_find_next_bit-wrappers
+++ a/include/asm-generic/bitops/find.h
@@ -2,6 +2,10 @@
 #ifndef _ASM_GENERIC_BITOPS_FIND_H_
 #define _ASM_GENERIC_BITOPS_FIND_H_
 
+extern unsigned long _find_next_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long nbits,
+		unsigned long start, unsigned long invert, unsigned long le);
+
 #ifndef find_next_bit
 /**
  * find_next_bit - find the next set bit in a memory region
@@ -12,8 +16,12 @@
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
-		size, unsigned long offset);
+static inline
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_and_bit
@@ -27,9 +35,13 @@ extern unsigned long find_next_bit(const
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_and_bit(const unsigned long *addr1,
+static inline
+unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
-		unsigned long offset);
+		unsigned long offset)
+{
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_zero_bit
@@ -42,8 +54,12 @@ extern unsigned long find_next_and_bit(c
  * Returns the bit number of the next zero bit
  * If no bits are zero, returns @size.
  */
-extern unsigned long find_next_zero_bit(const unsigned long *addr, unsigned
-		long size, unsigned long offset);
+static inline
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+}
 #endif
 
 #ifdef CONFIG_GENERIC_FIND_FIRST_BIT
--- a/include/asm-generic/bitops/le.h~lib-inline-_find_next_bit-wrappers
+++ a/include/asm-generic/bitops/le.h
@@ -2,6 +2,7 @@
 #ifndef _ASM_GENERIC_BITOPS_LE_H_
 #define _ASM_GENERIC_BITOPS_LE_H_
 
+#include <asm-generic/bitops/find.h>
 #include <asm/types.h>
 #include <asm/byteorder.h>
 
@@ -32,13 +33,21 @@ static inline unsigned long find_first_z
 #define BITOP_LE_SWIZZLE	((BITS_PER_LONG-1) & ~0x7)
 
 #ifndef find_next_zero_bit_le
-extern unsigned long find_next_zero_bit_le(const void *addr,
-		unsigned long size, unsigned long offset);
+static inline
+unsigned long find_next_zero_bit_le(const void *addr, unsigned
+		long size, unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
+}
 #endif
 
 #ifndef find_next_bit_le
-extern unsigned long find_next_bit_le(const void *addr,
-		unsigned long size, unsigned long offset);
+static inline
+unsigned long find_next_bit_le(const void *addr, unsigned
+		long size, unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
+}
 #endif
 
 #ifndef find_first_zero_bit_le
--- a/lib/find_bit.c~lib-inline-_find_next_bit-wrappers
+++ a/lib/find_bit.c
@@ -29,7 +29,7 @@
  *    searching it for one bits.
  *  - The optional "addr2", which is anded with "addr1" if present.
  */
-static unsigned long _find_next_bit(const unsigned long *addr1,
+unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le)
 {
@@ -68,37 +68,7 @@ static unsigned long _find_next_bit(cons
 
 	return min(start + __ffs(tmp), nbits);
 }
-#endif
-
-#ifndef find_next_bit
-/*
- * Find the next set bit in a memory region.
- */
-unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
-			    unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
-}
-EXPORT_SYMBOL(find_next_bit);
-#endif
-
-#ifndef find_next_zero_bit
-unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
-}
-EXPORT_SYMBOL(find_next_zero_bit);
-#endif
-
-#if !defined(find_next_and_bit)
-unsigned long find_next_and_bit(const unsigned long *addr1,
-		const unsigned long *addr2, unsigned long size,
-		unsigned long offset)
-{
-	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
-}
-EXPORT_SYMBOL(find_next_and_bit);
+EXPORT_SYMBOL(_find_next_bit);
 #endif
 
 #ifndef find_first_bit
@@ -157,28 +127,6 @@ unsigned long find_last_bit(const unsign
 EXPORT_SYMBOL(find_last_bit);
 #endif
 
-#ifdef __BIG_ENDIAN
-
-#ifndef find_next_zero_bit_le
-unsigned long find_next_zero_bit_le(const void *addr, unsigned
-		long size, unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
-}
-EXPORT_SYMBOL(find_next_zero_bit_le);
-#endif
-
-#ifndef find_next_bit_le
-unsigned long find_next_bit_le(const void *addr, unsigned
-		long size, unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
-}
-EXPORT_SYMBOL(find_next_bit_le);
-#endif
-
-#endif /* __BIG_ENDIAN */
-
 unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
 			       unsigned long size, unsigned long offset)
 {
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 21/91] tools: sync find_next_bit implementation
  2021-05-07  1:01 incoming Andrew Morton
                   ` (19 preceding siblings ...)
  2021-05-07  1:03 ` [patch 20/91] lib: inline _find_next_bit() wrappers Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 22/91] lib: add fast path for find_next_*_bit() Andrew Morton
                   ` (70 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: sync find_next_bit implementation

Sync the implementation with recent kernel changes.

Link: https://lkml.kernel.org/r/20210401003153.97325-9-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/include/asm-generic/bitops/find.h |   27 ++++++++---
 tools/lib/find_bit.c                    |   52 ++++++++--------------
 2 files changed, 42 insertions(+), 37 deletions(-)

--- a/tools/include/asm-generic/bitops/find.h~tools-sync-find_next_bit-implementation
+++ a/tools/include/asm-generic/bitops/find.h
@@ -2,6 +2,10 @@
 #ifndef _TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_
 #define _TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_
 
+extern unsigned long _find_next_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long nbits,
+		unsigned long start, unsigned long invert, unsigned long le);
+
 #ifndef find_next_bit
 /**
  * find_next_bit - find the next set bit in a memory region
@@ -12,8 +16,12 @@
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
-		size, unsigned long offset);
+static inline
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_and_bit
@@ -27,13 +35,16 @@ extern unsigned long find_next_bit(const
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_and_bit(const unsigned long *addr1,
+static inline
+unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
-		unsigned long offset);
+		unsigned long offset)
+{
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_zero_bit
-
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region
  * @addr: The address to base the search on
@@ -43,8 +54,12 @@ extern unsigned long find_next_and_bit(c
  * Returns the bit number of the next zero bit
  * If no bits are zero, returns @size.
  */
+static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset);
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+}
 #endif
 
 #ifndef find_first_bit
--- a/tools/lib/find_bit.c~tools-sync-find_next_bit-implementation
+++ a/tools/lib/find_bit.c
@@ -28,11 +28,12 @@
  *    searching it for one bits.
  *  - The optional "addr2", which is anded with "addr1" if present.
  */
-static inline unsigned long _find_next_bit(const unsigned long *addr1,
+unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
-		unsigned long start, unsigned long invert)
+		unsigned long start, unsigned long invert, unsigned long le)
 {
-	unsigned long tmp;
+	unsigned long tmp, mask;
+	(void) le;
 
 	if (unlikely(start >= nbits))
 		return nbits;
@@ -43,7 +44,19 @@ static inline unsigned long _find_next_b
 	tmp ^= invert;
 
 	/* Handle 1st word. */
-	tmp &= BITMAP_FIRST_WORD_MASK(start);
+	mask = BITMAP_FIRST_WORD_MASK(start);
+
+	/*
+	 * Due to the lack of swab() in tools, and the fact that it doesn't
+	 * need little-endian support, just comment it out
+	 */
+#if (0)
+	if (le)
+		mask = swab(mask);
+#endif
+
+	tmp &= mask;
+
 	start = round_down(start, BITS_PER_LONG);
 
 	while (!tmp) {
@@ -57,18 +70,12 @@ static inline unsigned long _find_next_b
 		tmp ^= invert;
 	}
 
-	return min(start + __ffs(tmp), nbits);
-}
+#if (0)
+	if (le)
+		tmp = swab(tmp);
 #endif
 
-#ifndef find_next_bit
-/*
- * Find the next set bit in a memory region.
- */
-unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
-			    unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL);
+	return min(start + __ffs(tmp), nbits);
 }
 #endif
 
@@ -105,20 +112,3 @@ unsigned long find_first_zero_bit(const
 	return size;
 }
 #endif
-
-#ifndef find_next_zero_bit
-unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL);
-}
-#endif
-
-#ifndef find_next_and_bit
-unsigned long find_next_and_bit(const unsigned long *addr1,
-		const unsigned long *addr2, unsigned long size,
-		unsigned long offset)
-{
-	return _find_next_bit(addr1, addr2, size, offset, 0UL);
-}
-#endif
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 22/91] lib: add fast path for find_next_*_bit()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (20 preceding siblings ...)
  2021-05-07  1:03 ` [patch 21/91] tools: sync find_next_bit implementation Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 23/91] lib: add fast path for find_first_*_bit() and find_last_bit() Andrew Morton
                   ` (69 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: lib: add fast path for find_next_*_bit()

Similarly to bitmap functions, find_next_*_bit() users will benefit if
we'll handle a case of bitmaps that fit into a single word inline.  In the
very best case, the compiler may replace a function call with a few
instructions.

This is the quite typical find_next_bit() user:

	unsigned int cpumask_next(int n, const struct cpumask *srcp)
	{
		/* -1 is a legal arg here. */
		if (n != -1)
			cpumask_check(n);
		return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
	}
	EXPORT_SYMBOL(cpumask_next);

Currently, on ARM64 the generated code looks like this:
	0000000000000000 <cpumask_next>:
	   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
	   4:   11000402        add     w2, w0, #0x1
	   8:   aa0103e0        mov     x0, x1
	   c:   d2800401        mov     x1, #0x40                       // #64
	  10:   910003fd        mov     x29, sp
	  14:   93407c42        sxtw    x2, w2
	  18:   94000000        bl      0 <find_next_bit>
	  1c:   a8c17bfd        ldp     x29, x30, [sp], #16
	  20:   d65f03c0        ret
	  24:   d503201f        nop

After applying this patch:
	0000000000000140 <cpumask_next>:
	 140:   11000400        add     w0, w0, #0x1
	 144:   93407c00        sxtw    x0, w0
	 148:   f100fc1f        cmp     x0, #0x3f
	 14c:   54000168        b.hi    178 <cpumask_next+0x38>  // b.pmore
	 150:   f9400023        ldr     x3, [x1]
	 154:   92800001        mov     x1, #0xffffffffffffffff         // #-1
	 158:   9ac02020        lsl     x0, x1, x0
	 15c:   52800802        mov     w2, #0x40                       // #64
	 160:   8a030001        and     x1, x0, x3
	 164:   dac00020        rbit    x0, x1
	 168:   f100003f        cmp     x1, #0x0
	 16c:   dac01000        clz     x0, x0
	 170:   1a800040        csel    w0, w2, w0, eq  // eq = none
	 174:   d65f03c0        ret
	 178:   52800800        mov     w0, #0x40                       // #64
	 17c:   d65f03c0        ret

find_next_bit() call is replaced with 6 instructions.  find_next_bit()
itself is 41 instructions plus function call overhead.

Despite inlining, the scripts/bloat-o-meter report smaller .text size
after applying the series:
	add/remove: 11/9 grow/shrink: 233/176 up/down: 5780/-6768 (-988)

Link: https://lkml.kernel.org/r/20210401003153.97325-10-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/bitops/find.h |   30 ++++++++++++++++++++++++++++
 include/asm-generic/bitops/le.h   |   21 +++++++++++++++++++
 2 files changed, 51 insertions(+)

--- a/include/asm-generic/bitops/find.h~lib-add-fast-path-for-find_next__bit
+++ a/include/asm-generic/bitops/find.h
@@ -20,6 +20,16 @@ static inline
 unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
 			    unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
 }
 #endif
@@ -40,6 +50,16 @@ unsigned long find_next_and_bit(const un
 		const unsigned long *addr2, unsigned long size,
 		unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & *addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
 }
 #endif
@@ -58,6 +78,16 @@ static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 				 unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
 }
 #endif
--- a/include/asm-generic/bitops/le.h~lib-add-fast-path-for-find_next__bit
+++ a/include/asm-generic/bitops/le.h
@@ -5,6 +5,7 @@
 #include <asm-generic/bitops/find.h>
 #include <asm/types.h>
 #include <asm/byteorder.h>
+#include <linux/swab.h>
 
 #if defined(__LITTLE_ENDIAN)
 
@@ -37,6 +38,16 @@ static inline
 unsigned long find_next_zero_bit_le(const void *addr, unsigned
 		long size, unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val = *(const unsigned long *)addr;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = swab(val) | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
 }
 #endif
@@ -46,6 +57,16 @@ static inline
 unsigned long find_next_bit_le(const void *addr, unsigned
 		long size, unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val = *(const unsigned long *)addr;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = swab(val) & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
 }
 #endif
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 23/91] lib: add fast path for find_first_*_bit() and find_last_bit()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (21 preceding siblings ...)
  2021-05-07  1:03 ` [patch 22/91] lib: add fast path for find_next_*_bit() Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 24/91] tools: sync lib/find_bit implementation Andrew Morton
                   ` (68 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: lib: add fast path for find_first_*_bit() and find_last_bit()

Similarly to bitmap functions, users would benefit if we'll handle a case
of small-size bitmaps that fit into a single word.

While here, move the find_last_bit() declaration to bitops/find.h where
other find_*_bit() functions sit.

Link: https://lkml.kernel.org/r/20210401003153.97325-11-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/bitops/find.h |   50 +++++++++++++++++++++++++---
 include/linux/bitops.h            |   12 ------
 lib/find_bit.c                    |   12 +++---
 3 files changed, 52 insertions(+), 22 deletions(-)

--- a/include/asm-generic/bitops/find.h~lib-add-fast-path-for-find_first__bit-and-find_last_bit
+++ a/include/asm-generic/bitops/find.h
@@ -5,6 +5,9 @@
 extern unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le);
+extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
 #ifndef find_next_bit
 /**
@@ -102,8 +105,17 @@ unsigned long find_next_zero_bit(const u
  * Returns the bit number of the first set bit.
  * If no bits are set, returns @size.
  */
-extern unsigned long find_first_bit(const unsigned long *addr,
-				    unsigned long size);
+static inline
+unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & GENMASK(size - 1, 0);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_bit(addr, size);
+}
 
 /**
  * find_first_zero_bit - find the first cleared bit in a memory region
@@ -113,8 +125,17 @@ extern unsigned long find_first_bit(cons
  * Returns the bit number of the first cleared bit.
  * If no bits are zero, returns @size.
  */
-extern unsigned long find_first_zero_bit(const unsigned long *addr,
-					 unsigned long size);
+static inline
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr | ~GENMASK(size - 1, 0);
+
+		return val == ~0UL ? size : ffz(val);
+	}
+
+	return _find_first_zero_bit(addr, size);
+}
 #else /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
 #ifndef find_first_bit
@@ -126,6 +147,27 @@ extern unsigned long find_first_zero_bit
 
 #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
+#ifndef find_last_bit
+/**
+ * find_last_bit - find the last set bit in a memory region
+ * @addr: The address to start the search at
+ * @size: The number of bits to search
+ *
+ * Returns the bit number of the last set bit, or size.
+ */
+static inline
+unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & GENMASK(size - 1, 0);
+
+		return val ? __fls(val) : size;
+	}
+
+	return _find_last_bit(addr, size);
+}
+#endif
+
 /**
  * find_next_clump8 - find next 8-bit clump with set bits in a memory region
  * @clump: location to store copy of found clump
--- a/include/linux/bitops.h~lib-add-fast-path-for-find_first__bit-and-find_last_bit
+++ a/include/linux/bitops.h
@@ -286,17 +286,5 @@ static __always_inline void __assign_bit
 })
 #endif
 
-#ifndef find_last_bit
-/**
- * find_last_bit - find the last set bit in a memory region
- * @addr: The address to start the search at
- * @size: The number of bits to search
- *
- * Returns the bit number of the last set bit, or size.
- */
-extern unsigned long find_last_bit(const unsigned long *addr,
-				   unsigned long size);
-#endif
-
 #endif /* __KERNEL__ */
 #endif
--- a/lib/find_bit.c~lib-add-fast-path-for-find_first__bit-and-find_last_bit
+++ a/lib/find_bit.c
@@ -75,7 +75,7 @@ EXPORT_SYMBOL(_find_next_bit);
 /*
  * Find the first set bit in a memory region.
  */
-unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -86,14 +86,14 @@ unsigned long find_first_bit(const unsig
 
 	return size;
 }
-EXPORT_SYMBOL(find_first_bit);
+EXPORT_SYMBOL(_find_first_bit);
 #endif
 
 #ifndef find_first_zero_bit
 /*
  * Find the first cleared bit in a memory region.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -104,11 +104,11 @@ unsigned long find_first_zero_bit(const
 
 	return size;
 }
-EXPORT_SYMBOL(find_first_zero_bit);
+EXPORT_SYMBOL(_find_first_zero_bit);
 #endif
 
 #ifndef find_last_bit
-unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_last_bit(const unsigned long *addr, unsigned long size)
 {
 	if (size) {
 		unsigned long val = BITMAP_LAST_WORD_MASK(size);
@@ -124,7 +124,7 @@ unsigned long find_last_bit(const unsign
 	}
 	return size;
 }
-EXPORT_SYMBOL(find_last_bit);
+EXPORT_SYMBOL(_find_last_bit);
 #endif
 
 unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 24/91] tools: sync lib/find_bit implementation
  2021-05-07  1:01 incoming Andrew Morton
                   ` (22 preceding siblings ...)
  2021-05-07  1:03 ` [patch 23/91] lib: add fast path for find_first_*_bit() and find_last_bit() Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 25/91] MAINTAINERS: add entry for the bitmap API Andrew Morton
                   ` (67 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: tools: sync lib/find_bit implementation

Add fast paths to find_*_bit() functions as per kernel implementation.

Link: https://lkml.kernel.org/r/20210401003153.97325-12-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/include/asm-generic/bitops/find.h |   58 ++++++++++++++++++++--
 tools/lib/find_bit.c                    |    4 -
 2 files changed, 57 insertions(+), 5 deletions(-)

--- a/tools/include/asm-generic/bitops/find.h~tools-sync-lib-find_bit-implementation
+++ a/tools/include/asm-generic/bitops/find.h
@@ -5,6 +5,9 @@
 extern unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le);
+extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
 #ifndef find_next_bit
 /**
@@ -20,6 +23,16 @@ static inline
 unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
 			    unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
 }
 #endif
@@ -40,6 +53,16 @@ unsigned long find_next_and_bit(const un
 		const unsigned long *addr2, unsigned long size,
 		unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & *addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
 }
 #endif
@@ -58,6 +81,16 @@ static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 				 unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
 }
 #endif
@@ -72,8 +105,17 @@ unsigned long find_next_zero_bit(const u
  * Returns the bit number of the first set bit.
  * If no bits are set, returns @size.
  */
-extern unsigned long find_first_bit(const unsigned long *addr,
-				    unsigned long size);
+static inline
+unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & GENMASK(size - 1, 0);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_bit(addr, size);
+}
 
 #endif /* find_first_bit */
 
@@ -87,7 +129,17 @@ extern unsigned long find_first_bit(cons
  * Returns the bit number of the first cleared bit.
  * If no bits are zero, returns @size.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size);
+static inline
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr | ~GENMASK(size - 1, 0);
+
+		return val == ~0UL ? size : ffz(val);
+	}
+
+	return _find_first_zero_bit(addr, size);
+}
 #endif
 
 #endif /*_TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_ */
--- a/tools/lib/find_bit.c~tools-sync-lib-find_bit-implementation
+++ a/tools/lib/find_bit.c
@@ -83,7 +83,7 @@ unsigned long _find_next_bit(const unsig
 /*
  * Find the first set bit in a memory region.
  */
-unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -100,7 +100,7 @@ unsigned long find_first_bit(const unsig
 /*
  * Find the first cleared bit in a memory region.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 25/91] MAINTAINERS: add entry for the bitmap API
  2021-05-07  1:01 incoming Andrew Morton
                   ` (23 preceding siblings ...)
  2021-05-07  1:03 ` [patch 24/91] tools: sync lib/find_bit implementation Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 26/91] lib/bch.c: fix a typo in the file bch.c Andrew Morton
                   ` (66 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: aklimov, akpm, andriy.shevchenko, arnd, dalias, dennis, dsterba,
	geert, glaubitz, jianpeng.ma, joe, jpoimboe, linux-mm, linux,
	mm-commits, richard.weiyang, sbrivio, torvalds, wsa+renesas,
	ysato, yury.norov

From: Yury Norov <yury.norov@gmail.com>
Subject: MAINTAINERS: add entry for the bitmap API

Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.

Link: https://lkml.kernel.org/r/20210401003153.97325-13-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/MAINTAINERS~maintainers-add-entry-for-the-bitmap-api
+++ a/MAINTAINERS
@@ -3205,6 +3205,22 @@ F:	Documentation/filesystems/bfs.rst
 F:	fs/bfs/
 F:	include/uapi/linux/bfs_fs.h
 
+BITMAP API
+M:	Yury Norov <yury.norov@gmail.com>
+R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
+S:	Maintained
+F:	include/asm-generic/bitops/find.h
+F:	include/linux/bitmap.h
+F:	lib/bitmap.c
+F:	lib/find_bit.c
+F:	lib/find_bit_benchmark.c
+F:	lib/test_bitmap.c
+F:	tools/include/asm-generic/bitops/find.h
+F:	tools/include/linux/bitmap.h
+F:	tools/lib/bitmap.c
+F:	tools/lib/find_bit.c
+
 BLINKM RGB LED DRIVER
 M:	Jan-Simon Moeller <jansimon.moeller@gmx.de>
 S:	Maintained
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 26/91] lib/bch.c: fix a typo in the file bch.c
  2021-05-07  1:01 incoming Andrew Morton
                   ` (24 preceding siblings ...)
  2021-05-07  1:03 ` [patch 25/91] MAINTAINERS: add entry for the bitmap API Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 27/91] lib: fix inconsistent indenting in process_bit1() Andrew Morton
                   ` (65 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rdunlap, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: lib/bch.c: fix a typo in the file bch.c

s/buid/build/

Link: https://lkml.kernel.org/r/20210301123129.18754-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/bch.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/bch.c~lib-fix-a-typo-in-the-file-bchc
+++ a/lib/bch.c
@@ -584,7 +584,7 @@ static int find_affine4_roots(struct bch
 	k = a_log(bch, a);
 	rows[0] = c;
 
-	/* buid linear system to solve X^4+aX^2+bX+c = 0 */
+	/* build linear system to solve X^4+aX^2+bX+c = 0 */
 	for (i = 0; i < m; i++) {
 		rows[i+1] = bch->a_pow_tab[4*i]^
 			(a ? bch->a_pow_tab[mod_s(bch, k)] : 0)^
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 27/91] lib: fix inconsistent indenting in process_bit1()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (25 preceding siblings ...)
  2021-05-07  1:03 ` [patch 26/91] lib/bch.c: fix a typo in the file bch.c Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 28/91] lib/list_sort.c: fix typo in function description Andrew Morton
                   ` (64 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, wangqing

From: Wang Qing <wangqing@vivo.com>
Subject: lib: fix inconsistent indenting in process_bit1()

Smatch gives the warning:
	lib/decompress_unlzma.c:395 process_bit1() warn: inconsistent indenting

Link: https://lkml.kernel.org/r/1614567775-4478-1-git-send-email-wangqing@vivo.com
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/decompress_unlzma.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/decompress_unlzma.c~lib-fix-inconsistent-indenting-in-process_bit1
+++ a/lib/decompress_unlzma.c
@@ -391,7 +391,7 @@ static inline int INIT process_bit0(stru
 static inline int INIT process_bit1(struct writer *wr, struct rc *rc,
 					    struct cstate *cst, uint16_t *p,
 					    int pos_state, uint16_t *prob) {
-  int offset;
+	int offset;
 	uint16_t *prob_len;
 	int num_bits;
 	int len;
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 28/91] lib/list_sort.c: fix typo in function description
  2021-05-07  1:01 incoming Andrew Morton
                   ` (26 preceding siblings ...)
  2021-05-07  1:03 ` [patch 27/91] lib: fix inconsistent indenting in process_bit1() Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 29/91] lib/genalloc.c: fix a typo Andrew Morton
                   ` (63 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, mrtoastcheng, torvalds

From: ToastC <mrtoastcheng@gmail.com>
Subject: lib/list_sort.c: fix typo in function description

Replace beautiully with beautifully

Link: https://lkml.kernel.org/r/20210315090633.9759-1-mrtoastcheng@gmail.com
Signed-off-by: ShihCheng Tu <mrtoastcheng@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/list_sort.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/list_sort.c~lib-fix-typo-in-function-description
+++ a/lib/list_sort.c
@@ -137,7 +137,7 @@ static void merge_final(void *priv, list
  *
  *
  * The merging is controlled by "count", the number of elements in the
- * pending lists.  This is beautiully simple code, but rather subtle.
+ * pending lists.  This is beautifully simple code, but rather subtle.
  *
  * Each time we increment "count", we set one bit (bit k) and clear
  * bits k-1 .. 0.  Each time this happens (except the very first time
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 29/91] lib/genalloc.c: fix a typo
  2021-05-07  1:01 incoming Andrew Morton
                   ` (27 preceding siblings ...)
  2021-05-07  1:03 ` [patch 28/91] lib/list_sort.c: fix typo in function description Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 30/91] lib: crc8: pointer to data block should be const Andrew Morton
                   ` (62 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: lib/genalloc.c: Fix a typo

s/macthing/matching/

Link: https://lkml.kernel.org/r/20210326131530.30481-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/genalloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/genalloc.c~lib-genallocc-fix-a-typo
+++ a/lib/genalloc.c
@@ -735,7 +735,7 @@ EXPORT_SYMBOL(gen_pool_first_fit_order_a
 
 /**
  * gen_pool_best_fit - find the best fitting region of memory
- * macthing the size requirement (no alignment constraint)
+ * matching the size requirement (no alignment constraint)
  * @map: The address to base the search on
  * @size: The bitmap size in bits
  * @start: The bitnumber to start searching at
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 30/91] lib: crc8: pointer to data block should be const
  2021-05-07  1:01 incoming Andrew Morton
                   ` (28 preceding siblings ...)
  2021-05-07  1:03 ` [patch 29/91] lib/genalloc.c: fix a typo Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 31/91] lib: stackdepot: turn depot_lock spinlock to raw_spinlock Andrew Morton
                   ` (61 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rdunlap, rf, torvalds

From: Richard Fitzgerald <rf@opensource.cirrus.com>
Subject: lib: crc8: pointer to data block should be const

crc8() does not change the data passed to it, so the pointer argument
should be declared const.  This avoids callers that receive const data
having to cast it to a non-const pointer to call crc8().

Link: https://lkml.kernel.org/r/20210329122409.3291-1-rf@opensource.cirrus.com
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/crc8.h |    2 +-
 lib/crc8.c           |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/crc8.h~lib-crc8-pointer-to-data-block-should-be-const
+++ a/include/linux/crc8.h
@@ -96,6 +96,6 @@ void crc8_populate_msb(u8 table[CRC8_TAB
  * Williams, Ross N., ross<at>ross.net
  * (see URL http://www.ross.net/crc/download/crc_v3.txt).
  */
-u8 crc8(const u8 table[CRC8_TABLE_SIZE], u8 *pdata, size_t nbytes, u8 crc);
+u8 crc8(const u8 table[CRC8_TABLE_SIZE], const u8 *pdata, size_t nbytes, u8 crc);
 
 #endif /* __CRC8_H_ */
--- a/lib/crc8.c~lib-crc8-pointer-to-data-block-should-be-const
+++ a/lib/crc8.c
@@ -71,7 +71,7 @@ EXPORT_SYMBOL(crc8_populate_lsb);
  * @nbytes: number of bytes in data buffer.
  * @crc: previous returned crc8 value.
  */
-u8 crc8(const u8 table[CRC8_TABLE_SIZE], u8 *pdata, size_t nbytes, u8 crc)
+u8 crc8(const u8 table[CRC8_TABLE_SIZE], const u8 *pdata, size_t nbytes, u8 crc)
 {
 	/* loop over the buffer data */
 	while (nbytes-- > 0)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 31/91] lib: stackdepot: turn depot_lock spinlock to raw_spinlock
  2021-05-07  1:01 incoming Andrew Morton
                   ` (29 preceding siblings ...)
  2021-05-07  1:03 ` [patch 30/91] lib: crc8: pointer to data block should be const Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 32/91] lib/percpu_counter: tame kernel-doc compile warning Andrew Morton
                   ` (60 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: ahalaney, akpm, glider, gustavoars, linux-mm, mm-commits,
	qiang.zhang, torvalds, vinmenon, vjitta, ylal

From: Zqiang <qiang.zhang@windriver.com>
Subject: lib: stackdepot: turn depot_lock spinlock to raw_spinlock

[    2.670635] BUG: sleeping function called from invalid context
at kernel/locking/rtmutex.c:951
[    2.670638] in_atomic(): 0, irqs_disabled(): 1, non_block: 0,
pid: 19, name: pgdatinit0
[    2.670768] Call Trace:
[    2.670800]  dump_stack+0x93/0xc2
[    2.670826]  ___might_sleep.cold+0x1b2/0x1f1
[    2.670838]  rt_spin_lock+0x3b/0xb0
[    2.670838]  stack_depot_save+0x1b9/0x440
[    2.670838]  kasan_save_stack+0x32/0x40
[    2.670838]  kasan_record_aux_stack+0xa5/0xb0
[    2.670838]  __call_rcu+0x117/0x880
[    2.670838]  __exit_signal+0xafb/0x1180
[    2.670838]  release_task+0x1d6/0x480
[    2.670838]  exit_notify+0x303/0x750
[    2.670838]  do_exit+0x678/0xcf0
[    2.670838]  kthread+0x364/0x4f0
[    2.670838]  ret_from_fork+0x22/0x30

In RT system, the spin_lock will be replaced by sleepable rt_mutex lock,
in __call_rcu(), disable interrupts before calling
kasan_record_aux_stack(), will trigger above calltrace, replace spinlock
with raw_spinlock.

Link: https://lkml.kernel.org/r/20210329084009.27013-1-qiang.zhang@windriver.com
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Reported-by: Andrew Halaney <ahalaney@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Yogesh Lal <ylal@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/stackdepot.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/lib/stackdepot.c~lib-stackdepot-turn-depot_lock-spinlock-to-raw_spinlock
+++ a/lib/stackdepot.c
@@ -71,7 +71,7 @@ static void *stack_slabs[STACK_ALLOC_MAX
 static int depot_index;
 static int next_slab_inited;
 static size_t depot_offset;
-static DEFINE_SPINLOCK(depot_lock);
+static DEFINE_RAW_SPINLOCK(depot_lock);
 
 static bool init_stack_slab(void **prealloc)
 {
@@ -305,7 +305,7 @@ depot_stack_handle_t stack_depot_save(un
 			prealloc = page_address(page);
 	}
 
-	spin_lock_irqsave(&depot_lock, flags);
+	raw_spin_lock_irqsave(&depot_lock, flags);
 
 	found = find_stack(*bucket, entries, nr_entries, hash);
 	if (!found) {
@@ -329,7 +329,7 @@ depot_stack_handle_t stack_depot_save(un
 		WARN_ON(!init_stack_slab(&prealloc));
 	}
 
-	spin_unlock_irqrestore(&depot_lock, flags);
+	raw_spin_unlock_irqrestore(&depot_lock, flags);
 exit:
 	if (prealloc) {
 		/* Nobody used this memory, ok to free it. */
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 32/91] lib/percpu_counter: tame kernel-doc compile warning
  2021-05-07  1:01 incoming Andrew Morton
                   ` (30 preceding siblings ...)
  2021-05-07  1:03 ` [patch 31/91] lib: stackdepot: turn depot_lock spinlock to raw_spinlock Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 33/91] lib/genalloc: add parameter description to fix doc " Andrew Morton
                   ` (59 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, alexs, linux-mm, mm-commits, nborisov, swboyd, torvalds

From: Alex Shi <alexs@kernel.org>
Subject: lib/percpu_counter: tame kernel-doc compile warning

commit 3e8f399da490 ("writeback: rework wb_[dec|inc]_stat family of
functions") add some function description of percpu_counter_add_batch. 
but the double '*' in comments means a kernel-doc format comment which
isn't right.

Since the whole file of lib/percpu_counter.c has no any other kernel-doc
format comments, we'd better to remove this incomplete one to tame the
kernel-doc warning:

lib/percpu_counter.c:83: warning: Function parameter or member 'fbc' not described in 'percpu_counter_add_batch'
lib/percpu_counter.c:83: warning: Function parameter or member 'amount' not described in 'percpu_counter_add_batch'
lib/percpu_counter.c:83: warning: Function parameter or member 'batch' not described in 'percpu_counter_add_batch'

Link: https://lkml.kernel.org/r/20210405135505.132446-1-alexs@kernel.org
Signed-off-by: Alex Shi <alexs@kernel.org>
Cc: Nikolay Borisov <nborisov@suse.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/percpu_counter.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/lib/percpu_counter.c~lib-percpu_counter-tame-kernel-doc-compile-warning
+++ a/lib/percpu_counter.c
@@ -72,7 +72,7 @@ void percpu_counter_set(struct percpu_co
 }
 EXPORT_SYMBOL(percpu_counter_set);
 
-/**
+/*
  * This function is both preempt and irq safe. The former is due to explicit
  * preemption disable. The latter is guaranteed by the fact that the slow path
  * is explicitly protected by an irq-safe spinlock whereas the fast patch uses
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 33/91] lib/genalloc: add parameter description to fix doc compile warning
  2021-05-07  1:01 incoming Andrew Morton
                   ` (31 preceding siblings ...)
  2021-05-07  1:03 ` [patch 32/91] lib/percpu_counter: tame kernel-doc compile warning Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 34/91] lib: parser: clean up kernel-doc Andrew Morton
                   ` (58 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, alexey.skidanov, alexs, linux-mm, mm-commits, sfr, sjhuang,
	torvalds, unixbhaskar

From: Alex Shi <alexs@kernel.org>
Subject: lib/genalloc: add parameter description to fix doc compile warning

commit 52fbf1134d47 ("lib/genalloc.c: fix allocation of aligned buffer
 from non-aligned chunk") add a new parameter 'start_addr' w/o
description for it. That cause some doc compile warning:

lib/genalloc.c:649: warning: Function parameter or member 'start_addr' not described in 'gen_pool_first_fit'
lib/genalloc.c:667: warning: Function parameter or member 'start_addr' not described in 'gen_pool_first_fit_align'
lib/genalloc.c:694: warning: Function parameter or member 'start_addr' not described in 'gen_pool_fixed_alloc'
lib/genalloc.c:729: warning: Function parameter or member 'start_addr' not described in 'gen_pool_first_fit_order_align'
lib/genalloc.c:752: warning: Function parameter or member 'start_addr' not described in 'gen_pool_best_fit'

This patch fix this by adding parameter descriptions.

Link: https://lkml.kernel.org/r/20210405132021.131231-1-alexs@kernel.org
Signed-off-by: Alex Shi <alexs@kernel.org>
Cc: Alexey Skidanov <alexey.skidanov@intel.com>
Cc: Huang Shijie <sjhuang@iluvatar.ai>
Cc: Alex Shi <alexs@kernel.org>
Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/genalloc.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/lib/genalloc.c~lib-genalloc-add-parameter-description-to-fix-doc-compile-warning
+++ a/lib/genalloc.c
@@ -642,6 +642,7 @@ EXPORT_SYMBOL(gen_pool_set_algo);
  * @nr: The number of zeroed bits we're looking for
  * @data: additional data - unused
  * @pool: pool to find the fit region memory from
+ * @start_addr: not used in this function
  */
 unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
 		unsigned long start, unsigned int nr, void *data,
@@ -660,6 +661,7 @@ EXPORT_SYMBOL(gen_pool_first_fit);
  * @nr: The number of zeroed bits we're looking for
  * @data: data for alignment
  * @pool: pool to get order from
+ * @start_addr: start addr of alloction chunk
  */
 unsigned long gen_pool_first_fit_align(unsigned long *map, unsigned long size,
 		unsigned long start, unsigned int nr, void *data,
@@ -687,6 +689,7 @@ EXPORT_SYMBOL(gen_pool_first_fit_align);
  * @nr: The number of zeroed bits we're looking for
  * @data: data for alignment
  * @pool: pool to get order from
+ * @start_addr: not used in this function
  */
 unsigned long gen_pool_fixed_alloc(unsigned long *map, unsigned long size,
 		unsigned long start, unsigned int nr, void *data,
@@ -721,6 +724,7 @@ EXPORT_SYMBOL(gen_pool_fixed_alloc);
  * @nr: The number of zeroed bits we're looking for
  * @data: additional data - unused
  * @pool: pool to find the fit region memory from
+ * @start_addr: not used in this function
  */
 unsigned long gen_pool_first_fit_order_align(unsigned long *map,
 		unsigned long size, unsigned long start,
@@ -742,6 +746,7 @@ EXPORT_SYMBOL(gen_pool_first_fit_order_a
  * @nr: The number of zeroed bits we're looking for
  * @data: additional data - unused
  * @pool: pool to find the fit region memory from
+ * @start_addr: not used in this function
  *
  * Iterate over the bitmap to find the smallest free region
  * which we can allocate the memory.
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 34/91] lib: parser: clean up kernel-doc
  2021-05-07  1:01 incoming Andrew Morton
                   ` (32 preceding siblings ...)
  2021-05-07  1:03 ` [patch 33/91] lib/genalloc: add parameter description to fix doc " Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 35/91] include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx() Andrew Morton
                   ` (57 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, dhowells, linux-mm, mm-commits, rdunlap, torvalds, viro

From: Randy Dunlap <rdunlap@infradead.org>
Subject: lib: parser: clean up kernel-doc

Mark match_uint() as kernel-doc notation since it is already fully
annotated as such.  Use % prefix on constants in kernel-doc comments. 
Convert function return descriptions to use the "Return:" kernel-doc
notation.

Link: https://lkml.kernel.org/r/20210407034514.5651-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/parser.c |   61 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 38 insertions(+), 23 deletions(-)

--- a/lib/parser.c~lib-parser-clean-up-kernel-doc
+++ a/lib/parser.c
@@ -98,7 +98,7 @@ static int match_one(char *s, const char
  * locations.
  *
  * Description: Detects which if any of a set of token strings has been passed
- * to it. Tokens can include up to MAX_OPT_ARGS instances of basic c-style
+ * to it. Tokens can include up to %MAX_OPT_ARGS instances of basic c-style
  * format identifiers which will be taken into account when matching the
  * tokens, and whose locations will be returned in the @args array.
  */
@@ -120,8 +120,10 @@ EXPORT_SYMBOL(match_token);
  * @base: base to use when converting string
  *
  * Description: Given a &substring_t and a base, attempts to parse the substring
- * as a number in that base. On success, sets @result to the integer represented
- * by the string and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * as a number in that base.
+ *
+ * Return: On success, sets @result to the integer represented by the
+ * string and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 static int match_number(substring_t *s, int *result, int base)
 {
@@ -153,8 +155,10 @@ static int match_number(substring_t *s,
  * @base: base to use when converting string
  *
  * Description: Given a &substring_t and a base, attempts to parse the substring
- * as a number in that base. On success, sets @result to the integer represented
- * by the string and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * as a number in that base.
+ *
+ * Return: On success, sets @result to the integer represented by the
+ * string and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 static int match_u64int(substring_t *s, u64 *result, int base)
 {
@@ -178,9 +182,10 @@ static int match_u64int(substring_t *s,
  * @s: substring_t to be scanned
  * @result: resulting integer on success
  *
- * Description: Attempts to parse the &substring_t @s as a decimal integer. On
- * success, sets @result to the integer represented by the string and returns 0.
- * Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * Description: Attempts to parse the &substring_t @s as a decimal integer.
+ *
+ * Return: On success, sets @result to the integer represented by the string
+ * and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 int match_int(substring_t *s, int *result)
 {
@@ -188,14 +193,15 @@ int match_int(substring_t *s, int *resul
 }
 EXPORT_SYMBOL(match_int);
 
-/*
+/**
  * match_uint - scan a decimal representation of an integer from a substring_t
  * @s: substring_t to be scanned
  * @result: resulting integer on success
  *
- * Description: Attempts to parse the &substring_t @s as a decimal integer. On
- * success, sets @result to the integer represented by the string and returns 0.
- * Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * Description: Attempts to parse the &substring_t @s as a decimal integer.
+ *
+ * Return: On success, sets @result to the integer represented by the string
+ * and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 int match_uint(substring_t *s, unsigned int *result)
 {
@@ -217,9 +223,10 @@ EXPORT_SYMBOL(match_uint);
  * @result: resulting unsigned long long on success
  *
  * Description: Attempts to parse the &substring_t @s as a long decimal
- * integer. On success, sets @result to the integer represented by the
- * string and returns 0.
- * Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * integer.
+ *
+ * Return: On success, sets @result to the integer represented by the string
+ * and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 int match_u64(substring_t *s, u64 *result)
 {
@@ -232,9 +239,10 @@ EXPORT_SYMBOL(match_u64);
  * @s: substring_t to be scanned
  * @result: resulting integer on success
  *
- * Description: Attempts to parse the &substring_t @s as an octal integer. On
- * success, sets @result to the integer represented by the string and returns
- * 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ * Description: Attempts to parse the &substring_t @s as an octal integer.
+ *
+ * Return: On success, sets @result to the integer represented by the string
+ * and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 int match_octal(substring_t *s, int *result)
 {
@@ -248,8 +256,9 @@ EXPORT_SYMBOL(match_octal);
  * @result: resulting integer on success
  *
  * Description: Attempts to parse the &substring_t @s as a hexadecimal integer.
- * On success, sets @result to the integer represented by the string and
- * returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
+ *
+ * Return: On success, sets @result to the integer represented by the string
+ * and returns 0. Returns -ENOMEM, -EINVAL, or -ERANGE on failure.
  */
 int match_hex(substring_t *s, int *result)
 {
@@ -263,10 +272,11 @@ EXPORT_SYMBOL(match_hex);
  * @str: the string to be parsed
  *
  * Description: Parse the string @str to check if matches wildcard
- * pattern @pattern. The pattern may contain two type wildcardes:
+ * pattern @pattern. The pattern may contain two types of wildcards:
  *   '*' - matches zero or more characters
  *   '?' - matches one character
- * If it's matched, return true, else return false.
+ *
+ * Return: If the @str matches the @pattern, return true, else return false.
  */
 bool match_wildcard(const char *pattern, const char *str)
 {
@@ -316,7 +326,9 @@ EXPORT_SYMBOL(match_wildcard);
  *
  * Description: Copy the characters in &substring_t @src to the
  * c-style string @dest.  Copy no more than @size - 1 characters, plus
- * the terminating NUL.  Return length of @src.
+ * the terminating NUL.
+ *
+ * Return: length of @src.
  */
 size_t match_strlcpy(char *dest, const substring_t *src, size_t size)
 {
@@ -338,6 +350,9 @@ EXPORT_SYMBOL(match_strlcpy);
  * Description: Allocates and returns a string filled with the contents of
  * the &substring_t @s. The caller is responsible for freeing the returned
  * string with kfree().
+ *
+ * Return: the address of the newly allocated NUL-terminated string or
+ * %NULL on error.
  */
 char *match_strdup(const substring_t *s)
 {
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 35/91] include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (33 preceding siblings ...)
  2021-05-07  1:03 ` [patch 34/91] lib: parser: clean up kernel-doc Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 36/91] checkpatch: warn when missing newline in return sysfs_emit() formats Andrew Morton
                   ` (56 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, linux-mm, masahiroy, mm-commits, torvalds

From: Masahiro Yamada <masahiroy@kernel.org>
Subject: include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx()

compat_sys##name is declared twice, just one line below.

With this removal SYSCALL_DEFINEx() (defined in <linux/syscalls.h>)
and COMPAT_SYSCALL_DEFINEx() look symmetrical.

Link: https://lkml.kernel.org/r/20210223114924.854794-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/compat.h |    1 -
 1 file changed, 1 deletion(-)

--- a/include/linux/compat.h~compat-remove-unneeded-declaration-from-compat_syscall_definex
+++ a/include/linux/compat.h
@@ -75,7 +75,6 @@
 	__diag_push();								\
 	__diag_ignore(GCC, 8, "-Wattribute-alias",				\
 		      "Type aliasing is used to sanitize syscall arguments");\
-	asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));	\
 	asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))	\
 		__attribute__((alias(__stringify(__se_compat_sys##name))));	\
 	ALLOW_ERROR_INJECTION(compat_sys##name, ERRNO);				\
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 36/91] checkpatch: warn when missing newline in return sysfs_emit() formats
  2021-05-07  1:01 incoming Andrew Morton
                   ` (34 preceding siblings ...)
  2021-05-07  1:03 ` [patch 35/91] include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx() Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:03 ` [patch 37/91] checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE Andrew Morton
                   ` (55 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, joe, linux-mm, mm-commits, torvalds

From: Joe Perches <joe@perches.com>
Subject: checkpatch: warn when missing newline in return sysfs_emit() formats

return sysfs_emit() uses should include a newline.

Suggest adding a newline when one is missing.
Add one using --fix too.

Link: https://lkml.kernel.org/r/aa1819fa5faf786573df298e5e2e7d357ba7d4ad.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/checkpatch.pl |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/scripts/checkpatch.pl~checkpatch-warn-when-missing-newline-in-return-sysfs_emit-formats
+++ a/scripts/checkpatch.pl
@@ -7198,6 +7198,17 @@ sub process {
 			     "Using $1 should generally have parentheses around the comparison\n" . $herecurr);
 		}
 
+# return sysfs_emit(foo, fmt, ...) fmt without newline
+		if ($line =~ /\breturn\s+sysfs_emit\s*\(\s*$FuncArg\s*,\s*($String)/ &&
+		    substr($rawline, $-[6], $+[6] - $-[6]) !~ /\\n"$/) {
+			my $offset = $+[6] - 1;
+			if (WARN("SYSFS_EMIT",
+				 "return sysfs_emit(...) formats should include a terminating newline\n" . $herecurr) &&
+			    $fix) {
+				substr($fixed[$fixlinenr], $offset, 0) = '\\n';
+			}
+		}
+
 # nested likely/unlikely calls
 		if ($line =~ /\b(?:(?:un)?likely)\s*\(\s*!?\s*(IS_ERR(?:_OR_NULL|_VALUE)?|WARN)/) {
 			WARN("LIKELY_MISUSE",
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 37/91] checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE
  2021-05-07  1:01 incoming Andrew Morton
                   ` (35 preceding siblings ...)
  2021-05-07  1:03 ` [patch 36/91] checkpatch: warn when missing newline in return sysfs_emit() formats Andrew Morton
@ 2021-05-07  1:03 ` Andrew Morton
  2021-05-07  1:04 ` [patch 38/91] checkpatch: improve ALLOC_ARRAY_ARGS test Andrew Morton
                   ` (54 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:03 UTC (permalink / raw)
  To: akpm, joe, linux-mm, mailhol.vincent, mm-commits, torvalds

From: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Subject: checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE

__must_be_array, offsetof, sizeof_field and __stringify are all
preprocessor macros and do not evaluate their arguments.  As such, it is
safe not to warn when arguments are being reused in those four
sub-expressions.

Exclude those so that they can pass checkpatch.

Link: https://lkml.kernel.org/r/20210407105042.25380-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/checkpatch.pl |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/scripts/checkpatch.pl~checkpatch-exclude-four-preprocessor-sub-expressions-from-macro_arg_reuse
+++ a/scripts/checkpatch.pl
@@ -5829,7 +5829,7 @@ sub process {
 			        next if ($arg =~ /\.\.\./);
 			        next if ($arg =~ /^type$/i);
 				my $tmp_stmt = $define_stmt;
-				$tmp_stmt =~ s/\b(sizeof|typeof|__typeof__|__builtin\w+|typecheck\s*\(\s*$Type\s*,|\#+)\s*\(*\s*$arg\s*\)*\b//g;
+				$tmp_stmt =~ s/\b(__must_be_array|offsetof|sizeof|sizeof_field|__stringify|typeof|__typeof__|__builtin\w+|typecheck\s*\(\s*$Type\s*,|\#+)\s*\(*\s*$arg\s*\)*\b//g;
 				$tmp_stmt =~ s/\#+\s*$arg\b//g;
 				$tmp_stmt =~ s/\b$arg\s*\#\#//g;
 				my $use_cnt = () = $tmp_stmt =~ /\b$arg\b/g;
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 38/91] checkpatch: improve ALLOC_ARRAY_ARGS test
  2021-05-07  1:01 incoming Andrew Morton
                   ` (36 preceding siblings ...)
  2021-05-07  1:03 ` [patch 37/91] checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 39/91] kselftest: introduce new epoll test case Andrew Morton
                   ` (53 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, christophe.jaillet, joe, linux-mm, mm-commits, torvalds

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Subject: checkpatch: improve ALLOC_ARRAY_ARGS test

The devm_ variant of 'kcalloc()' and 'kmalloc_array()' are not tested
Add the corresponding check.

Link: https://lkml.kernel.org/r/205fc4847972fb6779abcc8818f39c14d1b45af1.1618595794.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/checkpatch.pl |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/scripts/checkpatch.pl~checkpatch-improve-alloc_array_args-test
+++ a/scripts/checkpatch.pl
@@ -7006,7 +7006,7 @@ sub process {
 		}
 
 # check for alloc argument mismatch
-		if ($line =~ /\b(kcalloc|kmalloc_array)\s*\(\s*sizeof\b/) {
+		if ($line =~ /\b((?:devm_)?(?:kcalloc|kmalloc_array))\s*\(\s*sizeof\b/) {
 			WARN("ALLOC_ARRAY_ARGS",
 			     "$1 uses number as first arg, sizeof is generally wrong\n" . $herecurr);
 		}
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 39/91] kselftest: introduce new epoll test case
  2021-05-07  1:01 incoming Andrew Morton
                   ` (37 preceding siblings ...)
  2021-05-07  1:04 ` [patch 38/91] checkpatch: improve ALLOC_ARRAY_ARGS test Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 40/91] fs/epoll: restore waking from ep_done_scan() Andrew Morton
                   ` (52 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, dave, dbueso, jbaron, linux-mm, mm-commits, rpenyaev,
	torvalds, viro

From: Davidlohr Bueso <dave@stgolabs.net>
Subject: kselftest: introduce new epoll test case

Patch series "fs/epoll: restore user-visible behavior upon event ready".

This series tries to address a change in user visible behavior, reported
in https://bugzilla.kernel.org/show_bug.cgi?id=208943.

Epoll does not report an event to all the threads running epoll_wait()
on the same epoll descriptor. Unsurprisingly, this was bisected back to
339ddb53d373 (fs/epoll: remove unnecessary wakeups of nested epoll), which
has had various problems in the past, beyond only nested epoll usage.


This patch (of 2):

This incorporates the testcase originally reported in:

     https://bugzilla.kernel.org/show_bug.cgi?id=208943

Which ensures an event is reported to all threads blocked on the same
epoll descriptor, otherwise only a single thread will receive the wakeup
once the event become ready.

Link: https://lkml.kernel.org/r/20210405231025.33829-1-dave@stgolabs.net
Link: https://lkml.kernel.org/r/20210405231025.33829-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   44 ++++++++++
 1 file changed, 44 insertions(+)

--- a/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c~kselftest-introduce-new-epoll-test-case
+++ a/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
@@ -3449,4 +3449,48 @@ TEST(epoll63)
 	close(sfd[1]);
 }
 
+/*
+ *        t0    t1
+ *     (ew) \  / (ew)
+ *           e0
+ *            | (lt)
+ *           s0
+ */
+TEST(epoll64)
+{
+	pthread_t waiter[2];
+	struct epoll_event e;
+	struct epoll_mtcontext ctx = { 0 };
+
+	signal(SIGUSR1, signal_handler);
+
+	ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM, 0, ctx.sfd), 0);
+
+	ctx.efd[0] = epoll_create(1);
+	ASSERT_GE(ctx.efd[0], 0);
+
+	e.events = EPOLLIN;
+	ASSERT_EQ(epoll_ctl(ctx.efd[0], EPOLL_CTL_ADD, ctx.sfd[0], &e), 0);
+
+	/*
+	 * main will act as the emitter once both waiter threads are
+	 * blocked and expects to both be awoken upon the ready event.
+	 */
+	ctx.main = pthread_self();
+	ASSERT_EQ(pthread_create(&waiter[0], NULL, waiter_entry1a, &ctx), 0);
+	ASSERT_EQ(pthread_create(&waiter[1], NULL, waiter_entry1a, &ctx), 0);
+
+	usleep(100000);
+	ASSERT_EQ(write(ctx.sfd[1], "w", 1), 1);
+
+	ASSERT_EQ(pthread_join(waiter[0], NULL), 0);
+	ASSERT_EQ(pthread_join(waiter[1], NULL), 0);
+
+	EXPECT_EQ(ctx.count, 2);
+
+	close(ctx.efd[0]);
+	close(ctx.sfd[0]);
+	close(ctx.sfd[1]);
+}
+
 TEST_HARNESS_MAIN
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 40/91] fs/epoll: restore waking from ep_done_scan()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (38 preceding siblings ...)
  2021-05-07  1:04 ` [patch 39/91] kselftest: introduce new epoll test case Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 41/91] isofs: fix fall-through warnings for Clang Andrew Morton
                   ` (51 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, dave, dbueso, jbaron, linux-mm, mm-commits, rpenyaev,
	stable, torvalds, viro

From: Davidlohr Bueso <dave@stgolabs.net>
Subject: fs/epoll: restore waking from ep_done_scan()

339ddb53d373 (fs/epoll: remove unnecessary wakeups of nested epoll)
changed the userspace visible behavior of exclusive waiters blocked on a
common epoll descriptor upon a single event becoming ready.  Previously,
all tasks doing epoll_wait would awake, and now only one is awoken,
potentially causing missed wakeups on applications that rely on this
behavior, such as Apache Qpid.

While the aforementioned commit aims at having only a wakeup single path
in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to
restore the wakeup in what was the old ep_scan_ready_list() such that the
next thread can be awoken, in a cascading style, after the waker's
corresponding ep_send_events().

Link: https://lkml.kernel.org/r/20210405231025.33829-3-dave@stgolabs.net
Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/eventpoll.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- a/fs/eventpoll.c~fs-epoll-restore-waking-from-ep_done_scan
+++ a/fs/eventpoll.c
@@ -657,6 +657,12 @@ static void ep_done_scan(struct eventpol
 	 */
 	list_splice(txlist, &ep->rdllist);
 	__pm_relax(ep->ws);
+
+	if (!list_empty(&ep->rdllist)) {
+		if (waitqueue_active(&ep->wq))
+			wake_up(&ep->wq);
+	}
+
 	write_unlock_irq(&ep->lock);
 }
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 41/91] isofs: fix fall-through warnings for Clang
  2021-05-07  1:01 incoming Andrew Morton
                   ` (39 preceding siblings ...)
  2021-05-07  1:04 ` [patch 40/91] fs/epoll: restore waking from ep_done_scan() Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 42/91] fs/nilfs2: fix misspellings using codespell tool Andrew Morton
                   ` (50 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, gustavoars, linux-mm, mm-commits, torvalds

From: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Subject: isofs: fix fall-through warnings for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of just letting the code
fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Link: https://lkml.kernel.org/r/5b7caa73958588065fabc59032c340179b409ef5.1605896059.git.gustavoars@kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/isofs/rock.c |    1 +
 1 file changed, 1 insertion(+)

--- a/fs/isofs/rock.c~isofs-fix-fall-through-warnings-for-clang
+++ a/fs/isofs/rock.c
@@ -767,6 +767,7 @@ repeat:
 			rs.cont_extent = isonum_733(rr->u.CE.extent);
 			rs.cont_offset = isonum_733(rr->u.CE.offset);
 			rs.cont_size = isonum_733(rr->u.CE.size);
+			break;
 		default:
 			break;
 		}
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 42/91] fs/nilfs2: fix misspellings using codespell tool
  2021-05-07  1:01 incoming Andrew Morton
                   ` (40 preceding siblings ...)
  2021-05-07  1:04 ` [patch 41/91] isofs: fix fall-through warnings for Clang Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 43/91] nilfs2: fix typos in comments Andrew Morton
                   ` (49 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, konishi.ryusuke, linux-mm, liu.xuzhi, mm-commits, torvalds

From: Liu xuzhi <liu.xuzhi@zte.com.cn>
Subject: fs/nilfs2: fix misspellings using codespell tool

Two typos are found out by codespell tool \
in 2217th and 2254th lines of segment.c:

$ codespell ./fs/nilfs2/
./segment.c:2217 :retured  ==> returned
./segment.c:2254: retured  ==> returned

Fix two typos found by codespell.

Link: https://lkml.kernel.org/r/1617864087-8198-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Liu xuzhi <liu.xuzhi@zte.com.cn>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/nilfs2/segment.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/nilfs2/segment.c~fs-nilfs2-fix-misspellings-using-codespell-tool
+++ a/fs/nilfs2/segment.c
@@ -2214,7 +2214,7 @@ static void nilfs_segctor_wakeup(struct
  * nilfs_construct_segment - construct a logical segment
  * @sb: super block
  *
- * Return Value: On success, 0 is retured. On errors, one of the following
+ * Return Value: On success, 0 is returned. On errors, one of the following
  * negative error code is returned.
  *
  * %-EROFS - Read only filesystem.
@@ -2251,7 +2251,7 @@ int nilfs_construct_segment(struct super
  * @start: start byte offset
  * @end: end byte offset (inclusive)
  *
- * Return Value: On success, 0 is retured. On errors, one of the following
+ * Return Value: On success, 0 is returned. On errors, one of the following
  * negative error code is returned.
  *
  * %-EROFS - Read only filesystem.
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 43/91] nilfs2: fix typos in comments
  2021-05-07  1:01 incoming Andrew Morton
                   ` (41 preceding siblings ...)
  2021-05-07  1:04 ` [patch 42/91] fs/nilfs2: fix misspellings using codespell tool Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 44/91] hpfs: replace one-element array with flexible-array member Andrew Morton
                   ` (48 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, konishi.ryusuke, linux-mm, lujialin4, mm-commits, torvalds

From: Lu Jialin <lujialin4@huawei.com>
Subject: nilfs2: fix typos in comments

numer -> number in fs/nilfs2/cpfile.c
Decription -> Description in fs/nilfs2/ioctl.c
isntance -> instance in fs/nilfs2/the_nilfs.c

Link: https://lkml.kernel.org/r/1617942951-14631-1-git-send-email-konishi.ryusuke@gmail.com
Link: https://lore.kernel.org/r/20210409022519.176988-1-lujialin4@huawei.com
Signed-off-by: Lu Jialin <lujialin4@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/nilfs2/cpfile.c    |    2 +-
 fs/nilfs2/ioctl.c     |    4 ++--
 fs/nilfs2/the_nilfs.c |    2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

--- a/fs/nilfs2/cpfile.c~nilfs2-fix-typos-in-comments
+++ a/fs/nilfs2/cpfile.c
@@ -293,7 +293,7 @@ void nilfs_cpfile_put_checkpoint(struct
  * nilfs_cpfile_delete_checkpoints - delete checkpoints
  * @cpfile: inode of checkpoint file
  * @start: start checkpoint number
- * @end: end checkpoint numer
+ * @end: end checkpoint number
  *
  * Description: nilfs_cpfile_delete_checkpoints() deletes the checkpoints in
  * the period from @start to @end, excluding @end itself. The checkpoints
--- a/fs/nilfs2/ioctl.c~nilfs2-fix-typos-in-comments
+++ a/fs/nilfs2/ioctl.c
@@ -1043,7 +1043,7 @@ out:
  * @inode: inode object
  * @argp: pointer on argument from userspace
  *
- * Decription: nilfs_ioctl_trim_fs is the FITRIM ioctl handle function. It
+ * Description: nilfs_ioctl_trim_fs is the FITRIM ioctl handle function. It
  * checks the arguments from userspace and calls nilfs_sufile_trim_fs, which
  * performs the actual trim operation.
  *
@@ -1085,7 +1085,7 @@ static int nilfs_ioctl_trim_fs(struct in
  * @inode: inode object
  * @argp: pointer on argument from userspace
  *
- * Decription: nilfs_ioctl_set_alloc_range() function defines lower limit
+ * Description: nilfs_ioctl_set_alloc_range() function defines lower limit
  * of segments in bytes and upper limit of segments in bytes.
  * The NILFS_IOCTL_SET_ALLOC_RANGE is used by nilfs_resize utility.
  *
--- a/fs/nilfs2/the_nilfs.c~nilfs2-fix-typos-in-comments
+++ a/fs/nilfs2/the_nilfs.c
@@ -195,7 +195,7 @@ static int nilfs_store_log_cursor(struct
 /**
  * load_nilfs - load and recover the nilfs
  * @nilfs: the_nilfs structure to be released
- * @sb: super block isntance used to recover past segment
+ * @sb: super block instance used to recover past segment
  *
  * load_nilfs() searches and load the latest super root,
  * attaches the last segment, and does recovery if needed.
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 44/91] hpfs: replace one-element array with flexible-array member
  2021-05-07  1:01 incoming Andrew Morton
                   ` (42 preceding siblings ...)
  2021-05-07  1:04 ` [patch 43/91] nilfs2: fix typos in comments Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 45/91] do_wait: make PIDTYPE_PID case O(1) instead of O(n) Andrew Morton
                   ` (47 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, gustavoars, linux-mm, mikulas, mm-commits, torvalds

From: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Subject: hpfs: replace one-element array with flexible-array member

There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warning:

  CC [M]  fs/hpfs/dir.o
fs/hpfs/dir.c: In function `hpfs_readdir':
fs/hpfs/dir.c:163:41: warning: array subscript 1 is above array bounds of `u8[1]' {aka `unsigned char[1]'} [-Warray-bounds]
  163 |         || de ->name[0] != 1 || de->name[1] != 1))
      |                                 ~~~~~~~~^~~

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lkml.kernel.org/r/20210326173510.GA81212@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hpfs/hpfs.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/hpfs/hpfs.h~hpfs-replace-one-element-array-with-flexible-array-member
+++ a/fs/hpfs/hpfs.h
@@ -356,7 +356,8 @@ struct hpfs_dirent {
   u8 no_of_acls;			/* number of ACL's (low 3 bits) */
   u8 ix;				/* code page index (of filename), see
 					   struct code_page_data */
-  u8 namelen, name[1];			/* file name */
+  u8 namelen;				/* file name length */
+  u8 name[];				/* file name */
   /* dnode_secno down;	  btree down pointer, if present,
      			  follows name on next word boundary, or maybe it
 			  precedes next dirent, which is on a word boundary. */
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 45/91] do_wait: make PIDTYPE_PID case O(1) instead of O(n)
  2021-05-07  1:01 incoming Andrew Morton
                   ` (43 preceding siblings ...)
  2021-05-07  1:04 ` [patch 44/91] hpfs: replace one-element array with flexible-array member Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 46/91] kernel/fork.c: simplify copy_mm() Andrew Morton
                   ` (46 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, christian, ebiederm, jnewsome, linux-mm, mm-commits, oleg,
	torvalds

From: Jim Newsome <jnewsome@torproject.org>
Subject: do_wait: make PIDTYPE_PID case O(1) instead of O(n)

Add a special-case when waiting on a pid (via waitpid, waitid, wait4, etc)
to avoid doing an O(n) scan of children and tracees, and instead do an
O(1) lookup.  This improves performance when waiting on a pid from a
thread group with many children and/or tracees.

Time to fork and then call waitpid on the child, from a task that already
has N children [1]:

N    | Before  | After
-----|---------|------
1    | 74 us   | 74 us
20   | 72 us   | 75 us
100  | 83 us   | 77 us
500  | 99 us   | 74 us
1000 | 179 us  | 75 us
5000 | 804 us  | 79 us
8000 | 1268 us | 78 us

[1]: https://lkml.org/lkml/2021/3/12/1567

This can make a substantial performance improvement for applications with
a thread that has many children or tracees and frequently needs to wait on
them.  Tools that use ptrace to intercept syscalls for a large number of
processes are likely to fall into this category.  In particular this patch
was developed while building a ptrace-based second generation of the
Shadow emulator [2], for which it allows us to avoid quadratic scaling
(without having to use a workaround that introduces a ~40% performance
penalty) [3].  Other examples of tools that fall into this category which
this patch may help include User Mode Linux [4] and DetTrace [5].

[2]: https://shadow.github.io/
[3]: https://github.com/shadow/shadow/issues/1134#issuecomment-798992292
[4]: https://en.wikipedia.org/wiki/User-mode_Linux
[5]: https://github.com/dettrace/dettrace

Link: https://lkml.kernel.org/r/20210314231544.9379-1-jnewsome@torproject.org
Signed-off-by: James Newsome <jnewsome@torproject.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Christian Brauner <christian@brauner.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/exit.c |   67 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 57 insertions(+), 10 deletions(-)

--- a/kernel/exit.c~do_wait-make-pidtype_pid-case-o1-instead-of-on
+++ a/kernel/exit.c
@@ -1440,9 +1440,48 @@ void __wake_up_parent(struct task_struct
 			   TASK_INTERRUPTIBLE, p);
 }
 
+static bool is_effectively_child(struct wait_opts *wo, bool ptrace,
+				 struct task_struct *target)
+{
+	struct task_struct *parent =
+		!ptrace ? target->real_parent : target->parent;
+
+	return current == parent || (!(wo->wo_flags & __WNOTHREAD) &&
+				     same_thread_group(current, parent));
+}
+
+/*
+ * Optimization for waiting on PIDTYPE_PID. No need to iterate through child
+ * and tracee lists to find the target task.
+ */
+static int do_wait_pid(struct wait_opts *wo)
+{
+	bool ptrace;
+	struct task_struct *target;
+	int retval;
+
+	ptrace = false;
+	target = pid_task(wo->wo_pid, PIDTYPE_TGID);
+	if (target && is_effectively_child(wo, ptrace, target)) {
+		retval = wait_consider_task(wo, ptrace, target);
+		if (retval)
+			return retval;
+	}
+
+	ptrace = true;
+	target = pid_task(wo->wo_pid, PIDTYPE_PID);
+	if (target && target->ptrace &&
+	    is_effectively_child(wo, ptrace, target)) {
+		retval = wait_consider_task(wo, ptrace, target);
+		if (retval)
+			return retval;
+	}
+
+	return 0;
+}
+
 static long do_wait(struct wait_opts *wo)
 {
-	struct task_struct *tsk;
 	int retval;
 
 	trace_sched_process_wait(wo->wo_pid);
@@ -1464,19 +1503,27 @@ repeat:
 
 	set_current_state(TASK_INTERRUPTIBLE);
 	read_lock(&tasklist_lock);
-	tsk = current;
-	do {
-		retval = do_wait_thread(wo, tsk);
-		if (retval)
-			goto end;
 
-		retval = ptrace_do_wait(wo, tsk);
+	if (wo->wo_type == PIDTYPE_PID) {
+		retval = do_wait_pid(wo);
 		if (retval)
 			goto end;
+	} else {
+		struct task_struct *tsk = current;
 
-		if (wo->wo_flags & __WNOTHREAD)
-			break;
-	} while_each_thread(current, tsk);
+		do {
+			retval = do_wait_thread(wo, tsk);
+			if (retval)
+				goto end;
+
+			retval = ptrace_do_wait(wo, tsk);
+			if (retval)
+				goto end;
+
+			if (wo->wo_flags & __WNOTHREAD)
+				break;
+		} while_each_thread(current, tsk);
+	}
 	read_unlock(&tasklist_lock);
 
 notask:
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 46/91] kernel/fork.c: simplify copy_mm()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (44 preceding siblings ...)
  2021-05-07  1:04 ` [patch 45/91] do_wait: make PIDTYPE_PID case O(1) instead of O(n) Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 47/91] kernel/fork.c: fix typos Andrew Morton
                   ` (45 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, eb, linux-mm, mm-commits, torvalds

From: Rolf Eike Beer <eb@emlix.com>
Subject: kernel/fork.c: simplify copy_mm()

All this can happen without a single goto.

Link: https://lkml.kernel.org/r/2072685.XptgVkyDqn@devpool47
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/fork.c |   15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

--- a/kernel/fork.c~simplify-copy_mm
+++ a/kernel/fork.c
@@ -1396,7 +1396,6 @@ fail_nomem:
 static int copy_mm(unsigned long clone_flags, struct task_struct *tsk)
 {
 	struct mm_struct *mm, *oldmm;
-	int retval;
 
 	tsk->min_flt = tsk->maj_flt = 0;
 	tsk->nvcsw = tsk->nivcsw = 0;
@@ -1423,21 +1422,15 @@ static int copy_mm(unsigned long clone_f
 	if (clone_flags & CLONE_VM) {
 		mmget(oldmm);
 		mm = oldmm;
-		goto good_mm;
+	} else {
+		mm = dup_mm(tsk, current->mm);
+		if (!mm)
+			return -ENOMEM;
 	}
 
-	retval = -ENOMEM;
-	mm = dup_mm(tsk, current->mm);
-	if (!mm)
-		goto fail_nomem;
-
-good_mm:
 	tsk->mm = mm;
 	tsk->active_mm = mm;
 	return 0;
-
-fail_nomem:
-	return retval;
 }
 
 static int copy_fs(unsigned long clone_flags, struct task_struct *tsk)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 47/91] kernel/fork.c: fix typos
  2021-05-07  1:01 incoming Andrew Morton
                   ` (45 preceding siblings ...)
  2021-05-07  1:04 ` [patch 46/91] kernel/fork.c: simplify copy_mm() Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Andrew Morton
                   ` (44 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, axboe, caoxiaofeng, christian.brauner, cxfcosmos, linux-mm,
	mm-commits, torvalds

From: Xiaofeng Cao <cxfcosmos@gmail.com>
Subject: kernel/fork.c: fix typos

change 'ancestoral' to 'ancestral'
change 'reuseable' to 'reusable'
delete 'do' grammatically

Link: https://lkml.kernel.org/r/20210317082031.11692-1-caoxiaofeng@yulong.com
Signed-off-by: Xiaofeng Cao <caoxiaofeng@yulong.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/fork.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/kernel/fork.c~kernel-fork-fix-typo-issue
+++ a/kernel/fork.c
@@ -1145,7 +1145,7 @@ void mmput_async(struct mm_struct *mm)
  * invocations: in mmput() nobody alive left, in execve task is single
  * threaded. sys_prctl(PR_SET_MM_MAP/EXE_FILE) also needs to set the
  * mm->exe_file, but does so without using set_mm_exe_file() in order
- * to do avoid the need for any locks.
+ * to avoid the need for any locks.
  */
 void set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
 {
@@ -1736,7 +1736,7 @@ static int pidfd_release(struct inode *i
  * /proc/<pid>/status where Pid and NSpid are always shown relative to
  * the  pid namespace of the procfs instance. The difference becomes
  * obvious when sending around a pidfd between pid namespaces from a
- * different branch of the tree, i.e. where no ancestoral relation is
+ * different branch of the tree, i.e. where no ancestral relation is
  * present between the pid namespaces:
  * - create two new pid namespaces ns1 and ns2 in the initial pid
  *   namespace (also take care to create new mount namespaces in the
@@ -2728,8 +2728,8 @@ static bool clone3_args_valid(struct ker
 		return false;
 
 	/*
-	 * - make the CLONE_DETACHED bit reuseable for clone3
-	 * - make the CSIGNAL bits reuseable for clone3
+	 * - make the CLONE_DETACHED bit reusable for clone3
+	 * - make the CSIGNAL bits reusable for clone3
 	 */
 	if (kargs->flags & (CLONE_DETACHED | CSIGNAL))
 		return false;
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-07  1:01 incoming Andrew Morton
                   ` (46 preceding siblings ...)
  2021-05-07  1:04 ` [patch 47/91] kernel/fork.c: fix typos Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  7:25   ` Linus Torvalds
  2021-05-07  8:16   ` David Hildenbrand
  2021-05-07  1:04 ` [patch 49/91] kexec: add kexec reboot string Andrew Morton
                   ` (43 subsequent siblings)
  91 siblings, 2 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, andreyknvl, bhe, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

From: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
Subject: kernel/crash_core: add crashkernel=auto for vmcore creation

This adds crashkernel=auto feature to configure reserved memory for vmcore
creation.  CONFIG_CRASH_AUTO_STR is defined to be set for different kernel
distributions and different archs based on their needs.

Link: https://lkml.kernel.org/r/20210223174153.72802-1-saeed.mirzamohammadi@oracle.com
Signed-off-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
ed-by: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: YiFei Zhu <yifeifz2@illinois.edu>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/kdump/kdump.rst       |    3 +-
 Documentation/admin-guide/kernel-parameters.txt |    6 ++++
 arch/Kconfig                                    |   20 ++++++++++++++
 kernel/crash_core.c                             |    7 ++++
 4 files changed, 35 insertions(+), 1 deletion(-)

--- a/arch/Kconfig~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
+++ a/arch/Kconfig
@@ -14,6 +14,26 @@ menu "General architecture-dependent opt
 config CRASH_CORE
 	bool
 
+config CRASH_AUTO_STR
+	string "Memory reserved for crash kernel"
+	depends on CRASH_CORE
+	default "1G-64G:128M,64G-1T:256M,1T-:512M"
+	help
+	  This configures the reserved memory dependent
+	  on the value of System RAM. The syntax is:
+	  crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
+	              range=start-[end]
+
+	  For example:
+	      crashkernel=512M-2G:64M,2G-:128M
+
+	  This would mean:
+
+	      1) if the RAM is smaller than 512M, then don't reserve anything
+	         (this is the "rescue" case)
+	      2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
+	      3) if the RAM size is larger than 2G, then reserve 128M
+
 config KEXEC_CORE
 	select CRASH_CORE
 	bool
--- a/Documentation/admin-guide/kdump/kdump.rst~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
+++ a/Documentation/admin-guide/kdump/kdump.rst
@@ -285,7 +285,8 @@ This would mean:
     2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
     3) if the RAM size is larger than 2G, then reserve 128M
 
-
+Or you can use crashkernel=auto to choose the crash kernel memory size
+based on the recommended configuration set for each arch.
 
 Boot into System Kernel
 =======================
--- a/Documentation/admin-guide/kernel-parameters.txt~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
+++ a/Documentation/admin-guide/kernel-parameters.txt
@@ -751,6 +751,12 @@
 			a memory unit (amount[KMG]). See also
 			Documentation/admin-guide/kdump/kdump.rst for an example.
 
+	crashkernel=auto
+			[KNL] This parameter will set the reserved memory for
+			the crash kernel based on the value of the CRASH_AUTO_STR
+			that is the best effort estimation for each arch. See also
+			arch/Kconfig for further details.
+
 	crashkernel=size[KMG],high
 			[KNL, X86-64] range could be above 4G. Allow kernel
 			to allocate physical memory region from top, so could
--- a/kernel/crash_core.c~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
+++ a/kernel/crash_core.c
@@ -7,6 +7,7 @@
 #include <linux/crash_core.h>
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
+#include <linux/kexec.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -250,6 +251,12 @@ static int __init __parse_crashkernel(ch
 	if (suffix)
 		return parse_crashkernel_suffix(ck_cmdline, crash_size,
 				suffix);
+#ifdef CONFIG_CRASH_AUTO_STR
+	if (strncmp(ck_cmdline, "auto", 4) == 0) {
+		ck_cmdline = CONFIG_CRASH_AUTO_STR;
+		pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
+	}
+#endif
 	/*
 	 * if the commandline contains a ':', then that's the extended
 	 * syntax -- if not, it must be the classic syntax
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 49/91] kexec: add kexec reboot string
  2021-05-07  1:01 incoming Andrew Morton
                   ` (47 preceding siblings ...)
  2021-05-07  1:04 ` [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 50/91] kernel: kexec_file: fix error return code of kexec_calculate_store_digests() Andrew Morton
                   ` (42 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, bhe, jolevequ, lguohan, linux-mm, mm-commits, pmenzel, torvalds

From: Joe LeVeque <jolevequ@microsoft.com>
Subject: kexec: Add kexec reboot string

The purpose is to notify the kernel module for fast reboot.

Upstream a patch from the SONiC network operating system [1].

[1]: https://github.com/Azure/sonic-linux-kernel/pull/46

Link: https://lkml.kernel.org/r/20210304124626.13927-1-pmenzel@molgen.mpg.de
Signed-off-by: Joe LeVeque <jolevequ@microsoft.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Guohan Lu <lguohan@gmail.com>
Cc: Joe LeVeque <jolevequ@microsoft.com>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kexec_core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/kexec_core.c~kexec-add-kexec-reboot-string
+++ a/kernel/kexec_core.c
@@ -1165,7 +1165,7 @@ int kernel_kexec(void)
 #endif
 	{
 		kexec_in_progress = true;
-		kernel_restart_prepare(NULL);
+		kernel_restart_prepare("kexec reboot");
 		migrate_to_reboot_cpu();
 
 		/*
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 50/91] kernel: kexec_file: fix error return code of kexec_calculate_store_digests()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (48 preceding siblings ...)
  2021-05-07  1:04 ` [patch 49/91] kexec: add kexec reboot string Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 51/91] kexec: dump kmessage before machine_kexec Andrew Morton
                   ` (41 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, baijiaju1990, bhe, linux-mm, mm-commits, oslab, torvalds

From: Jia-Ju Bai <baijiaju1990@gmail.com>
Subject: kernel: kexec_file: fix error return code of kexec_calculate_store_digests()

When vzalloc() returns NULL to sha_regions, no error return code of
kexec_calculate_store_digests() is assigned.  To fix this bug, ret is
assigned with -ENOMEM in this case.

Link: https://lkml.kernel.org/r/20210309083904.24321-1-baijiaju1990@gmail.com
Fixes: a43cac0d9dc2 ("kexec: split kexec_file syscall code to kexec_file.c")
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kexec_file.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/kexec_file.c~kernel-kexec_file-fix-error-return-code-of-kexec_calculate_store_digests
+++ a/kernel/kexec_file.c
@@ -740,8 +740,10 @@ static int kexec_calculate_store_digests
 
 	sha_region_sz = KEXEC_SEGMENT_MAX * sizeof(struct kexec_sha_region);
 	sha_regions = vzalloc(sha_region_sz);
-	if (!sha_regions)
+	if (!sha_regions) {
+		ret = -ENOMEM;
 		goto out_free_desc;
+	}
 
 	desc->tfm   = tfm;
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 51/91] kexec: dump kmessage before machine_kexec
  2021-05-07  1:01 incoming Andrew Morton
                   ` (49 preceding siblings ...)
  2021-05-07  1:04 ` [patch 50/91] kernel: kexec_file: fix error return code of kexec_calculate_store_digests() Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 52/91] gcov: combine common code Andrew Morton
                   ` (40 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, anton, bhe, bhsharma, ccross, ebiederm, jmorris, keescook,
	linux-mm, mm-commits, pasha.tatashin, pmladek, sashal, tony.luck,
	torvalds, tyhicks

From: Pavel Tatashin <pasha.tatashin@soleen.com>
Subject: kexec: dump kmessage before machine_kexec

kmsg_dump(KMSG_DUMP_SHUTDOWN) is called before
machine_restart(), machine_halt(), machine_power_off(), the only one that
is missing is  machine_kexec().

The dmesg output that it contains can be used to study the shutdown
performance of both kernel and systemd during kexec reboot.

Here is example of dmesg data collected after kexec:

root@dplat-cp22:~# cat /sys/fs/pstore/dmesg-ramoops-0 | tail
...
<6>[   70.914592] psci: CPU3 killed (polled 0 ms)
<5>[   70.915705] CPU4: shutdown
<6>[   70.916643] psci: CPU4 killed (polled 4 ms)
<5>[   70.917715] CPU5: shutdown
<6>[   70.918725] psci: CPU5 killed (polled 0 ms)
<5>[   70.919704] CPU6: shutdown
<6>[   70.920726] psci: CPU6 killed (polled 4 ms)
<5>[   70.921642] CPU7: shutdown
<6>[   70.922650] psci: CPU7 killed (polled 0 ms)

Link: https://lkml.kernel.org/r/20210319192326.146000-2-pasha.tatashin@soleen.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: James Morris <jmorris@namei.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kexec_core.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/kernel/kexec_core.c~kexec-dump-kmessage-before-machine_kexec
+++ a/kernel/kexec_core.c
@@ -37,6 +37,7 @@
 #include <linux/compiler.h>
 #include <linux/hugetlb.h>
 #include <linux/objtool.h>
+#include <linux/kmsg_dump.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -1179,6 +1180,7 @@ int kernel_kexec(void)
 		machine_shutdown();
 	}
 
+	kmsg_dump(KMSG_DUMP_SHUTDOWN);
 	machine_kexec(kexec_image);
 
 #ifdef CONFIG_KEXEC_JUMP
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 52/91] gcov: combine common code
  2021-05-07  1:01 incoming Andrew Morton
                   ` (50 preceding siblings ...)
  2021-05-07  1:04 ` [patch 51/91] kexec: dump kmessage before machine_kexec Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 53/91] gcov: simplify buffer allocation Andrew Morton
                   ` (39 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, johannes.berg, linux-mm, mm-commits, oberpar, torvalds

From: Johannes Berg <johannes.berg@intel.com>
Subject: gcov: combine common code

There's a lot of duplicated code between gcc and clang implementations,
move it over to fs.c to simplify the code, there's no reason to believe
that for small data like this one would not just implement the simple
convert_to_gcda() function.

Link: https://lkml.kernel.org/r/20210315235453.e3fbb86e99a0.I08a3ee6dbe47ea3e8024956083f162884a958e40@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/gcov/base.c    |   49 +++++++++++
 kernel/gcov/clang.c   |  167 ----------------------------------------
 kernel/gcov/fs.c      |  116 +++++++++++++++++++++++++++
 kernel/gcov/gcc_4_7.c |  167 ----------------------------------------
 kernel/gcov/gcov.h    |   14 ---
 5 files changed, 171 insertions(+), 342 deletions(-)

--- a/kernel/gcov/base.c~gcov-combine-common-code
+++ a/kernel/gcov/base.c
@@ -49,6 +49,55 @@ void gcov_enable_events(void)
 	mutex_unlock(&gcov_lock);
 }
 
+/**
+ * store_gcov_u32 - store 32 bit number in gcov format to buffer
+ * @buffer: target buffer or NULL
+ * @off: offset into the buffer
+ * @v: value to be stored
+ *
+ * Number format defined by gcc: numbers are recorded in the 32 bit
+ * unsigned binary form of the endianness of the machine generating the
+ * file. Returns the number of bytes stored. If @buffer is %NULL, doesn't
+ * store anything.
+ */
+size_t store_gcov_u32(void *buffer, size_t off, u32 v)
+{
+	u32 *data;
+
+	if (buffer) {
+		data = buffer + off;
+		*data = v;
+	}
+
+	return sizeof(*data);
+}
+
+/**
+ * store_gcov_u64 - store 64 bit number in gcov format to buffer
+ * @buffer: target buffer or NULL
+ * @off: offset into the buffer
+ * @v: value to be stored
+ *
+ * Number format defined by gcc: numbers are recorded in the 32 bit
+ * unsigned binary form of the endianness of the machine generating the
+ * file. 64 bit numbers are stored as two 32 bit numbers, the low part
+ * first. Returns the number of bytes stored. If @buffer is %NULL, doesn't store
+ * anything.
+ */
+size_t store_gcov_u64(void *buffer, size_t off, u64 v)
+{
+	u32 *data;
+
+	if (buffer) {
+		data = buffer + off;
+
+		data[0] = (v & 0xffffffffUL);
+		data[1] = (v >> 32);
+	}
+
+	return sizeof(*data) * 2;
+}
+
 #ifdef CONFIG_MODULES
 /* Update list and generate events when modules are unloaded. */
 static int gcov_module_notifier(struct notifier_block *nb, unsigned long event,
--- a/kernel/gcov/clang.c~gcov-combine-common-code
+++ a/kernel/gcov/clang.c
@@ -48,7 +48,6 @@
 #include <linux/list.h>
 #include <linux/printk.h>
 #include <linux/ratelimit.h>
-#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include "gcov.h"
@@ -449,71 +448,6 @@ void gcov_info_free(struct gcov_info *in
 }
 #endif
 
-#define ITER_STRIDE	PAGE_SIZE
-
-/**
- * struct gcov_iterator - specifies current file position in logical records
- * @info: associated profiling data
- * @buffer: buffer containing file data
- * @size: size of buffer
- * @pos: current position in file
- */
-struct gcov_iterator {
-	struct gcov_info *info;
-	void *buffer;
-	size_t size;
-	loff_t pos;
-};
-
-/**
- * store_gcov_u32 - store 32 bit number in gcov format to buffer
- * @buffer: target buffer or NULL
- * @off: offset into the buffer
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. Returns the number of bytes stored. If @buffer is %NULL, doesn't
- * store anything.
- */
-static size_t store_gcov_u32(void *buffer, size_t off, u32 v)
-{
-	u32 *data;
-
-	if (buffer) {
-		data = buffer + off;
-		*data = v;
-	}
-
-	return sizeof(*data);
-}
-
-/**
- * store_gcov_u64 - store 64 bit number in gcov format to buffer
- * @buffer: target buffer or NULL
- * @off: offset into the buffer
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. 64 bit numbers are stored as two 32 bit numbers, the low part
- * first. Returns the number of bytes stored. If @buffer is %NULL, doesn't store
- * anything.
- */
-static size_t store_gcov_u64(void *buffer, size_t off, u64 v)
-{
-	u32 *data;
-
-	if (buffer) {
-		data = buffer + off;
-
-		data[0] = (v & 0xffffffffUL);
-		data[1] = (v >> 32);
-	}
-
-	return sizeof(*data) * 2;
-}
-
 /**
  * convert_to_gcda - convert profiling data set to gcda file format
  * @buffer: the buffer to store file data or %NULL if no data should be stored
@@ -521,7 +455,7 @@ static size_t store_gcov_u64(void *buffe
  *
  * Returns the number of bytes that were/would have been stored into the buffer.
  */
-static size_t convert_to_gcda(char *buffer, struct gcov_info *info)
+size_t convert_to_gcda(char *buffer, struct gcov_info *info)
 {
 	struct gcov_fn_info *fi_ptr;
 	size_t pos = 0;
@@ -558,102 +492,3 @@ static size_t convert_to_gcda(char *buff
 
 	return pos;
 }
-
-/**
- * gcov_iter_new - allocate and initialize profiling data iterator
- * @info: profiling data set to be iterated
- *
- * Return file iterator on success, %NULL otherwise.
- */
-struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
-{
-	struct gcov_iterator *iter;
-
-	iter = kzalloc(sizeof(struct gcov_iterator), GFP_KERNEL);
-	if (!iter)
-		goto err_free;
-
-	iter->info = info;
-	/* Dry-run to get the actual buffer size. */
-	iter->size = convert_to_gcda(NULL, info);
-	iter->buffer = vmalloc(iter->size);
-	if (!iter->buffer)
-		goto err_free;
-
-	convert_to_gcda(iter->buffer, info);
-
-	return iter;
-
-err_free:
-	kfree(iter);
-	return NULL;
-}
-
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-void gcov_iter_free(struct gcov_iterator *iter)
-{
-	vfree(iter->buffer);
-	kfree(iter);
-}
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter)
-{
-	return iter->info;
-}
-
-/**
- * gcov_iter_start - reset file iterator to starting position
- * @iter: file iterator
- */
-void gcov_iter_start(struct gcov_iterator *iter)
-{
-	iter->pos = 0;
-}
-
-/**
- * gcov_iter_next - advance file iterator to next logical record
- * @iter: file iterator
- *
- * Return zero if new position is valid, non-zero if iterator has reached end.
- */
-int gcov_iter_next(struct gcov_iterator *iter)
-{
-	if (iter->pos < iter->size)
-		iter->pos += ITER_STRIDE;
-
-	if (iter->pos >= iter->size)
-		return -EINVAL;
-
-	return 0;
-}
-
-/**
- * gcov_iter_write - write data for current pos to seq_file
- * @iter: file iterator
- * @seq: seq_file handle
- *
- * Return zero on success, non-zero otherwise.
- */
-int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq)
-{
-	size_t len;
-
-	if (iter->pos >= iter->size)
-		return -EINVAL;
-
-	len = ITER_STRIDE;
-	if (iter->pos + len > iter->size)
-		len = iter->size - iter->pos;
-
-	seq_write(seq, iter->buffer + iter->pos, len);
-
-	return 0;
-}
--- a/kernel/gcov/fs.c~gcov-combine-common-code
+++ a/kernel/gcov/fs.c
@@ -26,6 +26,7 @@
 #include <linux/slab.h>
 #include <linux/mutex.h>
 #include <linux/seq_file.h>
+#include <linux/vmalloc.h>
 #include "gcov.h"
 
 /**
@@ -85,6 +86,121 @@ static int __init gcov_persist_setup(cha
 }
 __setup("gcov_persist=", gcov_persist_setup);
 
+#define ITER_STRIDE	PAGE_SIZE
+
+/**
+ * struct gcov_iterator - specifies current file position in logical records
+ * @info: associated profiling data
+ * @buffer: buffer containing file data
+ * @size: size of buffer
+ * @pos: current position in file
+ */
+struct gcov_iterator {
+	struct gcov_info *info;
+	void *buffer;
+	size_t size;
+	loff_t pos;
+};
+
+/**
+ * gcov_iter_new - allocate and initialize profiling data iterator
+ * @info: profiling data set to be iterated
+ *
+ * Return file iterator on success, %NULL otherwise.
+ */
+static struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
+{
+	struct gcov_iterator *iter;
+
+	iter = kzalloc(sizeof(struct gcov_iterator), GFP_KERNEL);
+	if (!iter)
+		goto err_free;
+
+	iter->info = info;
+	/* Dry-run to get the actual buffer size. */
+	iter->size = convert_to_gcda(NULL, info);
+	iter->buffer = vmalloc(iter->size);
+	if (!iter->buffer)
+		goto err_free;
+
+	convert_to_gcda(iter->buffer, info);
+
+	return iter;
+
+err_free:
+	kfree(iter);
+	return NULL;
+}
+
+
+/**
+ * gcov_iter_free - free iterator data
+ * @iter: file iterator
+ */
+static void gcov_iter_free(struct gcov_iterator *iter)
+{
+	vfree(iter->buffer);
+	kfree(iter);
+}
+
+/**
+ * gcov_iter_get_info - return profiling data set for given file iterator
+ * @iter: file iterator
+ */
+static struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter)
+{
+	return iter->info;
+}
+
+/**
+ * gcov_iter_start - reset file iterator to starting position
+ * @iter: file iterator
+ */
+static void gcov_iter_start(struct gcov_iterator *iter)
+{
+	iter->pos = 0;
+}
+
+/**
+ * gcov_iter_next - advance file iterator to next logical record
+ * @iter: file iterator
+ *
+ * Return zero if new position is valid, non-zero if iterator has reached end.
+ */
+static int gcov_iter_next(struct gcov_iterator *iter)
+{
+	if (iter->pos < iter->size)
+		iter->pos += ITER_STRIDE;
+
+	if (iter->pos >= iter->size)
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * gcov_iter_write - write data for current pos to seq_file
+ * @iter: file iterator
+ * @seq: seq_file handle
+ *
+ * Return zero on success, non-zero otherwise.
+ */
+static int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq)
+{
+	size_t len;
+
+	if (iter->pos >= iter->size)
+		return -EINVAL;
+
+	len = ITER_STRIDE;
+	if (iter->pos + len > iter->size)
+		len = iter->size - iter->pos;
+
+	seq_write(seq, iter->buffer + iter->pos, len);
+
+	return 0;
+}
+
 /*
  * seq_file.start() implementation for gcov data files. Note that the
  * gcov_iterator interface is designed to be more restrictive than seq_file
--- a/kernel/gcov/gcc_4_7.c~gcov-combine-common-code
+++ a/kernel/gcov/gcc_4_7.c
@@ -15,7 +15,6 @@
 #include <linux/errno.h>
 #include <linux/slab.h>
 #include <linux/string.h>
-#include <linux/seq_file.h>
 #include <linux/vmalloc.h>
 #include "gcov.h"
 
@@ -363,71 +362,6 @@ free_info:
 	kfree(info);
 }
 
-#define ITER_STRIDE	PAGE_SIZE
-
-/**
- * struct gcov_iterator - specifies current file position in logical records
- * @info: associated profiling data
- * @buffer: buffer containing file data
- * @size: size of buffer
- * @pos: current position in file
- */
-struct gcov_iterator {
-	struct gcov_info *info;
-	void *buffer;
-	size_t size;
-	loff_t pos;
-};
-
-/**
- * store_gcov_u32 - store 32 bit number in gcov format to buffer
- * @buffer: target buffer or NULL
- * @off: offset into the buffer
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. Returns the number of bytes stored. If @buffer is %NULL, doesn't
- * store anything.
- */
-static size_t store_gcov_u32(void *buffer, size_t off, u32 v)
-{
-	u32 *data;
-
-	if (buffer) {
-		data = buffer + off;
-		*data = v;
-	}
-
-	return sizeof(*data);
-}
-
-/**
- * store_gcov_u64 - store 64 bit number in gcov format to buffer
- * @buffer: target buffer or NULL
- * @off: offset into the buffer
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. 64 bit numbers are stored as two 32 bit numbers, the low part
- * first. Returns the number of bytes stored. If @buffer is %NULL, doesn't store
- * anything.
- */
-static size_t store_gcov_u64(void *buffer, size_t off, u64 v)
-{
-	u32 *data;
-
-	if (buffer) {
-		data = buffer + off;
-
-		data[0] = (v & 0xffffffffUL);
-		data[1] = (v >> 32);
-	}
-
-	return sizeof(*data) * 2;
-}
-
 /**
  * convert_to_gcda - convert profiling data set to gcda file format
  * @buffer: the buffer to store file data or %NULL if no data should be stored
@@ -435,7 +369,7 @@ static size_t store_gcov_u64(void *buffe
  *
  * Returns the number of bytes that were/would have been stored into the buffer.
  */
-static size_t convert_to_gcda(char *buffer, struct gcov_info *info)
+size_t convert_to_gcda(char *buffer, struct gcov_info *info)
 {
 	struct gcov_fn_info *fi_ptr;
 	struct gcov_ctr_info *ci_ptr;
@@ -481,102 +415,3 @@ static size_t convert_to_gcda(char *buff
 
 	return pos;
 }
-
-/**
- * gcov_iter_new - allocate and initialize profiling data iterator
- * @info: profiling data set to be iterated
- *
- * Return file iterator on success, %NULL otherwise.
- */
-struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
-{
-	struct gcov_iterator *iter;
-
-	iter = kzalloc(sizeof(struct gcov_iterator), GFP_KERNEL);
-	if (!iter)
-		goto err_free;
-
-	iter->info = info;
-	/* Dry-run to get the actual buffer size. */
-	iter->size = convert_to_gcda(NULL, info);
-	iter->buffer = vmalloc(iter->size);
-	if (!iter->buffer)
-		goto err_free;
-
-	convert_to_gcda(iter->buffer, info);
-
-	return iter;
-
-err_free:
-	kfree(iter);
-	return NULL;
-}
-
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-void gcov_iter_free(struct gcov_iterator *iter)
-{
-	vfree(iter->buffer);
-	kfree(iter);
-}
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter)
-{
-	return iter->info;
-}
-
-/**
- * gcov_iter_start - reset file iterator to starting position
- * @iter: file iterator
- */
-void gcov_iter_start(struct gcov_iterator *iter)
-{
-	iter->pos = 0;
-}
-
-/**
- * gcov_iter_next - advance file iterator to next logical record
- * @iter: file iterator
- *
- * Return zero if new position is valid, non-zero if iterator has reached end.
- */
-int gcov_iter_next(struct gcov_iterator *iter)
-{
-	if (iter->pos < iter->size)
-		iter->pos += ITER_STRIDE;
-
-	if (iter->pos >= iter->size)
-		return -EINVAL;
-
-	return 0;
-}
-
-/**
- * gcov_iter_write - write data for current pos to seq_file
- * @iter: file iterator
- * @seq: seq_file handle
- *
- * Return zero on success, non-zero otherwise.
- */
-int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq)
-{
-	size_t len;
-
-	if (iter->pos >= iter->size)
-		return -EINVAL;
-
-	len = ITER_STRIDE;
-	if (iter->pos + len > iter->size)
-		len = iter->size - iter->pos;
-
-	seq_write(seq, iter->buffer + iter->pos, len);
-
-	return 0;
-}
--- a/kernel/gcov/gcov.h~gcov-combine-common-code
+++ a/kernel/gcov/gcov.h
@@ -48,6 +48,7 @@ struct gcov_info *gcov_info_next(struct
 void gcov_info_link(struct gcov_info *info);
 void gcov_info_unlink(struct gcov_info *prev, struct gcov_info *info);
 bool gcov_info_within_module(struct gcov_info *info, struct module *mod);
+size_t convert_to_gcda(char *buffer, struct gcov_info *info);
 
 /* Base interface. */
 enum gcov_action {
@@ -58,16 +59,9 @@ enum gcov_action {
 void gcov_event(enum gcov_action action, struct gcov_info *info);
 void gcov_enable_events(void);
 
-/* Iterator control. */
-struct seq_file;
-struct gcov_iterator;
-
-struct gcov_iterator *gcov_iter_new(struct gcov_info *info);
-void gcov_iter_free(struct gcov_iterator *iter);
-void gcov_iter_start(struct gcov_iterator *iter);
-int gcov_iter_next(struct gcov_iterator *iter);
-int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq);
-struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter);
+/* writing helpers */
+size_t store_gcov_u32(void *buffer, size_t off, u32 v);
+size_t store_gcov_u64(void *buffer, size_t off, u64 v);
 
 /* gcov_info control. */
 void gcov_info_reset(struct gcov_info *info);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 53/91] gcov: simplify buffer allocation
  2021-05-07  1:01 incoming Andrew Morton
                   ` (51 preceding siblings ...)
  2021-05-07  1:04 ` [patch 52/91] gcov: combine common code Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 54/91] gcov: use kvmalloc() Andrew Morton
                   ` (38 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, johannes.berg, linux-mm, mm-commits, oberpar, torvalds

From: Johannes Berg <johannes.berg@intel.com>
Subject: gcov: simplify buffer allocation

Use just a single vmalloc() with struct_size() instead of a separate
kmalloc() for the iter struct.

Link: https://lkml.kernel.org/r/20210315235453.b6de4a92096e.Iac40a5166589cefbff8449e466bd1b38ea7a17af@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/gcov/fs.c |   24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

--- a/kernel/gcov/fs.c~gcov-simplify-buffer-allocation
+++ a/kernel/gcov/fs.c
@@ -97,9 +97,9 @@ __setup("gcov_persist=", gcov_persist_se
  */
 struct gcov_iterator {
 	struct gcov_info *info;
-	void *buffer;
 	size_t size;
 	loff_t pos;
+	char buffer[];
 };
 
 /**
@@ -111,25 +111,20 @@ struct gcov_iterator {
 static struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
 {
 	struct gcov_iterator *iter;
+	size_t size;
+
+	/* Dry-run to get the actual buffer size. */
+	size = convert_to_gcda(NULL, info);
 
-	iter = kzalloc(sizeof(struct gcov_iterator), GFP_KERNEL);
+	iter = vmalloc(struct_size(iter, buffer, size));
 	if (!iter)
-		goto err_free;
+		return NULL;
 
 	iter->info = info;
-	/* Dry-run to get the actual buffer size. */
-	iter->size = convert_to_gcda(NULL, info);
-	iter->buffer = vmalloc(iter->size);
-	if (!iter->buffer)
-		goto err_free;
-
+	iter->size = size;
 	convert_to_gcda(iter->buffer, info);
 
 	return iter;
-
-err_free:
-	kfree(iter);
-	return NULL;
 }
 
 
@@ -139,8 +134,7 @@ err_free:
  */
 static void gcov_iter_free(struct gcov_iterator *iter)
 {
-	vfree(iter->buffer);
-	kfree(iter);
+	vfree(iter);
 }
 
 /**
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 54/91] gcov: use kvmalloc()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (52 preceding siblings ...)
  2021-05-07  1:04 ` [patch 53/91] gcov: simplify buffer allocation Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 55/91] gcov: clang: drop support for clang-10 and older Andrew Morton
                   ` (37 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, johannes.berg, linux-mm, mm-commits, ndesaulniers, oberpar,
	torvalds

From: Johannes Berg <johannes.berg@intel.com>
Subject: gcov: use kvmalloc()

Using vmalloc() in gcov is really quite wasteful, many of the objects
allocated are really small (e.g.  I've seen 24 bytes.) Use kvmalloc() to
automatically pick the better of kmalloc() or vmalloc() depending on the
size.

[johannes.berg@intel.com: fix clang-11+ build]
  Link: https://lkml.kernel.org/r/20210412214210.6e1ecca9cdc5.I24459763acf0591d5e6b31c7e3a59890d802f79c@changeid
Link: https://lkml.kernel.org/r/20210315235453.799e7a9d627d.I741d0db096c6f312910f7f1bcdfde0fda20801a4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/gcov/clang.c   |   12 ++++++------
 kernel/gcov/fs.c      |    6 +++---
 kernel/gcov/gcc_4_7.c |    6 +++---
 3 files changed, 12 insertions(+), 12 deletions(-)

--- a/kernel/gcov/clang.c~gcov-use-kvmalloc
+++ a/kernel/gcov/clang.c
@@ -49,7 +49,7 @@
 #include <linux/printk.h>
 #include <linux/ratelimit.h>
 #include <linux/slab.h>
-#include <linux/vmalloc.h>
+#include <linux/mm.h>
 #include "gcov.h"
 
 typedef void (*llvm_gcov_callback)(void);
@@ -333,8 +333,8 @@ void gcov_info_add(struct gcov_info *dst
 static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn)
 {
 	size_t cv_size; /* counter values size */
-	struct gcov_fn_info *fn_dup = kmemdup(fn, sizeof(*fn),
-			GFP_KERNEL);
+	struct gcov_fn_info *fn_dup = kmemdup(fn, sizeof(*fn), GFP_KERNEL);
+
 	if (!fn_dup)
 		return NULL;
 	INIT_LIST_HEAD(&fn_dup->head);
@@ -344,7 +344,7 @@ static struct gcov_fn_info *gcov_fn_info
 		goto err_name;
 
 	cv_size = fn->num_counters * sizeof(fn->counters[0]);
-	fn_dup->counters = vmalloc(cv_size);
+	fn_dup->counters = kvmalloc(cv_size, GFP_KERNEL);
 	if (!fn_dup->counters)
 		goto err_counters;
 	memcpy(fn_dup->counters, fn->counters, cv_size);
@@ -368,7 +368,7 @@ static struct gcov_fn_info *gcov_fn_info
 	INIT_LIST_HEAD(&fn_dup->head);
 
 	cv_size = fn->num_counters * sizeof(fn->counters[0]);
-	fn_dup->counters = vmalloc(cv_size);
+	fn_dup->counters = kvmalloc(cv_size, GFP_KERNEL);
 	if (!fn_dup->counters) {
 		kfree(fn_dup);
 		return NULL;
@@ -439,7 +439,7 @@ void gcov_info_free(struct gcov_info *in
 	struct gcov_fn_info *fn, *tmp;
 
 	list_for_each_entry_safe(fn, tmp, &info->functions, head) {
-		vfree(fn->counters);
+		kvfree(fn->counters);
 		list_del(&fn->head);
 		kfree(fn);
 	}
--- a/kernel/gcov/fs.c~gcov-use-kvmalloc
+++ a/kernel/gcov/fs.c
@@ -26,7 +26,7 @@
 #include <linux/slab.h>
 #include <linux/mutex.h>
 #include <linux/seq_file.h>
-#include <linux/vmalloc.h>
+#include <linux/mm.h>
 #include "gcov.h"
 
 /**
@@ -116,7 +116,7 @@ static struct gcov_iterator *gcov_iter_n
 	/* Dry-run to get the actual buffer size. */
 	size = convert_to_gcda(NULL, info);
 
-	iter = vmalloc(struct_size(iter, buffer, size));
+	iter = kvmalloc(struct_size(iter, buffer, size), GFP_KERNEL);
 	if (!iter)
 		return NULL;
 
@@ -134,7 +134,7 @@ static struct gcov_iterator *gcov_iter_n
  */
 static void gcov_iter_free(struct gcov_iterator *iter)
 {
-	vfree(iter);
+	kvfree(iter);
 }
 
 /**
--- a/kernel/gcov/gcc_4_7.c~gcov-use-kvmalloc
+++ a/kernel/gcov/gcc_4_7.c
@@ -15,7 +15,7 @@
 #include <linux/errno.h>
 #include <linux/slab.h>
 #include <linux/string.h>
-#include <linux/vmalloc.h>
+#include <linux/mm.h>
 #include "gcov.h"
 
 #if (__GNUC__ >= 10)
@@ -309,7 +309,7 @@ struct gcov_info *gcov_info_dup(struct g
 
 			cv_size = sizeof(gcov_type) * sci_ptr->num;
 
-			dci_ptr->values = vmalloc(cv_size);
+			dci_ptr->values = kvmalloc(cv_size, GFP_KERNEL);
 
 			if (!dci_ptr->values)
 				goto err_free;
@@ -351,7 +351,7 @@ void gcov_info_free(struct gcov_info *in
 		ci_ptr = info->functions[fi_idx]->ctrs;
 
 		for (ct_idx = 0; ct_idx < active; ct_idx++, ci_ptr++)
-			vfree(ci_ptr->values);
+			kvfree(ci_ptr->values);
 
 		kfree(info->functions[fi_idx]);
 	}
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 55/91] gcov: clang: drop support for clang-10 and older
  2021-05-07  1:01 incoming Andrew Morton
                   ` (53 preceding siblings ...)
  2021-05-07  1:04 ` [patch 54/91] gcov: use kvmalloc() Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:04 ` [patch 56/91] smp: kernel/panic.c - silence warnings Andrew Morton
                   ` (36 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, johannes.berg, linux-mm, maskray, mm-commits, nathan,
	ndesaulniers, oberpar, psodagud, torvalds

From: Nick Desaulniers <ndesaulniers@google.com>
Subject: gcov: clang: drop support for clang-10 and older

LLVM changed the expected function signatures for llvm_gcda_start_file()
and llvm_gcda_emit_function() in the clang-11 release.  Drop the older
implementations and require folks to upgrade their compiler if they're
interested in GCOV support.

Link: https://reviews.llvm.org/rGcdd683b516d147925212724b09ec6fb792a40041
Link: https://reviews.llvm.org/rG13a633b438b6500ecad9e4f936ebadf3411d0f44
Link: https://lkml.kernel.org/r/20210312224132.3413602-3-ndesaulniers@google.com
Link: https://lkml.kernel.org/r/20210413183113.2977432-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Cc: Prasad Sodagudi <psodagud@quicinc.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/gcov/Kconfig |    1 
 kernel/gcov/clang.c |  103 ------------------------------------------
 2 files changed, 1 insertion(+), 103 deletions(-)

--- a/kernel/gcov/clang.c~gcov-clang-drop-support-for-clang-10-and-older
+++ a/kernel/gcov/clang.c
@@ -69,16 +69,10 @@ struct gcov_fn_info {
 
 	u32 ident;
 	u32 checksum;
-#if CONFIG_CLANG_VERSION < 110000
-	u8 use_extra_checksum;
-#endif
 	u32 cfg_checksum;
 
 	u32 num_counters;
 	u64 *counters;
-#if CONFIG_CLANG_VERSION < 110000
-	const char *function_name;
-#endif
 };
 
 static struct gcov_info *current_info;
@@ -108,16 +102,6 @@ void llvm_gcov_init(llvm_gcov_callback w
 }
 EXPORT_SYMBOL(llvm_gcov_init);
 
-#if CONFIG_CLANG_VERSION < 110000
-void llvm_gcda_start_file(const char *orig_filename, const char version[4],
-		u32 checksum)
-{
-	current_info->filename = orig_filename;
-	memcpy(&current_info->version, version, sizeof(current_info->version));
-	current_info->checksum = checksum;
-}
-EXPORT_SYMBOL(llvm_gcda_start_file);
-#else
 void llvm_gcda_start_file(const char *orig_filename, u32 version, u32 checksum)
 {
 	current_info->filename = orig_filename;
@@ -125,28 +109,7 @@ void llvm_gcda_start_file(const char *or
 	current_info->checksum = checksum;
 }
 EXPORT_SYMBOL(llvm_gcda_start_file);
-#endif
 
-#if CONFIG_CLANG_VERSION < 110000
-void llvm_gcda_emit_function(u32 ident, const char *function_name,
-		u32 func_checksum, u8 use_extra_checksum, u32 cfg_checksum)
-{
-	struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL);
-
-	if (!info)
-		return;
-
-	INIT_LIST_HEAD(&info->head);
-	info->ident = ident;
-	info->checksum = func_checksum;
-	info->use_extra_checksum = use_extra_checksum;
-	info->cfg_checksum = cfg_checksum;
-	if (function_name)
-		info->function_name = kstrdup(function_name, GFP_KERNEL);
-
-	list_add_tail(&info->head, &current_info->functions);
-}
-#else
 void llvm_gcda_emit_function(u32 ident, u32 func_checksum, u32 cfg_checksum)
 {
 	struct gcov_fn_info *info = kzalloc(sizeof(*info), GFP_KERNEL);
@@ -160,7 +123,6 @@ void llvm_gcda_emit_function(u32 ident,
 	info->cfg_checksum = cfg_checksum;
 	list_add_tail(&info->head, &current_info->functions);
 }
-#endif
 EXPORT_SYMBOL(llvm_gcda_emit_function);
 
 void llvm_gcda_emit_arcs(u32 num_counters, u64 *counters)
@@ -291,16 +253,8 @@ int gcov_info_is_compatible(struct gcov_
 		!list_is_last(&fn_ptr2->head, &info2->functions)) {
 		if (fn_ptr1->checksum != fn_ptr2->checksum)
 			return false;
-#if CONFIG_CLANG_VERSION < 110000
-		if (fn_ptr1->use_extra_checksum != fn_ptr2->use_extra_checksum)
-			return false;
-		if (fn_ptr1->use_extra_checksum &&
-			fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum)
-			return false;
-#else
 		if (fn_ptr1->cfg_checksum != fn_ptr2->cfg_checksum)
 			return false;
-#endif
 		fn_ptr1 = list_next_entry(fn_ptr1, head);
 		fn_ptr2 = list_next_entry(fn_ptr2, head);
 	}
@@ -329,35 +283,6 @@ void gcov_info_add(struct gcov_info *dst
 	}
 }
 
-#if CONFIG_CLANG_VERSION < 110000
-static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn)
-{
-	size_t cv_size; /* counter values size */
-	struct gcov_fn_info *fn_dup = kmemdup(fn, sizeof(*fn), GFP_KERNEL);
-
-	if (!fn_dup)
-		return NULL;
-	INIT_LIST_HEAD(&fn_dup->head);
-
-	fn_dup->function_name = kstrdup(fn->function_name, GFP_KERNEL);
-	if (!fn_dup->function_name)
-		goto err_name;
-
-	cv_size = fn->num_counters * sizeof(fn->counters[0]);
-	fn_dup->counters = kvmalloc(cv_size, GFP_KERNEL);
-	if (!fn_dup->counters)
-		goto err_counters;
-	memcpy(fn_dup->counters, fn->counters, cv_size);
-
-	return fn_dup;
-
-err_counters:
-	kfree(fn_dup->function_name);
-err_name:
-	kfree(fn_dup);
-	return NULL;
-}
-#else
 static struct gcov_fn_info *gcov_fn_info_dup(struct gcov_fn_info *fn)
 {
 	size_t cv_size; /* counter values size */
@@ -378,7 +303,6 @@ static struct gcov_fn_info *gcov_fn_info
 
 	return fn_dup;
 }
-#endif
 
 /**
  * gcov_info_dup - duplicate profiling data set
@@ -419,21 +343,6 @@ err:
  * gcov_info_free - release memory for profiling data set duplicate
  * @info: profiling data set duplicate to free
  */
-#if CONFIG_CLANG_VERSION < 110000
-void gcov_info_free(struct gcov_info *info)
-{
-	struct gcov_fn_info *fn, *tmp;
-
-	list_for_each_entry_safe(fn, tmp, &info->functions, head) {
-		kfree(fn->function_name);
-		vfree(fn->counters);
-		list_del(&fn->head);
-		kfree(fn);
-	}
-	kfree(info->filename);
-	kfree(info);
-}
-#else
 void gcov_info_free(struct gcov_info *info)
 {
 	struct gcov_fn_info *fn, *tmp;
@@ -446,7 +355,6 @@ void gcov_info_free(struct gcov_info *in
 	kfree(info->filename);
 	kfree(info);
 }
-#endif
 
 /**
  * convert_to_gcda - convert profiling data set to gcda file format
@@ -469,21 +377,10 @@ size_t convert_to_gcda(char *buffer, str
 		u32 i;
 
 		pos += store_gcov_u32(buffer, pos, GCOV_TAG_FUNCTION);
-#if CONFIG_CLANG_VERSION < 110000
-		pos += store_gcov_u32(buffer, pos,
-			fi_ptr->use_extra_checksum ? 3 : 2);
-#else
 		pos += store_gcov_u32(buffer, pos, 3);
-#endif
 		pos += store_gcov_u32(buffer, pos, fi_ptr->ident);
 		pos += store_gcov_u32(buffer, pos, fi_ptr->checksum);
-#if CONFIG_CLANG_VERSION < 110000
-		if (fi_ptr->use_extra_checksum)
-			pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum);
-#else
 		pos += store_gcov_u32(buffer, pos, fi_ptr->cfg_checksum);
-#endif
-
 		pos += store_gcov_u32(buffer, pos, GCOV_TAG_COUNTER_BASE);
 		pos += store_gcov_u32(buffer, pos, fi_ptr->num_counters * 2);
 		for (i = 0; i < fi_ptr->num_counters; i++)
--- a/kernel/gcov/Kconfig~gcov-clang-drop-support-for-clang-10-and-older
+++ a/kernel/gcov/Kconfig
@@ -4,6 +4,7 @@ menu "GCOV-based kernel profiling"
 config GCOV_KERNEL
 	bool "Enable gcov-based kernel profiling"
 	depends on DEBUG_FS
+	depends on !CC_IS_CLANG || CLANG_VERSION >= 110000
 	select CONSTRUCTORS
 	default n
 	help
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 56/91] smp: kernel/panic.c - silence warnings
  2021-05-07  1:01 incoming Andrew Morton
                   ` (54 preceding siblings ...)
  2021-05-07  1:04 ` [patch 55/91] gcov: clang: drop support for clang-10 and older Andrew Morton
@ 2021-05-07  1:04 ` Andrew Morton
  2021-05-07  1:05 ` [patch 57/91] delayacct: clear right task's flag after blkio completes Andrew Morton
                   ` (35 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:04 UTC (permalink / raw)
  To: akpm, heying24, hulkci, linux-mm, mm-commits, torvalds

From: He Ying <heying24@huawei.com>
Subject: smp: kernel/panic.c - silence warnings

We found these warnings in kernel/panic.c by using sparse tool:
warning: symbol 'panic_smp_self_stop' was not declared.
warning: symbol 'nmi_panic_self_stop' was not declared.
warning: symbol 'crash_smp_send_stop' was not declared.

To avoid them, add declarations for these three functions in
include/linux/smp.h.

Link: https://lkml.kernel.org/r/20210316084150.75201-1-heying24@huawei.com
Signed-off-by: He Ying <heying24@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/smp.h |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/include/linux/smp.h~smp-kernel-panicc-silence-warnings
+++ a/include/linux/smp.h
@@ -56,6 +56,14 @@ void on_each_cpu_cond_mask(smp_cond_func
 int smp_call_function_single_async(int cpu, call_single_data_t *csd);
 
 /*
+ * Cpus stopping functions in panic. All have default weak definitions.
+ * Architecture-dependent code may override them.
+ */
+void panic_smp_self_stop(void);
+void nmi_panic_self_stop(struct pt_regs *regs);
+void crash_smp_send_stop(void);
+
+/*
  * Call a function on all processors
  */
 static inline void on_each_cpu(smp_call_func_t func, void *info, int wait)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 57/91] delayacct: clear right task's flag after blkio completes
  2021-05-07  1:01 incoming Andrew Morton
                   ` (55 preceding siblings ...)
  2021-05-07  1:04 ` [patch 56/91] smp: kernel/panic.c - silence warnings Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 58/91] gdb: lx-symbols: store the abspath() Andrew Morton
                   ` (34 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, axboe, joshs, laoar.shao, linux-mm, mingo, mm-commits, tj,
	torvalds

From: Yafang Shao <laoar.shao@gmail.com>
Subject: delayacct: clear right task's flag after blkio completes

When I was implementing a latency analyze tool by using task->delays and
other things, I found there's issue in delayacct.  The issue is it should
clear the target's flag instead of current's in delayacct_blkio_end().

When I git blame delayacct, I found there're some similar issues we have
fixed in delayacct_blkio_end().
'Commit c96f5471ce7d ("delayacct: Account blkio completion on the correct task")'
fixed the issue that it should account blkio completion on the target
task instead of current.
'Commit b512719f771a ("delayacct: fix crash in delayacct_blkio_end() after delayacct init failure")'
fixed the issue that it should check target task's delays instead of
current task'. It seems that delayacct_blkio_{begin, end} are error prone.
So I introduce a new paratmeter - the target task 'p' into these helpers,
after that change, the callsite will specifilly set the right task, which
should make it less error prone.

Link: https://lkml.kernel.org/r/20210414083720.24083-1-laoar.shao@gmail.com
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Josh Snyder <joshs@netflix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/delayacct.h |   20 ++++++++++----------
 mm/memory.c               |    8 ++++----
 2 files changed, 14 insertions(+), 14 deletions(-)

--- a/include/linux/delayacct.h~delayacct-clear-right-tasks-flag-after-blkio-completes
+++ a/include/linux/delayacct.h
@@ -82,16 +82,16 @@ static inline int delayacct_is_task_wait
 		return 0;
 }
 
-static inline void delayacct_set_flag(int flag)
+static inline void delayacct_set_flag(struct task_struct *p, int flag)
 {
-	if (current->delays)
-		current->delays->flags |= flag;
+	if (p->delays)
+		p->delays->flags |= flag;
 }
 
-static inline void delayacct_clear_flag(int flag)
+static inline void delayacct_clear_flag(struct task_struct *p, int flag)
 {
-	if (current->delays)
-		current->delays->flags &= ~flag;
+	if (p->delays)
+		p->delays->flags &= ~flag;
 }
 
 static inline void delayacct_tsk_init(struct task_struct *tsk)
@@ -114,7 +114,7 @@ static inline void delayacct_tsk_free(st
 
 static inline void delayacct_blkio_start(void)
 {
-	delayacct_set_flag(DELAYACCT_PF_BLKIO);
+	delayacct_set_flag(current, DELAYACCT_PF_BLKIO);
 	if (current->delays)
 		__delayacct_blkio_start();
 }
@@ -123,7 +123,7 @@ static inline void delayacct_blkio_end(s
 {
 	if (p->delays)
 		__delayacct_blkio_end(p);
-	delayacct_clear_flag(DELAYACCT_PF_BLKIO);
+	delayacct_clear_flag(p, DELAYACCT_PF_BLKIO);
 }
 
 static inline int delayacct_add_tsk(struct taskstats *d,
@@ -166,9 +166,9 @@ static inline void delayacct_thrashing_e
 }
 
 #else
-static inline void delayacct_set_flag(int flag)
+static inline void delayacct_set_flag(struct task_struct *p, int flag)
 {}
-static inline void delayacct_clear_flag(int flag)
+static inline void delayacct_clear_flag(struct task_struct *p, int flag)
 {}
 static inline void delayacct_init(void)
 {}
--- a/mm/memory.c~delayacct-clear-right-tasks-flag-after-blkio-completes
+++ a/mm/memory.c
@@ -3339,7 +3339,7 @@ vm_fault_t do_swap_page(struct vm_fault
 	}
 
 
-	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
+	delayacct_set_flag(current, DELAYACCT_PF_SWAPIN);
 	page = lookup_swap_cache(entry, vma, vmf->address);
 	swapcache = page;
 
@@ -3388,7 +3388,7 @@ vm_fault_t do_swap_page(struct vm_fault
 					vmf->address, &vmf->ptl);
 			if (likely(pte_same(*vmf->pte, vmf->orig_pte)))
 				ret = VM_FAULT_OOM;
-			delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
+			delayacct_clear_flag(current, DELAYACCT_PF_SWAPIN);
 			goto unlock;
 		}
 
@@ -3402,13 +3402,13 @@ vm_fault_t do_swap_page(struct vm_fault
 		 * owner processes (which may be unknown at hwpoison time)
 		 */
 		ret = VM_FAULT_HWPOISON;
-		delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
+		delayacct_clear_flag(current, DELAYACCT_PF_SWAPIN);
 		goto out_release;
 	}
 
 	locked = lock_page_or_retry(page, vma->vm_mm, vmf->flags);
 
-	delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
+	delayacct_clear_flag(current, DELAYACCT_PF_SWAPIN);
 	if (!locked) {
 		ret |= VM_FAULT_RETRY;
 		goto out_release;
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 58/91] gdb: lx-symbols: store the abspath()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (56 preceding siblings ...)
  2021-05-07  1:05 ` [patch 57/91] delayacct: clear right task's flag after blkio completes Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 59/91] scripts/gdb: document lx_current is only supported by x86 Andrew Morton
                   ` (33 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, jan.kiszka, johannes.berg, kbingham, linux-mm, mm-commits,
	torvalds

From: Johannes Berg <johannes.berg@intel.com>
Subject: gdb: lx-symbols: store the abspath()

If we store the relative path, the user might later cd to a different
directory, and that would break the automatic symbol resolving that
happens when a module is loaded into the target kernel.  Fix this by
storing the abspath() of each path given, just like we already do for the
cwd (os.getcwd() is absolute.)

Link: https://lkml.kernel.org/r/20201217091747.bf4332cf2b35.I10ebbdb7e9b80ab1a5cddebf53d073be8232d656@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/gdb/linux/symbols.py |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/scripts/gdb/linux/symbols.py~gdb-lx-symbols-store-the-abspath
+++ a/scripts/gdb/linux/symbols.py
@@ -164,7 +164,8 @@ lx-symbols command."""
             saved_state['breakpoint'].enabled = saved_state['enabled']
 
     def invoke(self, arg, from_tty):
-        self.module_paths = [os.path.expanduser(p) for p in arg.split()]
+        self.module_paths = [os.path.abspath(os.path.expanduser(p))
+                             for p in arg.split()]
         self.module_paths.append(os.getcwd())
 
         # enforce update
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 59/91] scripts/gdb: document lx_current is only supported by x86
  2021-05-07  1:01 incoming Andrew Morton
                   ` (57 preceding siblings ...)
  2021-05-07  1:05 ` [patch 58/91] gdb: lx-symbols: store the abspath() Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 60/91] scripts/gdb: add lx_current support for arm64 Andrew Morton
                   ` (32 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, corbet, jan.kiszka, kbingham, linux-mm, mm-commits,
	song.bao.hua, torvalds

From: Barry Song <song.bao.hua@hisilicon.com>
Subject: scripts/gdb: document lx_current is only supported by x86

Patch series "scripts/gdb: clarify the platforms supporting lx_current and add arm64 support", v2.

lx_current depends on per_cpu current_task variable which exists on x86
only.  so it actually works on x86 only.  the 1st patch documents this
clearly; the 2nd patch adds support for arm64.


This patch (of 2):

x86 is the only architecture which has per_cpu current_task:
arch$ git grep current_task | grep -i per_cpu
x86/include/asm/current.h:DECLARE_PER_CPU(struct task_struct *, current_task);
x86/kernel/cpu/common.c:DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
x86/kernel/cpu/common.c:EXPORT_PER_CPU_SYMBOL(current_task);
x86/kernel/cpu/common.c:DEFINE_PER_CPU(struct task_struct *, current_task) = &init_task;
x86/kernel/cpu/common.c:EXPORT_PER_CPU_SYMBOL(current_task);
x86/kernel/smpboot.c:	per_cpu(current_task, cpu) = idle;

On other architectures, lx_current() will lead to a python exception:
(gdb) p $lx_current().pid
Python Exception <class 'gdb.error'> No symbol "current_task" in current context.:
Error occurred in Python: No symbol "current_task" in current context.

To avoid more people struggling and wasting time in other architectures,
document it.

Link: https://lkml.kernel.org/r/20210314203444.15188-1-song.bao.hua@hisilicon.com
Link: https://lkml.kernel.org/r/20210314203444.15188-2-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/gdb-kernel-debugging.rst |    2 +-
 scripts/gdb/linux/cpus.py                        |   10 ++++++++--
 2 files changed, 9 insertions(+), 3 deletions(-)

--- a/Documentation/dev-tools/gdb-kernel-debugging.rst~scripts-gdb-document-lx_current-is-only-supported-by-x86
+++ a/Documentation/dev-tools/gdb-kernel-debugging.rst
@@ -114,7 +114,7 @@ Examples of using the Linux-provided gdb
     [     0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
     ....
 
-- Examine fields of the current task struct::
+- Examine fields of the current task struct(supported by x86 only)::
 
     (gdb) p $lx_current().pid
     $1 = 4998
--- a/scripts/gdb/linux/cpus.py~scripts-gdb-document-lx_current-is-only-supported-by-x86
+++ a/scripts/gdb/linux/cpus.py
@@ -156,6 +156,13 @@ Note that VAR has to be quoted as string
 
 PerCpu()
 
+def get_current_task(cpu):
+    if utils.is_target_arch("x86"):
+         var_ptr = gdb.parse_and_eval("&current_task")
+         return per_cpu(var_ptr, cpu).dereference()
+    else:
+        raise gdb.GdbError("Sorry, obtaining the current task is not yet "
+                           "supported with this arch")
 
 class LxCurrentFunc(gdb.Function):
     """Return current task.
@@ -167,8 +174,7 @@ number. If CPU is omitted, the CPU of th
         super(LxCurrentFunc, self).__init__("lx_current")
 
     def invoke(self, cpu=-1):
-        var_ptr = gdb.parse_and_eval("&current_task")
-        return per_cpu(var_ptr, cpu).dereference()
+        return get_current_task(cpu)
 
 
 LxCurrentFunc()
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 60/91] scripts/gdb: add lx_current support for arm64
  2021-05-07  1:01 incoming Andrew Morton
                   ` (58 preceding siblings ...)
  2021-05-07  1:05 ` [patch 59/91] scripts/gdb: document lx_current is only supported by x86 Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 61/91] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources Andrew Morton
                   ` (31 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, corbet, jan.kiszka, kbingham, linux-mm, mm-commits,
	song.bao.hua, torvalds

From: Barry Song <song.bao.hua@hisilicon.com>
Subject: scripts/gdb: add lx_current support for arm64

arm64 uses SP_EL0 to save the current task_struct address.  While running
in EL0, SP_EL0 is clobbered by userspace.  So if the upper bit is not 1
(not TTBR1), the current address is invalid.  This patch checks the upper
bit of SP_EL0, if the upper bit is 1, lx_current() of arm64 will return
the derefrence of current task.  Otherwise, lx_current() will tell users
they are running in userspace(EL0).

While arm64 is running in EL0, it is actually pointless to print current
task as the memory of kernel space is not accessible in EL0.

Link: https://lkml.kernel.org/r/20210314203444.15188-3-song.bao.hua@hisilicon.com
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kieran Bingham <kbingham@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/gdb-kernel-debugging.rst |    2 +-
 scripts/gdb/linux/cpus.py                        |   13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

--- a/Documentation/dev-tools/gdb-kernel-debugging.rst~scripts-gdb-add-lx_current-support-for-arm64
+++ a/Documentation/dev-tools/gdb-kernel-debugging.rst
@@ -114,7 +114,7 @@ Examples of using the Linux-provided gdb
     [     0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
     ....
 
-- Examine fields of the current task struct(supported by x86 only)::
+- Examine fields of the current task struct(supported by x86 and arm64 only)::
 
     (gdb) p $lx_current().pid
     $1 = 4998
--- a/scripts/gdb/linux/cpus.py~scripts-gdb-add-lx_current-support-for-arm64
+++ a/scripts/gdb/linux/cpus.py
@@ -16,6 +16,9 @@ import gdb
 from linux import tasks, utils
 
 
+task_type = utils.CachedType("struct task_struct")
+
+
 MAX_CPUS = 4096
 
 
@@ -157,9 +160,19 @@ Note that VAR has to be quoted as string
 PerCpu()
 
 def get_current_task(cpu):
+    task_ptr_type = task_type.get_type().pointer()
+
     if utils.is_target_arch("x86"):
          var_ptr = gdb.parse_and_eval("&current_task")
          return per_cpu(var_ptr, cpu).dereference()
+    elif utils.is_target_arch("aarch64"):
+         current_task_addr = gdb.parse_and_eval("$SP_EL0")
+         if((current_task_addr >> 63) != 0):
+             current_task = current_task_addr.cast(task_ptr_type)
+             return current_task.dereference()
+         else:
+             raise gdb.GdbError("Sorry, obtaining the current task is not allowed "
+                                "while running in userspace(EL0)")
     else:
         raise gdb.GdbError("Sorry, obtaining the current task is not yet "
                            "supported with this arch")
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 61/91] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
  2021-05-07  1:01 incoming Andrew Morton
                   ` (59 preceding siblings ...)
  2021-05-07  1:05 ` [patch 60/91] scripts/gdb: add lx_current support for arm64 Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 62/91] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources Andrew Morton
                   ` (30 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, bhe, bp, brijesh.singh, cai,
	dan.j.williams, daniel.vetter, dave.hansen, david, dyoung,
	ebiederm, gregkh, hpa, keith.busch, linux-mm, mchehab+huawei,
	mhocko, mingo, mm-commits, osalvador, tglx, thomas.lendacky,
	torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources

Patch series "kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree", v2.

Playing with kdump+virtio-mem I noticed that kexec_file_load() does not
consider System RAM added via dax/kmem and virtio-mem when preparing the
elf header for kdump.  Looking into the details, the logic used in
walk_system_ram_res() and walk_mem_res() seems to be outdated.

walk_system_ram_range() already does the right thing, let's change
walk_system_ram_res() and walk_mem_res(), and clean up.

Loading a kdump kernel via "kexec -p -s" ...  will result in the kdump
kernel to also dump dax/kmem and virtio-mem added System RAM now.

Note: kexec-tools on x86-64 also have to be updated to consider this
memory in the kexec_load() case when processing /proc/iomem.


This patch (of 3):

It used to be true that we can have system RAM (IORESOURCE_SYSTEM_RAM |
IORESOURCE_BUSY) only on the first level in the resource tree.  However,
this is no longer holds for driver-managed system RAM (i.e., added via
dax/kmem and virtio-mem), which gets added on lower levels, for example,
inside device containers.

We have two users of walk_system_ram_res(), which currently only
consideres the first level:

a) kernel/kexec_file.c:kexec_walk_resources() -- We properly skip
   IORESOURCE_SYSRAM_DRIVER_MANAGED resources via
   locate_mem_hole_callback(), so even after this change, we won't be
   placing kexec images onto dax/kmem and virtio-mem added memory.  No
   change.

b) arch/x86/kernel/crash.c:fill_up_crash_elf_data() -- we're currently
   not adding relevant ranges to the crash elf header, resulting in them
   not getting dumped via kdump.

This change fixes loading a crashkernel via kexec_file_load() and
including dax/kmem and virtio-mem added System RAM in the crashdump on
x86-64.  Note that e.g,, arm64 relies on memblock data and, therefore,
always considers all added System RAM already.

Let's find all IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY resources, making
the function behave like walk_system_ram_range().

Link: https://lkml.kernel.org/r/20210325115326.7826-1-david@redhat.com
Link: https://lkml.kernel.org/r/20210325115326.7826-2-david@redhat.com
Fixes: ebf71552bb0e ("virtio-mem: Add parent resource for all added "System RAM"")
Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/resource.c~kernel-resource-make-walk_system_ram_res-find-all-busy-ioresource_system_ram-resources
+++ a/kernel/resource.c
@@ -457,7 +457,7 @@ int walk_system_ram_res(u64 start, u64 e
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
 				     arg, func);
 }
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 62/91] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
  2021-05-07  1:01 incoming Andrew Morton
                   ` (60 preceding siblings ...)
  2021-05-07  1:05 ` [patch 61/91] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 63/91] kernel/resource: remove first_lvl / siblings_only logic Andrew Morton
                   ` (29 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, bhe, bp, brijesh.singh, cai,
	dan.j.williams, daniel.vetter, dave.hansen, david, dyoung,
	ebiederm, gregkh, hpa, keith.busch, linux-mm, mchehab+huawei,
	mhocko, mingo, mm-commits, osalvador, tglx, thomas.lendacky,
	torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources

It used to be true that we can have system RAM (IORESOURCE_SYSTEM_RAM |
IORESOURCE_BUSY) only on the first level in the resource tree.  However,
this is no longer holds for driver-managed system RAM (i.e., added via
dax/kmem and virtio-mem), which gets added on lower levels, for example,
inside device containers.

IORESOURCE_SYSTEM_RAM is defined as IORESOURCE_MEM | IORESOURCE_SYSRAM and
just a special type of IORESOURCE_MEM.

The function walk_mem_res() only considers the first level and is used in
arch/x86/mm/ioremap.c:__ioremap_check_mem() only.  We currently fail to
identify System RAM added by dax/kmem and virtio-mem as
"IORES_MAP_SYSTEM_RAM", for example, allowing for remapping of such
"normal RAM" in __ioremap_caller().

Let's find all IORESOURCE_MEM | IORESOURCE_BUSY resources, making the
function behave similar to walk_system_ram_res().

Link: https://lkml.kernel.org/r/20210325115326.7826-3-david@redhat.com
Fixes: ebf71552bb0e ("virtio-mem: Add parent resource for all added "System RAM"")
Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/resource.c~kernel-resource-make-walk_mem_res-find-all-busy-ioresource_mem-resources
+++ a/kernel/resource.c
@@ -470,7 +470,7 @@ int walk_mem_res(u64 start, u64 end, voi
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
 				     arg, func);
 }
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 63/91] kernel/resource: remove first_lvl / siblings_only logic
  2021-05-07  1:01 incoming Andrew Morton
                   ` (61 preceding siblings ...)
  2021-05-07  1:05 ` [patch 62/91] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 64/91] kernel/resource: allow region_intersects users to hold resource_lock Andrew Morton
                   ` (28 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, bhe, bp, brijesh.singh, cai,
	dan.j.williams, daniel.vetter, dave.hansen, david, dyoung,
	ebiederm, gregkh, hpa, keith.busch, linux-mm, mchehab+huawei,
	mhocko, mingo, mm-commits, osalvador, tglx, thomas.lendacky,
	torvalds, vgoyal

From: David Hildenbrand <david@redhat.com>
Subject: kernel/resource: remove first_lvl / siblings_only logic

All functions that search for IORESOURCE_SYSTEM_RAM or IORESOURCE_MEM
resources now properly consider the whole resource tree, not just the
first level.  Let's drop the unused first_lvl / siblings_only logic.

Remove documentation that indicates that some functions behave differently,
all consider the full resource tree now.

Link: https://lkml.kernel.org/r/20210325115326.7826-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   45 +++++++++++---------------------------------
 1 file changed, 12 insertions(+), 33 deletions(-)

--- a/kernel/resource.c~kernel-resource-remove-first_lvl-siblings_only-logic
+++ a/kernel/resource.c
@@ -64,12 +64,8 @@ static DEFINE_RWLOCK(resource_lock);
 static struct resource *bootmem_resource_free;
 static DEFINE_SPINLOCK(bootmem_resource_lock);
 
-static struct resource *next_resource(struct resource *p, bool sibling_only)
+static struct resource *next_resource(struct resource *p)
 {
-	/* Caller wants to traverse through siblings only */
-	if (sibling_only)
-		return p->sibling;
-
 	if (p->child)
 		return p->child;
 	while (!p->sibling && p->parent)
@@ -81,7 +77,7 @@ static void *r_next(struct seq_file *m,
 {
 	struct resource *p = v;
 	(*pos)++;
-	return (void *)next_resource(p, false);
+	return (void *)next_resource(p);
 }
 
 #ifdef CONFIG_PROC_FS
@@ -330,14 +326,10 @@ EXPORT_SYMBOL(release_resource);
  * of the resource that's within [@start..@end]; if none is found, returns
  * -ENODEV.  Returns -EINVAL for invalid parameters.
  *
- * This function walks the whole tree and not just first level children
- * unless @first_lvl is true.
- *
  * @start:	start address of the resource searched for
  * @end:	end address of same resource
  * @flags:	flags which the resource must have
  * @desc:	descriptor the resource must have
- * @first_lvl:	walk only the first level children, if set
  * @res:	return ptr, if resource found
  *
  * The caller must specify @start, @end, @flags, and @desc
@@ -345,9 +337,8 @@ EXPORT_SYMBOL(release_resource);
  */
 static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			       unsigned long flags, unsigned long desc,
-			       bool first_lvl, struct resource *res)
+			       struct resource *res)
 {
-	bool siblings_only = true;
 	struct resource *p;
 
 	if (!res)
@@ -358,7 +349,7 @@ static int find_next_iomem_res(resource_
 
 	read_lock(&resource_lock);
 
-	for (p = iomem_resource.child; p; p = next_resource(p, siblings_only)) {
+	for (p = iomem_resource.child; p; p = next_resource(p)) {
 		/* If we passed the resource we are looking for, stop */
 		if (p->start > end) {
 			p = NULL;
@@ -369,13 +360,6 @@ static int find_next_iomem_res(resource_
 		if (p->end < start)
 			continue;
 
-		/*
-		 * Now that we found a range that matches what we look for,
-		 * check the flags and the descriptor. If we were not asked to
-		 * use only the first level, start looking at children as well.
-		 */
-		siblings_only = first_lvl;
-
 		if ((p->flags & flags) != flags)
 			continue;
 		if ((desc != IORES_DESC_NONE) && (desc != p->desc))
@@ -402,14 +386,14 @@ static int find_next_iomem_res(resource_
 
 static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 				 unsigned long flags, unsigned long desc,
-				 bool first_lvl, void *arg,
+				 void *arg,
 				 int (*func)(struct resource *, void *))
 {
 	struct resource res;
 	int ret = -EINVAL;
 
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
+	       !find_next_iomem_res(start, end, flags, desc, &res)) {
 		ret = (*func)(&res, arg);
 		if (ret)
 			break;
@@ -431,7 +415,6 @@ static int __walk_iomem_res_desc(resourc
  * @arg: function argument for the callback @func
  * @func: callback function that is called for each qualifying resource area
  *
- * This walks through whole tree and not just first level children.
  * All the memory ranges which overlap start,end and also match flags and
  * desc are valid candidates.
  *
@@ -441,7 +424,7 @@ static int __walk_iomem_res_desc(resourc
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -457,8 +440,8 @@ int walk_system_ram_res(u64 start, u64 e
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
-				     arg, func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
+				     func);
 }
 
 /*
@@ -470,17 +453,14 @@ int walk_mem_res(u64 start, u64 end, voi
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, false,
-				     arg, func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
+				     func);
 }
 
 /*
  * This function calls the @func callback against all memory ranges of type
  * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY.
  * It is to be used only for System RAM.
- *
- * This will find System RAM ranges that are children of top-level resources
- * in addition to top-level System RAM resources.
  */
 int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 			  void *arg, int (*func)(unsigned long, unsigned long, void *))
@@ -495,8 +475,7 @@ int walk_system_ram_range(unsigned long
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
-				    false, &res)) {
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
 		pfn = PFN_UP(res.start);
 		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 64/91] kernel/resource: allow region_intersects users to hold resource_lock
  2021-05-07  1:01 incoming Andrew Morton
                   ` (62 preceding siblings ...)
  2021-05-07  1:05 ` [patch 63/91] kernel/resource: remove first_lvl / siblings_only logic Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 65/91] kernel/resource: refactor __request_region to allow external locking Andrew Morton
                   ` (27 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, apopple, bsingharora, dan.j.williams, daniel.vetter, david,
	gregkh, jglisse, jhubbard, linux-mm, mm-commits, smuchun,
	torvalds

From: Alistair Popple <apopple@nvidia.com>
Subject: kernel/resource: allow region_intersects users to hold resource_lock

Introduce a version of region_intersects() that can be called with the
resource_lock already held. This is used in a future fix to
__request_free_mem_region().

[akpm@linux-foundation.org: make __region_intersects static]
Link: https://lkml.kernel.org/r/20210419070109.4780-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Muchun Song <smuchun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   52 ++++++++++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 21 deletions(-)

--- a/kernel/resource.c~kernel-resource-allow-region_intersects-users-to-hold-resource_lock
+++ a/kernel/resource.c
@@ -502,6 +502,34 @@ int __weak page_is_ram(unsigned long pfn
 }
 EXPORT_SYMBOL_GPL(page_is_ram);
 
+static int __region_intersects(resource_size_t start, size_t size,
+			unsigned long flags, unsigned long desc)
+{
+	struct resource res;
+	int type = 0; int other = 0;
+	struct resource *p;
+
+	res.start = start;
+	res.end = start + size - 1;
+
+	for (p = iomem_resource.child; p ; p = p->sibling) {
+		bool is_type = (((p->flags & flags) == flags) &&
+				((desc == IORES_DESC_NONE) ||
+				 (desc == p->desc)));
+
+		if (resource_overlaps(p, &res))
+			is_type ? type++ : other++;
+	}
+
+	if (type == 0)
+		return REGION_DISJOINT;
+
+	if (other == 0)
+		return REGION_INTERSECTS;
+
+	return REGION_MIXED;
+}
+
 /**
  * region_intersects() - determine intersection of region with known resources
  * @start: region start address
@@ -525,31 +553,13 @@ EXPORT_SYMBOL_GPL(page_is_ram);
 int region_intersects(resource_size_t start, size_t size, unsigned long flags,
 		      unsigned long desc)
 {
-	struct resource res;
-	int type = 0; int other = 0;
-	struct resource *p;
-
-	res.start = start;
-	res.end = start + size - 1;
+	int ret;
 
 	read_lock(&resource_lock);
-	for (p = iomem_resource.child; p ; p = p->sibling) {
-		bool is_type = (((p->flags & flags) == flags) &&
-				((desc == IORES_DESC_NONE) ||
-				 (desc == p->desc)));
-
-		if (resource_overlaps(p, &res))
-			is_type ? type++ : other++;
-	}
+	ret = __region_intersects(start, size, flags, desc);
 	read_unlock(&resource_lock);
 
-	if (type == 0)
-		return REGION_DISJOINT;
-
-	if (other == 0)
-		return REGION_INTERSECTS;
-
-	return REGION_MIXED;
+	return ret;
 }
 EXPORT_SYMBOL_GPL(region_intersects);
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 65/91] kernel/resource: refactor __request_region to allow external locking
  2021-05-07  1:01 incoming Andrew Morton
                   ` (63 preceding siblings ...)
  2021-05-07  1:05 ` [patch 64/91] kernel/resource: allow region_intersects users to hold resource_lock Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 66/91] kernel/resource: fix locking in request_free_mem_region Andrew Morton
                   ` (26 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, apopple, bsingharora, dan.j.williams, daniel.vetter, david,
	gregkh, jglisse, jhubbard, linux-mm, mm-commits, smuchun,
	torvalds

From: Alistair Popple <apopple@nvidia.com>
Subject: kernel/resource: refactor __request_region to allow external locking

Refactor the portion of __request_region() done whilst holding the
resource_lock into a separate function to allow callers to hold the lock.

Link: https://lkml.kernel.org/r/20210419070109.4780-2-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Muchun Song <smuchun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   52 +++++++++++++++++++++++++++-----------------
 1 file changed, 32 insertions(+), 20 deletions(-)

--- a/kernel/resource.c~kernel-resource-refactor-__request_region-to-allow-external-locking
+++ a/kernel/resource.c
@@ -1160,31 +1160,16 @@ struct address_space *iomem_get_mapping(
 	return smp_load_acquire(&iomem_inode)->i_mapping;
 }
 
-/**
- * __request_region - create a new busy resource region
- * @parent: parent resource descriptor
- * @start: resource start address
- * @n: resource region size
- * @name: reserving caller's ID string
- * @flags: IO resource flags
- */
-struct resource * __request_region(struct resource *parent,
+static int __request_region_locked(struct resource *res, struct resource *parent,
 				   resource_size_t start, resource_size_t n,
 				   const char *name, int flags)
 {
 	DECLARE_WAITQUEUE(wait, current);
-	struct resource *res = alloc_resource(GFP_KERNEL);
-	struct resource *orig_parent = parent;
-
-	if (!res)
-		return NULL;
 
 	res->name = name;
 	res->start = start;
 	res->end = start + n - 1;
 
-	write_lock(&resource_lock);
-
 	for (;;) {
 		struct resource *conflict;
 
@@ -1220,13 +1205,40 @@ struct resource * __request_region(struc
 			continue;
 		}
 		/* Uhhuh, that didn't work out.. */
-		free_resource(res);
-		res = NULL;
-		break;
+		return -EBUSY;
 	}
+
+	return 0;
+}
+
+/**
+ * __request_region - create a new busy resource region
+ * @parent: parent resource descriptor
+ * @start: resource start address
+ * @n: resource region size
+ * @name: reserving caller's ID string
+ * @flags: IO resource flags
+ */
+struct resource *__request_region(struct resource *parent,
+				  resource_size_t start, resource_size_t n,
+				  const char *name, int flags)
+{
+	struct resource *res = alloc_resource(GFP_KERNEL);
+	int ret;
+
+	if (!res)
+		return NULL;
+
+	write_lock(&resource_lock);
+	ret = __request_region_locked(res, parent, start, n, name, flags);
 	write_unlock(&resource_lock);
 
-	if (res && orig_parent == &iomem_resource)
+	if (ret) {
+		free_resource(res);
+		return NULL;
+	}
+
+	if (parent == &iomem_resource)
 		revoke_iomem(res);
 
 	return res;
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 66/91] kernel/resource: fix locking in request_free_mem_region
  2021-05-07  1:01 incoming Andrew Morton
                   ` (64 preceding siblings ...)
  2021-05-07  1:05 ` [patch 65/91] kernel/resource: refactor __request_region to allow external locking Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 67/91] selftests: remove duplicate include Andrew Morton
                   ` (25 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, apopple, bsingharora, dan.j.williams, daniel.vetter, david,
	gregkh, jglisse, jhubbard, linux-mm, mm-commits, smuchun,
	torvalds

From: Alistair Popple <apopple@nvidia.com>
Subject: kernel/resource: fix locking in request_free_mem_region

request_free_mem_region() is used to find an empty range of physical
addresses for hotplugging ZONE_DEVICE memory.  It does this by iterating
over the range of possible addresses using region_intersects() to see if
the range is free before calling request_mem_region() to allocate the
region.

However the resource_lock is dropped between these two calls meaning by
the time request_mem_region() is called in request_free_mem_region()
another thread may have already reserved the requested region.  This
results in unexpected failures and a message in the kernel log from
hitting this condition:

        /*
         * mm/hmm.c reserves physical addresses which then
         * become unavailable to other users.  Conflicts are
         * not expected.  Warn to aid debugging if encountered.
         */
        if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) {
                pr_warn("Unaddressable device %s %pR conflicts with %pR",
                        conflict->name, conflict, res);

These unexpected failures can be corrected by holding resource_lock across
the two calls.  This also requires memory allocation to be performed prior
to taking the lock.

Link: https://lkml.kernel.org/r/20210419070109.4780-3-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Muchun Song <smuchun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/resource.c |   45 +++++++++++++++++++++++++++++++++++++-------
 1 file changed, 38 insertions(+), 7 deletions(-)

--- a/kernel/resource.c~kernel-resource-fix-locking-in-request_free_mem_region
+++ a/kernel/resource.c
@@ -1780,25 +1780,56 @@ static struct resource *__request_free_m
 {
 	resource_size_t end, addr;
 	struct resource *res;
+	struct region_devres *dr = NULL;
 
 	size = ALIGN(size, 1UL << PA_SECTION_SHIFT);
 	end = min_t(unsigned long, base->end, (1UL << MAX_PHYSMEM_BITS) - 1);
 	addr = end - size + 1UL;
 
+	res = alloc_resource(GFP_KERNEL);
+	if (!res)
+		return ERR_PTR(-ENOMEM);
+
+	if (dev) {
+		dr = devres_alloc(devm_region_release,
+				sizeof(struct region_devres), GFP_KERNEL);
+		if (!dr) {
+			free_resource(res);
+			return ERR_PTR(-ENOMEM);
+		}
+	}
+
+	write_lock(&resource_lock);
 	for (; addr > size && addr >= base->start; addr -= size) {
-		if (region_intersects(addr, size, 0, IORES_DESC_NONE) !=
+		if (__region_intersects(addr, size, 0, IORES_DESC_NONE) !=
 				REGION_DISJOINT)
 			continue;
 
-		if (dev)
-			res = devm_request_mem_region(dev, addr, size, name);
-		else
-			res = request_mem_region(addr, size, name);
-		if (!res)
-			return ERR_PTR(-ENOMEM);
+		if (!__request_region_locked(res, &iomem_resource, addr, size,
+						name, 0))
+			break;
+
+		if (dev) {
+			dr->parent = &iomem_resource;
+			dr->start = addr;
+			dr->n = size;
+			devres_add(dev, dr);
+		}
+
 		res->desc = IORES_DESC_DEVICE_PRIVATE_MEMORY;
+		write_unlock(&resource_lock);
+
+		/*
+		 * A driver is claiming this region so revoke any mappings.
+		 */
+		revoke_iomem(res);
 		return res;
 	}
+	write_unlock(&resource_lock);
+
+	free_resource(res);
+	if (dr)
+		devres_free(dr);
 
 	return ERR_PTR(-ERANGE);
 }
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 67/91] selftests: remove duplicate include
  2021-05-07  1:01 incoming Andrew Morton
                   ` (65 preceding siblings ...)
  2021-05-07  1:05 ` [patch 66/91] kernel/resource: fix locking in request_free_mem_region Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 68/91] kernel/async.c: stop guarding pr_debug() statements Andrew Morton
                   ` (24 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, pbonzini, shuah, torvalds, zhang.yunkai

From: Zhang Yunkai <zhang.yunkai@zte.com.cn>
Subject: selftests: remove duplicate include

'assert.h' included in 'sparsebit.c' is duplicated.
It is also included in the 161th line.
'string.h' included in 'mincore_selftest.c' is duplicated.
It is also included in the 15th line.
'sched.h' included in 'tlbie_test.c' is duplicated.
It is also included in the 33th line.

Link: https://lkml.kernel.org/r/20210316073336.426255-1-zhang.yunkai@zte.com.cn
Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/kvm/lib/sparsebit.c        |    1 -
 tools/testing/selftests/mincore/mincore_selftest.c |    1 -
 tools/testing/selftests/powerpc/mm/tlbie_test.c    |    1 -
 3 files changed, 3 deletions(-)

--- a/tools/testing/selftests/kvm/lib/sparsebit.c~selftests-remove-duplicate-include
+++ a/tools/testing/selftests/kvm/lib/sparsebit.c
@@ -1890,7 +1890,6 @@ void sparsebit_validate_internal(struct
  */
 
 #include <stdlib.h>
-#include <assert.h>
 
 struct range {
 	sparsebit_idx_t first, last;
--- a/tools/testing/selftests/mincore/mincore_selftest.c~selftests-remove-duplicate-include
+++ a/tools/testing/selftests/mincore/mincore_selftest.c
@@ -14,7 +14,6 @@
 #include <sys/mman.h>
 #include <string.h>
 #include <fcntl.h>
-#include <string.h>
 
 #include "../kselftest.h"
 #include "../kselftest_harness.h"
--- a/tools/testing/selftests/powerpc/mm/tlbie_test.c~selftests-remove-duplicate-include
+++ a/tools/testing/selftests/powerpc/mm/tlbie_test.c
@@ -33,7 +33,6 @@
 #include <sched.h>
 #include <time.h>
 #include <stdarg.h>
-#include <sched.h>
 #include <pthread.h>
 #include <signal.h>
 #include <sys/prctl.h>
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 68/91] kernel/async.c: stop guarding pr_debug() statements
  2021-05-07  1:01 incoming Andrew Morton
                   ` (66 preceding siblings ...)
  2021-05-07  1:05 ` [patch 67/91] selftests: remove duplicate include Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 69/91] kernel/async.c: remove async_unregister_domain() Andrew Morton
                   ` (23 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, linux-mm, linux, mm-commits, tj, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: kernel/async.c: stop guarding pr_debug() statements

It's currently nigh impossible to get these pr_debug()s to print
something.  Being guarded by initcall_debug means one has to enable tons
of other debug output during boot, and the system_state condition further
means it's impossible to get them when loading modules later.

Also, the compiler can't know that these global conditions do not change,
so there are W=2 warnings

kernel/async.c:125:9: warning: `calltime' may be used uninitialized in this function [-Wmaybe-uninitialized]
kernel/async.c:300:9: warning: `starttime' may be used uninitialized in this function [-Wmaybe-uninitialized]

Make it possible, for a DYNAMIC_DEBUG kernel, to get these to print their
messages by booting with appropriate 'dyndbg="file async.c +p"' command
line argument.  For a non-DYNAMIC_DEBUG kernel, pr_debug() compiles to
nothing.

This does cost doing an unconditional ktime_get() for the starttime value,
but the corresponding ktime_get for the end time can be elided by
factoring it into a function which only gets called if the printk()
arguments end up being evaluated.

Link: https://lkml.kernel.org/r/20210309151723.1907838-1-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/async.c |   48 +++++++++++++++++++----------------------------
 1 file changed, 20 insertions(+), 28 deletions(-)

--- a/kernel/async.c~kernel-asyncc-stop-guarding-pr_debug-statements
+++ a/kernel/async.c
@@ -78,6 +78,12 @@ static DECLARE_WAIT_QUEUE_HEAD(async_don
 
 static atomic_t entry_count;
 
+static long long microseconds_since(ktime_t start)
+{
+	ktime_t now = ktime_get();
+	return ktime_to_ns(ktime_sub(now, start)) >> 10;
+}
+
 static async_cookie_t lowest_in_progress(struct async_domain *domain)
 {
 	struct async_entry *first = NULL;
@@ -111,24 +117,18 @@ static void async_run_entry_fn(struct wo
 	struct async_entry *entry =
 		container_of(work, struct async_entry, work);
 	unsigned long flags;
-	ktime_t calltime, delta, rettime;
+	ktime_t calltime;
 
 	/* 1) run (and print duration) */
-	if (initcall_debug && system_state < SYSTEM_RUNNING) {
-		pr_debug("calling  %lli_%pS @ %i\n",
-			(long long)entry->cookie,
-			entry->func, task_pid_nr(current));
-		calltime = ktime_get();
-	}
+	pr_debug("calling  %lli_%pS @ %i\n", (long long)entry->cookie,
+		 entry->func, task_pid_nr(current));
+	calltime = ktime_get();
+
 	entry->func(entry->data, entry->cookie);
-	if (initcall_debug && system_state < SYSTEM_RUNNING) {
-		rettime = ktime_get();
-		delta = ktime_sub(rettime, calltime);
-		pr_debug("initcall %lli_%pS returned after %lld usecs\n",
-			(long long)entry->cookie,
-			entry->func,
-			(long long)ktime_to_ns(delta) >> 10);
-	}
+
+	pr_debug("initcall %lli_%pS returned after %lld usecs\n",
+		 (long long)entry->cookie, entry->func,
+		 microseconds_since(calltime));
 
 	/* 2) remove self from the pending queues */
 	spin_lock_irqsave(&async_lock, flags);
@@ -287,23 +287,15 @@ EXPORT_SYMBOL_GPL(async_synchronize_full
  */
 void async_synchronize_cookie_domain(async_cookie_t cookie, struct async_domain *domain)
 {
-	ktime_t starttime, delta, endtime;
+	ktime_t starttime;
 
-	if (initcall_debug && system_state < SYSTEM_RUNNING) {
-		pr_debug("async_waiting @ %i\n", task_pid_nr(current));
-		starttime = ktime_get();
-	}
+	pr_debug("async_waiting @ %i\n", task_pid_nr(current));
+	starttime = ktime_get();
 
 	wait_event(async_done, lowest_in_progress(domain) >= cookie);
 
-	if (initcall_debug && system_state < SYSTEM_RUNNING) {
-		endtime = ktime_get();
-		delta = ktime_sub(endtime, starttime);
-
-		pr_debug("async_continuing @ %i after %lli usec\n",
-			task_pid_nr(current),
-			(long long)ktime_to_ns(delta) >> 10);
-	}
+	pr_debug("async_continuing @ %i after %lli usec\n", task_pid_nr(current),
+		 microseconds_since(starttime));
 }
 EXPORT_SYMBOL_GPL(async_synchronize_cookie_domain);
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 69/91] kernel/async.c: remove async_unregister_domain()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (67 preceding siblings ...)
  2021-05-07  1:05 ` [patch 68/91] kernel/async.c: stop guarding pr_debug() statements Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 70/91] init/initramfs.c: do unpacking asynchronously Andrew Morton
                   ` (22 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, linux-mm, linux, mm-commits, tj, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: kernel/async.c: remove async_unregister_domain()

No callers in the tree.

Link: https://lkml.kernel.org/r/20210309151723.1907838-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/async.h |    1 -
 kernel/async.c        |   18 ------------------
 2 files changed, 19 deletions(-)

--- a/include/linux/async.h~kernel-asyncc-remove-async_unregister_domain
+++ a/include/linux/async.h
@@ -112,7 +112,6 @@ async_schedule_dev_domain(async_func_t f
 	return async_schedule_node_domain(func, dev, dev_to_node(dev), domain);
 }
 
-void async_unregister_domain(struct async_domain *domain);
 extern void async_synchronize_full(void);
 extern void async_synchronize_full_domain(struct async_domain *domain);
 extern void async_synchronize_cookie(async_cookie_t cookie);
--- a/kernel/async.c~kernel-asyncc-remove-async_unregister_domain
+++ a/kernel/async.c
@@ -246,24 +246,6 @@ void async_synchronize_full(void)
 EXPORT_SYMBOL_GPL(async_synchronize_full);
 
 /**
- * async_unregister_domain - ensure no more anonymous waiters on this domain
- * @domain: idle domain to flush out of any async_synchronize_full instances
- *
- * async_synchronize_{cookie|full}_domain() are not flushed since callers
- * of these routines should know the lifetime of @domain
- *
- * Prefer ASYNC_DOMAIN_EXCLUSIVE() declarations over flushing
- */
-void async_unregister_domain(struct async_domain *domain)
-{
-	spin_lock_irq(&async_lock);
-	WARN_ON(!domain->registered || !list_empty(&domain->pending));
-	domain->registered = 0;
-	spin_unlock_irq(&async_lock);
-}
-EXPORT_SYMBOL_GPL(async_unregister_domain);
-
-/**
  * async_synchronize_full_domain - synchronize all asynchronous function within a certain domain
  * @domain: the domain to synchronize
  *
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 70/91] init/initramfs.c: do unpacking asynchronously
  2021-05-07  1:01 incoming Andrew Morton
                   ` (68 preceding siblings ...)
  2021-05-07  1:05 ` [patch 69/91] kernel/async.c: remove async_unregister_domain() Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 71/91] modules: add CONFIG_MODPROBE_PATH Andrew Morton
                   ` (21 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, bp, corbet, gregkh, jeyu, linux-mm, linux, mcgrof,
	mm-commits, ndesaulniers, tiwai, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: init/initramfs.c: do unpacking asynchronously

Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.

These two patches are independent, but better-together.

The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g.  the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.

The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish.  Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.

There's not much to win if most of the functionality needed during boot is
only available as modules.  But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).


This patch (of 2):

Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed.  So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.

This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough.  By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.

Normal desktops might benefit as well.  On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M. 
That takes almost two seconds:

[    0.201454] Trying to unpack rootfs image as initramfs...
[    1.976633] Freeing initrd memory: 29416K

Before this patch, these lines occur consecutively in dmesg.  With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.

Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.

But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e.  request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs().  So there is an
escape hatch in the form of an initramfs_async= command line parameter.

Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/kernel-parameters.txt |   12 ++++
 drivers/base/firmware_loader/main.c             |    2 
 include/linux/initrd.h                          |    2 
 init/initramfs.c                                |   38 +++++++++++++-
 init/main.c                                     |    1 
 kernel/umh.c                                    |    2 
 6 files changed, 56 insertions(+), 1 deletion(-)

--- a/Documentation/admin-guide/kernel-parameters.txt~init-initramfsc-do-unpacking-asynchronously
+++ a/Documentation/admin-guide/kernel-parameters.txt
@@ -1839,6 +1839,18 @@
 			initcall functions.  Useful for debugging built-in
 			modules and initcalls.
 
+	initramfs_async= [KNL]
+			Format: <bool>
+			Default: 1
+			This parameter controls whether the initramfs
+			image is unpacked asynchronously, concurrently
+			with devices being probed and
+			initialized. This should normally just work,
+			but as a debugging aid, one can get the
+			historical behaviour of the initramfs
+			unpacking being completed before device_ and
+			late_ initcalls.
+
 	initrd=		[BOOT] Specify the location of the initial ramdisk
 
 	initrdmem=	[KNL] Specify a physical address and size from which to
--- a/drivers/base/firmware_loader/main.c~init-initramfsc-do-unpacking-asynchronously
+++ a/drivers/base/firmware_loader/main.c
@@ -15,6 +15,7 @@
 #include <linux/kernel_read_file.h>
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/initrd.h>
 #include <linux/timer.h>
 #include <linux/vmalloc.h>
 #include <linux/interrupt.h>
@@ -504,6 +505,7 @@ fw_get_filesystem_firmware(struct device
 	if (!path)
 		return -ENOMEM;
 
+	wait_for_initramfs();
 	for (i = 0; i < ARRAY_SIZE(fw_path); i++) {
 		size_t file_size = 0;
 		size_t *file_size_ptr = NULL;
--- a/include/linux/initrd.h~init-initramfsc-do-unpacking-asynchronously
+++ a/include/linux/initrd.h
@@ -20,8 +20,10 @@ extern void free_initrd_mem(unsigned lon
 
 #ifdef CONFIG_BLK_DEV_INITRD
 extern void __init reserve_initrd_mem(void);
+extern void wait_for_initramfs(void);
 #else
 static inline void __init reserve_initrd_mem(void) {}
+static inline void wait_for_initramfs(void) {}
 #endif
 
 extern phys_addr_t phys_initrd_start;
--- a/init/initramfs.c~init-initramfsc-do-unpacking-asynchronously
+++ a/init/initramfs.c
@@ -1,5 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/init.h>
+#include <linux/async.h>
 #include <linux/fs.h>
 #include <linux/slab.h>
 #include <linux/types.h>
@@ -541,6 +542,14 @@ static int __init keepinitrd_setup(char
 __setup("keepinitrd", keepinitrd_setup);
 #endif
 
+static bool __initdata initramfs_async = true;
+static int __init initramfs_async_setup(char *str)
+{
+	strtobool(str, &initramfs_async);
+	return 1;
+}
+__setup("initramfs_async=", initramfs_async_setup);
+
 extern char __initramfs_start[];
 extern unsigned long __initramfs_size;
 #include <linux/initrd.h>
@@ -658,7 +667,7 @@ static void __init populate_initrd_image
 }
 #endif /* CONFIG_BLK_DEV_RAM */
 
-static int __init populate_rootfs(void)
+static void __init do_populate_rootfs(void *unused, async_cookie_t cookie)
 {
 	/* Load the built in initramfs */
 	char *err = unpack_to_rootfs(__initramfs_start, __initramfs_size);
@@ -693,6 +702,33 @@ done:
 	initrd_end = 0;
 
 	flush_delayed_fput();
+}
+
+static ASYNC_DOMAIN_EXCLUSIVE(initramfs_domain);
+static async_cookie_t initramfs_cookie;
+
+void wait_for_initramfs(void)
+{
+	if (!initramfs_cookie) {
+		/*
+		 * Something before rootfs_initcall wants to access
+		 * the filesystem/initramfs. Probably a bug. Make a
+		 * note, avoid deadlocking the machine, and let the
+		 * caller's access fail as it used to.
+		 */
+		pr_warn_once("wait_for_initramfs() called before rootfs_initcalls\n");
+		return;
+	}
+	async_synchronize_cookie_domain(initramfs_cookie + 1, &initramfs_domain);
+}
+EXPORT_SYMBOL_GPL(wait_for_initramfs);
+
+static int __init populate_rootfs(void)
+{
+	initramfs_cookie = async_schedule_domain(do_populate_rootfs, NULL,
+						 &initramfs_domain);
+	if (!initramfs_async)
+		wait_for_initramfs();
 	return 0;
 }
 rootfs_initcall(populate_rootfs);
--- a/init/main.c~init-initramfsc-do-unpacking-asynchronously
+++ a/init/main.c
@@ -1561,6 +1561,7 @@ static noinline void __init kernel_init_
 
 	kunit_run_all_tests();
 
+	wait_for_initramfs();
 	console_on_rootfs();
 
 	/*
--- a/kernel/umh.c~init-initramfsc-do-unpacking-asynchronously
+++ a/kernel/umh.c
@@ -27,6 +27,7 @@
 #include <linux/ptrace.h>
 #include <linux/async.h>
 #include <linux/uaccess.h>
+#include <linux/initrd.h>
 
 #include <trace/events/module.h>
 
@@ -107,6 +108,7 @@ static int call_usermodehelper_exec_asyn
 
 	commit_creds(new);
 
+	wait_for_initramfs();
 	retval = kernel_execve(sub_info->path,
 			       (const char *const *)sub_info->argv,
 			       (const char *const *)sub_info->envp);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 71/91] modules: add CONFIG_MODPROBE_PATH
  2021-05-07  1:01 incoming Andrew Morton
                   ` (69 preceding siblings ...)
  2021-05-07  1:05 ` [patch 70/91] init/initramfs.c: do unpacking asynchronously Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 72/91] ipc/sem.c: mundane typo fixes Andrew Morton
                   ` (20 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, bp, corbet, gregkh, jeyu, linux-mm, linux, mcgrof,
	mm-commits, ndesaulniers, tiwai, torvalds

From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: modules: add CONFIG_MODPROBE_PATH

Allow the developer to specifiy the initial value of the modprobe_path[]
string.  This can be used to set it to the empty string initially, thus
effectively disabling request_module() during early boot until userspace
writes a new value via the /proc/sys/kernel/modprobe interface.  [1]

When building a custom kernel (often for an embedded target), it's normal
to build everything into the kernel that is needed for booting, and indeed
the initramfs often contains no modules at all, so every such
request_module() done before userspace init has mounted the real rootfs is
a waste of time.

This is particularly useful when combined with the previous patch, which
made the initramfs unpacking asynchronous - for that to work, it had to
make any usermodehelper call wait for the unpacking to finish before
attempting to invoke the userspace helper.  By eliminating all such
(known-to-be-futile) calls of usermodehelper, the initramfs unpacking and
the {device,late}_initcalls can proceed in parallel for much longer.

For a relatively slow ppc board I'm working on, the two patches combined
lead to 0.2s faster boot - but more importantly, the fact that the
initramfs unpacking proceeds completely in the background while devices
get probed means I get to handle the gpio watchdog in time without getting
reset.

[1] __request_module() already has an early -ENOENT return when
modprobe_path is the empty string.

Link: https://lkml.kernel.org/r/20210313212528.2956377-3-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 init/Kconfig  |   12 ++++++++++++
 kernel/kmod.c |    2 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

--- a/init/Kconfig~modules-add-config_modprobe_path
+++ a/init/Kconfig
@@ -2299,6 +2299,18 @@ config MODULE_ALLOW_MISSING_NAMESPACE_IM
 
 	  If unsure, say N.
 
+config MODPROBE_PATH
+	string "Path to modprobe binary"
+	default "/sbin/modprobe"
+	help
+	  When kernel code requests a module, it does so by calling
+	  the "modprobe" userspace utility. This option allows you to
+	  set the path where that binary is found. This can be changed
+	  at runtime via the sysctl file
+	  /proc/sys/kernel/modprobe. Setting this to the empty string
+	  removes the kernel's ability to request modules (but
+	  userspace can still load modules explicitly).
+
 config TRIM_UNUSED_KSYMS
 	bool "Trim unused exported kernel symbols" if EXPERT
 	depends on !COMPILE_TEST
--- a/kernel/kmod.c~modules-add-config_modprobe_path
+++ a/kernel/kmod.c
@@ -58,7 +58,7 @@ static DECLARE_WAIT_QUEUE_HEAD(kmod_wq);
 /*
 	modprobe_path is set via /proc/sys.
 */
-char modprobe_path[KMOD_PATH_LEN] = "/sbin/modprobe";
+char modprobe_path[KMOD_PATH_LEN] = CONFIG_MODPROBE_PATH;
 
 static void free_modprobe_argv(struct subprocess_info *info)
 {
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 72/91] ipc/sem.c: mundane typo fixes
  2021-05-07  1:01 incoming Andrew Morton
                   ` (70 preceding siblings ...)
  2021-05-07  1:05 ` [patch 71/91] modules: add CONFIG_MODPROBE_PATH Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 73/91] mm: fix some typos and code style problems Andrew Morton
                   ` (19 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rdunlap, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: ipc/sem.c: mundane typo fixes

s/runtine/runtime/
s/AQUIRE/ACQUIRE/
s/seperately/separately/
s/wont/won\'t/
s/succesfull/successful/

Link: https://lkml.kernel.org/r/20210326022240.26375-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/sem.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/ipc/sem.c~ipc-semc-mundane-typo-fixes
+++ a/ipc/sem.c
@@ -36,7 +36,7 @@
  * - two Linux specific semctl() commands: SEM_STAT, SEM_INFO.
  * - undo adjustments at process exit are limited to 0..SEMVMX.
  * - namespace are supported.
- * - SEMMSL, SEMMNS, SEMOPM and SEMMNI can be configured at runtine by writing
+ * - SEMMSL, SEMMNS, SEMOPM and SEMMNI can be configured at runtime by writing
  *   to /proc/sys/kernel/sem.
  * - statistics about the usage are reported in /proc/sysvipc/sem.
  *
@@ -224,7 +224,7 @@ static int sysvipc_sem_proc_show(struct
  * Setting it to a result code is a RELEASE, this is ensured by both a
  * smp_store_release() (for case a) and while holding sem_lock()
  * (for case b).
- * The AQUIRE when reading the result code without holding sem_lock() is
+ * The ACQUIRE when reading the result code without holding sem_lock() is
  * achieved by using READ_ONCE() + smp_acquire__after_ctrl_dep().
  * (case a above).
  * Reading the result code while holding sem_lock() needs no further barriers,
@@ -821,7 +821,7 @@ static inline int check_restart(struct s
 
 	/* It is impossible that someone waits for the new value:
 	 * - complex operations always restart.
-	 * - wait-for-zero are handled seperately.
+	 * - wait-for-zero are handled separately.
 	 * - q is a previously sleeping simple operation that
 	 *   altered the array. It must be a decrement, because
 	 *   simple increments never sleep.
@@ -1046,7 +1046,7 @@ static void do_smart_update(struct sem_a
 			 * - No complex ops, thus all sleeping ops are
 			 *   decrease.
 			 * - if we decreased the value, then any sleeping
-			 *   semaphore ops wont be able to run: If the
+			 *   semaphore ops won't be able to run: If the
 			 *   previous value was too small, then the new
 			 *   value will be too small, too.
 			 */
@@ -2108,7 +2108,7 @@ static long do_semtimedop(int semid, str
 	queue.dupsop = dupsop;
 
 	error = perform_atomic_semop(sma, &queue);
-	if (error == 0) { /* non-blocking succesfull path */
+	if (error == 0) { /* non-blocking successful path */
 		DEFINE_WAKE_Q(wake_q);
 
 		/*
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 73/91] mm: fix some typos and code style problems
  2021-05-07  1:01 incoming Andrew Morton
                   ` (71 preceding siblings ...)
  2021-05-07  1:05 ` [patch 72/91] ipc/sem.c: mundane typo fixes Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:05 ` [patch 74/91] drivers/char: remove /dev/kmem for good Andrew Morton
                   ` (18 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, linmiaohe, linux-mm, luoshijie1, mm-commits, torvalds

From: Shijie Luo <luoshijie1@huawei.com>
Subject: mm: fix some typos and code style problems

fix some typos and code style problems in mm.

gfp.h: s/MAXNODES/MAX_NUMNODES
mmzone.h: s/then/than
rmap.c: s/__vma_split()/__vma_adjust()
swap.c: s/__mod_zone_page_stat/__mod_zone_page_state, s/is is/is
swap_state.c: s/whoes/whose
z3fold.c: code style problem fix in z3fold_unregister_migration
zsmalloc.c: s/of/or, s/give/given

Link: https://lkml.kernel.org/r/20210419083057.64820-1-luoshijie1@huawei.com
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/gfp.h    |    2 +-
 include/linux/mmzone.h |    2 +-
 mm/rmap.c              |    2 +-
 mm/swap.c              |    4 ++--
 mm/swap_state.c        |    2 +-
 mm/z3fold.c            |    2 +-
 mm/zsmalloc.c          |    4 ++--
 7 files changed, 9 insertions(+), 9 deletions(-)

--- a/include/linux/gfp.h~mm-fix-some-typos-and-code-style-problems
+++ a/include/linux/gfp.h
@@ -490,7 +490,7 @@ static inline int gfp_zonelist(gfp_t fla
 
 /*
  * We get the zone list from the current node and the gfp_mask.
- * This zone list contains a maximum of MAXNODES*MAX_NR_ZONES zones.
+ * This zone list contains a maximum of MAX_NUMNODES*MAX_NR_ZONES zones.
  * There are two zonelists per node, one for all zones with memory and
  * one containing just zones from the node the zonelist belongs to.
  *
--- a/include/linux/mmzone.h~mm-fix-some-typos-and-code-style-problems
+++ a/include/linux/mmzone.h
@@ -55,7 +55,7 @@ enum migratetype {
 	 * pageblocks to MIGRATE_CMA which can be done by
 	 * __free_pageblock_cma() function.  What is important though
 	 * is that a range of pageblocks must be aligned to
-	 * MAX_ORDER_NR_PAGES should biggest page be bigger then
+	 * MAX_ORDER_NR_PAGES should biggest page be bigger than
 	 * a single pageblock.
 	 */
 	MIGRATE_CMA,
--- a/mm/rmap.c~mm-fix-some-typos-and-code-style-problems
+++ a/mm/rmap.c
@@ -257,7 +257,7 @@ static inline void unlock_anon_vma_root(
  * Attach the anon_vmas from src to dst.
  * Returns 0 on success, -ENOMEM on failure.
  *
- * anon_vma_clone() is called by __vma_split(), __split_vma(), copy_vma() and
+ * anon_vma_clone() is called by __vma_adjust(), __split_vma(), copy_vma() and
  * anon_vma_fork(). The first three want an exact copy of src, while the last
  * one, anon_vma_fork(), may try to reuse an existing anon_vma to prevent
  * endless growth of anon_vma. Since dst->anon_vma is set to NULL before call,
--- a/mm/swap.c~mm-fix-some-typos-and-code-style-problems
+++ a/mm/swap.c
@@ -496,7 +496,7 @@ void lru_cache_add_inactive_or_unevictab
 	if (unlikely(unevictable) && !TestSetPageMlocked(page)) {
 		int nr_pages = thp_nr_pages(page);
 		/*
-		 * We use the irq-unsafe __mod_zone_page_stat because this
+		 * We use the irq-unsafe __mod_zone_page_state because this
 		 * counter is not modified from interrupt context, and the pte
 		 * lock is held(spinlock), which implies preemption disabled.
 		 */
@@ -808,7 +808,7 @@ inline void __lru_add_drain_all(bool for
 	 * below which drains the page vectors.
 	 *
 	 * Let x, y, and z represent some system CPU numbers, where x < y < z.
-	 * Assume CPU #z is is in the middle of the for_each_online_cpu loop
+	 * Assume CPU #z is in the middle of the for_each_online_cpu loop
 	 * below and has already reached CPU #y's per-cpu data. CPU #x comes
 	 * along, adds some pages to its per-cpu vectors, then calls
 	 * lru_add_drain_all().
--- a/mm/swap_state.c~mm-fix-some-typos-and-code-style-problems
+++ a/mm/swap_state.c
@@ -792,7 +792,7 @@ static void swap_ra_info(struct vm_fault
  *
  * Returns the struct page for entry and addr, after queueing swapin.
  *
- * Primitive swap readahead code. We simply read in a few pages whoes
+ * Primitive swap readahead code. We simply read in a few pages whose
  * virtual addresses are around the fault address in the same vma.
  *
  * Caller must hold read mmap_lock if vmf->vma is not NULL.
--- a/mm/z3fold.c~mm-fix-some-typos-and-code-style-problems
+++ a/mm/z3fold.c
@@ -391,7 +391,7 @@ static void z3fold_unregister_migration(
 {
 	if (pool->inode)
 		iput(pool->inode);
- }
+}
 
 /* Initializes the z3fold header of a newly allocated z3fold page */
 static struct z3fold_header *init_z3fold_page(struct page *page, bool headless,
--- a/mm/zsmalloc.c~mm-fix-some-typos-and-code-style-problems
+++ a/mm/zsmalloc.c
@@ -61,7 +61,7 @@
 #define ZSPAGE_MAGIC	0x58
 
 /*
- * This must be power of 2 and greater than of equal to sizeof(link_free).
+ * This must be power of 2 and greater than or equal to sizeof(link_free).
  * These two conditions ensure that any 'struct link_free' itself doesn't
  * span more than 1 page which avoids complex case of mapping 2 pages simply
  * to restore link_free pointer values.
@@ -530,7 +530,7 @@ static void set_zspage_mapping(struct zs
  * class maintains a list of zspages where each zspage is divided
  * into equal sized chunks. Each allocation falls into one of these
  * classes depending on its size. This function returns index of the
- * size class which has chunk size big enough to hold the give size.
+ * size class which has chunk size big enough to hold the given size.
  */
 static int get_size_class_index(int size)
 {
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 74/91] drivers/char: remove /dev/kmem for good
  2021-05-07  1:01 incoming Andrew Morton
                   ` (72 preceding siblings ...)
  2021-05-07  1:05 ` [patch 73/91] mm: fix some typos and code style problems Andrew Morton
@ 2021-05-07  1:05 ` Andrew Morton
  2021-05-07  1:06 ` [patch 75/91] mm: remove xlate_dev_kmem_ptr() Andrew Morton
                   ` (17 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:05 UTC (permalink / raw)
  To: akpm, alexandre.belloni, andrew, andrey.zhizhikin, arnd, bcain,
	benh, bigeasy, borntraeger, chris, christophe.leroy, clabbe,
	corbet, dalias, davem, david, deller, ebiederm, geert,
	gerald.schaefer, gor, grandmaster, green.hu, gregkh,
	gregory.clement, hca, hdanton, huang.ying.caritas, ink,
	James.Bottomley, james.troup, jcmvbkbc, jiaxun.yang, jonas,
	kasong, keescook, krzk, kuninori.morimoto.gx, linux-mm, linux,
	liviu.dudau, lorenzo.pieralisi, luc.vanoostenryck, mattst88,
	mcgrof, mhocko, minchan, mingo, mm-commits, mpatocka, mpe,
	nixiaoming, oleksiy.avramchenko, palmerdabbelt, paulus, pavel,
	pavel, peterz, pmorel, rdunlap, robh, rostedt, rppt, rric, rth,
	sam, schnelle, sebastian.hesselbarth, shorne,
	stefan.kristiansson, sudeep.holla, tblodt, tglx, torvalds,
	tsbogend, viresh.kumar, viro, wcohen, willy, ysato

From: David Hildenbrand <david@redhat.com>
Subject: drivers/char: remove /dev/kmem for good

Patch series "drivers/char: remove /dev/kmem for good".

Exploring /dev/kmem and /dev/mem in the context of memory hot(un)plug and
memory ballooning, I started questioning the existence of /dev/kmem.

Comparing it with the /proc/kcore implementation, it does not seem to be
able to deal with things like

a) Pages unmapped from the direct mapping (e.g., to be used by secretmem)
  -> kern_addr_valid(). virt_addr_valid() is not sufficient.

b) Special cases like gart aperture memory that is not to be touched
  -> mem_pfn_is_ram()

Unless I am missing something, it's at least broken in some cases and might
fault/crash the machine.

Looks like its existence has been questioned before in 2005 and 2010 [1],
after ~11 additional years, it might make sense to revive the discussion.

CONFIG_DEVKMEM is only enabled in a single defconfig (on purpose or by
mistake?).  All distributions disable it: in Ubuntu it has been disabled
for more than 10 years, in Debian since 2.6.31, in Fedora at least
starting with FC3, in RHEL starting with RHEL4, in SUSE starting from
15sp2, and OpenSUSE has it disabled as well.

1) /dev/kmem was popular for rootkits [2] before it got disabled
   basically everywhere. Ubuntu documents [3] "There is no modern user of
   /dev/kmem any more beyond attackers using it to load kernel rootkits.".
   RHEL documents in a BZ [5] "it served no practical purpose other than to
   serve as a potential security problem or to enable binary module drivers
   to access structures/functions they shouldn't be touching"

2) /proc/kcore is a decent interface to have a controlled way to read
   kernel memory for debugging puposes. (will need some extensions to
   deal with memory offlining/unplug, memory ballooning, and poisoned
   pages, though)

3) It might be useful for corner case debugging [1]. KDB/KGDB might be a
   better fit, especially, to write random memory; harder to shoot
   yourself into the foot.

4) "Kernel Memory Editor" [4] hasn't seen any updates since 2000 and seems
   to be incompatible with 64bit [1]. For educational purposes,
   /proc/kcore might be used to monitor value updates -- or older
   kernels can be used.

5) It's broken on arm64, and therefore, completely disabled there.

Looks like it's essentially unused and has been replaced by better
suited interfaces for individual tasks (/proc/kcore, KDB/KGDB). Let's
just remove it.

[1] https://lwn.net/Articles/147901/
[2] https://www.linuxjournal.com/article/10505
[3] https://wiki.ubuntu.com/Security/Features#A.2Fdev.2Fkmem_disabled
[4] https://sourceforge.net/projects/kme/
[5] https://bugzilla.redhat.com/show_bug.cgi?id=154796

Link: https://lkml.kernel.org/r/20210324102351.6932-1-david@redhat.com
Link: https://lkml.kernel.org/r/20210324102351.6932-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Alexander A. Klimov" <grandmaster@al2klimov.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Andrey Zhizhikin <andrey.zhizhikin@leica-geosystems.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: Corentin Labbe <clabbe@baylibre.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Gregory Clement <gregory.clement@bootlin.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hillf Danton <hdanton@sina.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: James Troup <james.troup@canonical.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kairui Song <kasong@redhat.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: openrisc@lists.librecores.org
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Pavel Machek (CIP)" <pavel@denx.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Cc: sparclinux@vger.kernel.org
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Theodore Dubois <tblodt@icloud.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/devices.txt     |    2 
 arch/arm/configs/dove_defconfig           |    1 
 arch/arm/configs/magician_defconfig       |    1 
 arch/arm/configs/moxart_defconfig         |    1 
 arch/arm/configs/mps2_defconfig           |    1 
 arch/arm/configs/mvebu_v5_defconfig       |    1 
 arch/arm/configs/xcep_defconfig           |    1 
 arch/hexagon/configs/comet_defconfig      |    1 
 arch/m68k/configs/amcore_defconfig        |    1 
 arch/openrisc/configs/or1ksim_defconfig   |    1 
 arch/sh/configs/edosk7705_defconfig       |    1 
 arch/sh/configs/se7206_defconfig          |    1 
 arch/sh/configs/sh2007_defconfig          |    1 
 arch/sh/configs/sh7724_generic_defconfig  |    1 
 arch/sh/configs/sh7770_generic_defconfig  |    1 
 arch/sh/configs/sh7785lcr_32bit_defconfig |    1 
 arch/sparc/configs/sparc64_defconfig      |    1 
 arch/xtensa/configs/xip_kc705_defconfig   |    1 
 drivers/char/Kconfig                      |   10 
 drivers/char/mem.c                        |  231 --------------------
 include/linux/fs.h                        |    2 
 include/linux/vmalloc.h                   |    2 
 kernel/configs/android-base.config        |    1 
 mm/ksm.c                                  |    2 
 mm/vmalloc.c                              |    2 
 25 files changed, 5 insertions(+), 264 deletions(-)

--- a/arch/arm/configs/dove_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/dove_defconfig
@@ -63,7 +63,6 @@ CONFIG_INPUT_EVDEV=y
 # CONFIG_MOUSE_PS2 is not set
 # CONFIG_SERIO is not set
 CONFIG_LEGACY_PTY_COUNT=16
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_RUNTIME_UARTS=2
--- a/arch/arm/configs/magician_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/magician_defconfig
@@ -72,7 +72,6 @@ CONFIG_INPUT_TOUCHSCREEN=y
 CONFIG_INPUT_MISC=y
 CONFIG_INPUT_UINPUT=m
 # CONFIG_SERIO is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_PXA=y
 # CONFIG_LEGACY_PTYS is not set
 # CONFIG_HW_RANDOM is not set
--- a/arch/arm/configs/moxart_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/moxart_defconfig
@@ -79,7 +79,6 @@ CONFIG_INPUT_EVBUG=y
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
 # CONFIG_LEGACY_PTYS is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_NR_UARTS=1
--- a/arch/arm/configs/mps2_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/mps2_defconfig
@@ -69,7 +69,6 @@ CONFIG_SMSC911X=y
 # CONFIG_VT is not set
 # CONFIG_LEGACY_PTYS is not set
 CONFIG_SERIAL_NONSTANDARD=y
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_MPS2_UART_CONSOLE=y
 CONFIG_SERIAL_MPS2_UART=y
 # CONFIG_HW_RANDOM is not set
--- a/arch/arm/configs/mvebu_v5_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/mvebu_v5_defconfig
@@ -100,7 +100,6 @@ CONFIG_INPUT_EVDEV=y
 CONFIG_KEYBOARD_GPIO=y
 # CONFIG_INPUT_MOUSE is not set
 CONFIG_LEGACY_PTY_COUNT=16
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_8250_RUNTIME_UARTS=2
--- a/arch/arm/configs/xcep_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/arm/configs/xcep_defconfig
@@ -53,7 +53,6 @@ CONFIG_NET_ETHERNET=y
 # CONFIG_INPUT_KEYBOARD is not set
 # CONFIG_INPUT_MOUSE is not set
 # CONFIG_SERIO is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_PXA=y
 CONFIG_SERIAL_PXA_CONSOLE=y
 # CONFIG_LEGACY_PTYS is not set
--- a/arch/hexagon/configs/comet_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/hexagon/configs/comet_defconfig
@@ -34,7 +34,6 @@ CONFIG_NET_ETHERNET=y
 # CONFIG_SERIO is not set
 # CONFIG_CONSOLE_TRANSLATIONS is not set
 CONFIG_LEGACY_PTY_COUNT=64
-# CONFIG_DEVKMEM is not set
 # CONFIG_HW_RANDOM is not set
 CONFIG_SPI=y
 CONFIG_SPI_DEBUG=y
--- a/arch/m68k/configs/amcore_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/m68k/configs/amcore_defconfig
@@ -60,7 +60,6 @@ CONFIG_DM9000=y
 # CONFIG_VT is not set
 # CONFIG_UNIX98_PTYS is not set
 # CONFIG_DEVMEM is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_MCF=y
 CONFIG_SERIAL_MCF_BAUDRATE=115200
 CONFIG_SERIAL_MCF_CONSOLE=y
--- a/arch/openrisc/configs/or1ksim_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/openrisc/configs/or1ksim_defconfig
@@ -43,7 +43,6 @@ CONFIG_MICREL_PHY=y
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
 # CONFIG_LEGACY_PTYS is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
 CONFIG_SERIAL_OF_PLATFORM=y
--- a/arch/sh/configs/edosk7705_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/edosk7705_defconfig
@@ -23,7 +23,6 @@ CONFIG_SH_PCLK_FREQ=31250000
 # CONFIG_INPUT is not set
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
-# CONFIG_DEVKMEM is not set
 # CONFIG_UNIX98_PTYS is not set
 # CONFIG_LEGACY_PTYS is not set
 # CONFIG_HW_RANDOM is not set
--- a/arch/sh/configs/se7206_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/se7206_defconfig
@@ -71,7 +71,6 @@ CONFIG_SMC91X=y
 # CONFIG_INPUT is not set
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SH_SCI=y
 CONFIG_SERIAL_SH_SCI_NR_UARTS=4
 CONFIG_SERIAL_SH_SCI_CONSOLE=y
--- a/arch/sh/configs/sh2007_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/sh2007_defconfig
@@ -75,7 +75,6 @@ CONFIG_INPUT_FF_MEMLESS=y
 # CONFIG_INPUT_MOUSE is not set
 # CONFIG_SERIO is not set
 CONFIG_VT_HW_CONSOLE_BINDING=y
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SH_SCI=y
 CONFIG_SERIAL_SH_SCI_CONSOLE=y
 # CONFIG_LEGACY_PTYS is not set
--- a/arch/sh/configs/sh7724_generic_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/sh7724_generic_defconfig
@@ -18,7 +18,6 @@ CONFIG_CPU_IDLE=y
 # CONFIG_INPUT is not set
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SH_SCI=y
 CONFIG_SERIAL_SH_SCI_NR_UARTS=6
 CONFIG_SERIAL_SH_SCI_CONSOLE=y
--- a/arch/sh/configs/sh7770_generic_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/sh7770_generic_defconfig
@@ -20,7 +20,6 @@ CONFIG_CPU_IDLE=y
 # CONFIG_INPUT is not set
 # CONFIG_SERIO is not set
 # CONFIG_VT is not set
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SH_SCI=y
 CONFIG_SERIAL_SH_SCI_NR_UARTS=6
 CONFIG_SERIAL_SH_SCI_CONSOLE=y
--- a/arch/sh/configs/sh7785lcr_32bit_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sh/configs/sh7785lcr_32bit_defconfig
@@ -66,7 +66,6 @@ CONFIG_INPUT_FF_MEMLESS=m
 CONFIG_INPUT_EVDEV=y
 CONFIG_INPUT_EVBUG=m
 CONFIG_VT_HW_CONSOLE_BINDING=y
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SH_SCI=y
 CONFIG_SERIAL_SH_SCI_NR_UARTS=6
 CONFIG_SERIAL_SH_SCI_CONSOLE=y
--- a/arch/sparc/configs/sparc64_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/sparc/configs/sparc64_defconfig
@@ -122,7 +122,6 @@ CONFIG_INPUT_SPARCSPKR=y
 # CONFIG_SERIO_SERPORT is not set
 CONFIG_SERIO_PCIPS2=m
 CONFIG_SERIO_RAW=m
-# CONFIG_DEVKMEM is not set
 CONFIG_SERIAL_SUNSU=y
 CONFIG_SERIAL_SUNSU_CONSOLE=y
 CONFIG_SERIAL_SUNSAB=y
--- a/arch/xtensa/configs/xip_kc705_defconfig~drivers-char-remove-dev-kmem-for-good
+++ a/arch/xtensa/configs/xip_kc705_defconfig
@@ -72,7 +72,6 @@ CONFIG_MARVELL_PHY=y
 # CONFIG_INPUT_KEYBOARD is not set
 # CONFIG_INPUT_MOUSE is not set
 # CONFIG_SERIO is not set
-CONFIG_DEVKMEM=y
 CONFIG_SERIAL_8250=y
 # CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
 CONFIG_SERIAL_8250_CONSOLE=y
--- a/Documentation/admin-guide/devices.txt~drivers-char-remove-dev-kmem-for-good
+++ a/Documentation/admin-guide/devices.txt
@@ -4,7 +4,7 @@
 
    1 char	Memory devices
 		  1 = /dev/mem		Physical memory access
-		  2 = /dev/kmem		Kernel virtual memory access
+		  2 = /dev/kmem		OBSOLETE - replaced by /proc/kcore
 		  3 = /dev/null		Null device
 		  4 = /dev/port		I/O port access
 		  5 = /dev/zero		Null byte source
--- a/drivers/char/Kconfig~drivers-char-remove-dev-kmem-for-good
+++ a/drivers/char/Kconfig
@@ -334,16 +334,6 @@ config DEVMEM
 	  memory.
 	  When in doubt, say "Y".
 
-config DEVKMEM
-	bool "/dev/kmem virtual device support"
-	# On arm64, VMALLOC_START < PAGE_OFFSET, which confuses kmem read/write
-	depends on !ARM64
-	help
-	  Say Y here if you want to support the /dev/kmem device. The
-	  /dev/kmem device is rarely used, but can be used for certain
-	  kind of kernel debugging operations.
-	  When in doubt, say "N".
-
 config NVRAM
 	tristate "/dev/nvram support"
 	depends on X86 || HAVE_ARCH_NVRAM_OPS
--- a/drivers/char/mem.c~drivers-char-remove-dev-kmem-for-good
+++ a/drivers/char/mem.c
@@ -403,221 +403,6 @@ static int mmap_mem(struct file *file, s
 	return 0;
 }
 
-static int mmap_kmem(struct file *file, struct vm_area_struct *vma)
-{
-	unsigned long pfn;
-
-	/* Turn a kernel-virtual address into a physical page frame */
-	pfn = __pa((u64)vma->vm_pgoff << PAGE_SHIFT) >> PAGE_SHIFT;
-
-	/*
-	 * RED-PEN: on some architectures there is more mapped memory than
-	 * available in mem_map which pfn_valid checks for. Perhaps should add a
-	 * new macro here.
-	 *
-	 * RED-PEN: vmalloc is not supported right now.
-	 */
-	if (!pfn_valid(pfn))
-		return -EIO;
-
-	vma->vm_pgoff = pfn;
-	return mmap_mem(file, vma);
-}
-
-/*
- * This function reads the *virtual* memory as seen by the kernel.
- */
-static ssize_t read_kmem(struct file *file, char __user *buf,
-			 size_t count, loff_t *ppos)
-{
-	unsigned long p = *ppos;
-	ssize_t low_count, read, sz;
-	char *kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
-	int err = 0;
-
-	read = 0;
-	if (p < (unsigned long) high_memory) {
-		low_count = count;
-		if (count > (unsigned long)high_memory - p)
-			low_count = (unsigned long)high_memory - p;
-
-#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
-		/* we don't have page 0 mapped on sparc and m68k.. */
-		if (p < PAGE_SIZE && low_count > 0) {
-			sz = size_inside_page(p, low_count);
-			if (clear_user(buf, sz))
-				return -EFAULT;
-			buf += sz;
-			p += sz;
-			read += sz;
-			low_count -= sz;
-			count -= sz;
-		}
-#endif
-		while (low_count > 0) {
-			sz = size_inside_page(p, low_count);
-
-			/*
-			 * On ia64 if a page has been mapped somewhere as
-			 * uncached, then it must also be accessed uncached
-			 * by the kernel or data corruption may occur
-			 */
-			kbuf = xlate_dev_kmem_ptr((void *)p);
-			if (!virt_addr_valid(kbuf))
-				return -ENXIO;
-
-			if (copy_to_user(buf, kbuf, sz))
-				return -EFAULT;
-			buf += sz;
-			p += sz;
-			read += sz;
-			low_count -= sz;
-			count -= sz;
-			if (should_stop_iteration()) {
-				count = 0;
-				break;
-			}
-		}
-	}
-
-	if (count > 0) {
-		kbuf = (char *)__get_free_page(GFP_KERNEL);
-		if (!kbuf)
-			return -ENOMEM;
-		while (count > 0) {
-			sz = size_inside_page(p, count);
-			if (!is_vmalloc_or_module_addr((void *)p)) {
-				err = -ENXIO;
-				break;
-			}
-			sz = vread(kbuf, (char *)p, sz);
-			if (!sz)
-				break;
-			if (copy_to_user(buf, kbuf, sz)) {
-				err = -EFAULT;
-				break;
-			}
-			count -= sz;
-			buf += sz;
-			read += sz;
-			p += sz;
-			if (should_stop_iteration())
-				break;
-		}
-		free_page((unsigned long)kbuf);
-	}
-	*ppos = p;
-	return read ? read : err;
-}
-
-
-static ssize_t do_write_kmem(unsigned long p, const char __user *buf,
-				size_t count, loff_t *ppos)
-{
-	ssize_t written, sz;
-	unsigned long copied;
-
-	written = 0;
-#ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
-	/* we don't have page 0 mapped on sparc and m68k.. */
-	if (p < PAGE_SIZE) {
-		sz = size_inside_page(p, count);
-		/* Hmm. Do something? */
-		buf += sz;
-		p += sz;
-		count -= sz;
-		written += sz;
-	}
-#endif
-
-	while (count > 0) {
-		void *ptr;
-
-		sz = size_inside_page(p, count);
-
-		/*
-		 * On ia64 if a page has been mapped somewhere as uncached, then
-		 * it must also be accessed uncached by the kernel or data
-		 * corruption may occur.
-		 */
-		ptr = xlate_dev_kmem_ptr((void *)p);
-		if (!virt_addr_valid(ptr))
-			return -ENXIO;
-
-		copied = copy_from_user(ptr, buf, sz);
-		if (copied) {
-			written += sz - copied;
-			if (written)
-				break;
-			return -EFAULT;
-		}
-		buf += sz;
-		p += sz;
-		count -= sz;
-		written += sz;
-		if (should_stop_iteration())
-			break;
-	}
-
-	*ppos += written;
-	return written;
-}
-
-/*
- * This function writes to the *virtual* memory as seen by the kernel.
- */
-static ssize_t write_kmem(struct file *file, const char __user *buf,
-			  size_t count, loff_t *ppos)
-{
-	unsigned long p = *ppos;
-	ssize_t wrote = 0;
-	ssize_t virtr = 0;
-	char *kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
-	int err = 0;
-
-	if (p < (unsigned long) high_memory) {
-		unsigned long to_write = min_t(unsigned long, count,
-					       (unsigned long)high_memory - p);
-		wrote = do_write_kmem(p, buf, to_write, ppos);
-		if (wrote != to_write)
-			return wrote;
-		p += wrote;
-		buf += wrote;
-		count -= wrote;
-	}
-
-	if (count > 0) {
-		kbuf = (char *)__get_free_page(GFP_KERNEL);
-		if (!kbuf)
-			return wrote ? wrote : -ENOMEM;
-		while (count > 0) {
-			unsigned long sz = size_inside_page(p, count);
-			unsigned long n;
-
-			if (!is_vmalloc_or_module_addr((void *)p)) {
-				err = -ENXIO;
-				break;
-			}
-			n = copy_from_user(kbuf, buf, sz);
-			if (n) {
-				err = -EFAULT;
-				break;
-			}
-			vwrite(kbuf, (char *)p, sz);
-			count -= sz;
-			buf += sz;
-			virtr += sz;
-			p += sz;
-			if (should_stop_iteration())
-				break;
-		}
-		free_page((unsigned long)kbuf);
-	}
-
-	*ppos = p;
-	return virtr + wrote ? : err;
-}
-
 static ssize_t read_port(struct file *file, char __user *buf,
 			 size_t count, loff_t *ppos)
 {
@@ -855,7 +640,6 @@ static int open_port(struct inode *inode
 #define write_zero	write_null
 #define write_iter_zero	write_iter_null
 #define open_mem	open_port
-#define open_kmem	open_mem
 
 static const struct file_operations __maybe_unused mem_fops = {
 	.llseek		= memory_lseek,
@@ -869,18 +653,6 @@ static const struct file_operations __ma
 #endif
 };
 
-static const struct file_operations __maybe_unused kmem_fops = {
-	.llseek		= memory_lseek,
-	.read		= read_kmem,
-	.write		= write_kmem,
-	.mmap		= mmap_kmem,
-	.open		= open_kmem,
-#ifndef CONFIG_MMU
-	.get_unmapped_area = get_unmapped_area_mem,
-	.mmap_capabilities = memory_mmap_capabilities,
-#endif
-};
-
 static const struct file_operations null_fops = {
 	.llseek		= null_lseek,
 	.read		= read_null,
@@ -925,9 +697,6 @@ static const struct memdev {
 #ifdef CONFIG_DEVMEM
 	 [DEVMEM_MINOR] = { "mem", 0, &mem_fops, FMODE_UNSIGNED_OFFSET },
 #endif
-#ifdef CONFIG_DEVKMEM
-	 [2] = { "kmem", 0, &kmem_fops, FMODE_UNSIGNED_OFFSET },
-#endif
 	 [3] = { "null", 0666, &null_fops, 0 },
 #ifdef CONFIG_DEVPORT
 	 [4] = { "port", 0, &port_fops, 0 },
--- a/include/linux/fs.h~drivers-char-remove-dev-kmem-for-good
+++ a/include/linux/fs.h
@@ -145,7 +145,7 @@ typedef int (dio_iodone_t)(struct kiocb
 /* Expect random access pattern */
 #define FMODE_RANDOM		((__force fmode_t)0x1000)
 
-/* File is huge (eg. /dev/kmem): treat loff_t as unsigned */
+/* File is huge (eg. /dev/mem): treat loff_t as unsigned */
 #define FMODE_UNSIGNED_OFFSET	((__force fmode_t)0x2000)
 
 /* File is opened with O_PATH; almost nothing can be done with it */
--- a/include/linux/vmalloc.h~drivers-char-remove-dev-kmem-for-good
+++ a/include/linux/vmalloc.h
@@ -227,7 +227,7 @@ static inline void set_vm_flush_reset_pe
 }
 #endif
 
-/* for /dev/kmem */
+/* for /proc/kcore */
 extern long vread(char *buf, char *addr, unsigned long count);
 extern long vwrite(char *buf, char *addr, unsigned long count);
 
--- a/kernel/configs/android-base.config~drivers-char-remove-dev-kmem-for-good
+++ a/kernel/configs/android-base.config
@@ -1,5 +1,4 @@
 #  KEEP ALPHABETICALLY SORTED
-# CONFIG_DEVKMEM is not set
 # CONFIG_DEVMEM is not set
 # CONFIG_FHANDLE is not set
 # CONFIG_INET_LRO is not set
--- a/mm/ksm.c~drivers-char-remove-dev-kmem-for-good
+++ a/mm/ksm.c
@@ -459,7 +459,7 @@ static inline bool ksm_test_exit(struct
  * but taking great care only to touch a ksm page, in a VM_MERGEABLE vma,
  * in case the application has unmapped and remapped mm,addr meanwhile.
  * Could a ksm page appear anywhere else?  Actually yes, in a VM_PFNMAP
- * mmap of /dev/mem or /dev/kmem, where we would not want to touch it.
+ * mmap of /dev/mem, where we would not want to touch it.
  *
  * FAULT_FLAG/FOLL_REMOTE are because we do this outside the context
  * of the process that owns 'vma'.  We also do not want to enforce
--- a/mm/vmalloc.c~drivers-char-remove-dev-kmem-for-good
+++ a/mm/vmalloc.c
@@ -3219,7 +3219,7 @@ static int aligned_vwrite(char *buf, cha
  * Note: In usual ops, vread() is never necessary because the caller
  * should know vmalloc() area is valid and can use memcpy().
  * This is for routines which have to access vmalloc area without
- * any information, as /dev/kmem.
+ * any information, as /proc/kcore.
  *
  * Return: number of bytes for which addr and buf should be increased
  * (same number as @count) or %0 if [addr...addr+count) doesn't
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 75/91] mm: remove xlate_dev_kmem_ptr()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (73 preceding siblings ...)
  2021-05-07  1:05 ` [patch 74/91] drivers/char: remove /dev/kmem for good Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 76/91] mm/vmalloc: remove vwrite() Andrew Morton
                   ` (16 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, arnd, bcain, benh, bigeasy, borntraeger, christophe.leroy,
	dalias, davem, david, deller, geert, gerald.schaefer, gor,
	green.hu, gregkh, hca, ink, James.Bottomley, jiaxun.yang, krzk,
	kuninori.morimoto.gx, linux-mm, linux, luc.vanoostenryck,
	mattst88, mcgrof, mhocko, mingo, mm-commits, mpatocka, mpe,
	palmerdabbelt, paulus, peterz, pmorel, rdunlap, rppt, rth,
	schnelle, torvalds, tsbogend, ysato

From: David Hildenbrand <david@redhat.com>
Subject: mm: remove xlate_dev_kmem_ptr()

Since /dev/kmem has been removed, let's remove the xlate_dev_kmem_ptr()
leftovers.

Link: https://lkml.kernel.org/r/20210324102351.6932-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Pierre Morel <pmorel@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/include/asm/io.h     |    5 -----
 arch/arm/include/asm/io.h       |    5 -----
 arch/hexagon/include/asm/io.h   |    1 -
 arch/ia64/include/asm/io.h      |    1 -
 arch/ia64/include/asm/uaccess.h |   18 ------------------
 arch/m68k/include/asm/io_mm.h   |    5 -----
 arch/mips/include/asm/io.h      |    5 -----
 arch/parisc/include/asm/io.h    |    5 -----
 arch/powerpc/include/asm/io.h   |    5 -----
 arch/s390/include/asm/io.h      |    5 -----
 arch/sh/include/asm/io.h        |    5 -----
 arch/sparc/include/asm/io_64.h  |    5 -----
 include/asm-generic/io.h        |   11 -----------
 13 files changed, 76 deletions(-)

--- a/arch/alpha/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/alpha/include/asm/io.h
@@ -602,11 +602,6 @@ extern void outsl (unsigned long port, c
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #endif /* __KERNEL__ */
 
 #endif /* __ALPHA_IO_H */
--- a/arch/arm/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/arm/include/asm/io.h
@@ -430,11 +430,6 @@ extern void pci_iounmap(struct pci_dev *
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #include <asm-generic/io.h>
 
 #ifdef CONFIG_MMU
--- a/arch/hexagon/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/hexagon/include/asm/io.h
@@ -64,7 +64,6 @@ static inline void *phys_to_virt(unsigne
  * convert a physical pointer to a virtual kernel pointer for
  * /dev/mem access.
  */
-#define xlate_dev_kmem_ptr(p)    __va(p)
 #define xlate_dev_mem_ptr(p)    __va(p)
 
 /*
--- a/arch/ia64/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/ia64/include/asm/io.h
@@ -277,7 +277,6 @@ extern void memset_io(volatile void __io
 #define memcpy_fromio memcpy_fromio
 #define memcpy_toio memcpy_toio
 #define memset_io memset_io
-#define xlate_dev_kmem_ptr xlate_dev_kmem_ptr
 #define xlate_dev_mem_ptr xlate_dev_mem_ptr
 #include <asm-generic/io.h>
 #undef PCI_IOBASE
--- a/arch/ia64/include/asm/uaccess.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/ia64/include/asm/uaccess.h
@@ -272,22 +272,4 @@ xlate_dev_mem_ptr(phys_addr_t p)
 	return ptr;
 }
 
-/*
- * Convert a virtual cached kernel memory pointer to an uncached pointer
- */
-static __inline__ void *
-xlate_dev_kmem_ptr(void *p)
-{
-	struct page *page;
-	void *ptr;
-
-	page = virt_to_page((unsigned long)p);
-	if (PageUncached(page))
-		ptr = (void *)__pa(p) + __IA64_UNCACHED_OFFSET;
-	else
-		ptr = p;
-
-	return ptr;
-}
-
 #endif /* _ASM_IA64_UACCESS_H */
--- a/arch/m68k/include/asm/io_mm.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/m68k/include/asm/io_mm.h
@@ -397,11 +397,6 @@ static inline void isa_delay(void)
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #define readb_relaxed(addr)	readb(addr)
 #define readw_relaxed(addr)	readw(addr)
 #define readl_relaxed(addr)	readl(addr)
--- a/arch/mips/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/mips/include/asm/io.h
@@ -564,11 +564,6 @@ extern void (*_dma_cache_inv)(unsigned l
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 void __ioread64_copy(void *to, const void __iomem *from, size_t count);
 
 #endif /* _ASM_IO_H */
--- a/arch/parisc/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/parisc/include/asm/io.h
@@ -316,11 +316,6 @@ extern void iowrite64be(u64 val, void __
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 extern int devmem_is_allowed(unsigned long pfn);
 
 #endif
--- a/arch/powerpc/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/powerpc/include/asm/io.h
@@ -663,11 +663,6 @@ static inline void name at					\
 #define xlate_dev_mem_ptr(p)	__va(p)
 
 /*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
-/*
  * We don't do relaxed operations yet, at least not with this semantic
  */
 #define readb_relaxed(addr)	readb(addr)
--- a/arch/s390/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/s390/include/asm/io.h
@@ -20,11 +20,6 @@ void *xlate_dev_mem_ptr(phys_addr_t phys
 #define unxlate_dev_mem_ptr unxlate_dev_mem_ptr
 void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr);
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #define IO_SPACE_LIMIT 0
 
 void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot);
--- a/arch/sh/include/asm/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/sh/include/asm/io.h
@@ -283,11 +283,6 @@ static inline void __iomem *ioremap_prot
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #define ARCH_HAS_VALID_PHYS_ADDR_RANGE
 int valid_phys_addr_range(phys_addr_t addr, size_t size);
 int valid_mmap_phys_addr_range(unsigned long pfn, size_t size);
--- a/arch/sparc/include/asm/io_64.h~mm-remove-xlate_dev_kmem_ptr
+++ a/arch/sparc/include/asm/io_64.h
@@ -454,11 +454,6 @@ void sbus_set_sbus64(struct device *, in
  */
 #define xlate_dev_mem_ptr(p)	__va(p)
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#define xlate_dev_kmem_ptr(p)	p
-
 #endif
 
 #endif /* !(__SPARC64_IO_H) */
--- a/include/asm-generic/io.h~mm-remove-xlate_dev_kmem_ptr
+++ a/include/asm-generic/io.h
@@ -1064,17 +1064,6 @@ static inline void pci_iounmap(struct pc
 #endif
 #endif /* CONFIG_GENERIC_IOMAP */
 
-/*
- * Convert a virtual cached pointer to an uncached pointer
- */
-#ifndef xlate_dev_kmem_ptr
-#define xlate_dev_kmem_ptr xlate_dev_kmem_ptr
-static inline void *xlate_dev_kmem_ptr(void *addr)
-{
-	return addr;
-}
-#endif
-
 #ifndef xlate_dev_mem_ptr
 #define xlate_dev_mem_ptr xlate_dev_mem_ptr
 static inline void *xlate_dev_mem_ptr(phys_addr_t addr)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 76/91] mm/vmalloc: remove vwrite()
  2021-05-07  1:01 incoming Andrew Morton
                   ` (74 preceding siblings ...)
  2021-05-07  1:06 ` [patch 75/91] mm: remove xlate_dev_kmem_ptr() Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 77/91] arm: print alloc free paths for address in registers Andrew Morton
                   ` (15 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, david, gregkh, hdanton, huang.ying.caritas, linux-mm,
	mhocko, minchan, mm-commits, oleksiy.avramchenko, rostedt,
	torvalds, willy

From: David Hildenbrand <david@redhat.com>
Subject: mm/vmalloc: remove vwrite()

The last user (/dev/kmem) is gone. Let's drop it.

Link: https://lkml.kernel.org/r/20210324102351.6932-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: huang ying <huang.ying.caritas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/vmalloc.h |    1 
 mm/nommu.c              |   10 ---
 mm/vmalloc.c            |  116 --------------------------------------
 3 files changed, 1 insertion(+), 126 deletions(-)

--- a/include/linux/vmalloc.h~mm-vmalloc-remove-vwrite
+++ a/include/linux/vmalloc.h
@@ -229,7 +229,6 @@ static inline void set_vm_flush_reset_pe
 
 /* for /proc/kcore */
 extern long vread(char *buf, char *addr, unsigned long count);
-extern long vwrite(char *buf, char *addr, unsigned long count);
 
 /*
  *	Internals.  Dont't use..
--- a/mm/nommu.c~mm-vmalloc-remove-vwrite
+++ a/mm/nommu.c
@@ -210,16 +210,6 @@ long vread(char *buf, char *addr, unsign
 	return count;
 }
 
-long vwrite(char *buf, char *addr, unsigned long count)
-{
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
-
-	memcpy(addr, buf, count);
-	return count;
-}
-
 /*
  *	vmalloc  -  allocate virtually contiguous memory
  *
--- a/mm/vmalloc.c~mm-vmalloc-remove-vwrite
+++ a/mm/vmalloc.c
@@ -3146,10 +3146,7 @@ static int aligned_vread(char *buf, char
 		 * kmap() and get small overhead in this access function.
 		 */
 		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
+			/* We can expect USER0 is not used -- see vread() */
 			void *map = kmap_atomic(p);
 			memcpy(buf, map + offset, length);
 			kunmap_atomic(map);
@@ -3164,43 +3161,6 @@ static int aligned_vread(char *buf, char
 	return copied;
 }
 
-static int aligned_vwrite(char *buf, char *addr, unsigned long count)
-{
-	struct page *p;
-	int copied = 0;
-
-	while (count) {
-		unsigned long offset, length;
-
-		offset = offset_in_page(addr);
-		length = PAGE_SIZE - offset;
-		if (length > count)
-			length = count;
-		p = vmalloc_to_page(addr);
-		/*
-		 * To do safe access to this _mapped_ area, we need
-		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
-		 * interface, rarely used. Instead of that, we'll use
-		 * kmap() and get small overhead in this access function.
-		 */
-		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
-			void *map = kmap_atomic(p);
-			memcpy(map + offset, buf, length);
-			kunmap_atomic(map);
-		}
-		addr += length;
-		buf += length;
-		copied += length;
-		count -= length;
-	}
-	return copied;
-}
-
 /**
  * vread() - read vmalloc area in a safe way.
  * @buf:     buffer for reading data
@@ -3283,80 +3243,6 @@ finished:
 	return buflen;
 }
 
-/**
- * vwrite() - write vmalloc area in a safe way.
- * @buf:      buffer for source data
- * @addr:     vm address.
- * @count:    number of bytes to be read.
- *
- * This function checks that addr is a valid vmalloc'ed area, and
- * copy data from a buffer to the given addr. If specified range of
- * [addr...addr+count) includes some valid address, data is copied from
- * proper area of @buf. If there are memory holes, no copy to hole.
- * IOREMAP area is treated as memory hole and no copy is done.
- *
- * If [addr...addr+count) doesn't includes any intersects with alive
- * vm_struct area, returns 0. @buf should be kernel's buffer.
- *
- * Note: In usual ops, vwrite() is never necessary because the caller
- * should know vmalloc() area is valid and can use memcpy().
- * This is for routines which have to access vmalloc area without
- * any information, as /dev/kmem.
- *
- * Return: number of bytes for which addr and buf should be
- * increased (same number as @count) or %0 if [addr...addr+count)
- * doesn't include any intersection with valid vmalloc area
- */
-long vwrite(char *buf, char *addr, unsigned long count)
-{
-	struct vmap_area *va;
-	struct vm_struct *vm;
-	char *vaddr;
-	unsigned long n, buflen;
-	int copied = 0;
-
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
-	buflen = count;
-
-	spin_lock(&vmap_area_lock);
-	list_for_each_entry(va, &vmap_area_list, list) {
-		if (!count)
-			break;
-
-		if (!va->vm)
-			continue;
-
-		vm = va->vm;
-		vaddr = (char *) vm->addr;
-		if (addr >= vaddr + get_vm_area_size(vm))
-			continue;
-		while (addr < vaddr) {
-			if (count == 0)
-				goto finished;
-			buf++;
-			addr++;
-			count--;
-		}
-		n = vaddr + get_vm_area_size(vm) - addr;
-		if (n > count)
-			n = count;
-		if (!(vm->flags & VM_IOREMAP)) {
-			aligned_vwrite(buf, addr, n);
-			copied++;
-		}
-		buf += n;
-		addr += n;
-		count -= n;
-	}
-finished:
-	spin_unlock(&vmap_area_lock);
-	if (!copied)
-		return 0;
-	return buflen;
-}
-
 /**
  * remap_vmalloc_range_partial - map vmalloc pages to userspace
  * @vma:		vma to cover
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 77/91] arm: print alloc free paths for address in registers
  2021-05-07  1:01 incoming Andrew Morton
                   ` (75 preceding siblings ...)
  2021-05-07  1:06 ` [patch 76/91] mm/vmalloc: remove vwrite() Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 78/91] scripts/spelling.txt: add "overlfow" Andrew Morton
                   ` (14 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: 0x7f454c46, akpm, cl, iamjoonsoo.kim, linux-mm, linux,
	maninder1.s, mm-commits, paulmck, penberg, rientjes, torvalds,
	v.narang, vbabka, viro

From: Maninder Singh <maninder1.s@samsung.com>
Subject: arm: print alloc free paths for address in registers

In case of a use after free kernel oops, the freeing path of the object is
required to debug futher.  In most of cases the object address is present
in one of the registers.

Thus check the register's address and if it belongs to slab, print its
alloc and free path.

e.g.  in the below issue register r6 belongs to slab, and a use after free
issue occurred on one of its dereferenced values:

[   20.182197] Unable to handle kernel paging request at virtual address 6b6b6b6f
....
[   20.185035] pc : [<c0538afc>]    lr : [<c0465674>]    psr: 60000013
[   20.185271] sp : c8927d40  ip : ffffefff  fp : c8aa8020
[   20.185462] r10: c8927e10  r9 : 00000001  r8 : 00400cc0
[   20.185674] r7 : 00000000  r6 : c8ab0180  r5 : c1804a80  r4 : c8aa8008
[   20.185924] r3 : c1a5661c  r2 : 00000000  r1 : 6b6b6b6b  r0 : c139bf48
.....
[   20.191499] Register r6 information: slab kmalloc-64 start c8ab0140 data offset 64 pointer offset 0 size 64 allocated at meminfo_proc_show+0x40/0x4fc
[   20.192078]     meminfo_proc_show+0x40/0x4fc
[   20.192263]     seq_read_iter+0x18c/0x4c4
[   20.192430]     proc_reg_read_iter+0x84/0xac
[   20.192617]     generic_file_splice_read+0xe8/0x17c
[   20.192816]     splice_direct_to_actor+0xb8/0x290
[   20.193008]     do_splice_direct+0xa0/0xe0
[   20.193185]     do_sendfile+0x2d0/0x438
[   20.193345]     sys_sendfile64+0x12c/0x140
[   20.193523]     ret_fast_syscall+0x0/0x58
[   20.193695]     0xbeeacde4
[   20.193822]  Free path:
[   20.193935]     meminfo_proc_show+0x5c/0x4fc
[   20.194115]     seq_read_iter+0x18c/0x4c4
[   20.194285]     proc_reg_read_iter+0x84/0xac
[   20.194475]     generic_file_splice_read+0xe8/0x17c
[   20.194685]     splice_direct_to_actor+0xb8/0x290
[   20.194870]     do_splice_direct+0xa0/0xe0
[   20.195014]     do_sendfile+0x2d0/0x438
[   20.195174]     sys_sendfile64+0x12c/0x140
[   20.195336]     ret_fast_syscall+0x0/0x58
[   20.195491]     0xbeeacde4

Link: https://lkml.kernel.org/r/1615891032-29160-3-git-send-email-maninder1.s@samsung.com
Co-developed-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm/include/asm/bug.h |    1 +
 arch/arm/kernel/process.c  |   11 +++++++++++
 arch/arm/kernel/traps.c    |    1 +
 3 files changed, 13 insertions(+)

--- a/arch/arm/include/asm/bug.h~arm-print-alloc-free-paths-for-address-in-registers
+++ a/arch/arm/include/asm/bug.h
@@ -88,5 +88,6 @@ extern asmlinkage void c_backtrace(unsig
 struct mm_struct;
 void show_pte(const char *lvl, struct mm_struct *mm, unsigned long addr);
 extern void __show_regs(struct pt_regs *);
+extern void __show_regs_alloc_free(struct pt_regs *regs);
 
 #endif
--- a/arch/arm/kernel/process.c~arm-print-alloc-free-paths-for-address-in-registers
+++ a/arch/arm/kernel/process.c
@@ -92,6 +92,17 @@ void arch_cpu_idle_exit(void)
 	ledtrig_cpu(CPU_LED_IDLE_END);
 }
 
+void __show_regs_alloc_free(struct pt_regs *regs)
+{
+	int i;
+
+	/* check for r0 - r12 only */
+	for (i = 0; i < 13; i++) {
+		pr_alert("Register r%d information:", i);
+		mem_dump_obj((void *)regs->uregs[i]);
+	}
+}
+
 void __show_regs(struct pt_regs *regs)
 {
 	unsigned long flags;
--- a/arch/arm/kernel/traps.c~arm-print-alloc-free-paths-for-address-in-registers
+++ a/arch/arm/kernel/traps.c
@@ -287,6 +287,7 @@ static int __die(const char *str, int er
 
 	print_modules();
 	__show_regs(regs);
+	__show_regs_alloc_free(regs);
 	pr_emerg("Process %.*s (pid: %d, stack limit = 0x%p)\n",
 		 TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), end_of_stack(tsk));
 
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 78/91] scripts/spelling.txt: add "overlfow"
  2021-05-07  1:01 incoming Andrew Morton
                   ` (76 preceding siblings ...)
  2021-05-07  1:06 ` [patch 77/91] arm: print alloc free paths for address in registers Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 79/91] scripts/spelling.txt: add "diabled" typo Andrew Morton
                   ` (13 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, drew, keescook, linux-mm, mm-commits, torvalds

From: Drew Fustini <drew@beagleboard.org>
Subject: scripts/spelling.txt: add "overlfow"

Add typo "overlfow" for "overflow". This typo was found and fixed in
net/sctp/tsnmap.c.

Link: https://lore.kernel.org/netdev/20210304055548.56829-1-drew@beagleboard.org/
Link: https://lkml.kernel.org/r/20210304072657.64577-1-drew@beagleboard.org
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |    1 +
 1 file changed, 1 insertion(+)

--- a/scripts/spelling.txt~scripts-spellingtxt-add-overlfow
+++ a/scripts/spelling.txt
@@ -1027,6 +1027,7 @@ oustanding||outstanding
 overaall||overall
 overhread||overhead
 overlaping||overlapping
+overlfow||overflow
 overide||override
 overrided||overridden
 overriden||overridden
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 79/91] scripts/spelling.txt: add "diabled" typo
  2021-05-07  1:01 incoming Andrew Morton
                   ` (77 preceding siblings ...)
  2021-05-07  1:06 ` [patch 78/91] scripts/spelling.txt: add "overlfow" Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 80/91] scripts/spelling.txt: add "overflw" Andrew Morton
                   ` (12 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, zuoqilin

From: zuoqilin <zuoqilin@yulong.com>
Subject: scripts/spelling.txt: Add "diabled" typo

Increase "diabled" spelling error check.

Link: https://lkml.kernel.org/r/20210304070106.2313-1-zuoqilin1@163.com
Signed-off-by: zuoqilin <zuoqilin@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |    1 +
 1 file changed, 1 insertion(+)

--- a/scripts/spelling.txt~scripts-spellingtxt-add-diabled-typo
+++ a/scripts/spelling.txt
@@ -480,6 +480,7 @@ devided||divided
 deviece||device
 devision||division
 diable||disable
+diabled||disabled
 dicline||decline
 dictionnary||dictionary
 didnt||didn't
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 80/91] scripts/spelling.txt: add "overflw"
  2021-05-07  1:01 incoming Andrew Morton
                   ` (78 preceding siblings ...)
  2021-05-07  1:06 ` [patch 79/91] scripts/spelling.txt: add "diabled" typo Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 81/91] mm/slab.c: fix spelling mistake "disired" -> "desired" Andrew Morton
                   ` (11 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, drew, gustavoars, linux-mm, mm-commits, torvalds

From: Drew Fustini <drew@beagleboard.org>
Subject: scripts/spelling.txt: add "overflw"

Add typo "overflw" for "overflow".  This typo was found and fixed in
drivers/clocksource/timer-pistachio.c.

Link: https://lore.kernel.org/lkml/20210305090315.384547-1-drew@beagleboard.org/
Link: https://lkml.kernel.org/r/20210305095151.388182-1-drew@beagleboard.org
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Suggested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |    1 +
 1 file changed, 1 insertion(+)

--- a/scripts/spelling.txt~scripts-spellingtxt-add-overflw
+++ a/scripts/spelling.txt
@@ -1028,6 +1028,7 @@ oustanding||outstanding
 overaall||overall
 overhread||overhead
 overlaping||overlapping
+overflw||overflow
 overlfow||overflow
 overide||override
 overrided||overridden
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 81/91] mm/slab.c: fix spelling mistake "disired" -> "desired"
  2021-05-07  1:01 incoming Andrew Morton
                   ` (79 preceding siblings ...)
  2021-05-07  1:06 ` [patch 80/91] scripts/spelling.txt: add "overflw" Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 82/91] include/linux/pgtable.h: few spelling fixes Andrew Morton
                   ` (10 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, colin.king, linux-mm, mm-commits, torvalds

From: Colin Ian King <colin.king@canonical.com>
Subject: mm/slab.c: fix spelling mistake "disired" -> "desired"

There is a spelling mistake in a comment. Fix it.

Link: https://lkml.kernel.org/r/20210317094158.5762-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/slab.c~mm-slab-fix-spelling-mistake-disired-desired
+++ a/mm/slab.c
@@ -2284,7 +2284,7 @@ void __kmem_cache_release(struct kmem_ca
  * Because if it is the case, that means we defer the creation of
  * the kmalloc_{dma,}_cache of size sizeof(slab descriptor) to this point.
  * And we eventually call down to __kmem_cache_create(), which
- * in turn looks up in the kmalloc_{dma,}_caches for the disired-size one.
+ * in turn looks up in the kmalloc_{dma,}_caches for the desired-size one.
  * This is a "chicken-and-egg" problem.
  *
  * So the off-slab slab descriptor shall come from the kmalloc_{dma,}_caches,
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 82/91] include/linux/pgtable.h: few spelling fixes
  2021-05-07  1:01 incoming Andrew Morton
                   ` (80 preceding siblings ...)
  2021-05-07  1:06 ` [patch 81/91] mm/slab.c: fix spelling mistake "disired" -> "desired" Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 83/91] kernel/umh.c: fix some spelling mistakes Andrew Morton
                   ` (9 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rdunlap, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: include/linux/pgtable.h: few spelling fixes

Few spelling fixes throughout the file.

Link: https://lkml.kernel.org/r/20210318201404.6380-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/pgtable.h |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/include/linux/pgtable.h~mm-few-spelling-fixes
+++ a/include/linux/pgtable.h
@@ -426,7 +426,7 @@ static inline void ptep_set_wrprotect(st
 
 /*
  * On some architectures hardware does not set page access bit when accessing
- * memory page, it is responsibilty of software setting this bit. It brings
+ * memory page, it is responsibility of software setting this bit. It brings
  * out extra page fault penalty to track page access bit. For optimization page
  * access bit can be set during all page fault flow on these arches.
  * To be differentiate with macro pte_mkyoung, this macro is used on platforms
@@ -519,7 +519,7 @@ extern pgtable_t pgtable_trans_huge_with
 /*
  * This is an implementation of pmdp_establish() that is only suitable for an
  * architecture that doesn't have hardware dirty/accessed bits. In this case we
- * can't race with CPU which sets these bits and non-atomic aproach is fine.
+ * can't race with CPU which sets these bits and non-atomic approach is fine.
  */
 static inline pmd_t generic_pmdp_establish(struct vm_area_struct *vma,
 		unsigned long address, pmd_t *pmdp, pmd_t pmd)
@@ -852,7 +852,7 @@ static inline void __ptep_modify_prot_co
  * updates, but to prevent any updates it may make from being lost.
  *
  * This does not protect against other software modifications of the
- * pte; the appropriate pte lock must be held over the transation.
+ * pte; the appropriate pte lock must be held over the transaction.
  *
  * Note that this interface is intended to be batchable, meaning that
  * ptep_modify_prot_commit may not actually update the pte, but merely
@@ -1281,13 +1281,13 @@ static inline int pmd_none_or_trans_huge
 	 *
 	 * The complete check uses is_pmd_migration_entry() in linux/swapops.h
 	 * But using that requires moving current function and pmd_trans_unstable()
-	 * to linux/swapops.h to resovle dependency, which is too much code move.
+	 * to linux/swapops.h to resolve dependency, which is too much code move.
 	 *
 	 * !pmd_present() is equivalent to is_pmd_migration_entry() currently,
 	 * because !pmd_present() pages can only be under migration not swapped
 	 * out.
 	 *
-	 * pmd_none() is preseved for future condition checks on pmd migration
+	 * pmd_none() is preserved for future condition checks on pmd migration
 	 * entries and not confusing with this function name, although it is
 	 * redundant with !pmd_present().
 	 */
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 83/91] kernel/umh.c: fix some spelling mistakes
  2021-05-07  1:01 incoming Andrew Morton
                   ` (81 preceding siblings ...)
  2021-05-07  1:06 ` [patch 82/91] include/linux/pgtable.h: few spelling fixes Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 84/91] kernel/user_namespace.c: fix typos Andrew Morton
                   ` (8 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mcgrof, mm-commits, torvalds, zhouchuangao

From: zhouchuangao <zhouchuangao@vivo.com>
Subject: kernel/umh.c: fix some spelling mistakes

Fix some spelling mistakes, and modify the order of the parameter comments
to be consistent with the order of the parameters passed to the function.

Link: https://lkml.kernel.org/r/1615636139-4076-1-git-send-email-zhouchuangao@vivo.com
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/umh.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/kernel/umh.c~umh-fix-some-spelling-mistakes
+++ a/kernel/umh.c
@@ -338,8 +338,8 @@ static void helper_unlock(void)
  * @argv: arg vector for process
  * @envp: environment for process
  * @gfp_mask: gfp mask for memory allocation
- * @cleanup: a cleanup function
  * @init: an init function
+ * @cleanup: a cleanup function
  * @data: arbitrary context sensitive data
  *
  * Returns either %NULL on allocation failure, or a subprocess_info
@@ -350,7 +350,7 @@ static void helper_unlock(void)
  * exec.  A non-zero return code causes the process to error out, exit,
  * and return the failure to the calling process
  *
- * The cleanup function is just before ethe subprocess_info is about to
+ * The cleanup function is just before the subprocess_info is about to
  * be freed.  This can be used for freeing the argv and envp.  The
  * Function must be runnable in either a process context or the
  * context in which call_usermodehelper_exec is called.
@@ -386,7 +386,7 @@ EXPORT_SYMBOL(call_usermodehelper_setup)
 
 /**
  * call_usermodehelper_exec - start a usermode application
- * @sub_info: information about the subprocessa
+ * @sub_info: information about the subprocess
  * @wait: wait for the application to finish and return status.
  *        when UMH_NO_WAIT don't wait at all, but you get no useful error back
  *        when the program couldn't be exec'ed. This makes it safe to call
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 84/91] kernel/user_namespace.c: fix typos
  2021-05-07  1:01 incoming Andrew Morton
                   ` (82 preceding siblings ...)
  2021-05-07  1:06 ` [patch 83/91] kernel/umh.c: fix some spelling mistakes Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 85/91] kernel/up.c: fix typo Andrew Morton
                   ` (7 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, caoxiaofeng, cxfcosmos, linux-mm, mm-commits, torvalds

From: Xiaofeng Cao <cxfcosmos@gmail.com>
Subject: kernel/user_namespace.c: fix typos

change 'verifing' to 'verifying'
change 'certaint' to 'certain'
change 'approprpiate' to 'appropriate'

Link: https://lkml.kernel.org/r/20210317100129.12440-1-caoxiaofeng@yulong.com
Signed-off-by: Xiaofeng Cao <caoxiaofeng@yulong.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/user_namespace.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/kernel/user_namespace.c~kernel-user_namespace-fix-typo-issue
+++ a/kernel/user_namespace.c
@@ -85,7 +85,7 @@ int create_user_ns(struct cred *new)
 	/*
 	 * Verify that we can not violate the policy of which files
 	 * may be accessed that is specified by the root directory,
-	 * by verifing that the root directory is at the root of the
+	 * by verifying that the root directory is at the root of the
 	 * mount namespace which allows all files to be accessed.
 	 */
 	ret = -EPERM;
@@ -1014,7 +1014,7 @@ static ssize_t map_write(struct file *fi
 			goto out;
 		ret = -EINVAL;
 	}
-	/* Be very certaint the new map actually exists */
+	/* Be very certain the new map actually exists */
 	if (new_map.nr_extents == 0)
 		goto out;
 
@@ -1169,7 +1169,7 @@ static bool new_idmap_permitted(const st
 
 	/* Allow the specified ids if we have the appropriate capability
 	 * (CAP_SETUID or CAP_SETGID) over the parent user namespace.
-	 * And the opener of the id file also had the approprpiate capability.
+	 * And the opener of the id file also has the appropriate capability.
 	 */
 	if (ns_capable(ns->parent, cap_setid) &&
 	    file_ns_capable(file, ns->parent, cap_setid))
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 85/91] kernel/up.c: fix typo
  2021-05-07  1:01 incoming Andrew Morton
                   ` (83 preceding siblings ...)
  2021-05-07  1:06 ` [patch 84/91] kernel/user_namespace.c: fix typos Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 86/91] kernel/sys.c: " Andrew Morton
                   ` (6 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: kernel/up.c: fix typo

s/condtions/conditions/

Link: https://lkml.kernel.org/r/20210317032732.3260835-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/up.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/up.c~kernel-fix-a-typo-in-the-file-upc
+++ a/kernel/up.c
@@ -38,7 +38,7 @@ EXPORT_SYMBOL(smp_call_function_single_a
 
 /*
  * Preemption is disabled here to make sure the cond_func is called under the
- * same condtions in UP and SMP.
+ * same conditions in UP and SMP.
  */
 void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func,
 			   void *info, bool wait, const struct cpumask *mask)
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 86/91] kernel/sys.c: fix typo
  2021-05-07  1:01 incoming Andrew Morton
                   ` (84 preceding siblings ...)
  2021-05-07  1:06 ` [patch 85/91] kernel/up.c: fix typo Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 87/91] fs: fat: fix spelling typo of values Andrew Morton
                   ` (5 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, caoxiaofeng, linux-mm, mm-commits, rdunlap, torvalds

From: Xiaofeng Cao <caoxiaofeng@yulong.com>
Subject: kernel/sys.c: fix typo

change 'infite'     to 'infinite'
change 'concurent'  to 'concurrent'
change 'memvers'    to 'members'
change 'decendants' to 'descendants'
change 'argumets'   to 'arguments'

Link: https://lkml.kernel.org/r/20210316112904.10661-1-cxfcosmos@gmail.com
Signed-off-by: Xiaofeng Cao <caoxiaofeng@yulong.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/sys.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/kernel/sys.c~kernel-sys-fix-typo-issue
+++ a/kernel/sys.c
@@ -1590,7 +1590,7 @@ int do_prlimit(struct task_struct *tsk,
 
 	/*
 	 * RLIMIT_CPU handling. Arm the posix CPU timer if the limit is not
-	 * infite. In case of RLIM_INFINITY the posix CPU timer code
+	 * infinite. In case of RLIM_INFINITY the posix CPU timer code
 	 * ignores the rlimit.
 	 */
 	 if (!retval && new_rlim && resource == RLIMIT_CPU &&
@@ -2029,7 +2029,7 @@ static int prctl_set_mm_map(int opt, con
 	}
 
 	/*
-	 * arg_lock protects concurent updates but we still need mmap_lock for
+	 * arg_lock protects concurrent updates but we still need mmap_lock for
 	 * read to exclude races with sys_brk.
 	 */
 	mmap_read_lock(mm);
@@ -2041,7 +2041,7 @@ static int prctl_set_mm_map(int opt, con
 	 * output in procfs mostly, except
 	 *
 	 *  - @start_brk/@brk which are used in do_brk_flags but kernel lookups
-	 *    for VMAs when updating these memvers so anything wrong written
+	 *    for VMAs when updating these members so anything wrong written
 	 *    here cause kernel to swear at userspace program but won't lead
 	 *    to any problem in kernel itself
 	 */
@@ -2143,7 +2143,7 @@ static int prctl_set_mm(int opt, unsigne
 	error = -EINVAL;
 
 	/*
-	 * arg_lock protects concurent updates of arg boundaries, we need
+	 * arg_lock protects concurrent updates of arg boundaries, we need
 	 * mmap_lock for a) concurrent sys_brk, b) finding VMA for addr
 	 * validation.
 	 */
@@ -2210,7 +2210,7 @@ static int prctl_set_mm(int opt, unsigne
 	 * If command line arguments and environment
 	 * are placed somewhere else on stack, we can
 	 * set them up here, ARG_START/END to setup
-	 * command line argumets and ENV_START/END
+	 * command line arguments and ENV_START/END
 	 * for environment.
 	 */
 	case PR_SET_MM_START_STACK:
@@ -2258,8 +2258,8 @@ static int prctl_get_tid_address(struct
 static int propagate_has_child_subreaper(struct task_struct *p, void *data)
 {
 	/*
-	 * If task has has_child_subreaper - all its decendants
-	 * already have these flag too and new decendants will
+	 * If task has has_child_subreaper - all its descendants
+	 * already have these flag too and new descendants will
 	 * inherit it on fork, skip them.
 	 *
 	 * If we've found child_reaper - skip descendants in
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 87/91] fs: fat: fix spelling typo of values
  2021-05-07  1:01 incoming Andrew Morton
                   ` (85 preceding siblings ...)
  2021-05-07  1:06 ` [patch 86/91] kernel/sys.c: " Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 88/91] ipc/sem.c: spelling fix Andrew Morton
                   ` (4 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, dingsenjie, hirofumi, linux-mm, mm-commits, torvalds

From: dingsenjie <dingsenjie@yulong.com>
Subject: fs: fat: fix spelling typo of values

vaules -> values

Link: https://lkml.kernel.org/r/20210302034817.30384-1-dingsenjie@163.com
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/fat/fatent.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/fat/fatent.c~fs-fat-fix-spelling-typo-of-values
+++ a/fs/fat/fatent.c
@@ -771,7 +771,7 @@ int fat_trim_fs(struct inode *inode, str
 	/*
 	 * FAT data is organized as clusters, trim at the granulary of cluster.
 	 *
-	 * fstrim_range is in byte, convert vaules to cluster index.
+	 * fstrim_range is in byte, convert values to cluster index.
 	 * Treat sectors before data region as all used, not to trim them.
 	 */
 	ent_start = max_t(u64, range->start>>sbi->cluster_bits, FAT_START_ENT);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 88/91] ipc/sem.c: spelling fix
  2021-05-07  1:01 incoming Andrew Morton
                   ` (86 preceding siblings ...)
  2021-05-07  1:06 ` [patch 87/91] fs: fat: fix spelling typo of values Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 89/91] treewide: remove editor modelines and cruft Andrew Morton
                   ` (3 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rdunlap, torvalds, unixbhaskar

From: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Subject: ipc/sem.c: spelling fix

s/purpuse/purpose/

Link: https://lkml.kernel.org/r/20210319221432.26631-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/sem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/ipc/sem.c~ipc-semc-couple-of-spelling-fixes
+++ a/ipc/sem.c
@@ -786,7 +786,7 @@ static inline void wake_up_sem_queue_pre
 {
 	get_task_struct(q->sleeper);
 
-	/* see SEM_BARRIER_2 for purpuse/pairing */
+	/* see SEM_BARRIER_2 for purpose/pairing */
 	smp_store_release(&q->status, error);
 
 	wake_q_add_safe(wake_q, q->sleeper);
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 89/91] treewide: remove editor modelines and cruft
  2021-05-07  1:01 incoming Andrew Morton
                   ` (87 preceding siblings ...)
  2021-05-07  1:06 ` [patch 88/91] ipc/sem.c: spelling fix Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 90/91] mm: fix typos in comments Andrew Morton
                   ` (2 subsequent siblings)
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, geert, linux-mm, masahiroy, mm-commits, ojeda, torvalds

From: Masahiro Yamada <masahiroy@kernel.org>
Subject: treewide: remove editor modelines and cruft

The section "19) Editor modelines and other cruft" in
Documentation/process/coding-style.rst clearly says, "Do not include any
of these in source files."

I recently receive a patch to explicitly add a new one.

Let's do treewide cleanups, otherwise some people follow the existing code
and attempt to upstream their favoriate editor setups.

It is even nicer if scripts/checkpatch.pl can check it.

If we like to impose coding style in an editor-independent manner, I think
editorconfig (patch [1]) is a saner solution.

[1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/

Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>	[auxdisplay]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/m68k/atari/time.c                                 |    7 ---
 arch/parisc/include/asm/pdc_chassis.h                  |    1 
 arch/um/drivers/cow.h                                  |    7 ---
 drivers/auxdisplay/panel.c                             |    7 ---
 drivers/gpu/drm/qxl/qxl_drv.c                          |    1 
 drivers/media/usb/pwc/pwc-uncompress.c                 |    3 -
 drivers/net/ethernet/adaptec/starfire.c                |    8 ----
 drivers/net/ethernet/amd/atarilance.c                  |    8 ----
 drivers/net/ethernet/amd/pcnet32.c                     |    7 ---
 drivers/net/wireless/intersil/orinoco/orinoco_nortel.c |    8 ----
 drivers/net/wireless/intersil/orinoco/orinoco_pci.c    |    8 ----
 drivers/net/wireless/intersil/orinoco/orinoco_plx.c    |    8 ----
 drivers/net/wireless/intersil/orinoco/orinoco_tmd.c    |    8 ----
 drivers/parport/parport_ip32.c                         |   12 ------
 drivers/platform/x86/dell/dell_rbu.c                   |    3 -
 drivers/scsi/53c700.c                                  |    1 
 drivers/scsi/53c700.h                                  |    1 
 drivers/scsi/ch.c                                      |    6 ---
 drivers/scsi/ips.c                                     |   20 ----------
 drivers/scsi/ips.h                                     |   20 ----------
 drivers/scsi/lasi700.c                                 |    1 
 drivers/scsi/megaraid/mbox_defs.h                      |    2 -
 drivers/scsi/megaraid/mega_common.h                    |    2 -
 drivers/scsi/megaraid/megaraid_mbox.c                  |    2 -
 drivers/scsi/megaraid/megaraid_mbox.h                  |    2 -
 drivers/scsi/qla1280.c                                 |   12 ------
 drivers/scsi/sni_53c710.c                              |    1 
 drivers/video/fbdev/matrox/matroxfb_base.c             |    9 ----
 drivers/video/fbdev/vga16fb.c                          |   10 -----
 fs/configfs/configfs_internal.h                        |    4 --
 fs/configfs/dir.c                                      |    4 --
 fs/configfs/file.c                                     |    4 --
 fs/configfs/inode.c                                    |    4 --
 fs/configfs/item.c                                     |    4 --
 fs/configfs/mount.c                                    |    4 --
 fs/configfs/symlink.c                                  |    4 --
 fs/nfs/dir.c                                           |    7 ---
 fs/nfs/nfs4proc.c                                      |    6 ---
 fs/nfs/nfs4renewd.c                                    |    6 ---
 fs/nfs/nfs4state.c                                     |    6 ---
 fs/nfs/nfs4xdr.c                                       |    6 ---
 fs/nfsd/nfs4proc.c                                     |    6 ---
 fs/nfsd/nfs4xdr.c                                      |    6 ---
 fs/nfsd/xdr4.h                                         |    6 ---
 fs/ocfs2/acl.c                                         |    4 --
 fs/ocfs2/acl.h                                         |    4 --
 fs/ocfs2/alloc.c                                       |    4 --
 fs/ocfs2/alloc.h                                       |    4 --
 fs/ocfs2/aops.c                                        |    4 --
 fs/ocfs2/aops.h                                        |    4 --
 fs/ocfs2/blockcheck.c                                  |    4 --
 fs/ocfs2/blockcheck.h                                  |    4 --
 fs/ocfs2/buffer_head_io.c                              |    4 --
 fs/ocfs2/buffer_head_io.h                              |    4 --
 fs/ocfs2/cluster/heartbeat.c                           |    4 --
 fs/ocfs2/cluster/heartbeat.h                           |    4 --
 fs/ocfs2/cluster/masklog.c                             |    4 --
 fs/ocfs2/cluster/masklog.h                             |    4 --
 fs/ocfs2/cluster/netdebug.c                            |    4 --
 fs/ocfs2/cluster/nodemanager.c                         |    4 --
 fs/ocfs2/cluster/nodemanager.h                         |    4 --
 fs/ocfs2/cluster/ocfs2_heartbeat.h                     |    4 --
 fs/ocfs2/cluster/ocfs2_nodemanager.h                   |    4 --
 fs/ocfs2/cluster/quorum.c                              |    4 --
 fs/ocfs2/cluster/quorum.h                              |    4 --
 fs/ocfs2/cluster/sys.c                                 |    4 --
 fs/ocfs2/cluster/sys.h                                 |    4 --
 fs/ocfs2/cluster/tcp.c                                 |    4 --
 fs/ocfs2/cluster/tcp.h                                 |    4 --
 fs/ocfs2/cluster/tcp_internal.h                        |    4 --
 fs/ocfs2/dcache.c                                      |    4 --
 fs/ocfs2/dcache.h                                      |    4 --
 fs/ocfs2/dir.c                                         |    4 --
 fs/ocfs2/dir.h                                         |    4 --
 fs/ocfs2/dlm/dlmapi.h                                  |    4 --
 fs/ocfs2/dlm/dlmast.c                                  |    4 --
 fs/ocfs2/dlm/dlmcommon.h                               |    4 --
 fs/ocfs2/dlm/dlmconvert.c                              |    4 --
 fs/ocfs2/dlm/dlmconvert.h                              |    4 --
 fs/ocfs2/dlm/dlmdebug.c                                |    4 --
 fs/ocfs2/dlm/dlmdebug.h                                |    4 --
 fs/ocfs2/dlm/dlmdomain.c                               |    4 --
 fs/ocfs2/dlm/dlmdomain.h                               |    4 --
 fs/ocfs2/dlm/dlmlock.c                                 |    4 --
 fs/ocfs2/dlm/dlmmaster.c                               |    4 --
 fs/ocfs2/dlm/dlmrecovery.c                             |    4 --
 fs/ocfs2/dlm/dlmthread.c                               |    4 --
 fs/ocfs2/dlm/dlmunlock.c                               |    4 --
 fs/ocfs2/dlmfs/dlmfs.c                                 |    4 --
 fs/ocfs2/dlmfs/userdlm.c                               |    4 --
 fs/ocfs2/dlmfs/userdlm.h                               |    4 --
 fs/ocfs2/dlmglue.c                                     |    4 --
 fs/ocfs2/dlmglue.h                                     |    4 --
 fs/ocfs2/export.c                                      |    4 --
 fs/ocfs2/export.h                                      |    4 --
 fs/ocfs2/extent_map.c                                  |    4 --
 fs/ocfs2/extent_map.h                                  |    4 --
 fs/ocfs2/file.c                                        |    4 --
 fs/ocfs2/file.h                                        |    4 --
 fs/ocfs2/filecheck.c                                   |    4 --
 fs/ocfs2/filecheck.h                                   |    4 --
 fs/ocfs2/heartbeat.c                                   |    4 --
 fs/ocfs2/heartbeat.h                                   |    4 --
 fs/ocfs2/inode.c                                       |    4 --
 fs/ocfs2/inode.h                                       |    4 --
 fs/ocfs2/journal.c                                     |    4 --
 fs/ocfs2/journal.h                                     |    4 --
 fs/ocfs2/localalloc.c                                  |    4 --
 fs/ocfs2/localalloc.h                                  |    4 --
 fs/ocfs2/locks.c                                       |    4 --
 fs/ocfs2/locks.h                                       |    4 --
 fs/ocfs2/mmap.c                                        |    4 --
 fs/ocfs2/move_extents.c                                |    4 --
 fs/ocfs2/move_extents.h                                |    4 --
 fs/ocfs2/namei.c                                       |    4 --
 fs/ocfs2/namei.h                                       |    4 --
 fs/ocfs2/ocfs1_fs_compat.h                             |    4 --
 fs/ocfs2/ocfs2.h                                       |    4 --
 fs/ocfs2/ocfs2_fs.h                                    |    4 --
 fs/ocfs2/ocfs2_ioctl.h                                 |    4 --
 fs/ocfs2/ocfs2_lockid.h                                |    4 --
 fs/ocfs2/ocfs2_lockingver.h                            |    4 --
 fs/ocfs2/refcounttree.c                                |    4 --
 fs/ocfs2/refcounttree.h                                |    4 --
 fs/ocfs2/reservations.c                                |    4 --
 fs/ocfs2/reservations.h                                |    4 --
 fs/ocfs2/resize.c                                      |    4 --
 fs/ocfs2/resize.h                                      |    4 --
 fs/ocfs2/slot_map.c                                    |    4 --
 fs/ocfs2/slot_map.h                                    |    4 --
 fs/ocfs2/stack_o2cb.c                                  |    4 --
 fs/ocfs2/stack_user.c                                  |    4 --
 fs/ocfs2/stackglue.c                                   |    4 --
 fs/ocfs2/stackglue.h                                   |    4 --
 fs/ocfs2/suballoc.c                                    |    4 --
 fs/ocfs2/suballoc.h                                    |    4 --
 fs/ocfs2/super.c                                       |    4 --
 fs/ocfs2/super.h                                       |    4 --
 fs/ocfs2/symlink.c                                     |    4 --
 fs/ocfs2/symlink.h                                     |    4 --
 fs/ocfs2/sysfile.c                                     |    4 --
 fs/ocfs2/sysfile.h                                     |    4 --
 fs/ocfs2/uptodate.c                                    |    4 --
 fs/ocfs2/uptodate.h                                    |    4 --
 fs/ocfs2/xattr.c                                       |    4 --
 fs/ocfs2/xattr.h                                       |    4 --
 fs/reiserfs/procfs.c                                   |   10 -----
 include/linux/configfs.h                               |    4 --
 include/linux/genl_magic_func.h                        |    1 
 include/linux/genl_magic_struct.h                      |    1 
 include/uapi/linux/if_bonding.h                        |   11 -----
 include/uapi/linux/nfs4.h                              |    6 ---
 include/xen/interface/elfnote.h                        |   10 -----
 include/xen/interface/hvm/hvm_vcpu.h                   |   10 -----
 include/xen/interface/io/xenbus.h                      |   10 -----
 samples/configfs/configfs_sample.c                     |    2 -
 tools/usb/hcd-tests.sh                                 |    2 -
 157 files changed, 110 insertions(+), 627 deletions(-)

--- a/arch/m68k/atari/time.c~treewide-remove-editor-modelines-and-cruft
+++ a/arch/m68k/atari/time.c
@@ -317,10 +317,3 @@ int atari_tt_hwclk( int op, struct rtc_t
 
     return( 0 );
 }
-
-/*
- * Local variables:
- *  c-indent-level: 4
- *  tab-width: 8
- * End:
- */
--- a/arch/parisc/include/asm/pdc_chassis.h~treewide-remove-editor-modelines-and-cruft
+++ a/arch/parisc/include/asm/pdc_chassis.h
@@ -365,4 +365,3 @@ void parisc_pdc_chassis_init(void);
 					 PDC_CHASSIS_EOM_SET		)
 
 #endif /* _PARISC_PDC_CHASSIS_H */
-/* vim: set ts=8 */
--- a/arch/um/drivers/cow.h~treewide-remove-editor-modelines-and-cruft
+++ a/arch/um/drivers/cow.h
@@ -24,10 +24,3 @@ extern void cow_sizes(int version, __u64
 		      int *data_offset_out);
 
 #endif
-
-/*
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-file-style: "linux"
- * End:
- */
--- a/drivers/auxdisplay/panel.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/auxdisplay/panel.c
@@ -1737,10 +1737,3 @@ module_init(panel_init_module);
 module_exit(panel_cleanup_module);
 MODULE_AUTHOR("Willy Tarreau");
 MODULE_LICENSE("GPL");
-
-/*
- * Local variables:
- *  c-indent-level: 4
- *  tab-width: 8
- * End:
- */
--- a/drivers/gpu/drm/qxl/qxl_drv.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/gpu/drm/qxl/qxl_drv.c
@@ -1,4 +1,3 @@
-/* vim: set ts=8 sw=8 tw=78 ai noexpandtab */
 /* qxl_drv.c -- QXL driver -*- linux-c -*-
  *
  * Copyright 2011 Red Hat, Inc.
--- a/drivers/media/usb/pwc/pwc-uncompress.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/media/usb/pwc/pwc-uncompress.c
@@ -9,9 +9,6 @@
    Please send bug reports and support requests to <luc@saillard.org>.
    The decompression routines have been implemented by reverse-engineering the
    Nemosoft binary pwcx module. Caveat emptor.
-
-
-   vim: set ts=8:
 */
 
 #include <asm/current.h>
--- a/drivers/net/ethernet/adaptec/starfire.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/ethernet/adaptec/starfire.c
@@ -2070,11 +2070,3 @@ static void __exit starfire_cleanup (voi
 
 module_init(starfire_init);
 module_exit(starfire_cleanup);
-
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/drivers/net/ethernet/amd/atarilance.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/ethernet/amd/atarilance.c
@@ -1156,11 +1156,3 @@ static void __exit atarilance_module_exi
 module_init(atarilance_module_init);
 module_exit(atarilance_module_exit);
 #endif /* MODULE */
-
-
-/*
- * Local variables:
- *  c-indent-level: 4
- *  tab-width: 4
- * End:
- */
--- a/drivers/net/ethernet/amd/pcnet32.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/ethernet/amd/pcnet32.c
@@ -3029,10 +3029,3 @@ static void __exit pcnet32_cleanup_modul
 
 module_init(pcnet32_init_module);
 module_exit(pcnet32_cleanup_module);
-
-/*
- * Local variables:
- *  c-indent-level: 4
- *  tab-width: 8
- * End:
- */
--- a/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/wireless/intersil/orinoco/orinoco_nortel.c
@@ -312,11 +312,3 @@ static void __exit orinoco_nortel_exit(v
 
 module_init(orinoco_nortel_init);
 module_exit(orinoco_nortel_exit);
-
-/*
- * Local variables:
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/drivers/net/wireless/intersil/orinoco/orinoco_pci.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/wireless/intersil/orinoco/orinoco_pci.c
@@ -255,11 +255,3 @@ static void __exit orinoco_pci_exit(void
 
 module_init(orinoco_pci_init);
 module_exit(orinoco_pci_exit);
-
-/*
- * Local variables:
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/drivers/net/wireless/intersil/orinoco/orinoco_plx.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/wireless/intersil/orinoco/orinoco_plx.c
@@ -360,11 +360,3 @@ static void __exit orinoco_plx_exit(void
 
 module_init(orinoco_plx_init);
 module_exit(orinoco_plx_exit);
-
-/*
- * Local variables:
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/drivers/net/wireless/intersil/orinoco/orinoco_tmd.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/net/wireless/intersil/orinoco/orinoco_tmd.c
@@ -235,11 +235,3 @@ static void __exit orinoco_tmd_exit(void
 
 module_init(orinoco_tmd_init);
 module_exit(orinoco_tmd_exit);
-
-/*
- * Local variables:
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/drivers/parport/parport_ip32.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/parport/parport_ip32.c
@@ -2224,15 +2224,3 @@ MODULE_PARM_DESC(features,
 		 ", bit 2: hardware SPP mode"
 		 ", bit 3: hardware EPP mode"
 		 ", bit 4: hardware ECP mode");
-
-/*--- Inform (X)Emacs about preferred coding style ---------------------*/
-/*
- * Local Variables:
- * mode: c
- * c-file-style: "linux"
- * indent-tabs-mode: t
- * tab-width: 8
- * fill-column: 78
- * ispell-local-dictionary: "american"
- * End:
- */
--- a/drivers/platform/x86/dell/dell_rbu.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/platform/x86/dell/dell_rbu.c
@@ -675,6 +675,3 @@ static __exit void dcdrbu_exit(void)
 
 module_exit(dcdrbu_exit);
 module_init(dcdrbu_init);
-
-/* vim:noet:ts=8:sw=8
-*/
--- a/drivers/scsi/53c700.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/53c700.c
@@ -1,5 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8 -*- */
 
 /* NCR (or Symbios) 53c700 and 53c700-66 Driver
  *
--- a/drivers/scsi/53c700.h~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/53c700.h
@@ -1,5 +1,4 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* -*- mode: c; c-basic-offset: 8 -*- */
 
 /* Driver for 53c700 and 53c700-66 chips from NCR and Symbios
  *
--- a/drivers/scsi/ch.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/ch.c
@@ -1058,9 +1058,3 @@ static void __exit exit_ch_module(void)
 
 module_init(init_ch_module);
 module_exit(exit_ch_module);
-
-/*
- * Local variables:
- * c-basic-offset: 8
- * End:
- */
--- a/drivers/scsi/ips.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/ips.c
@@ -7099,23 +7099,3 @@ ips_init_phase2(int index)
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("IBM ServeRAID Adapter Driver " IPS_VER_STRING);
 MODULE_VERSION(IPS_VER_STRING);
-
-
-/*
- * Overrides for Emacs so that we almost follow Linus's tabbing style.
- * Emacs will notice this stuff at the end of the file and automatically
- * adjust the settings for this buffer only.  This must remain at the end
- * of the file.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-indent-level: 2
- * c-brace-imaginary-offset: 0
- * c-brace-offset: -2
- * c-argdecl-indent: 2
- * c-label-offset: -2
- * c-continued-statement-offset: 2
- * c-continued-brace-offset: 0
- * indent-tabs-mode: nil
- * tab-width: 8
- * End:
- */
--- a/drivers/scsi/ips.h~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/ips.h
@@ -1211,23 +1211,3 @@ typedef struct {
       IPS_COMPAT_TAMPA, \
       IPS_COMPAT_KEYWEST \
    }
-
-
-/*
- * Overrides for Emacs so that we almost follow Linus's tabbing style.
- * Emacs will notice this stuff at the end of the file and automatically
- * adjust the settings for this buffer only.  This must remain at the end
- * of the file.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-indent-level: 2
- * c-brace-imaginary-offset: 0
- * c-brace-offset: -2
- * c-argdecl-indent: 2
- * c-label-offset: -2
- * c-continued-statement-offset: 2
- * c-continued-brace-offset: 0
- * indent-tabs-mode: nil
- * tab-width: 8
- * End:
- */
--- a/drivers/scsi/lasi700.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/lasi700.c
@@ -1,5 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8 -*- */
 
 /* PARISC LASI driver for the 53c700 chip
  *
--- a/drivers/scsi/megaraid/mbox_defs.h~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/megaraid/mbox_defs.h
@@ -781,5 +781,3 @@ typedef struct {
 } __attribute__ ((packed)) mbox_sgl32;
 
 #endif		// _MRAID_MBOX_DEFS_H_
-
-/* vim: set ts=8 sw=8 tw=78: */
--- a/drivers/scsi/megaraid/mega_common.h~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/megaraid/mega_common.h
@@ -282,5 +282,3 @@ struct mraid_pci_blk {
 };
 
 #endif // _MEGA_COMMON_H_
-
-// vim: set ts=8 sw=8 tw=78:
--- a/drivers/scsi/megaraid/megaraid_mbox.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/megaraid/megaraid_mbox.c
@@ -4068,5 +4068,3 @@ megaraid_sysfs_show_ldnum(struct device
  */
 module_init(megaraid_init);
 module_exit(megaraid_exit);
-
-/* vim: set ts=8 sw=8 tw=78 ai si: */
--- a/drivers/scsi/megaraid/megaraid_mbox.h~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/megaraid/megaraid_mbox.h
@@ -230,5 +230,3 @@ typedef struct {
 #define WROUTDOOR(rdev, value)	writel(value, (rdev)->baseaddr + 0x2C)
 
 #endif // _MEGARAID_H_
-
-// vim: set ts=8 sw=8 tw=78:
--- a/drivers/scsi/qla1280.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/qla1280.c
@@ -4403,15 +4403,3 @@ MODULE_FIRMWARE("qlogic/1040.bin");
 MODULE_FIRMWARE("qlogic/1280.bin");
 MODULE_FIRMWARE("qlogic/12160.bin");
 MODULE_VERSION(QLA1280_VERSION);
-
-/*
- * Overrides for Emacs so that we almost follow Linus's tabbing style.
- * Emacs will notice this stuff at the end of the file and automatically
- * adjust the settings for this buffer only.  This must remain at the end
- * of the file.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-basic-offset: 8
- * tab-width: 8
- * End:
- */
--- a/drivers/scsi/sni_53c710.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/scsi/sni_53c710.c
@@ -1,5 +1,4 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8 -*- */
 
 /* SNI RM driver
  *
--- a/drivers/video/fbdev/matrox/matroxfb_base.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/video/fbdev/matrox/matroxfb_base.c
@@ -2608,12 +2608,3 @@ EXPORT_SYMBOL(matroxfb_register_driver);
 EXPORT_SYMBOL(matroxfb_unregister_driver);
 EXPORT_SYMBOL(matroxfb_wait_for_sync);
 EXPORT_SYMBOL(matroxfb_enable_irq);
-
-/*
- * Overrides for Emacs so that we follow Linus's tabbing style.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-basic-offset: 8
- * End:
- */
-
--- a/drivers/video/fbdev/vga16fb.c~treewide-remove-editor-modelines-and-cruft
+++ a/drivers/video/fbdev/vga16fb.c
@@ -1451,13 +1451,3 @@ MODULE_DESCRIPTION("Legacy VGA framebuff
 MODULE_LICENSE("GPL");
 module_init(vga16fb_init);
 module_exit(vga16fb_exit);
-
-
-/*
- * Overrides for Emacs so that we follow Linus's tabbing style.
- * ---------------------------------------------------------------------------
- * Local variables:
- * c-basic-offset: 8
- * End:
- */
-
--- a/fs/configfs/configfs_internal.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/configfs_internal.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset:8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * configfs_internal.h - Internal stuff for configfs
  *
  * Based on sysfs:
--- a/fs/configfs/dir.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/dir.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dir.c - Operations for configfs directories.
  *
  * Based on sysfs:
--- a/fs/configfs/file.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/file.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * file.c - operations for regular (text) files.
  *
  * Based on sysfs:
--- a/fs/configfs/inode.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/inode.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * inode.c - basic inode and dentry operations.
  *
  * Based on sysfs:
--- a/fs/configfs/item.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/item.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * item.c - library routines for handling generic config items
  *
  * Based on kobject:
--- a/fs/configfs/mount.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/mount.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * mount.c - operations for initializing and mounting configfs.
  *
  * Based on sysfs:
--- a/fs/configfs/symlink.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/configfs/symlink.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * symlink.c - operations for configfs symlinks.
  *
  * Based on sysfs:
--- a/fs/nfs/dir.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfs/dir.c
@@ -3004,10 +3004,3 @@ out_notsup:
 	goto out;
 }
 EXPORT_SYMBOL_GPL(nfs_permission);
-
-/*
- * Local variables:
- *  version-control: t
- *  kept-new-versions: 5
- * End:
- */
--- a/fs/nfsd/nfs4proc.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfsd/nfs4proc.c
@@ -3317,9 +3317,3 @@ const struct svc_version nfsd_version4 =
 	.vs_rpcb_optnl		= true,
 	.vs_need_cong_ctrl	= true,
 };
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/nfsd/nfs4xdr.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfsd/nfs4xdr.c
@@ -5448,9 +5448,3 @@ nfs4svc_encode_compoundres(struct svc_rq
 	nfsd4_sequence_done(resp);
 	return 1;
 }
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/nfsd/xdr4.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfsd/xdr4.h
@@ -866,9 +866,3 @@ struct nfsd4_operation {
 
 
 #endif
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/nfs/nfs4proc.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfs/nfs4proc.c
@@ -10427,9 +10427,3 @@ const struct xattr_handler *nfs4_xattr_h
 #endif
 	NULL
 };
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/nfs/nfs4renewd.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfs/nfs4renewd.c
@@ -149,9 +149,3 @@ void nfs4_set_lease_period(struct nfs_cl
 	/* Cap maximum reconnect timeout at 1/2 lease period */
 	rpc_set_connect_timeout(clp->cl_rpcclient, lease, lease >> 1);
 }
-
-/*
- * Local variables:
- *   c-basic-offset: 8
- * End:
- */
--- a/fs/nfs/nfs4state.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfs/nfs4state.c
@@ -2695,9 +2695,3 @@ static int nfs4_run_state_manager(void *
 	module_put_and_exit(0);
 	return 0;
 }
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/nfs/nfs4xdr.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/nfs/nfs4xdr.c
@@ -7629,9 +7629,3 @@ const struct rpc_version nfs_version4 =
 	.procs			= nfs4_procedures,
 	.counts			= nfs_version4_counts,
 };
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/fs/ocfs2/acl.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/acl.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * acl.c
  *
  * Copyright (C) 2004, 2008 Oracle.  All rights reserved.
--- a/fs/ocfs2/acl.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/acl.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * acl.h
  *
  * Copyright (C) 2004, 2008 Oracle.  All rights reserved.
--- a/fs/ocfs2/alloc.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/alloc.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * alloc.c
  *
  * Extent allocs and frees
--- a/fs/ocfs2/alloc.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/alloc.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * alloc.h
  *
  * Function prototypes
--- a/fs/ocfs2/aops.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/aops.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2002, 2004 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/aops.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/aops.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2002, 2004, 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/blockcheck.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/blockcheck.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * blockcheck.c
  *
  * Checksum and ECC codes for the OCFS2 userspace library.
--- a/fs/ocfs2/blockcheck.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/blockcheck.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * blockcheck.h
  *
  * Checksum and ECC codes for the OCFS2 userspace library.
--- a/fs/ocfs2/buffer_head_io.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/buffer_head_io.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * io.c
  *
  * Buffer cache handling
--- a/fs/ocfs2/buffer_head_io.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/buffer_head_io.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_buffer_head.h
  *
  * Buffer cache handling functions defined
--- a/fs/ocfs2/cluster/heartbeat.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/heartbeat.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2004, 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/cluster/heartbeat.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/heartbeat.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * heartbeat.h
  *
  * Function prototypes
--- a/fs/ocfs2/cluster/masklog.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/masklog.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2004, 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/cluster/masklog.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/masklog.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/cluster/netdebug.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/netdebug.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * netdebug.c
  *
  * debug functionality for o2net
--- a/fs/ocfs2/cluster/nodemanager.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/nodemanager.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2004, 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/cluster/nodemanager.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/nodemanager.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * nodemanager.h
  *
  * Function prototypes
--- a/fs/ocfs2/cluster/ocfs2_heartbeat.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/ocfs2_heartbeat.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_heartbeat.h
  *
  * On-disk structures for ocfs2_heartbeat
--- a/fs/ocfs2/cluster/ocfs2_nodemanager.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/ocfs2_nodemanager.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_nodemanager.h
  *
  * Header describing the interface between userspace and the kernel
--- a/fs/ocfs2/cluster/quorum.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/quorum.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- *
- * vim: noexpandtab sw=8 ts=8 sts=0:
+/*
  *
  * Copyright (C) 2005 Oracle.  All rights reserved.
  */
--- a/fs/ocfs2/cluster/quorum.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/quorum.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/cluster/sys.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/sys.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * sys.c
  *
  * OCFS2 cluster sysfs interface
--- a/fs/ocfs2/cluster/sys.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/sys.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * sys.h
  *
  * Function prototypes for o2cb sysfs interface
--- a/fs/ocfs2/cluster/tcp.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/tcp.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- *
- * vim: noexpandtab sw=8 ts=8 sts=0:
+/*
  *
  * Copyright (C) 2004 Oracle.  All rights reserved.
  *
--- a/fs/ocfs2/cluster/tcp.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/tcp.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * tcp.h
  *
  * Function prototypes
--- a/fs/ocfs2/cluster/tcp_internal.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/cluster/tcp_internal.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * Copyright (C) 2005 Oracle.  All rights reserved.
  */
 
--- a/fs/ocfs2/dcache.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dcache.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dcache.c
  *
  * dentry cache handling code
--- a/fs/ocfs2/dcache.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dcache.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dcache.h
  *
  * Function prototypes
--- a/fs/ocfs2/dir.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dir.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dir.c
  *
  * Creates, reads, walks and deletes directory-nodes
--- a/fs/ocfs2/dir.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dir.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dir.h
  *
  * Function prototypes
--- a/fs/ocfs2/dlm/dlmapi.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmapi.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmapi.h
  *
  * externally exported dlm interfaces
--- a/fs/ocfs2/dlm/dlmast.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmast.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmast.c
  *
  * AST and BAST functionality for local and remote nodes
--- a/fs/ocfs2/dlm/dlmcommon.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmcommon.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmcommon.h
  *
  * Copyright (C) 2004 Oracle.  All rights reserved.
--- a/fs/ocfs2/dlm/dlmconvert.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmconvert.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmconvert.c
  *
  * underlying calls for lock conversion
--- a/fs/ocfs2/dlm/dlmconvert.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmconvert.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmconvert.h
  *
  * Copyright (C) 2004 Oracle.  All rights reserved.
--- a/fs/ocfs2/dlm/dlmdebug.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmdebug.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmdebug.c
  *
  * debug functionality for the dlm
--- a/fs/ocfs2/dlm/dlmdebug.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmdebug.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmdebug.h
  *
  * Copyright (C) 2008 Oracle.  All rights reserved.
--- a/fs/ocfs2/dlm/dlmdomain.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmdomain.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmdomain.c
  *
  * defines domain join / leave apis
--- a/fs/ocfs2/dlm/dlmdomain.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmdomain.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmdomain.h
  *
  * Copyright (C) 2004 Oracle.  All rights reserved.
--- a/fs/ocfs2/dlm/dlmlock.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmlock.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmlock.c
  *
  * underlying calls for lock creation
--- a/fs/ocfs2/dlm/dlmmaster.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmmaster.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmmod.c
  *
  * standalone DLM module
--- a/fs/ocfs2/dlm/dlmrecovery.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmrecovery.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmrecovery.c
  *
  * recovery stuff
--- a/fs/ocfs2/dlm/dlmthread.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmthread.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmthread.c
  *
  * standalone DLM module
--- a/fs/ocfs2/dlm/dlmunlock.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlm/dlmunlock.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmunlock.c
  *
  * underlying calls for unlocking locks
--- a/fs/ocfs2/dlmfs/dlmfs.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlmfs/dlmfs.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmfs.c
  *
  * Code which implements the kernel side of a minimal userspace
--- a/fs/ocfs2/dlmfs/userdlm.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlmfs/userdlm.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * userdlm.c
  *
  * Code which implements the kernel side of a minimal userspace
--- a/fs/ocfs2/dlmfs/userdlm.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlmfs/userdlm.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * userdlm.h
  *
  * Userspace dlm defines
--- a/fs/ocfs2/dlmglue.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlmglue.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmglue.c
  *
  * Code which implements an OCFS2 specific interface to our DLM.
--- a/fs/ocfs2/dlmglue.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/dlmglue.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * dlmglue.h
  *
  * description here
--- a/fs/ocfs2/export.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/export.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * export.c
  *
  * Functions to facilitate NFS exporting
--- a/fs/ocfs2/export.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/export.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * export.h
  *
  * Function prototypes
--- a/fs/ocfs2/extent_map.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/extent_map.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * extent_map.c
  *
  * Block/Cluster mapping functions
--- a/fs/ocfs2/extent_map.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/extent_map.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * extent_map.h
  *
  * In-memory file extent mappings for OCFS2.
--- a/fs/ocfs2/file.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/file.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * file.c
  *
  * File open, close, extend, truncate
--- a/fs/ocfs2/filecheck.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/filecheck.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * filecheck.c
  *
  * Code which implements online file check.
--- a/fs/ocfs2/filecheck.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/filecheck.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * filecheck.h
  *
  * Online file check.
--- a/fs/ocfs2/file.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/file.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * file.h
  *
  * Function prototypes
--- a/fs/ocfs2/heartbeat.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/heartbeat.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * heartbeat.c
  *
  * Register ourselves with the heartbaet service, keep our node maps
--- a/fs/ocfs2/heartbeat.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/heartbeat.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * heartbeat.h
  *
  * Function prototypes
--- a/fs/ocfs2/inode.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/inode.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * inode.c
  *
  * vfs' aops, fops, dops and iops
--- a/fs/ocfs2/inode.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/inode.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * inode.h
  *
  * Function prototypes
--- a/fs/ocfs2/journal.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/journal.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * journal.c
  *
  * Defines functions of journalling api
--- a/fs/ocfs2/journal.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/journal.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * journal.h
  *
  * Defines journalling api and structures.
--- a/fs/ocfs2/localalloc.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/localalloc.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * localalloc.c
  *
  * Node local data allocation
--- a/fs/ocfs2/localalloc.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/localalloc.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * localalloc.h
  *
  * Function prototypes
--- a/fs/ocfs2/locks.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/locks.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * locks.c
  *
  * Userspace file locking support
--- a/fs/ocfs2/locks.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/locks.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * locks.h
  *
  * Function prototypes for Userspace file locking support
--- a/fs/ocfs2/mmap.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/mmap.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * mmap.c
  *
  * Code to deal with the mess that is clustered mmap.
--- a/fs/ocfs2/move_extents.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/move_extents.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * move_extents.c
  *
  * Copyright (C) 2011 Oracle.  All rights reserved.
--- a/fs/ocfs2/move_extents.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/move_extents.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * move_extents.h
  *
  * Copyright (C) 2011 Oracle.  All rights reserved.
--- a/fs/ocfs2/namei.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/namei.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * namei.c
  *
  * Create and rename file, directory, symlinks
--- a/fs/ocfs2/namei.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/namei.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * namei.h
  *
  * Function prototypes
--- a/fs/ocfs2/ocfs1_fs_compat.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs1_fs_compat.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs1_fs_compat.h
  *
  * OCFS1 volume header definitions.  OCFS2 creates valid but unmountable
--- a/fs/ocfs2/ocfs2_fs.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs2_fs.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_fs.h
  *
  * On-disk structures for OCFS2.
--- a/fs/ocfs2/ocfs2.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs2.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2.h
  *
  * Defines macros and structures used in OCFS2
--- a/fs/ocfs2/ocfs2_ioctl.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs2_ioctl.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_ioctl.h
  *
  * Defines OCFS2 ioctls.
--- a/fs/ocfs2/ocfs2_lockid.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs2_lockid.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_lockid.h
  *
  * Defines OCFS2 lockid bits.
--- a/fs/ocfs2/ocfs2_lockingver.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/ocfs2_lockingver.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * ocfs2_lockingver.h
  *
  * Defines OCFS2 Locking version values.
--- a/fs/ocfs2/refcounttree.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/refcounttree.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * refcounttree.c
  *
  * Copyright (C) 2009 Oracle.  All rights reserved.
--- a/fs/ocfs2/refcounttree.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/refcounttree.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * refcounttree.h
  *
  * Copyright (C) 2009 Oracle.  All rights reserved.
--- a/fs/ocfs2/reservations.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/reservations.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * reservations.c
  *
  * Allocation reservations implementation
--- a/fs/ocfs2/reservations.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/reservations.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * reservations.h
  *
  * Allocation reservations function prototypes and structures.
--- a/fs/ocfs2/resize.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/resize.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * resize.c
  *
  * volume resize.
--- a/fs/ocfs2/resize.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/resize.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * resize.h
  *
  * Function prototypes
--- a/fs/ocfs2/slot_map.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/slot_map.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * slot_map.c
  *
  * Copyright (C) 2002, 2004 Oracle.  All rights reserved.
--- a/fs/ocfs2/slot_map.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/slot_map.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * slotmap.h
  *
  * description here
--- a/fs/ocfs2/stackglue.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/stackglue.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * stackglue.c
  *
  * Code which implements an OCFS2 specific interface to underlying
--- a/fs/ocfs2/stackglue.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/stackglue.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * stackglue.h
  *
  * Glue to the underlying cluster stack.
--- a/fs/ocfs2/stack_o2cb.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/stack_o2cb.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * stack_o2cb.c
  *
  * Code which interfaces ocfs2 with the o2cb stack.
--- a/fs/ocfs2/stack_user.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/stack_user.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * stack_user.c
  *
  * Code which interfaces ocfs2 with fs/dlm and a userspace stack.
--- a/fs/ocfs2/suballoc.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/suballoc.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * suballoc.c
  *
  * metadata alloc and free
--- a/fs/ocfs2/suballoc.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/suballoc.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * suballoc.h
  *
  * Defines sub allocator api
--- a/fs/ocfs2/super.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/super.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * super.c
  *
  * load/unload driver, mount/dismount volumes
--- a/fs/ocfs2/super.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/super.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * super.h
  *
  * Function prototypes
--- a/fs/ocfs2/symlink.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/symlink.c
@@ -1,6 +1,4 @@
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  *  linux/cluster/ssi/cfs/symlink.c
  *
  *	This program is free software; you can redistribute it and/or
--- a/fs/ocfs2/symlink.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/symlink.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * symlink.h
  *
  * Function prototypes
--- a/fs/ocfs2/sysfile.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/sysfile.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * sysfile.c
  *
  * Initialize, read, write, etc. system files.
--- a/fs/ocfs2/sysfile.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/sysfile.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * sysfile.h
  *
  * Function prototypes
--- a/fs/ocfs2/uptodate.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/uptodate.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * uptodate.c
  *
  * Tracking the up-to-date-ness of a local buffer_head with respect to
--- a/fs/ocfs2/uptodate.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/uptodate.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * uptodate.h
  *
  * Cluster uptodate tracking
--- a/fs/ocfs2/xattr.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/xattr.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-only
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * xattr.c
  *
  * Copyright (C) 2004, 2008 Oracle.  All rights reserved.
--- a/fs/ocfs2/xattr.h~treewide-remove-editor-modelines-and-cruft
+++ a/fs/ocfs2/xattr.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * xattr.h
  *
  * Copyright (C) 2004, 2008 Oracle.  All rights reserved.
--- a/fs/reiserfs/procfs.c~treewide-remove-editor-modelines-and-cruft
+++ a/fs/reiserfs/procfs.c
@@ -488,13 +488,3 @@ int reiserfs_proc_info_global_done(void)
  * (available at http://www.namesys.com/legalese.html)
  *
  */
-
-/*
- * Make Linus happy.
- * Local variables:
- * c-indentation-style: "K&R"
- * mode-name: "LC"
- * c-basic-offset: 8
- * tab-width: 8
- * End:
- */
--- a/include/linux/configfs.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/linux/configfs.h
@@ -1,7 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
-/* -*- mode: c; c-basic-offset: 8; -*-
- * vim: noexpandtab sw=8 ts=8 sts=0:
- *
+/*
  * configfs.h - definitions for the device driver filesystem
  *
  * Based on sysfs:
--- a/include/linux/genl_magic_func.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/linux/genl_magic_func.h
@@ -404,4 +404,3 @@ s_fields								\
 
 /* }}}1 */
 #endif /* GENL_MAGIC_FUNC_H */
-/* vim: set foldmethod=marker foldlevel=1 nofoldenable : */
--- a/include/linux/genl_magic_struct.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/linux/genl_magic_struct.h
@@ -283,4 +283,3 @@ enum {									\
 
 /* }}}1 */
 #endif /* GENL_MAGIC_STRUCT_H */
-/* vim: set foldmethod=marker nofoldenable : */
--- a/include/uapi/linux/if_bonding.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/uapi/linux/if_bonding.h
@@ -153,14 +153,3 @@ enum {
 #define BOND_3AD_STAT_MAX (__BOND_3AD_STAT_MAX - 1)
 
 #endif /* _LINUX_IF_BONDING_H */
-
-/*
- * Local variables:
- *  version-control: t
- *  kept-new-versions: 5
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
-
--- a/include/uapi/linux/nfs4.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/uapi/linux/nfs4.h
@@ -178,9 +178,3 @@
 #define NFS4_MAX_BACK_CHANNEL_OPS 2
 
 #endif /* _UAPI_LINUX_NFS4_H */
-
-/*
- * Local variables:
- *  c-basic-offset: 8
- * End:
- */
--- a/include/xen/interface/elfnote.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/xen/interface/elfnote.h
@@ -208,13 +208,3 @@
 #define XEN_ELFNOTE_MAX XEN_ELFNOTE_PHYS32_ENTRY
 
 #endif /* __XEN_PUBLIC_ELFNOTE_H__ */
-
-/*
- * Local variables:
- * mode: C
- * c-set-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
--- a/include/xen/interface/hvm/hvm_vcpu.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/xen/interface/hvm/hvm_vcpu.h
@@ -131,13 +131,3 @@ struct vcpu_hvm_context {
 typedef struct vcpu_hvm_context vcpu_hvm_context_t;
 
 #endif /* __XEN_PUBLIC_HVM_HVM_VCPU_H__ */
-
-/*
- * Local variables:
- * mode: C
- * c-file-style: "BSD"
- * c-basic-offset: 4
- * tab-width: 4
- * indent-tabs-mode: nil
- * End:
- */
--- a/include/xen/interface/io/xenbus.h~treewide-remove-editor-modelines-and-cruft
+++ a/include/xen/interface/io/xenbus.h
@@ -39,13 +39,3 @@ enum xenbus_state
 };
 
 #endif /* _XEN_PUBLIC_IO_XENBUS_H */
-
-/*
- * Local variables:
- *  c-file-style: "linux"
- *  indent-tabs-mode: t
- *  c-indent-level: 8
- *  c-basic-offset: 8
- *  tab-width: 8
- * End:
- */
--- a/samples/configfs/configfs_sample.c~treewide-remove-editor-modelines-and-cruft
+++ a/samples/configfs/configfs_sample.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /*
- * vim: noexpandtab ts=8 sts=0 sw=8:
- *
  * configfs_example_macros.c - This file is a demonstration module
  *      containing a number of configfs subsystems.  It uses the helper
  *      macros defined by configfs.h
--- a/tools/usb/hcd-tests.sh~treewide-remove-editor-modelines-and-cruft
+++ a/tools/usb/hcd-tests.sh
@@ -272,5 +272,3 @@ do
 	echo ''
     done
 done
-
-# vim: sw=4
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 90/91] mm: fix typos in comments
  2021-05-07  1:01 incoming Andrew Morton
                   ` (88 preceding siblings ...)
  2021-05-07  1:06 ` [patch 89/91] treewide: remove editor modelines and cruft Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  1:06 ` [patch 91/91] " Andrew Morton
  2021-05-07  7:12 ` incoming Linus Torvalds
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, mingo, mm-commits, rdunlap, torvalds, unixbhaskar, willy

From: Ingo Molnar <mingo@kernel.org>
Subject: mm: fix typos in comments

Fix ~94 single-word typos in locking code comments, plus a few
very obvious grammar mistakes.

Link: https://lkml.kernel.org/r/20210322212624.GA1963421@gmail.com
Link: https://lore.kernel.org/r/20210322205203.GB1959563@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h      |    2 +-
 include/linux/vmalloc.h |    4 ++--
 mm/balloon_compaction.c |    4 ++--
 mm/compaction.c         |    4 ++--
 mm/filemap.c            |    2 +-
 mm/gup.c                |    2 +-
 mm/highmem.c            |    2 +-
 mm/huge_memory.c        |    6 +++---
 mm/hugetlb.c            |    6 +++---
 mm/internal.h           |    2 +-
 mm/kasan/kasan.h        |    8 ++++----
 mm/kasan/quarantine.c   |    4 ++--
 mm/kasan/shadow.c       |    4 ++--
 mm/kfence/report.c      |    2 +-
 mm/khugepaged.c         |    2 +-
 mm/ksm.c                |    4 ++--
 mm/madvise.c            |    4 ++--
 mm/memcontrol.c         |   18 +++++++++---------
 mm/memory-failure.c     |    2 +-
 mm/memory.c             |   10 +++++-----
 mm/mempolicy.c          |    4 ++--
 mm/migrate.c            |    8 ++++----
 mm/mmap.c               |    4 ++--
 mm/mprotect.c           |    2 +-
 mm/mremap.c             |    2 +-
 mm/oom_kill.c           |    2 +-
 mm/page-writeback.c     |    4 ++--
 mm/page_alloc.c         |   14 +++++++-------
 mm/page_owner.c         |    2 +-
 mm/percpu-internal.h    |    2 +-
 mm/percpu.c             |    2 +-
 mm/pgalloc-track.h      |    6 +++---
 mm/slab.c               |    6 +++---
 mm/slub.c               |    2 +-
 mm/swap_slots.c         |    2 +-
 mm/vmalloc.c            |    6 +++---
 mm/vmstat.c             |    2 +-
 mm/zpool.c              |    2 +-
 mm/zsmalloc.c           |    2 +-
 39 files changed, 83 insertions(+), 83 deletions(-)

--- a/include/linux/mm.h~mm-fix-typos-in-comments
+++ a/include/linux/mm.h
@@ -106,7 +106,7 @@ extern int mmap_rnd_compat_bits __read_m
  * embedding these tags into addresses that point to these memory regions, and
  * checking that the memory and the pointer tags match on memory accesses)
  * redefine this macro to strip tags from pointers.
- * It's defined as noop for arcitectures that don't support memory tagging.
+ * It's defined as noop for architectures that don't support memory tagging.
  */
 #ifndef untagged_addr
 #define untagged_addr(addr) (addr)
--- a/include/linux/vmalloc.h~mm-fix-typos-in-comments
+++ a/include/linux/vmalloc.h
@@ -33,7 +33,7 @@ struct notifier_block;		/* in notifier.h
  *
  * If IS_ENABLED(CONFIG_KASAN_VMALLOC), VM_KASAN is set on a vm_struct after
  * shadow memory has been mapped. It's used to handle allocation errors so that
- * we don't try to poision shadow on free if it was never allocated.
+ * we don't try to poison shadow on free if it was never allocated.
  *
  * Otherwise, VM_KASAN is set for kasan_module_alloc() allocations and used to
  * determine which allocations need the module shadow freed.
@@ -43,7 +43,7 @@ struct notifier_block;		/* in notifier.h
 
 /*
  * Maximum alignment for ioremap() regions.
- * Can be overriden by arch-specific value.
+ * Can be overridden by arch-specific value.
  */
 #ifndef IOREMAP_MAX_ORDER
 #define IOREMAP_MAX_ORDER	(7 + PAGE_SHIFT)	/* 128 pages */
--- a/mm/balloon_compaction.c~mm-fix-typos-in-comments
+++ a/mm/balloon_compaction.c
@@ -58,7 +58,7 @@ EXPORT_SYMBOL_GPL(balloon_page_list_enqu
 /**
  * balloon_page_list_dequeue() - removes pages from balloon's page list and
  *				 returns a list of the pages.
- * @b_dev_info: balloon device decriptor where we will grab a page from.
+ * @b_dev_info: balloon device descriptor where we will grab a page from.
  * @pages: pointer to the list of pages that would be returned to the caller.
  * @n_req_pages: number of requested pages.
  *
@@ -157,7 +157,7 @@ EXPORT_SYMBOL_GPL(balloon_page_enqueue);
 /*
  * balloon_page_dequeue - removes a page from balloon's page list and returns
  *			  its address to allow the driver to release the page.
- * @b_dev_info: balloon device decriptor where we will grab a page from.
+ * @b_dev_info: balloon device descriptor where we will grab a page from.
  *
  * Driver must call this function to properly dequeue a previously enqueued page
  * before definitively releasing it back to the guest system.
--- a/mm/compaction.c~mm-fix-typos-in-comments
+++ a/mm/compaction.c
@@ -2012,8 +2012,8 @@ static unsigned int fragmentation_score_
 	unsigned int wmark_low;
 
 	/*
-	 * Cap the low watermak to avoid excessive compaction
-	 * activity in case a user sets the proactivess tunable
+	 * Cap the low watermark to avoid excessive compaction
+	 * activity in case a user sets the proactiveness tunable
 	 * close to 100 (maximum).
 	 */
 	wmark_low = max(100U - sysctl_compaction_proactiveness, 5U);
--- a/mm/filemap.c~mm-fix-typos-in-comments
+++ a/mm/filemap.c
@@ -2755,7 +2755,7 @@ unsigned int seek_page_size(struct xa_st
  * entirely memory-based such as tmpfs, and filesystems which support
  * unwritten extents.
  *
- * Return: The requested offset on successs, or -ENXIO if @whence specifies
+ * Return: The requested offset on success, or -ENXIO if @whence specifies
  * SEEK_DATA and there is no data after @start.  There is an implicit hole
  * after @end - 1, so SEEK_HOLE returns @end if all the bytes between @start
  * and @end contain data.
--- a/mm/gup.c~mm-fix-typos-in-comments
+++ a/mm/gup.c
@@ -1575,7 +1575,7 @@ finish_or_fault:
  * Returns NULL on any kind of failure - a hole must then be inserted into
  * the corefile, to preserve alignment with its headers; and also returns
  * NULL wherever the ZERO_PAGE, or an anonymous pte_none, has been found -
- * allowing a hole to be left in the corefile to save diskspace.
+ * allowing a hole to be left in the corefile to save disk space.
  *
  * Called without mmap_lock (takes and releases the mmap_lock by itself).
  */
--- a/mm/highmem.c~mm-fix-typos-in-comments
+++ a/mm/highmem.c
@@ -519,7 +519,7 @@ void *__kmap_local_pfn_prot(unsigned lon
 
 	/*
 	 * Disable migration so resulting virtual address is stable
-	 * accross preemption.
+	 * across preemption.
 	 */
 	migrate_disable();
 	preempt_disable();
--- a/mm/huge_memory.c~mm-fix-typos-in-comments
+++ a/mm/huge_memory.c
@@ -1792,8 +1792,8 @@ bool move_huge_pmd(struct vm_area_struct
 /*
  * Returns
  *  - 0 if PMD could not be locked
- *  - 1 if PMD was locked but protections unchange and TLB flush unnecessary
- *  - HPAGE_PMD_NR is protections changed and TLB flush necessary
+ *  - 1 if PMD was locked but protections unchanged and TLB flush unnecessary
+ *  - HPAGE_PMD_NR if protections changed and TLB flush necessary
  */
 int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 		unsigned long addr, pgprot_t newprot, unsigned long cp_flags)
@@ -2469,7 +2469,7 @@ static void __split_huge_page(struct pag
 		xa_lock(&swap_cache->i_pages);
 	}
 
-	/* lock lru list/PageCompound, ref freezed by page_ref_freeze */
+	/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
 	lruvec = lock_page_lruvec(head);
 
 	for (i = nr - 1; i >= 1; i--) {
--- a/mm/hugetlb.c~mm-fix-typos-in-comments
+++ a/mm/hugetlb.c
@@ -466,7 +466,7 @@ static int allocate_file_region_entries(
 			      resv->region_cache_count;
 
 		/* At this point, we should have enough entries in the cache
-		 * for all the existings adds_in_progress. We should only be
+		 * for all the existing adds_in_progress. We should only be
 		 * needing to allocate for regions_needed.
 		 */
 		VM_BUG_ON(resv->region_cache_count < resv->adds_in_progress);
@@ -5536,8 +5536,8 @@ void adjust_range_if_pmd_sharing_possibl
 		v_end = ALIGN_DOWN(vma->vm_end, PUD_SIZE);
 
 	/*
-	 * vma need span at least one aligned PUD size and the start,end range
-	 * must at least partialy within it.
+	 * vma needs to span at least one aligned PUD size, and the range
+	 * must be at least partially within in.
 	 */
 	if (!(vma->vm_flags & VM_MAYSHARE) || !(v_end > v_start) ||
 		(*end <= v_start) || (*start >= v_end))
--- a/mm/internal.h~mm-fix-typos-in-comments
+++ a/mm/internal.h
@@ -334,7 +334,7 @@ static inline bool is_exec_mapping(vm_fl
 }
 
 /*
- * Stack area - atomatically grows in one direction
+ * Stack area - automatically grows in one direction
  *
  * VM_GROWSUP / VM_GROWSDOWN VMAs are always private anonymous:
  * do_mmap() forbids all other combinations.
--- a/mm/kasan/kasan.h~mm-fix-typos-in-comments
+++ a/mm/kasan/kasan.h
@@ -55,9 +55,9 @@ extern bool kasan_flag_async __ro_after_
 #define KASAN_TAG_MAX		0xFD /* maximum value for random tags */
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define KASAN_TAG_MIN		0xF0 /* mimimum value for random tags */
+#define KASAN_TAG_MIN		0xF0 /* minimum value for random tags */
 #else
-#define KASAN_TAG_MIN		0x00 /* mimimum value for random tags */
+#define KASAN_TAG_MIN		0x00 /* minimum value for random tags */
 #endif
 
 #ifdef CONFIG_KASAN_GENERIC
@@ -403,7 +403,7 @@ static inline bool kasan_byte_accessible
 #else /* CONFIG_KASAN_HW_TAGS */
 
 /**
- * kasan_poison - mark the memory range as unaccessible
+ * kasan_poison - mark the memory range as inaccessible
  * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
  * @size - range size, must be aligned to KASAN_GRANULE_SIZE
  * @value - value that's written to metadata for the range
@@ -434,7 +434,7 @@ bool kasan_byte_accessible(const void *a
 
 /**
  * kasan_poison_last_granule - mark the last granule of the memory range as
- * unaccessible
+ * inaccessible
  * @addr - range start address, must be aligned to KASAN_GRANULE_SIZE
  * @size - range size
  *
--- a/mm/kasan/quarantine.c~mm-fix-typos-in-comments
+++ a/mm/kasan/quarantine.c
@@ -27,7 +27,7 @@
 /* Data structure and operations for quarantine queues. */
 
 /*
- * Each queue is a signle-linked list, which also stores the total size of
+ * Each queue is a single-linked list, which also stores the total size of
  * objects inside of it.
  */
 struct qlist_head {
@@ -138,7 +138,7 @@ static void qlink_free(struct qlist_node
 		local_irq_save(flags);
 
 	/*
-	 * As the object now gets freed from the quaratine, assume that its
+	 * As the object now gets freed from the quarantine, assume that its
 	 * free track is no longer valid.
 	 */
 	*(u8 *)kasan_mem_to_shadow(object) = KASAN_KMALLOC_FREE;
--- a/mm/kasan/shadow.c~mm-fix-typos-in-comments
+++ a/mm/kasan/shadow.c
@@ -316,7 +316,7 @@ int kasan_populate_vmalloc(unsigned long
 	 * // rest of vmalloc process		<data dependency>
 	 * STORE p, a				LOAD shadow(x+99)
 	 *
-	 * If there is no barrier between the end of unpoisioning the shadow
+	 * If there is no barrier between the end of unpoisoning the shadow
 	 * and the store of the result to p, the stores could be committed
 	 * in a different order by CPU#0, and CPU#1 could erroneously observe
 	 * poison in the shadow.
@@ -384,7 +384,7 @@ static int kasan_depopulate_vmalloc_pte(
  * How does this work?
  * -------------------
  *
- * We have a region that is page aligned, labelled as A.
+ * We have a region that is page aligned, labeled as A.
  * That might not map onto the shadow in a way that is page-aligned:
  *
  *                    start                     end
--- a/mm/kfence/report.c~mm-fix-typos-in-comments
+++ a/mm/kfence/report.c
@@ -263,6 +263,6 @@ void kfence_report_error(unsigned long a
 	if (panic_on_warn)
 		panic("panic_on_warn set ...\n");
 
-	/* We encountered a memory unsafety error, taint the kernel! */
+	/* We encountered a memory safety error, taint the kernel! */
 	add_taint(TAINT_BAD_PAGE, LOCKDEP_STILL_OK);
 }
--- a/mm/khugepaged.c~mm-fix-typos-in-comments
+++ a/mm/khugepaged.c
@@ -667,7 +667,7 @@ static int __collapse_huge_page_isolate(
 		 *
 		 * The page table that maps the page has been already unlinked
 		 * from the page table tree and this process cannot get
-		 * an additinal pin on the page.
+		 * an additional pin on the page.
 		 *
 		 * New pins can come later if the page is shared across fork,
 		 * but not from this process. The other process cannot write to
--- a/mm/ksm.c~mm-fix-typos-in-comments
+++ a/mm/ksm.c
@@ -1065,7 +1065,7 @@ static int write_protect_page(struct vm_
 		/*
 		 * Ok this is tricky, when get_user_pages_fast() run it doesn't
 		 * take any lock, therefore the check that we are going to make
-		 * with the pagecount against the mapcount is racey and
+		 * with the pagecount against the mapcount is racy and
 		 * O_DIRECT can happen right after the check.
 		 * So we clear the pte and flush the tlb before the check
 		 * this assure us that no O_DIRECT can happen after the check
@@ -1435,7 +1435,7 @@ static struct page *stable_node_dup(stru
 			 */
 			*_stable_node = found;
 			/*
-			 * Just for robustneess as stable_node is
+			 * Just for robustness, as stable_node is
 			 * otherwise left as a stable pointer, the
 			 * compiler shall optimize it away at build
 			 * time.
--- a/mm/madvise.c~mm-fix-typos-in-comments
+++ a/mm/madvise.c
@@ -799,7 +799,7 @@ static long madvise_dontneed_free(struct
 		if (end > vma->vm_end) {
 			/*
 			 * Don't fail if end > vma->vm_end. If the old
-			 * vma was splitted while the mmap_lock was
+			 * vma was split while the mmap_lock was
 			 * released the effect of the concurrent
 			 * operation may not cause madvise() to
 			 * have an undefined result. There may be an
@@ -1039,7 +1039,7 @@ process_madvise_behavior_valid(int behav
  *  MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump.
  *  MADV_COLD - the application is not expected to use this memory soon,
  *		deactivate pages in this range so that they can be reclaimed
- *		easily if memory pressure hanppens.
+ *		easily if memory pressure happens.
  *  MADV_PAGEOUT - the application is not expected to use this memory soon,
  *		page out the pages in this range immediately.
  *
--- a/mm/memcontrol.c~mm-fix-typos-in-comments
+++ a/mm/memcontrol.c
@@ -215,7 +215,7 @@ enum res_type {
 #define MEMFILE_PRIVATE(x, val)	((x) << 16 | (val))
 #define MEMFILE_TYPE(val)	((val) >> 16 & 0xffff)
 #define MEMFILE_ATTR(val)	((val) & 0xffff)
-/* Used for OOM nofiier */
+/* Used for OOM notifier */
 #define OOM_CONTROL		(0)
 
 /*
@@ -786,7 +786,7 @@ void __mod_lruvec_kmem_state(void *p, en
  * __count_memcg_events - account VM events in a cgroup
  * @memcg: the memory cgroup
  * @idx: the event item
- * @count: the number of events that occured
+ * @count: the number of events that occurred
  */
 void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx,
 			  unsigned long count)
@@ -904,7 +904,7 @@ struct mem_cgroup *get_mem_cgroup_from_m
 	rcu_read_lock();
 	do {
 		/*
-		 * Page cache insertions can happen withou an
+		 * Page cache insertions can happen without an
 		 * actual mm context, e.g. during disk probing
 		 * on boot, loopback IO, acct() writes etc.
 		 */
@@ -1712,7 +1712,7 @@ static void mem_cgroup_unmark_under_oom(
 	struct mem_cgroup *iter;
 
 	/*
-	 * Be careful about under_oom underflows becase a child memcg
+	 * Be careful about under_oom underflows because a child memcg
 	 * could have been added after mem_cgroup_mark_under_oom.
 	 */
 	spin_lock(&memcg_oom_lock);
@@ -1884,7 +1884,7 @@ bool mem_cgroup_oom_synchronize(bool han
 		/*
 		 * There is no guarantee that an OOM-lock contender
 		 * sees the wakeups triggered by the OOM kill
-		 * uncharges.  Wake any sleepers explicitely.
+		 * uncharges.  Wake any sleepers explicitly.
 		 */
 		memcg_oom_recover(memcg);
 	}
@@ -4364,7 +4364,7 @@ void mem_cgroup_wb_stats(struct bdi_writ
  * Foreign dirty flushing
  *
  * There's an inherent mismatch between memcg and writeback.  The former
- * trackes ownership per-page while the latter per-inode.  This was a
+ * tracks ownership per-page while the latter per-inode.  This was a
  * deliberate design decision because honoring per-page ownership in the
  * writeback path is complicated, may lead to higher CPU and IO overheads
  * and deemed unnecessary given that write-sharing an inode across
@@ -4379,9 +4379,9 @@ void mem_cgroup_wb_stats(struct bdi_writ
  * triggering background writeback.  A will be slowed down without a way to
  * make writeback of the dirty pages happen.
  *
- * Conditions like the above can lead to a cgroup getting repatedly and
+ * Conditions like the above can lead to a cgroup getting repeatedly and
  * severely throttled after making some progress after each
- * dirty_expire_interval while the underyling IO device is almost
+ * dirty_expire_interval while the underlying IO device is almost
  * completely idle.
  *
  * Solving this problem completely requires matching the ownership tracking
@@ -5774,7 +5774,7 @@ static int mem_cgroup_can_attach(struct
 		return 0;
 
 	/*
-	 * We are now commited to this value whatever it is. Changes in this
+	 * We are now committed to this value whatever it is. Changes in this
 	 * tunable will only affect upcoming migrations, not the current one.
 	 * So we need to save it, and keep it going.
 	 */
--- a/mm/memory.c~mm-fix-typos-in-comments
+++ a/mm/memory.c
@@ -3727,7 +3727,7 @@ vm_fault_t do_set_pmd(struct vm_fault *v
 		return ret;
 
 	/*
-	 * Archs like ppc64 need additonal space to store information
+	 * Archs like ppc64 need additional space to store information
 	 * related to pte entry. Use the preallocated table for that.
 	 */
 	if (arch_needs_pgtable_deposit() && !vmf->prealloc_pte) {
@@ -4503,7 +4503,7 @@ retry_pud:
 }
 
 /**
- * mm_account_fault - Do page fault accountings
+ * mm_account_fault - Do page fault accounting
  *
  * @regs: the pt_regs struct pointer.  When set to NULL, will skip accounting
  *        of perf event counters, but we'll still do the per-task accounting to
@@ -4512,9 +4512,9 @@ retry_pud:
  * @flags: the fault flags.
  * @ret: the fault retcode.
  *
- * This will take care of most of the page fault accountings.  Meanwhile, it
+ * This will take care of most of the page fault accounting.  Meanwhile, it
  * will also include the PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN] perf counter
- * updates.  However note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
+ * updates.  However, note that the handling of PERF_COUNT_SW_PAGE_FAULTS should
  * still be in per-arch page fault handlers at the entry of page fault.
  */
 static inline void mm_account_fault(struct pt_regs *regs,
@@ -4848,7 +4848,7 @@ out:
 /**
  * generic_access_phys - generic implementation for iomem mmap access
  * @vma: the vma to access
- * @addr: userspace addres, not relative offset within @vma
+ * @addr: userspace address, not relative offset within @vma
  * @buf: buffer to read/write
  * @len: length of transfer
  * @write: set to FOLL_WRITE when writing, otherwise reading
--- a/mm/memory-failure.c~mm-fix-typos-in-comments
+++ a/mm/memory-failure.c
@@ -75,7 +75,7 @@ static bool page_handle_poison(struct pa
 		if (dissolve_free_huge_page(page) || !take_page_off_buddy(page))
 			/*
 			 * We could fail to take off the target page from buddy
-			 * for example due to racy page allocaiton, but that's
+			 * for example due to racy page allocation, but that's
 			 * acceptable because soft-offlined page is not broken
 			 * and if someone really want to use it, they should
 			 * take it.
--- a/mm/mempolicy.c~mm-fix-typos-in-comments
+++ a/mm/mempolicy.c
@@ -1867,7 +1867,7 @@ static int apply_policy_zone(struct memp
 	 * we apply policy when gfp_zone(gfp) = ZONE_MOVABLE only.
 	 *
 	 * policy->v.nodes is intersect with node_states[N_MEMORY].
-	 * so if the following test faile, it implies
+	 * so if the following test fails, it implies
 	 * policy->v.nodes has movable memory only.
 	 */
 	if (!nodes_intersects(policy->v.nodes, node_states[N_HIGH_MEMORY]))
@@ -2098,7 +2098,7 @@ bool init_nodemask_of_mempolicy(nodemask
  *
  * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default
  * policy.  Otherwise, check for intersection between mask and the policy
- * nodemask for 'bind' or 'interleave' policy.  For 'perferred' or 'local'
+ * nodemask for 'bind' or 'interleave' policy.  For 'preferred' or 'local'
  * policy, always return true since it may allocate elsewhere on fallback.
  *
  * Takes task_lock(tsk) to prevent freeing of its mempolicy.
--- a/mm/migrate.c~mm-fix-typos-in-comments
+++ a/mm/migrate.c
@@ -2779,11 +2779,11 @@ restore:
  *
  * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we
  * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
- * allowing the caller to allocate device memory for those unback virtual
- * address.  For this the caller simply has to allocate device memory and
+ * allowing the caller to allocate device memory for those unbacked virtual
+ * addresses.  For this the caller simply has to allocate device memory and
  * properly set the destination entry like for regular migration.  Note that
- * this can still fails and thus inside the device driver must check if the
- * migration was successful for those entries after calling migrate_vma_pages()
+ * this can still fail, and thus inside the device driver you must check if the
+ * migration was successful for those entries after calling migrate_vma_pages(),
  * just like for regular migration.
  *
  * After that, the callers must call migrate_vma_pages() to go over each entry
--- a/mm/mmap.c~mm-fix-typos-in-comments
+++ a/mm/mmap.c
@@ -612,7 +612,7 @@ static unsigned long count_vma_pages_ran
 	unsigned long nr_pages = 0;
 	struct vm_area_struct *vma;
 
-	/* Find first overlaping mapping */
+	/* Find first overlapping mapping */
 	vma = find_vma_intersection(mm, addr, end);
 	if (!vma)
 		return 0;
@@ -2875,7 +2875,7 @@ int __do_munmap(struct mm_struct *mm, un
 	if (unlikely(uf)) {
 		/*
 		 * If userfaultfd_unmap_prep returns an error the vmas
-		 * will remain splitted, but userland will get a
+		 * will remain split, but userland will get a
 		 * highly unexpected error anyway. This is no
 		 * different than the case where the first of the two
 		 * __split_vma fails, but we don't undo the first
--- a/mm/mprotect.c~mm-fix-typos-in-comments
+++ a/mm/mprotect.c
@@ -699,7 +699,7 @@ SYSCALL_DEFINE1(pkey_free, int, pkey)
 	mmap_write_unlock(current->mm);
 
 	/*
-	 * We could provie warnings or errors if any VMA still
+	 * We could provide warnings or errors if any VMA still
 	 * has the pkey set here.
 	 */
 	return ret;
--- a/mm/mremap.c~mm-fix-typos-in-comments
+++ a/mm/mremap.c
@@ -730,7 +730,7 @@ static unsigned long mremap_to(unsigned
 	 * So, to avoid such scenario we can pre-compute if the whole
 	 * operation has high chances to success map-wise.
 	 * Worst-scenario case is when both vma's (new_addr and old_addr) get
-	 * split in 3 before unmaping it.
+	 * split in 3 before unmapping it.
 	 * That means 2 more maps (1 for each) to the ones we already hold.
 	 * Check whether current map count plus 2 still leads us to 4 maps below
 	 * the threshold, otherwise return -ENOMEM here to be more safe.
--- a/mm/oom_kill.c~mm-fix-typos-in-comments
+++ a/mm/oom_kill.c
@@ -74,7 +74,7 @@ static inline bool is_memcg_oom(struct o
 
 #ifdef CONFIG_NUMA
 /**
- * oom_cpuset_eligible() - check task eligiblity for kill
+ * oom_cpuset_eligible() - check task eligibility for kill
  * @start: task struct of which task to consider
  * @oc: pointer to struct oom_control
  *
--- a/mm/page_alloc.c~mm-fix-typos-in-comments
+++ a/mm/page_alloc.c
@@ -893,7 +893,7 @@ compaction_capture(struct capture_contro
 		return false;
 
 	/*
-	 * Do not let lower order allocations polluate a movable pageblock.
+	 * Do not let lower order allocations pollute a movable pageblock.
 	 * This might let an unmovable request use a reclaimable pageblock
 	 * and vice-versa but no more than normal fallback logic which can
 	 * have trouble finding a high-order free page.
@@ -2776,7 +2776,7 @@ static bool unreserve_highatomic_pageblo
 			/*
 			 * In page freeing path, migratetype change is racy so
 			 * we can counter several free pages in a pageblock
-			 * in this loop althoug we changed the pageblock type
+			 * in this loop although we changed the pageblock type
 			 * from highatomic to ac->migratetype. So we should
 			 * adjust the count once.
 			 */
@@ -3080,7 +3080,7 @@ static void drain_local_pages_wq(struct
 	 * drain_all_pages doesn't use proper cpu hotplug protection so
 	 * we can race with cpu offline when the WQ can move this from
 	 * a cpu pinned worker to an unbound one. We can operate on a different
-	 * cpu which is allright but we also have to make sure to not move to
+	 * cpu which is alright but we also have to make sure to not move to
 	 * a different one.
 	 */
 	preempt_disable();
@@ -5929,7 +5929,7 @@ static int build_zonerefs_node(pg_data_t
 static int __parse_numa_zonelist_order(char *s)
 {
 	/*
-	 * We used to support different zonlists modes but they turned
+	 * We used to support different zonelists modes but they turned
 	 * out to be just not useful. Let's keep the warning in place
 	 * if somebody still use the cmd line parameter so that we do
 	 * not fail it silently
@@ -7670,7 +7670,7 @@ static void check_for_memory(pg_data_t *
 }
 
 /*
- * Some architecturs, e.g. ARC may have ZONE_HIGHMEM below ZONE_NORMAL. For
+ * Some architectures, e.g. ARC may have ZONE_HIGHMEM below ZONE_NORMAL. For
  * such cases we allow max_zone_pfn sorted in the descending order
  */
 bool __weak arch_has_descending_max_zone_pfns(void)
@@ -8728,7 +8728,7 @@ static int __alloc_contig_migrate_range(
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @migratetype:	migratetype of the underlaying pageblocks (either
+ * @migratetype:	migratetype of the underlying pageblocks (either
  *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
  *			in range must have the same migratetype and it must
  *			be either of the two.
@@ -8988,7 +8988,7 @@ EXPORT_SYMBOL(free_contig_range);
 
 /*
  * The zone indicated has a new number of managed_pages; batch sizes and percpu
- * page high values need to be recalulated.
+ * page high values need to be recalculated.
  */
 void __meminit zone_pcp_update(struct zone *zone)
 {
--- a/mm/page_owner.c~mm-fix-typos-in-comments
+++ a/mm/page_owner.c
@@ -233,7 +233,7 @@ void __copy_page_owner(struct page *oldp
 	/*
 	 * We don't clear the bit on the oldpage as it's going to be freed
 	 * after migration. Until then, the info can be useful in case of
-	 * a bug, and the overal stats will be off a bit only temporarily.
+	 * a bug, and the overall stats will be off a bit only temporarily.
 	 * Also, migrate_misplaced_transhuge_page() can still fail the
 	 * migration and then we want the oldpage to retain the info. But
 	 * in that case we also don't need to explicitly clear the info from
--- a/mm/page-writeback.c~mm-fix-typos-in-comments
+++ a/mm/page-writeback.c
@@ -1806,7 +1806,7 @@ pause:
 			break;
 
 		/*
-		 * In the case of an unresponding NFS server and the NFS dirty
+		 * In the case of an unresponsive NFS server and the NFS dirty
 		 * pages exceeds dirty_thresh, give the other good wb's a pipe
 		 * to go through, so that tasks on them still remain responsive.
 		 *
@@ -2216,7 +2216,7 @@ int write_cache_pages(struct address_spa
 			 * Page truncated or invalidated. We can freely skip it
 			 * then, even for data integrity operations: the page
 			 * has disappeared concurrently, so there could be no
-			 * real expectation of this data interity operation
+			 * real expectation of this data integrity operation
 			 * even if there is now a new, dirty page at the same
 			 * pagecache address.
 			 */
--- a/mm/percpu.c~mm-fix-typos-in-comments
+++ a/mm/percpu.c
@@ -1862,7 +1862,7 @@ fail:
 			pr_info("limit reached, disable warning\n");
 	}
 	if (is_atomic) {
-		/* see the flag handling in pcpu_blance_workfn() */
+		/* see the flag handling in pcpu_balance_workfn() */
 		pcpu_atomic_alloc_failed = true;
 		pcpu_schedule_balance_work();
 	} else {
--- a/mm/percpu-internal.h~mm-fix-typos-in-comments
+++ a/mm/percpu-internal.h
@@ -170,7 +170,7 @@ struct percpu_stats {
 	u64 nr_max_alloc;	/* max # of live allocations */
 	u32 nr_chunks;		/* current # of live chunks */
 	u32 nr_max_chunks;	/* max # of live chunks */
-	size_t min_alloc_size;	/* min allocaiton size */
+	size_t min_alloc_size;	/* min allocation size */
 	size_t max_alloc_size;	/* max allocation size */
 };
 
--- a/mm/pgalloc-track.h~mm-fix-typos-in-comments
+++ a/mm/pgalloc-track.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_PGALLLC_TRACK_H
-#define _LINUX_PGALLLC_TRACK_H
+#ifndef _LINUX_PGALLOC_TRACK_H
+#define _LINUX_PGALLOC_TRACK_H
 
 #if defined(CONFIG_MMU)
 static inline p4d_t *p4d_alloc_track(struct mm_struct *mm, pgd_t *pgd,
@@ -48,4 +48,4 @@ static inline pmd_t *pmd_alloc_track(str
 	  (__pte_alloc_kernel(pmd) || ({*(mask)|=PGTBL_PMD_MODIFIED;0;})))?\
 		NULL: pte_offset_kernel(pmd, address))
 
-#endif /* _LINUX_PGALLLC_TRACK_H */
+#endif /* _LINUX_PGALLOC_TRACK_H */
--- a/mm/slab.c~mm-fix-typos-in-comments
+++ a/mm/slab.c
@@ -259,7 +259,7 @@ static void kmem_cache_node_init(struct
 
 #define BATCHREFILL_LIMIT	16
 /*
- * Optimization question: fewer reaps means less probability for unnessary
+ * Optimization question: fewer reaps means less probability for unnecessary
  * cpucache drain/refill cycles.
  *
  * OTOH the cpuarrays can contain lots of objects,
@@ -2381,8 +2381,8 @@ union freelist_init_state {
 };
 
 /*
- * Initialize the state based on the randomization methode available.
- * return true if the pre-computed list is available, false otherwize.
+ * Initialize the state based on the randomization method available.
+ * return true if the pre-computed list is available, false otherwise.
  */
 static bool freelist_state_initialize(union freelist_init_state *state,
 				struct kmem_cache *cachep,
--- a/mm/slub.c~mm-fix-typos-in-comments
+++ a/mm/slub.c
@@ -3403,7 +3403,7 @@ EXPORT_SYMBOL(kmem_cache_alloc_bulk);
  */
 
 /*
- * Mininum / Maximum order of slab pages. This influences locking overhead
+ * Minimum / Maximum order of slab pages. This influences locking overhead
  * and slab fragmentation. A higher order reduces the number of partial slabs
  * and increases the number of allocations possible without having to
  * take the list_lock.
--- a/mm/swap_slots.c~mm-fix-typos-in-comments
+++ a/mm/swap_slots.c
@@ -16,7 +16,7 @@
  * to local caches without needing to acquire swap_info
  * lock.  We do not reuse the returned slots directly but
  * move them back to the global pool in a batch.  This
- * allows the slots to coaellesce and reduce fragmentation.
+ * allows the slots to coalesce and reduce fragmentation.
  *
  * The swap entry allocated is marked with SWAP_HAS_CACHE
  * flag in map_count that prevents it from being allocated
--- a/mm/vmalloc.c~mm-fix-typos-in-comments
+++ a/mm/vmalloc.c
@@ -1583,7 +1583,7 @@ static unsigned long lazy_max_pages(void
 static atomic_long_t vmap_lazy_nr = ATOMIC_LONG_INIT(0);
 
 /*
- * Serialize vmap purging.  There is no actual criticial section protected
+ * Serialize vmap purging.  There is no actual critical section protected
  * by this look, but we want to avoid concurrent calls for performance
  * reasons and to make the pcpu_get_vm_areas more deterministic.
  */
@@ -2628,7 +2628,7 @@ static void __vfree(const void *addr)
  * May sleep if called *not* from interrupt context.
  * Must not be called in NMI context (strictly speaking, it could be
  * if we have CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling
- * conventions for vfree() arch-depenedent would be a really bad idea).
+ * conventions for vfree() arch-dependent would be a really bad idea).
  */
 void vfree(const void *addr)
 {
@@ -3141,7 +3141,7 @@ static int aligned_vread(char *buf, char
 		/*
 		 * To do safe access to this _mapped_ area, we need
 		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
+		 * overhead of vmalloc()/vfree() calls for this _debug_
 		 * interface, rarely used. Instead of that, we'll use
 		 * kmap() and get small overhead in this access function.
 		 */
--- a/mm/vmstat.c~mm-fix-typos-in-comments
+++ a/mm/vmstat.c
@@ -934,7 +934,7 @@ void cpu_vm_stats_fold(int cpu)
 
 /*
  * this is only called if !populated_zone(zone), which implies no other users of
- * pset->vm_stat_diff[] exsist.
+ * pset->vm_stat_diff[] exist.
  */
 void drain_zonestat(struct zone *zone, struct per_cpu_pageset *pset)
 {
--- a/mm/zpool.c~mm-fix-typos-in-comments
+++ a/mm/zpool.c
@@ -336,7 +336,7 @@ int zpool_shrink(struct zpool *zpool, un
  * This may hold locks, disable interrupts, and/or preemption,
  * and the zpool_unmap_handle() must be called to undo those
  * actions.  The code that uses the mapped handle should complete
- * its operatons on the mapped handle memory quickly and unmap
+ * its operations on the mapped handle memory quickly and unmap
  * as soon as possible.  As the implementation may use per-cpu
  * data, multiple handles should not be mapped concurrently on
  * any cpu.
--- a/mm/zsmalloc.c~mm-fix-typos-in-comments
+++ a/mm/zsmalloc.c
@@ -1227,7 +1227,7 @@ EXPORT_SYMBOL_GPL(zs_get_total_pages);
  * zs_map_object - get address of allocated object from handle.
  * @pool: pool from which the object was allocated
  * @handle: handle returned from zs_malloc
- * @mm: maping mode to use
+ * @mm: mapping mode to use
  *
  * Before using an object allocated from zs_malloc, it must be mapped using
  * this function. When done with the object, it must be unmapped using
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* [patch 91/91] mm: fix typos in comments
  2021-05-07  1:01 incoming Andrew Morton
                   ` (89 preceding siblings ...)
  2021-05-07  1:06 ` [patch 90/91] mm: fix typos in comments Andrew Morton
@ 2021-05-07  1:06 ` Andrew Morton
  2021-05-07  7:12 ` incoming Linus Torvalds
  91 siblings, 0 replies; 119+ messages in thread
From: Andrew Morton @ 2021-05-07  1:06 UTC (permalink / raw)
  To: akpm, linux-mm, lujialin4, mm-commits, torvalds

From: Lu Jialin <lujialin4@huawei.com>
Subject: mm: fix typos in comments

succed -> succeed in mm/hugetlb.c
wil -> will in mm/mempolicy.c
wit -> with in mm/page_alloc.c
Retruns -> Returns in mm/page_vma_mapped.c
confict -> conflict in mm/secretmem.c
No functionality changed.

Link: https://lkml.kernel.org/r/20210408140027.60623-1-lujialin4@huawei.com
Signed-off-by: Lu Jialin <lujialin4@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mempolicy.c       |    2 +-
 mm/page_alloc.c      |    2 +-
 mm/page_vma_mapped.c |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/mm/mempolicy.c~mm-fix-typos-in-comments-2
+++ a/mm/mempolicy.c
@@ -994,7 +994,7 @@ static long do_get_mempolicy(int *policy
 		if (flags & MPOL_F_ADDR) {
 			/*
 			 * Take a refcount on the mpol, lookup_node()
-			 * wil drop the mmap_lock, so after calling
+			 * will drop the mmap_lock, so after calling
 			 * lookup_node() only "pol" remains valid, "vma"
 			 * is stale.
 			 */
--- a/mm/page_alloc.c~mm-fix-typos-in-comments-2
+++ a/mm/page_alloc.c
@@ -4173,7 +4173,7 @@ out:
 }
 
 /*
- * Maximum number of compaction retries wit a progress before OOM
+ * Maximum number of compaction retries with a progress before OOM
  * killer is consider as the only way to move forward.
  */
 #define MAX_COMPACT_RETRIES 16
--- a/mm/page_vma_mapped.c~mm-fix-typos-in-comments-2
+++ a/mm/page_vma_mapped.c
@@ -134,7 +134,7 @@ static bool check_pte(struct page_vma_ma
  * regardless of which page table level the page is mapped at. @pvmw->pmd is
  * NULL.
  *
- * Retruns false if there are no more page table entries for the page in
+ * Returns false if there are no more page table entries for the page in
  * the vma. @pvmw->ptl is unlocked and @pvmw->pte is unmapped.
  *
  * If you need to stop the walk before page_vma_mapped_walk() returned false,
_

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 04/91] proc: save LOC in __xlate_proc_name()
  2021-05-07  1:02 ` [patch 04/91] proc: save LOC in __xlate_proc_name() Andrew Morton
@ 2021-05-07  2:24   ` Linus Torvalds
  0 siblings, 0 replies; 119+ messages in thread
From: Linus Torvalds @ 2021-05-07  2:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alexey Dobriyan, Linux-MM, mm-commits

On Thu, May 6, 2021 at 6:02 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> From: Alexey Dobriyan <adobriyan@gmail.com>
> Subject: proc: save LOC in __xlate_proc_name()
..
> +       while ((next = strchr(cp, '/'))) {

Please don't do this.

Yes, gcc suggests that double parentheses syntax around an assignment
to avoid warnings.

gcc is wrong, and is being completely stupid.

The proper way to avoid the "assignment in conditional" warning is to
(surprise, surprise) USE A CONDITIONAL.

So that

          while ((next = strchr(cp, '/'))) {

is the crazy rantings of a misguided compiler. No sane human should
ever care about some odd double parenthesis syntax. We're not writing
LISP, for chrissake.

The proper way to write this is

          while ((next = strchr(cp, '/')) != NULL) {

which makes sense to not just a machine, but to a human, and avoids
the whole "assignment used as a conditional" warning very naturally.

See? Now it uses a conditional as a conditional. Doesn't that make a
whole lot more sense than the crazy ramblings of a broken machine
mind?

I fixed it up manually, I just wanted to rant against this kind of
"mindlessly take advice from the compiler without thinking about it".

                   Linus

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: incoming
  2021-05-07  1:01 incoming Andrew Morton
                   ` (90 preceding siblings ...)
  2021-05-07  1:06 ` [patch 91/91] " Andrew Morton
@ 2021-05-07  7:12 ` Linus Torvalds
  91 siblings, 0 replies; 119+ messages in thread
From: Linus Torvalds @ 2021-05-07  7:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Thu, May 6, 2021 at 6:01 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> I've been wobbly about the secretmem patches due to doubts about
> whether the feature is sufficiently useful to justify inclusion, but
> developers are now weighing in with helpful information and I've asked Mike
> for an extensively updated [0/n] changelog.  This will take a few days
> to play out so it is possible that I will prevail upon you for a post-rc1
> merge.

Oh, much too late for this release by now.

> If that's a problem, there's always 5.13-rc1.

5.13-rc1 is two days from now, it would be for 5.14-rc1.. How time -
and version numbers - fly.

             Linus

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-07  1:04 ` [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Andrew Morton
@ 2021-05-07  7:25   ` Linus Torvalds
  2021-05-08  3:13     ` Baoquan He
  2021-05-07  8:16   ` David Hildenbrand
  1 sibling, 1 reply; 119+ messages in thread
From: Linus Torvalds @ 2021-05-07  7:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrey Konovalov, Baoquan He, Christian Brauner, Colin King,
	Jonathan Corbet, dyoung, Frederic Weisbecker, gpiccoli,
	john.p.donnelly, Josh Poimboeuf, Kees Cook, Linux-MM,
	Masahiro Yamada, Mauro Carvalho Chehab, Mike Kravetz,
	Ingo Molnar, mm-commits, Paul E. McKenney, Peter Zijlstra,
	Randy Dunlap, Steven Rostedt, Mike Rapoport,
	saeed.mirzamohammadi, Sami Tolvanen, Stephen Boyd,
	Thomas Gleixner, Vivek Goyal, yifeifz2

On Thu, May 6, 2021 at 6:04 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> From: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
> Subject: kernel/crash_core: add crashkernel=auto for vmcore creation
>
> This adds crashkernel=auto feature to configure reserved memory for vmcore
> creation.  CONFIG_CRASH_AUTO_STR is defined to be set for different kernel
> distributions and different archs based on their needs.

Ugh. I didn't realize how nasty this was until after I'd applied this patch.

I'm going to drop this patch, because the Kconfig thing for it is an
unmitigated mess. I was confused by the question, and then the help
message was actively misleading.

This is wrong for so many reasons:

 - this is a classic case of "you shouldn't ask a user this".

   The question makes no sense to any normal person, it certainly
didn't to me. Don't ask questions that don't have sane answers.

 - the config help text is actively misleading, and claims that the
option is about how much memory is reserved for a crash kernel

   Not so. It's the default string for when somebody uses "crashkernel=auto"

 - this shouldn't be a config option at all, it's clearly a distro
setting, and should be on the kernel command line with the other
distro settings.

So I'm dropping this, and I don't see it ever being applied in this
form for the above reasons.

People, I've said this before, and apparently I need to say it again:
the kernel config is likely the nastiest part of building a local
kernel, and the biggest impediment to people actually building their
own kernels.

And people building their own kernel is the first step to becoming a
kernel developer.

So the kernel configuration is already one of the less pleasant parts
of the kernel, but that does NOT mean that we should strive to make it
even worse.

Obscure, odd, strange config questions like this are a no-no. We're
not making an already bad experience wose for something like this.

           Linus

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-07  1:04 ` [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Andrew Morton
  2021-05-07  7:25   ` Linus Torvalds
@ 2021-05-07  8:16   ` David Hildenbrand
  2021-05-08  8:51     ` Baoquan He
  1 sibling, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-07  8:16 UTC (permalink / raw)
  To: Andrew Morton, andreyknvl, bhe, christian.brauner, colin.king,
	corbet, dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe,
	keescook, linux-mm, masahiroy, mchehab+huawei, mike.kravetz,
	mingo, mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

On 07.05.21 03:04, Andrew Morton wrote:
> From: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
> Subject: kernel/crash_core: add crashkernel=auto for vmcore creation
> 
> This adds crashkernel=auto feature to configure reserved memory for vmcore
> creation.  CONFIG_CRASH_AUTO_STR is defined to be set for different kernel
> distributions and different archs based on their needs.
> 
> Link: https://lkml.kernel.org/r/20210223174153.72802-1-saeed.mirzamohammadi@oracle.com
> Signed-off-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
> Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
> Tested-by: John Donnelly <john.p.donnelly@oracle.com>
> ed-by: Dave Young <dyoung@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> Cc: YiFei Zhu <yifeifz2@illinois.edu>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: Colin Ian King <colin.king@canonical.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>   Documentation/admin-guide/kdump/kdump.rst       |    3 +-
>   Documentation/admin-guide/kernel-parameters.txt |    6 ++++
>   arch/Kconfig                                    |   20 ++++++++++++++
>   kernel/crash_core.c                             |    7 ++++
>   4 files changed, 35 insertions(+), 1 deletion(-)
> 
> --- a/arch/Kconfig~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> +++ a/arch/Kconfig
> @@ -14,6 +14,26 @@ menu "General architecture-dependent opt
>   config CRASH_CORE
>   	bool
>   
> +config CRASH_AUTO_STR
> +	string "Memory reserved for crash kernel"
> +	depends on CRASH_CORE
> +	default "1G-64G:128M,64G-1T:256M,1T-:512M"
> +	help
> +	  This configures the reserved memory dependent
> +	  on the value of System RAM. The syntax is:
> +	  crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
> +	              range=start-[end]
> +
> +	  For example:
> +	      crashkernel=512M-2G:64M,2G-:128M
> +
> +	  This would mean:
> +
> +	      1) if the RAM is smaller than 512M, then don't reserve anything
> +	         (this is the "rescue" case)
> +	      2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> +	      3) if the RAM size is larger than 2G, then reserve 128M
> +
>   config KEXEC_CORE
>   	select CRASH_CORE
>   	bool
> --- a/Documentation/admin-guide/kdump/kdump.rst~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> +++ a/Documentation/admin-guide/kdump/kdump.rst
> @@ -285,7 +285,8 @@ This would mean:
>       2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
>       3) if the RAM size is larger than 2G, then reserve 128M
>   
> -
> +Or you can use crashkernel=auto to choose the crash kernel memory size
> +based on the recommended configuration set for each arch.
>   
>   Boot into System Kernel
>   =======================
> --- a/Documentation/admin-guide/kernel-parameters.txt~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> +++ a/Documentation/admin-guide/kernel-parameters.txt
> @@ -751,6 +751,12 @@
>   			a memory unit (amount[KMG]). See also
>   			Documentation/admin-guide/kdump/kdump.rst for an example.
>   
> +	crashkernel=auto
> +			[KNL] This parameter will set the reserved memory for
> +			the crash kernel based on the value of the CRASH_AUTO_STR
> +			that is the best effort estimation for each arch. See also
> +			arch/Kconfig for further details.
> +
>   	crashkernel=size[KMG],high
>   			[KNL, X86-64] range could be above 4G. Allow kernel
>   			to allocate physical memory region from top, so could
> --- a/kernel/crash_core.c~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> +++ a/kernel/crash_core.c
> @@ -7,6 +7,7 @@
>   #include <linux/crash_core.h>
>   #include <linux/utsname.h>
>   #include <linux/vmalloc.h>
> +#include <linux/kexec.h>
>   
>   #include <asm/page.h>
>   #include <asm/sections.h>
> @@ -250,6 +251,12 @@ static int __init __parse_crashkernel(ch
>   	if (suffix)
>   		return parse_crashkernel_suffix(ck_cmdline, crash_size,
>   				suffix);
> +#ifdef CONFIG_CRASH_AUTO_STR
> +	if (strncmp(ck_cmdline, "auto", 4) == 0) {
> +		ck_cmdline = CONFIG_CRASH_AUTO_STR;
> +		pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
> +	}
> +#endif
I remember that the original "crashkernel=auto" as once proposed by Red 
Hat people did not receive a warm welcome.

Let me take a look .... oh, there it is from 2009

https://marc.info/?t=125006512600002&r=1&w=2

and then we had it in 2018

https://lkml.org/lkml/2018/5/20/262


The issue I have with this: it's just plain wrong when you take memory 
hotplug into serious account as we see it quite heavily in VMs. You 
don't know what you'll need when building a kernel. Just pass it via the 
cmdline ...

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-07  7:25   ` Linus Torvalds
@ 2021-05-08  3:13     ` Baoquan He
  2021-05-08  3:29       ` Baoquan He
  0 siblings, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-08  3:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Andrey Konovalov, Christian Brauner, Colin King,
	Jonathan Corbet, dyoung, Frederic Weisbecker, gpiccoli,
	john.p.donnelly, Josh Poimboeuf, Kees Cook, Linux-MM,
	Masahiro Yamada, Mauro Carvalho Chehab, Mike Kravetz,
	Ingo Molnar, mm-commits, Paul E. McKenney, Peter Zijlstra,
	Randy Dunlap, Steven Rostedt, Mike Rapoport,
	saeed.mirzamohammadi, Sami Tolvanen, Stephen Boyd,
	Thomas Gleixner, Vivek Goyal, yifeifz2, Hari Bathini, piliu

Hi Linus,

On 05/07/21 at 12:25am, Linus Torvalds wrote:
> On Thu, May 6, 2021 at 6:04 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > From: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
> > Subject: kernel/crash_core: add crashkernel=auto for vmcore creation
> >
> > This adds crashkernel=auto feature to configure reserved memory for vmcore
> > creation.  CONFIG_CRASH_AUTO_STR is defined to be set for different kernel
> > distributions and different archs based on their needs.


> 
> Ugh. I didn't realize how nasty this was until after I'd applied this patch.
> 
> I'm going to drop this patch, because the Kconfig thing for it is an
> unmitigated mess. I was confused by the question, and then the help
> message was actively misleading.
> 
> This is wrong for so many reasons:
> 
>  - this is a classic case of "you shouldn't ask a user this".
> 
>    The question makes no sense to any normal person, it certainly
> didn't to me. Don't ask questions that don't have sane answers.
> 
>  - the config help text is actively misleading, and claims that the
> option is about how much memory is reserved for a crash kernel
> 
>    Not so. It's the default string for when somebody uses "crashkernel=auto"

Sorry for the confusion, we should have been more careful to reivew and
add the commit log and kernel config description.
> 
>  - this shouldn't be a config option at all, it's clearly a distro
> setting, and should be on the kernel command line with the other
> distro settings.

Don't know kernel config is disliked sometime, will remember it in the
future and more cautiously to add. 

Crashkernel=auto exists in our distros for many years, and as David
mentioned in other thread, we have been trying to adding rashkernel=auto
support into upstream. We pursue crashkernel=auto being added to upstream
because:

1) Empirical value is given to user by default;

It was required by customer originally, now has been an important part
of kdump feature and supported in several main ARCHes. With crashkernel=auto,
people w/o much knowledge of kdump details can use kdump to debug. Distros
can provide the suggested values with crashkernel=auto which are got by
investigation, analysis and tested widely on test environment. 

2) Cover corner case/special case;

In some cases, kernel may need extra memory to handle, kdump kernel is
not exceptional. E.g when sme/sev enabled, SWIOTLB will be enabled
necessarily, even in kdump kernel. (Below sme/sev related commits for
reference). Then extra 64M need be reserved for crashkernel. User
doesn't need to know this, we already have done it for them.

commit c7753208a94c ("x86, swiotlb: Add memory encryption support")
commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is active")
commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory encryption")

We are eager to push crashkernel=auto to upstream becasue of our
UPSTREAM FIRST rule. Since it has been in RHEL for many years, each time
a new RHEL main release anchor a upstream kernel release and is prepared,
these crashkernel=auto RHEL-only patches need be reviewed inside Redhat,
then we will be questioned and challenged why they are not in upstream.

As for how to implement crashkernel=auto, we have tried several ways.

1) Add into kernel command line

The suggested value need be stored in user space if added into kernel
command line, then added into kernel. This makes the suggested value
separated from kernel itself. It's not what we expect to see. Because
the suggested crashkernel value is strongly related to distros release.
We could adjust the value between sub-releases of kernel because of
of kernel change. Adding them into kernel command line make us lose the
track of them in kernel.

2) Add a weak generic function and several arch dependent functions
3) Hardcode values in __parse_crashkernel()

Method 2) is taken in our RHEL7, 3) is used in RHEL8, RHEL-only patches
add them. If we try to push them into upstram, any later value
adjustment need a upstream patch posting. Otherwise, RHEL-only patch
need be introduced again, Redhat internal reviewer will challenge us
again. (Put the value hard coding pieces at bottom for reference).

4) Add kernel config to add default value

It's done in this patch. With the kernel config CRASH_AUTO_STR, Distros can
add default value, and adjust it anytime in the future w/o bothering
upstream. If crashkernel=auto is specified, only below 3 LOC added, to
go to parse the CONFIG_CRASH_AUTO_STR directly.

@@ -250,6 +251,12 @@ static int __init __parse_crashkernel(ch
        if (suffix)
                return parse_crashkernel_suffix(ck_cmdline, crash_size,
                                suffix);
+#ifdef CONFIG_CRASH_AUTO_STR
+       if (strncmp(ck_cmdline, "auto", 4) == 0) {
+               ck_cmdline = CONFIG_CRASH_AUTO_STR;
+               pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
+       }
+#endif


Before this, we don't know Saeed Mirzamohammadi, the patch author. He
could experience the same torture. We were wild with joy when noticing
his patch. We were planning to launch new round of post to add
crashkernel=auto, kernel config is our final option too. We could be too
happy to forget polishing the commit log.

Not sure if I make myself clear. Basically, we expect crashkernel=auto
to be added in upstream kernel. About how to implement it in kernel, we would
like to hear upstream people's suggestion.

Thanks
Baoquan


Hard code crashkernel=auto values in __parse_crashkernel()
===========================================================
static int __init __parse_crashkernel(char *cmdline,
                             unsigned long long system_ram,
                             unsigned long long *crash_size,
                             unsigned long long *crash_base,
                             const char *name,
                             const char *suffix)
{
......
        if (strncmp(ck_cmdline, "auto", 4) == 0) {
#if defined(CONFIG_X86_64) || defined(CONFIG_S390)
                ck_cmdline = "1G-4G:160M,4G-64G:192M,64G-1T:256M,1T-:512M";
#elif defined(CONFIG_ARM64)
                ck_cmdline = "2G-:448M";
#elif defined(CONFIG_PPC64)
                char *fadump_cmdline;

                fadump_cmdline = get_last_crashkernel(cmdline, "fadump=", NULL);
                fadump_cmdline = fadump_cmdline ?
                                fadump_cmdline + strlen("fadump=") : NULL;
                if (!fadump_cmdline || (strncmp(fadump_cmdline, "off", 3) == 0))
                        ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";
                else
                        ck_cmdline = "4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G";
#endif
                pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
        }

......
}
==================================================================





^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-08  3:13     ` Baoquan He
@ 2021-05-08  3:29       ` Baoquan He
  0 siblings, 0 replies; 119+ messages in thread
From: Baoquan He @ 2021-05-08  3:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Andrey Konovalov, Christian Brauner, Colin King,
	Jonathan Corbet, dyoung, Frederic Weisbecker, gpiccoli,
	john.p.donnelly, Josh Poimboeuf, Kees Cook, Linux-MM,
	Masahiro Yamada, Mauro Carvalho Chehab, Mike Kravetz,
	Ingo Molnar, mm-commits, Paul E. McKenney, Peter Zijlstra,
	Randy Dunlap, Steven Rostedt, Mike Rapoport,
	saeed.mirzamohammadi, Sami Tolvanen, Stephen Boyd,
	Thomas Gleixner, Vivek Goyal, yifeifz2, Hari Bathini, piliu,
	kasong

Add Kairui to CC since he is taking care of the crashkernel=auto code in
our Distros.

On 05/08/21 at 11:13am, Baoquan He wrote:
> Hi Linus,
> 
> On 05/07/21 at 12:25am, Linus Torvalds wrote:
> > On Thu, May 6, 2021 at 6:04 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > From: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>
> > > Subject: kernel/crash_core: add crashkernel=auto for vmcore creation
> > >
> > > This adds crashkernel=auto feature to configure reserved memory for vmcore
> > > creation.  CONFIG_CRASH_AUTO_STR is defined to be set for different kernel
> > > distributions and different archs based on their needs.
> 
> 
> > 
> > Ugh. I didn't realize how nasty this was until after I'd applied this patch.
> > 
> > I'm going to drop this patch, because the Kconfig thing for it is an
> > unmitigated mess. I was confused by the question, and then the help
> > message was actively misleading.
> > 
> > This is wrong for so many reasons:
> > 
> >  - this is a classic case of "you shouldn't ask a user this".
> > 
> >    The question makes no sense to any normal person, it certainly
> > didn't to me. Don't ask questions that don't have sane answers.
> > 
> >  - the config help text is actively misleading, and claims that the
> > option is about how much memory is reserved for a crash kernel
> > 
> >    Not so. It's the default string for when somebody uses "crashkernel=auto"
> 
> Sorry for the confusion, we should have been more careful to reivew and
> add the commit log and kernel config description.
> > 
> >  - this shouldn't be a config option at all, it's clearly a distro
> > setting, and should be on the kernel command line with the other
> > distro settings.
> 
> Don't know kernel config is disliked sometime, will remember it in the
> future and more cautiously to add. 
> 
> Crashkernel=auto exists in our distros for many years, and as David
> mentioned in other thread, we have been trying to adding rashkernel=auto
> support into upstream. We pursue crashkernel=auto being added to upstream
> because:
> 
> 1) Empirical value is given to user by default;
> 
> It was required by customer originally, now has been an important part
> of kdump feature and supported in several main ARCHes. With crashkernel=auto,
> people w/o much knowledge of kdump details can use kdump to debug. Distros
> can provide the suggested values with crashkernel=auto which are got by
> investigation, analysis and tested widely on test environment. 
> 
> 2) Cover corner case/special case;
> 
> In some cases, kernel may need extra memory to handle, kdump kernel is
> not exceptional. E.g when sme/sev enabled, SWIOTLB will be enabled
> necessarily, even in kdump kernel. (Below sme/sev related commits for
> reference). Then extra 64M need be reserved for crashkernel. User
> doesn't need to know this, we already have done it for them.
> 
> commit c7753208a94c ("x86, swiotlb: Add memory encryption support")
> commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is active")
> commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory encryption")
> 
> We are eager to push crashkernel=auto to upstream becasue of our
> UPSTREAM FIRST rule. Since it has been in RHEL for many years, each time
> a new RHEL main release anchor a upstream kernel release and is prepared,
> these crashkernel=auto RHEL-only patches need be reviewed inside Redhat,
> then we will be questioned and challenged why they are not in upstream.
> 
> As for how to implement crashkernel=auto, we have tried several ways.
> 
> 1) Add into kernel command line
> 
> The suggested value need be stored in user space if added into kernel
> command line, then added into kernel. This makes the suggested value
> separated from kernel itself. It's not what we expect to see. Because
> the suggested crashkernel value is strongly related to distros release.
> We could adjust the value between sub-releases of kernel because of
> of kernel change. Adding them into kernel command line make us lose the
> track of them in kernel.
> 
> 2) Add a weak generic function and several arch dependent functions
> 3) Hardcode values in __parse_crashkernel()
> 
> Method 2) is taken in our RHEL7, 3) is used in RHEL8, RHEL-only patches
> add them. If we try to push them into upstram, any later value
> adjustment need a upstream patch posting. Otherwise, RHEL-only patch
> need be introduced again, Redhat internal reviewer will challenge us
> again. (Put the value hard coding pieces at bottom for reference).
> 
> 4) Add kernel config to add default value
> 
> It's done in this patch. With the kernel config CRASH_AUTO_STR, Distros can
> add default value, and adjust it anytime in the future w/o bothering
> upstream. If crashkernel=auto is specified, only below 3 LOC added, to
> go to parse the CONFIG_CRASH_AUTO_STR directly.
> 
> @@ -250,6 +251,12 @@ static int __init __parse_crashkernel(ch
>         if (suffix)
>                 return parse_crashkernel_suffix(ck_cmdline, crash_size,
>                                 suffix);
> +#ifdef CONFIG_CRASH_AUTO_STR
> +       if (strncmp(ck_cmdline, "auto", 4) == 0) {
> +               ck_cmdline = CONFIG_CRASH_AUTO_STR;
> +               pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
> +       }
> +#endif
> 
> 
> Before this, we don't know Saeed Mirzamohammadi, the patch author. He
> could experience the same torture. We were wild with joy when noticing
> his patch. We were planning to launch new round of post to add
> crashkernel=auto, kernel config is our final option too. We could be too
> happy to forget polishing the commit log.
> 
> Not sure if I make myself clear. Basically, we expect crashkernel=auto
> to be added in upstream kernel. About how to implement it in kernel, we would
> like to hear upstream people's suggestion.
> 
> Thanks
> Baoquan
> 
> 
> Hard code crashkernel=auto values in __parse_crashkernel()
> ===========================================================
> static int __init __parse_crashkernel(char *cmdline,
>                              unsigned long long system_ram,
>                              unsigned long long *crash_size,
>                              unsigned long long *crash_base,
>                              const char *name,
>                              const char *suffix)
> {
> ......
>         if (strncmp(ck_cmdline, "auto", 4) == 0) {
> #if defined(CONFIG_X86_64) || defined(CONFIG_S390)
>                 ck_cmdline = "1G-4G:160M,4G-64G:192M,64G-1T:256M,1T-:512M";
> #elif defined(CONFIG_ARM64)
>                 ck_cmdline = "2G-:448M";
> #elif defined(CONFIG_PPC64)
>                 char *fadump_cmdline;
> 
>                 fadump_cmdline = get_last_crashkernel(cmdline, "fadump=", NULL);
>                 fadump_cmdline = fadump_cmdline ?
>                                 fadump_cmdline + strlen("fadump=") : NULL;
>                 if (!fadump_cmdline || (strncmp(fadump_cmdline, "off", 3) == 0))
>                         ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";
>                 else
>                         ck_cmdline = "4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G";
> #endif
>                 pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
>         }
> 
> ......
> }
> ==================================================================
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-07  8:16   ` David Hildenbrand
@ 2021-05-08  8:51     ` Baoquan He
  2021-05-08  9:22       ` David Hildenbrand
  0 siblings, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-08  8:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

On 05/07/21 at 10:16am, David Hildenbrand wrote:
> On 07.05.21 03:04, Andrew Morton wrote:
......
> > 
> >   Documentation/admin-guide/kdump/kdump.rst       |    3 +-
> >   Documentation/admin-guide/kernel-parameters.txt |    6 ++++
> >   arch/Kconfig                                    |   20 ++++++++++++++
> >   kernel/crash_core.c                             |    7 ++++
> >   4 files changed, 35 insertions(+), 1 deletion(-)
> > 
> > --- a/arch/Kconfig~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> > +++ a/arch/Kconfig
> > @@ -14,6 +14,26 @@ menu "General architecture-dependent opt
> >   config CRASH_CORE
> >   	bool
> > +config CRASH_AUTO_STR
> > +	string "Memory reserved for crash kernel"
> > +	depends on CRASH_CORE
> > +	default "1G-64G:128M,64G-1T:256M,1T-:512M"
> > +	help
> > +	  This configures the reserved memory dependent
> > +	  on the value of System RAM. The syntax is:
> > +	  crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset]
> > +	              range=start-[end]
> > +
> > +	  For example:
> > +	      crashkernel=512M-2G:64M,2G-:128M
> > +
> > +	  This would mean:
> > +
> > +	      1) if the RAM is smaller than 512M, then don't reserve anything
> > +	         (this is the "rescue" case)
> > +	      2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> > +	      3) if the RAM size is larger than 2G, then reserve 128M
> > +
> >   config KEXEC_CORE
> >   	select CRASH_CORE
> >   	bool
> > --- a/Documentation/admin-guide/kdump/kdump.rst~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> > +++ a/Documentation/admin-guide/kdump/kdump.rst
> > @@ -285,7 +285,8 @@ This would mean:
> >       2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
> >       3) if the RAM size is larger than 2G, then reserve 128M
> > -
> > +Or you can use crashkernel=auto to choose the crash kernel memory size
> > +based on the recommended configuration set for each arch.
> >   Boot into System Kernel
> >   =======================
> > --- a/Documentation/admin-guide/kernel-parameters.txt~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> > +++ a/Documentation/admin-guide/kernel-parameters.txt
> > @@ -751,6 +751,12 @@
> >   			a memory unit (amount[KMG]). See also
> >   			Documentation/admin-guide/kdump/kdump.rst for an example.
> > +	crashkernel=auto
> > +			[KNL] This parameter will set the reserved memory for
> > +			the crash kernel based on the value of the CRASH_AUTO_STR
> > +			that is the best effort estimation for each arch. See also
> > +			arch/Kconfig for further details.
> > +
> >   	crashkernel=size[KMG],high
> >   			[KNL, X86-64] range could be above 4G. Allow kernel
> >   			to allocate physical memory region from top, so could
> > --- a/kernel/crash_core.c~kernel-crash_core-add-crashkernel=auto-for-vmcore-creation
> > +++ a/kernel/crash_core.c
> > @@ -7,6 +7,7 @@
> >   #include <linux/crash_core.h>
> >   #include <linux/utsname.h>
> >   #include <linux/vmalloc.h>
> > +#include <linux/kexec.h>
> >   #include <asm/page.h>
> >   #include <asm/sections.h>
> > @@ -250,6 +251,12 @@ static int __init __parse_crashkernel(ch
> >   	if (suffix)
> >   		return parse_crashkernel_suffix(ck_cmdline, crash_size,
> >   				suffix);
> > +#ifdef CONFIG_CRASH_AUTO_STR
> > +	if (strncmp(ck_cmdline, "auto", 4) == 0) {
> > +		ck_cmdline = CONFIG_CRASH_AUTO_STR;
> > +		pr_info("Using crashkernel=auto, the size chosen is a best effort estimation.\n");
> > +	}
> > +#endif
> I remember that the original "crashkernel=auto" as once proposed by Red Hat
> people did not receive a warm welcome.
> 
> Let me take a look .... oh, there it is from 2009
> 
> https://marc.info/?t=125006512600002&r=1&w=2
> 
> and then we had it in 2018
> 
> https://lkml.org/lkml/2018/5/20/262

Thanks for digging these two out, otherwise I may need do for people to
know the history better.

> 
> 
> The issue I have with this: it's just plain wrong when you take memory
> hotplug into serious account as we see it quite heavily in VMs. You don't
> know what you'll need when building a kernel. Just pass it via the cmdline

Hmm, kdump may have no issue with memory hotplug in crashkernel
reservation aspect. The system RAM size is not correlated to
crashkernel size directly, that's why the default value in this patch is
not linear related to system RAM size. The proportion of crashkernel
size to the total RAM size is thing we take into account. Usually
crashkernel 160M is enough on most of systems. If system RAM size is
larger, extra memory can be added just in case, and not bring much
impact to system.

With our investigation, PCIe devices impact the crashkernel size, and
cpu number. There are always pci devices which driver require tens of KB
meomry, even MB. E.g in below patch, my colleague Coiby found out the
i40e network card even cost 1.5G memory to initialize its ringbuffer on
ppc, and 85M on x86_64.

[PATCH v1 0/3] Reducing memory usage of i40e for kdump
http://lists.infradead.org/pipermail/kexec/2021-March/022117.html

Even though not all pci devices need surprisingly large memory like
i40e, system with hundreds of pci devices can also cost more memory than
expected. This kind of system usually is high end server, specified
crashkernel value need be set manually. 

So system RAM size is the least important part to influence crashkernel
costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160M
crashkernel is still enough. Just we would like to get a tiny extra part
to add to crashkernel if the total RAM is very large, that's the rule
for crashkernel=auto. As for VMs, given their very few devices, virtio
disk, NAT nic, etc, no matter how much memory is deployed and hot
added/removed, crashkernel size won't be influenced very much. My
personal understanding about it.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-08  8:51     ` Baoquan He
@ 2021-05-08  9:22       ` David Hildenbrand
  2021-05-10  4:53         ` Baoquan He
  0 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-08  9:22 UTC (permalink / raw)
  To: Baoquan He
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

>> Let me take a look .... oh, there it is from 2009
>>
>> https://marc.info/?t=125006512600002&r=1&w=2
>>
>> and then we had it in 2018
>>
>> https://lkml.org/lkml/2018/5/20/262
> 
> Thanks for digging these two out, otherwise I may need do for people to
> know the history better.

Sure, I stumbled over this myself recently when wondering about what 
fadump is.


>> The issue I have with this: it's just plain wrong when you take memory
>> hotplug into serious account as we see it quite heavily in VMs. You don't
>> know what you'll need when building a kernel. Just pass it via the cmdline
> 
> Hmm, kdump may have no issue with memory hotplug in crashkernel
> reservation aspect. The system RAM size is not correlated to
> crashkernel size directly, that's why the default value in this patch is

"Not correlated directly" ...

"1G-64G:128M,64G-1T:256M,1T-:512M"

Am I still asleep and dreaming? :)


> not linear related to system RAM size. The proportion of crashkernel
> size to the total RAM size is thing we take into account. Usually
> crashkernel 160M is enough on most of systems. If system RAM size is
> larger, extra memory can be added just in case, and not bring much
> impact to system.

So, all the rules we have are essentially broken because they rely 
completely on the system RAM during boot.

> 
> With our investigation, PCIe devices impact the crashkernel size, and
> cpu number. There are always pci devices which driver require tens of KB
> meomry, even MB. E.g in below patch, my colleague Coiby found out the
> i40e network card even cost 1.5G memory to initialize its ringbuffer on
> ppc, and 85M on x86_64.
> 
> [PATCH v1 0/3] Reducing memory usage of i40e for kdump
> http://lists.infradead.org/pipermail/kexec/2021-March/022117.html
> 
> Even though not all pci devices need surprisingly large memory like
> i40e, system with hundreds of pci devices can also cost more memory than
> expected. This kind of system usually is high end server, specified
> crashkernel value need be set manually.
> 
> So system RAM size is the least important part to influence crashkernel

Aehm, not with fadump, no?

> costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160M
> crashkernel is still enough. Just we would like to get a tiny extra part
> to add to crashkernel if the total RAM is very large, that's the rule
> for crashkernel=auto. As for VMs, given their very few devices, virtio
> disk, NAT nic, etc, no matter how much memory is deployed and hot
> added/removed, crashkernel size won't be influenced very much. My
> personal understanding about it.

That's an interesting observation. But you're telling me that we end up 
wasting memory for the crashkernel because "crashkernel=auto" which is 
supposed to do something magical good automatically does something very 
suboptimal? Oh my ... this is broken.

Long story short: crashkernel=auto is pure ugliness.


Why can't we construct a crashkernel in user space when 
installing/activating kdump and requiring a reboot for kdump to be 
active as long as that crashkernel setting is not properly respected?

Just have a look at the system properties (is_qemu(), #PCI, ...) and 
propose a value for "crashkernel=". Check that that value is at least 
active when activating kdump. Otherwise don't enable kdump and fail.

Yes, it can be difficult with some newer/older kernels having some 
different demands, but things should change drastically, and a distro 
can always update its advises along with the kernel, no?

You could even have a kernel interface that gives you the current 
crashkernel size (maybe already there) vs. the recommended crashkernel 
size. Make kdump or *whoever* activate that in the cmdline and let kdump 
check if both values are satisfied when booting up.

Also: this approach here doesn't make any sense when you want to do 
something dependent on other cmdline parameters. Take "fadump=on" vs 
"fadump=off" as an example. You just cannot handle it properly as 
proposed in this patch. To me the approach in this patch makes least 
sense TBH.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-08  9:22       ` David Hildenbrand
@ 2021-05-10  4:53         ` Baoquan He
  2021-05-10  8:32           ` David Hildenbrand
  0 siblings, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-10  4:53 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

On 05/08/21 at 11:22am, David Hildenbrand wrote:
> > > Let me take a look .... oh, there it is from 2009
> > > 
> > > https://marc.info/?t=125006512600002&r=1&w=2
> > > 
> > > and then we had it in 2018
> > > 
> > > https://lkml.org/lkml/2018/5/20/262
> > 
> > Thanks for digging these two out, otherwise I may need do for people to
> > know the history better.
> 
> Sure, I stumbled over this myself recently when wondering about what fadump
> is.
> 
> 
> > > The issue I have with this: it's just plain wrong when you take memory
> > > hotplug into serious account as we see it quite heavily in VMs. You don't
> > > know what you'll need when building a kernel. Just pass it via the cmdline
> > 
> > Hmm, kdump may have no issue with memory hotplug in crashkernel
> > reservation aspect. The system RAM size is not correlated to
> > crashkernel size directly, that's why the default value in this patch is
> 
> "Not correlated directly" ...
> 
> "1G-64G:128M,64G-1T:256M,1T-:512M"
> 
> Am I still asleep and dreaming? :)

Well, I said 'Not correlated directly', then gave sentences to explan
the reason. I would like to repeat them:

1) Crashkernel need more memory on some systems mainly because of
device driver. You can take a system, no matter how much memory you
increse or decrease total system RAM size, the crashkernel size needed
is invariable.

  - The extreme case I have give about the i40e.
  - And the more devices, narutally the more memory needed.

2) About "1G-64G:128M,64G-1T:256M,1T-:512M", I also said the different
value is because taking very low proprotion of extra memory to avoid
potential risk, it's cost effective. Here, add another 90M which is
0.13% of 64G, 0.0085% of 1TB.

Hope it can help people sober up.

> 
> 
> > not linear related to system RAM size. The proportion of crashkernel
> > size to the total RAM size is thing we take into account. Usually
> > crashkernel 160M is enough on most of systems. If system RAM size is
> > larger, extra memory can be added just in case, and not bring much
> > impact to system.
> 
> So, all the rules we have are essentially broken because they rely
> completely on the system RAM during boot.

How do you get this?

Crashkernel=auto is a default value. PC, VMs, normal workstation and server
which are the overall majority can work well with it. I can say the number
is 99%. Only very few high end workstation, servers which contain
many PCI devices need investigation to decide crashkernel size. A possible
manual setting and rebooting is needed for them. You call this
'essentially broken'? So you later suggestd constructing crashkernel value
in user space and rebooting is not broken? Even though it's the similar
thing? what is your logic behind your conclusion?

Crashkernel=auto is mainly targetting most of systems, help people
w/o much knowledge of kdump implementation to use it for debugging.

I can say more about the benefit of crashkernel=auto. On Fedora, the
community distros sponsord by Redhat, the kexec/kdump is also maintained
by us. Fedora kernel is mainline kernel, so no crashkernel=auto
provided. We almost never get bug report from users, means almost nobody
use  it. We hope Fedora users' usage can help test functionality of
component. 
> 
> > 
> > With our investigation, PCIe devices impact the crashkernel size, and
> > cpu number. There are always pci devices which driver require tens of KB
> > meomry, even MB. E.g in below patch, my colleague Coiby found out the
> > i40e network card even cost 1.5G memory to initialize its ringbuffer on
> > ppc, and 85M on x86_64.
> > 
> > [PATCH v1 0/3] Reducing memory usage of i40e for kdump
> > http://lists.infradead.org/pipermail/kexec/2021-March/022117.html
> > 
> > Even though not all pci devices need surprisingly large memory like
> > i40e, system with hundreds of pci devices can also cost more memory than
> > expected. This kind of system usually is high end server, specified
> > crashkernel value need be set manually.
> > 
> > So system RAM size is the least important part to influence crashkernel
> 
> Aehm, not with fadump, no?

Fadump makes use of crashkernel reservation, but has different mechanism
to dumping. It needs a kernel config too if this patch is accepted, or
it can add it to command line from a user space program, I will talk
about that later. This depends on IBM's decision, I have added Hari to CC,
they will make the best choice after consideration.

}
> 
> > costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160M
> > crashkernel is still enough. Just we would like to get a tiny extra part
> > to add to crashkernel if the total RAM is very large, that's the rule
> > for crashkernel=auto. As for VMs, given their very few devices, virtio
> > disk, NAT nic, etc, no matter how much memory is deployed and hot
> > added/removed, crashkernel size won't be influenced very much. My
> > personal understanding about it.
> 
> That's an interesting observation. But you're telling me that we end up
> wasting memory for the crashkernel because "crashkernel=auto" which is
> supposed to do something magical good automatically does something very
> suboptimal? Oh my ... this is broken.
> 
> Long story short: crashkernel=auto is pure ugliness.

Very interesting. Your long story is clear to me, but your short story
confuses me a lot.

Let me try to sort out and understand. In your first reply, you asserted
"it's plain wrong when taking memory hotplug serious account as
we see it quite heavily in VMs", means you plain don't know if it's
wrong, but you say it's plain wrong. I answered you 'no, not at all'
with detailed explanation, means it's plain opposite to your assertion.
So then you quickly came to 'crashkernel=auto is pure ugliness'. If a
simple crashkernel=auto is added to cover 99% systems, and advanced
operation only need be done for the rest which is tiny proportion,
this is called pure ugliness, what's pure beauty? Here I say 99%, I
could be very conservative.

> 
> Why can't we construct a crashkernel in user space when
> installing/activating kdump and requiring a reboot for kdump to be active as
> long as that crashkernel setting is not properly respected?
> 
> Just have a look at the system properties (is_qemu(), #PCI, ...) and propose
> a value for "crashkernel=". Check that that value is at least active when
> activating kdump. Otherwise don't enable kdump and fail.
> 
> Yes, it can be difficult with some newer/older kernels having some different
> demands, but things should change drastically, and a distro can always
> update its advises along with the kernel, no?
> 
> You could even have a kernel interface that gives you the current
> crashkernel size (maybe already there) vs. the recommended crashkernel size.
> Make kdump or *whoever* activate that in the cmdline and let kdump check if
> both values are satisfied when booting up.

Now, let's go to your long story.

Yes, if you haven't seen our patch in fedora kexec-tools maining list,
your suggested approach is the exactly same thing we are doing, please
check below patch.

[PATCH v2] kdumpctl: Add kdumpctl estimate
https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/thread/YCEOJHQXKVEIVNB23M2TDAJGYVNP5MJZ/

We will provide a new feature in user space script, to let user check if
their current crashkernel size is good or not. If not, they can adjust
accordingly.

But, where's the current crashkernel size coming from? Surely
crashkernel=auto. You wouldn't add a random crashkernel size then
compared with the recommended crashkernel size, then reboot, will you?
If crashkernel=auto get the expected size, no need to reboot. Means 99%
of systems has no need to reboot. Only very few of systems, need reboot
after checking the recommended size.

Long story short. crashkernel=auto will give a default value, trying to
cover most of systems. (Very few high end server need check if it's
enough and adjust with the help of user space tools. Then reboot.)


> 
> Also: this approach here doesn't make any sense when you want to do
> something dependent on other cmdline parameters. Take "fadump=on" vs
> "fadump=off" as an example. You just cannot handle it properly as proposed
> in this patch. To me the approach in this patch makes least sense TBH.

Why? We don't have this kind of judgement in kernel? Crashkernel=auto is
a generic mechanism, and has been added much earlier. Fadump was added
later by IBM for their need on ppc only, it relies on crashkernel
reservation but different mechanism of dumping. If it has different value
than kdump, a special hanlding is certainly needed. Who tell it has to be
'fadump=on'? They can check the value in user space program and add into
cmdline as you suggested, they can also make it into auto. The most suitable
is the best.

And I have several questions to ask, hope you can help answer:

1) Have you ever met crashkernel=auto broken on virt platform?

Asking this because you are from Virt team, and crashkernel=auto has been
there in RHEL for many years, and we have been working with Virt team to
support dumping. We haven't seen any bug report or complaint about
crashkernel=auto from Virt. 

2) Adding crashkernel=auto, and the kdumpctl estimate as user space
program to get a recommended size, then reboot. Removing crashkernel=auto,
only the kdumpctl estimate to get a recommended size, always reboot.
In RHEL we will take the 1st option. Are you willing to take the 2nd one
for Virt platform since you think crashkernel=auto is plain wrong, pure
ugliness, essentially broken, least sense?

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10  4:53         ` Baoquan He
@ 2021-05-10  8:32           ` David Hildenbrand
  2021-05-10 10:43             ` Baoquan He
  0 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-10  8:32 UTC (permalink / raw)
  To: Baoquan He
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2


>> "Not correlated directly" ...
>>
>> "1G-64G:128M,64G-1T:256M,1T-:512M"
>>
>> Am I still asleep and dreaming? :)
> 
> Well, I said 'Not correlated directly', then gave sentences to explan
> the reason. I would like to repeat them:
> 
> 1) Crashkernel need more memory on some systems mainly because of
> device driver. You can take a system, no matter how much memory you
> increse or decrease total system RAM size, the crashkernel size needed
> is invariable.
> 
>    - The extreme case I have give about the i40e.
>    - And the more devices, narutally the more memory needed.
> 
> 2) About "1G-64G:128M,64G-1T:256M,1T-:512M", I also said the different
> value is because taking very low proprotion of extra memory to avoid
> potential risk, it's cost effective. Here, add another 90M which is
> 0.13% of 64G, 0.0085% of 1TB.

Just let me clarify the problem I am having with all of this:

We model the crashkernel size as a function of the memory size. Yet, 
it's pretty much independent of the memory size. That screams for "ugly".

The main problem is that early during boot we don't have a clue how much 
crashkernel memory we may need. So what I see is that we are mostly 
using a heuristic based on the memory size to come up with the right 
answer how much devices we might have. That just feels very wrong.

I can understand the reasoning of "using a fraction of the memory size" 
when booting up just to be on the safe side as we don't know", and that 
motivation is much better than what I read so far. But then I wonder if 
we cannot handle that any better? Because this feels very suboptimal to 
me and I feel like there can be cases where the heuristic is just wrong.

As one example, can I add a whole bunch of devices to a 32GB VM and 
break "crashkernel=auto"?

As another example, when I boot a 64G VM, the crashkernel size will be 
512MB, although I really only might need 128MB. That's an effective 
overhead of 0.5%. And especially when we take memory ballooning etc. 
into account it can effectively be more than that.

Let's do a more detailed look. PPC64 in kernel-ark:

"2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";

Assume I would only need 385M on a simple 16GB VM. We would have an 
overhead of ~4%. But maybe on ppc64 we do have to take the memory size 
into account (my assumption and, thus, my comment regarding memory hotplug)?


I wonder if we could always try allocating larger granularity (falling 
back to smaller if it fails), and once the kernel is able to come up 
with a better answer how many devices there are and, thus, how big the 
crashkernel area really should be, shrink the preallocated crashkernel 
(either from the kernel or from user space)? Not completely trivial but 
possible I think. It's trivial when we allocate a memmap for the 
crashkernel (I think we mostly do but I might be wrong).

The "crashkernel=auto" would really do something magical good instead of 
implement some heuristic base don the memory size.

[...]
>> So, all the rules we have are essentially broken because they rely
>> completely on the system RAM during boot.
> 
> How do you get this?
> 
> Crashkernel=auto is a default value. PC, VMs, normal workstation and server
> which are the overall majority can work well with it. I can say the number
> is 99%. Only very few high end workstation, servers which contain
> many PCI devices need investigation to decide crashkernel size. A possible
> manual setting and rebooting is needed for them. You call this
> 'essentially broken'? So you later suggestd constructing crashkernel value
> in user space and rebooting is not broken? Even though it's the similar
> thing? what is your logic behind your conclusion?

A kernel early during boot can only guess. A kernel late during boot 
knows. Please correct me if I'm wrong.

> 
> Crashkernel=auto is mainly targetting most of systems, help people
> w/o much knowledge of kdump implementation to use it for debugging.
> 
> I can say more about the benefit of crashkernel=auto. On Fedora, the
> community distros sponsord by Redhat, the kexec/kdump is also maintained
> by us. Fedora kernel is mainline kernel, so no crashkernel=auto
> provided. We almost never get bug report from users, means almost nobody
> use  it. We hope Fedora users' usage can help test functionality of
> component.

I know how helpful "crashkernel=auto" was so far, but I am also aware 
that there was strong pushback in the past, and I remember for the 
reasons I gave. IMHO we should refine that approach instead of trying to 
push the same thing upstream every couple of years.

I ran into the "512MB crashkernel" on a 64G VM with memory ballooning 
issue already but didn't report a BZ, because so far, I was under the 
impression that more memory means more crashkernel. But you explained to 
me that I was just running into a (for my use case) bad heuristic.

>>> So system RAM size is the least important part to influence crashkernel
>>
>> Aehm, not with fadump, no?
> 
> Fadump makes use of crashkernel reservation, but has different mechanism
> to dumping. It needs a kernel config too if this patch is accepted, or
> it can add it to command line from a user space program, I will talk
> about that later. This depends on IBM's decision, I have added Hari to CC,
> they will make the best choice after consideration.
> 

I was looking at RHEL8, and there we have

fadump_cmdline = get_last_crashkernel(cmdline, "fadump=", NULL);
...
if (!fadump_cmdline || (strncmp(fadump_cmdline, "off", 3) == 0))
	ck_cmdline = ...
else
	ck_cmdline = ...

which was a runtime check for fadump.

Something that cannot be modeled properly at least with this patch here.

> }
>>
>>> costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160M
>>> crashkernel is still enough. Just we would like to get a tiny extra part
>>> to add to crashkernel if the total RAM is very large, that's the rule
>>> for crashkernel=auto. As for VMs, given their very few devices, virtio
>>> disk, NAT nic, etc, no matter how much memory is deployed and hot
>>> added/removed, crashkernel size won't be influenced very much. My
>>> personal understanding about it.
>>
>> That's an interesting observation. But you're telling me that we end up
>> wasting memory for the crashkernel because "crashkernel=auto" which is
>> supposed to do something magical good automatically does something very
>> suboptimal? Oh my ... this is broken.
>>
>> Long story short: crashkernel=auto is pure ugliness.
> 
> Very interesting. Your long story is clear to me, but your short story
> confuses me a lot.
> 
> Let me try to sort out and understand. In your first reply, you asserted
> "it's plain wrong when taking memory hotplug serious account as
> we see it quite heavily in VMs", means you plain don't know if it's
> wrong, but you say it's plain wrong. I answered you 'no, not at all'
> with detailed explanation, means it's plain opposite to your assertion.

Yep, I might be partially wrong about memory hotplug thingy, mostly 
because I had the RHEL8 rule for ppc64 (including fadump) in mind. For 
dynamic resizing of VMs, the current rules for VMs can be very sub-optimal.

Let's relax "plain wrong" to "the heuristic can be very suboptimal 
because it uses something mostly unrelated to come up with an answer". 
And it's simply not plain wrong because in practice it gets the job 
done. Mostly.


> So then you quickly came to 'crashkernel=auto is pure ugliness'. If a
> simple crashkernel=auto is added to cover 99% systems, and advanced
> operation only need be done for the rest which is tiny proportion,
> this is called pure ugliness, what's pure beauty? Here I say 99%, I
> could be very conservative.

I don't like wasting memory just because we cannot come up with a better 
heuristic. Yes, it somewhat gets the job done, but I call that ugly. My 
humble opinion.

[...]

> 
> Yes, if you haven't seen our patch in fedora kexec-tools maining list,
> your suggested approach is the exactly same thing we are doing, please
> check below patch.
> 
> [PATCH v2] kdumpctl: Add kdumpctl estimate
> https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/thread/YCEOJHQXKVEIVNB23M2TDAJGYVNP5MJZ/
> 
> We will provide a new feature in user space script, to let user check if
> their current crashkernel size is good or not. If not, they can adjust
> accordingly.

That's good, thanks for the pointer -- wasn't aware of that.

> 
> But, where's the current crashkernel size coming from? Surely
> crashkernel=auto. You wouldn't add a random crashkernel size then
> compared with the recommended crashkernel size, then reboot, will you?
> If crashkernel=auto get the expected size, no need to reboot. Means 99%
> of systems has no need to reboot. Only very few of systems, need reboot
> after checking the recommended size.
> 
> Long story short. crashkernel=auto will give a default value, trying to
> cover most of systems. (Very few high end server need check if it's
> enough and adjust with the help of user space tools. Then reboot.)

Then we might really want to investigate into shrinking a possibly 
larger allocation dynamically during boot.

>>
>> Also: this approach here doesn't make any sense when you want to do
>> something dependent on other cmdline parameters. Take "fadump=on" vs
>> "fadump=off" as an example. You just cannot handle it properly as proposed
>> in this patch. To me the approach in this patch makes least sense TBH.
> 
> Why? We don't have this kind of judgement in kernel? Crashkernel=auto is
> a generic mechanism, and has been added much earlier. Fadump was added
> later by IBM for their need on ppc only, it relies on crashkernel
> reservation but different mechanism of dumping. If it has different value
> than kdump, a special hanlding is certainly needed. Who tell it has to be
> 'fadump=on'? They can check the value in user space program and add into
> cmdline as you suggested, they can also make it into auto. The most suitable
> is the best.

Take a look at the RHEL8 handling to see where my comment is coming from.

> 
> And I have several questions to ask, hope you can help answer:
> 
> 1) Have you ever met crashkernel=auto broken on virt platform?

I have encountered it being very suboptimal. I call wasting hundreds of 
MB problematic, especially when dynamically resizing of VMs (for 
example, using memory ballooning)

> 
> Asking this because you are from Virt team, and crashkernel=auto has been
> there in RHEL for many years, and we have been working with Virt team to
> support dumping. We haven't seen any bug report or complaint about
> crashkernel=auto from Virt.

I've had plenty of bug reports where people try inflating the balloon 
fairly heavily but don't take the crashkernel size into account. The 
bigger the crashkernel size, the bigger the issue when people try 
squeezing the last couple of MB out of their VMs. I keep repeating to 
them "with crashkernel=auto, you have to be careful about how much 
memory might get set aside for the crashkernel and, therefore, reduces 
your effective guest OS RAM size and reduces the maximum balloon size".

> 
> 2) Adding crashkernel=auto, and the kdumpctl estimate as user space
> program to get a recommended size, then reboot. Removing crashkernel=auto,
> only the kdumpctl estimate to get a recommended size, always reboot.
> In RHEL we will take the 1st option. Are you willing to take the 2nd one
> for Virt platform since you think crashkernel=auto is plain wrong, pure
> ugliness, essentially broken, least sense?

We are talking about upstreaming stuff here and I am wearing my upstream 
hat here. I'm stating (just like people decades ago) that this might not 
be the right approach for upstream, at least not as it stands.

And no, I don't have time to solve problems/implement solutions/upstream 
patches to tackle fundamental issues that have been there for decades.

I'll be happy to help looking into dynamic shrinking of the crashkernel 
size if that approach makes sense. We could even let user space trigger 
that resizing -- without a reboot.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10  8:32           ` David Hildenbrand
@ 2021-05-10 10:43             ` Baoquan He
  2021-05-10 11:01               ` David Hildenbrand
  0 siblings, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-10 10:43 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2

On 05/10/21 at 10:32am, David Hildenbrand wrote:
> 
> > > "Not correlated directly" ...
> > > 
> > > "1G-64G:128M,64G-1T:256M,1T-:512M"
> > > 
> > > Am I still asleep and dreaming? :)
> > 
> > Well, I said 'Not correlated directly', then gave sentences to explan
> > the reason. I would like to repeat them:
> > 
> > 1) Crashkernel need more memory on some systems mainly because of
> > device driver. You can take a system, no matter how much memory you
> > increse or decrease total system RAM size, the crashkernel size needed
> > is invariable.
> > 
> >    - The extreme case I have give about the i40e.
> >    - And the more devices, narutally the more memory needed.
> > 
> > 2) About "1G-64G:128M,64G-1T:256M,1T-:512M", I also said the different
> > value is because taking very low proprotion of extra memory to avoid
> > potential risk, it's cost effective. Here, add another 90M which is
> > 0.13% of 64G, 0.0085% of 1TB.
> 
> Just let me clarify the problem I am having with all of this:
> 
> We model the crashkernel size as a function of the memory size. Yet, it's
> pretty much independent of the memory size. That screams for "ugly".
> 
> The main problem is that early during boot we don't have a clue how much
> crashkernel memory we may need. So what I see is that we are mostly using a
> heuristic based on the memory size to come up with the right answer how much
> devices we might have. That just feels very wrong.
> 
> I can understand the reasoning of "using a fraction of the memory size" when
> booting up just to be on the safe side as we don't know", and that
> motivation is much better than what I read so far. But then I wonder if we
> cannot handle that any better? Because this feels very suboptimal to me and
> I feel like there can be cases where the heuristic is just wrong.

Yes, I understand what you said. Our headache is mainly from bare metal
system worrying the reservation is not enough becuase of many devices.

On VM, it is truly different. With much less devices, it does waste some
memory. Usually a fixed minimal size can cover 99.9% of system unless
too many devices attached/added to VM, I am not sure what's the
probability it could happen. While, by the help of /sys/kernel/kexec_crash_size,
you can shrink it to an small enough but available size. Just you may
need to reload kdump kernel because the loaded kernel should have been
erazed and out of control. The shrinking should be done at early stage of
kernel running, I would say, lest crash may happen during that period.

We ever tried several different ways to enlarge the crashkernel size
dynamically, but didn't think of a good way.

> 
> As one example, can I add a whole bunch of devices to a 32GB VM and break
> "crashkernel=auto"?
> 
> As another example, when I boot a 64G VM, the crashkernel size will be
> 512MB, although I really only might need 128MB. That's an effective overhead
> of 0.5%. And especially when we take memory ballooning etc. into account it
> can effectively be more than that.
> 
> Let's do a more detailed look. PPC64 in kernel-ark:
> 
> "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";

Yes, the wasting mainly happens on ppc. Its 64K page size, caused the
difference with other ARCHes. On x86_64, s390, it's much better, assuming
most of VM won't own memory bigger than 64G, their crashkernel size will
be 160M most of time.

> 
> Assume I would only need 385M on a simple 16GB VM. We would have an overhead
> of ~4%. But maybe on ppc64 we do have to take the memory size into account
> (my assumption and, thus, my comment regarding memory hotplug)?
> 
> 
> I wonder if we could always try allocating larger granularity (falling back
> to smaller if it fails), and once the kernel is able to come up with a
> better answer how many devices there are and, thus, how big the crashkernel
> area really should be, shrink the preallocated crashkernel (either from the
> kernel or from user space)? Not completely trivial but possible I think.
> It's trivial when we allocate a memmap for the crashkernel (I think we
> mostly do but I might be wrong).
> 
> The "crashkernel=auto" would really do something magical good instead of
> implement some heuristic base don the memory size.
> 
> [...]
> > > So, all the rules we have are essentially broken because they rely
> > > completely on the system RAM during boot.
> > 
> > How do you get this?
> > 
> > Crashkernel=auto is a default value. PC, VMs, normal workstation and server
> > which are the overall majority can work well with it. I can say the number
> > is 99%. Only very few high end workstation, servers which contain
> > many PCI devices need investigation to decide crashkernel size. A possible
> > manual setting and rebooting is needed for them. You call this
> > 'essentially broken'? So you later suggestd constructing crashkernel value
> > in user space and rebooting is not broken? Even though it's the similar
> > thing? what is your logic behind your conclusion?
> 
> A kernel early during boot can only guess. A kernel late during boot knows.
> Please correct me if I'm wrong.

Well, I would not say it's guess, and would like to call them experical
values from statistical data. With a priori vlaue given by 'auto',
basically normal users of kdump don't need to care about the setting.
E.g on Fedora, 'auto' can cover all systems, assume nobody would deploy
it on a high end server. Everything we do is to make thing simple enough.
If you don't know how to set, just add 'crashkernel=auto' to cmdline,
then everything is done. I believe you agree that not everybody would
like to dig into kexec/kdump just for getting how big crashkernel size
need be set when they want to use kdump functionality.

> 
> > 
> > Crashkernel=auto is mainly targetting most of systems, help people
> > w/o much knowledge of kdump implementation to use it for debugging.
> > 
> > I can say more about the benefit of crashkernel=auto. On Fedora, the
> > community distros sponsord by Redhat, the kexec/kdump is also maintained
> > by us. Fedora kernel is mainline kernel, so no crashkernel=auto
> > provided. We almost never get bug report from users, means almost nobody
> > use  it. We hope Fedora users' usage can help test functionality of
> > component.
> 
> I know how helpful "crashkernel=auto" was so far, but I am also aware that
> there was strong pushback in the past, and I remember for the reasons I
> gave. IMHO we should refine that approach instead of trying to push the same
> thing upstream every couple of years.
> 
> I ran into the "512MB crashkernel" on a 64G VM with memory ballooning issue
> already but didn't report a BZ, because so far, I was under the impression
> that more memory means more crashkernel. But you explained to me that I was
> just running into a (for my use case) bad heuristic.

I re-read the old posts, didn't see strong push-back. People just gave
some different ideas instead. When we were silent, we tried different
way, e.g the enlarging crashkernel at run time as told at above, but
failed. Reusing free pages and user space pages of 1st kernel in kdump
kernel, also failed. We also talked with people to consult if it's
doable to remove 'auto' support, nobody would like to give an affirmative
answer. I know SUSE is using the way you mentioned to get a recommended
size for long time, but it needs severeal more steps and need reboot. We
prefer to take that way too as an improvement. The simpler, the better.

Besides, 'auto' doesn't introduce tons of complicated code, and we don't
think of it with a pat on the head, then try to push to pollute kernel.

> 
> > > > So system RAM size is the least important part to influence crashkernel
> > > 
> > > Aehm, not with fadump, no?
> > 
> > Fadump makes use of crashkernel reservation, but has different mechanism
> > to dumping. It needs a kernel config too if this patch is accepted, or
> > it can add it to command line from a user space program, I will talk
> > about that later. This depends on IBM's decision, I have added Hari to CC,
> > they will make the best choice after consideration.
> > 
> 
> I was looking at RHEL8, and there we have
> 
> fadump_cmdline = get_last_crashkernel(cmdline, "fadump=", NULL);
> ...
> if (!fadump_cmdline || (strncmp(fadump_cmdline, "off", 3) == 0))
> 	ck_cmdline = ...
> else
> 	ck_cmdline = ...
> 
> which was a runtime check for fadump.
> 
> Something that cannot be modeled properly at least with this patch here.

Yes, I believe it won't be done like that. A static detection or a
global switch variable can solve it.

> 
> > }
> > > 
> > > > costing. Say my x1 laptop, even though I extended the RAM to 100TB, 160M
> > > > crashkernel is still enough. Just we would like to get a tiny extra part
> > > > to add to crashkernel if the total RAM is very large, that's the rule
> > > > for crashkernel=auto. As for VMs, given their very few devices, virtio
> > > > disk, NAT nic, etc, no matter how much memory is deployed and hot
> > > > added/removed, crashkernel size won't be influenced very much. My
> > > > personal understanding about it.
> > > 
> > > That's an interesting observation. But you're telling me that we end up
> > > wasting memory for the crashkernel because "crashkernel=auto" which is
> > > supposed to do something magical good automatically does something very
> > > suboptimal? Oh my ... this is broken.
> > > 
> > > Long story short: crashkernel=auto is pure ugliness.
> > 
> > Very interesting. Your long story is clear to me, but your short story
> > confuses me a lot.
> > 
> > Let me try to sort out and understand. In your first reply, you asserted
> > "it's plain wrong when taking memory hotplug serious account as
> > we see it quite heavily in VMs", means you plain don't know if it's
> > wrong, but you say it's plain wrong. I answered you 'no, not at all'
> > with detailed explanation, means it's plain opposite to your assertion.
> 
> Yep, I might be partially wrong about memory hotplug thingy, mostly because
> I had the RHEL8 rule for ppc64 (including fadump) in mind. For dynamic
> resizing of VMs, the current rules for VMs can be very sub-optimal.
> 
> Let's relax "plain wrong" to "the heuristic can be very suboptimal because
> it uses something mostly unrelated to come up with an answer". And it's
> simply not plain wrong because in practice it gets the job done. Mostly.
> 
> 
> > So then you quickly came to 'crashkernel=auto is pure ugliness'. If a
> > simple crashkernel=auto is added to cover 99% systems, and advanced
> > operation only need be done for the rest which is tiny proportion,
> > this is called pure ugliness, what's pure beauty? Here I say 99%, I
> > could be very conservative.
> 
> I don't like wasting memory just because we cannot come up with a better
> heuristic. Yes, it somewhat gets the job done, but I call that ugly. My
> humble opinion.
> 
> [...]
> 
> > 
> > Yes, if you haven't seen our patch in fedora kexec-tools maining list,
> > your suggested approach is the exactly same thing we are doing, please
> > check below patch.
> > 
> > [PATCH v2] kdumpctl: Add kdumpctl estimate
> > https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/thread/YCEOJHQXKVEIVNB23M2TDAJGYVNP5MJZ/
> > 
> > We will provide a new feature in user space script, to let user check if
> > their current crashkernel size is good or not. If not, they can adjust
> > accordingly.
> 
> That's good, thanks for the pointer -- wasn't aware of that.
> 
> > 
> > But, where's the current crashkernel size coming from? Surely
> > crashkernel=auto. You wouldn't add a random crashkernel size then
> > compared with the recommended crashkernel size, then reboot, will you?
> > If crashkernel=auto get the expected size, no need to reboot. Means 99%
> > of systems has no need to reboot. Only very few of systems, need reboot
> > after checking the recommended size.
> > 
> > Long story short. crashkernel=auto will give a default value, trying to
> > cover most of systems. (Very few high end server need check if it's
> > enough and adjust with the help of user space tools. Then reboot.)
> 
> Then we might really want to investigate into shrinking a possibly larger
> allocation dynamically during boot.
> 
> > > 
> > > Also: this approach here doesn't make any sense when you want to do
> > > something dependent on other cmdline parameters. Take "fadump=on" vs
> > > "fadump=off" as an example. You just cannot handle it properly as proposed
> > > in this patch. To me the approach in this patch makes least sense TBH.
> > 
> > Why? We don't have this kind of judgement in kernel? Crashkernel=auto is
> > a generic mechanism, and has been added much earlier. Fadump was added
> > later by IBM for their need on ppc only, it relies on crashkernel
> > reservation but different mechanism of dumping. If it has different value
> > than kdump, a special hanlding is certainly needed. Who tell it has to be
> > 'fadump=on'? They can check the value in user space program and add into
> > cmdline as you suggested, they can also make it into auto. The most suitable
> > is the best.
> 
> Take a look at the RHEL8 handling to see where my comment is coming from.
> 
> > 
> > And I have several questions to ask, hope you can help answer:
> > 
> > 1) Have you ever met crashkernel=auto broken on virt platform?
> 
> I have encountered it being very suboptimal. I call wasting hundreds of MB
> problematic, especially when dynamically resizing of VMs (for example, using
> memory ballooning)
> 
> > 
> > Asking this because you are from Virt team, and crashkernel=auto has been
> > there in RHEL for many years, and we have been working with Virt team to
> > support dumping. We haven't seen any bug report or complaint about
> > crashkernel=auto from Virt.
> 
> I've had plenty of bug reports where people try inflating the balloon fairly
> heavily but don't take the crashkernel size into account. The bigger the
> crashkernel size, the bigger the issue when people try squeezing the last
> couple of MB out of their VMs. I keep repeating to them "with
> crashkernel=auto, you have to be careful about how much memory might get set
> aside for the crashkernel and, therefore, reduces your effective guest OS
> RAM size and reduces the maximum balloon size".
> 
> > 
> > 2) Adding crashkernel=auto, and the kdumpctl estimate as user space
> > program to get a recommended size, then reboot. Removing crashkernel=auto,
> > only the kdumpctl estimate to get a recommended size, always reboot.
> > In RHEL we will take the 1st option. Are you willing to take the 2nd one
> > for Virt platform since you think crashkernel=auto is plain wrong, pure
> > ugliness, essentially broken, least sense?
> 
> We are talking about upstreaming stuff here and I am wearing my upstream hat
> here. I'm stating (just like people decades ago) that this might not be the
> right approach for upstream, at least not as it stands.
> 
> And no, I don't have time to solve problems/implement solutions/upstream
> patches to tackle fundamental issues that have been there for decades.
> 
> I'll be happy to help looking into dynamic shrinking of the crashkernel size
> if that approach makes sense. We could even let user space trigger that
> resizing -- without a reboot.

Don't reply each inline comment since I believe they have been covered
by the earlier reply. Thanks for looking to this and telling your
thought, to let us know that in fact you really care about the extra
memory on VMs which we have realized, but didn't realized it really cause
issue. 

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10 10:43             ` Baoquan He
@ 2021-05-10 11:01               ` David Hildenbrand
  2021-05-10 11:44                 ` Dave Young
  0 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-10 11:01 UTC (permalink / raw)
  To: Baoquan He
  Cc: Andrew Morton, andreyknvl, christian.brauner, colin.king, corbet,
	dyoung, frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2, Michal Hocko


>> I can understand the reasoning of "using a fraction of the memory size" when
>> booting up just to be on the safe side as we don't know", and that
>> motivation is much better than what I read so far. But then I wonder if we
>> cannot handle that any better? Because this feels very suboptimal to me and
>> I feel like there can be cases where the heuristic is just wrong.
> 
> Yes, I understand what you said. Our headache is mainly from bare metal
> system worrying the reservation is not enough becuase of many devices.
> 
> On VM, it is truly different. With much less devices, it does waste some
> memory. Usually a fixed minimal size can cover 99.9% of system unless
> too many devices attached/added to VM, I am not sure what's the
> probability it could happen. While, by the help of /sys/kernel/kexec_crash_size,
> you can shrink it to an small enough but available size. Just you may
> need to reload kdump kernel because the loaded kernel should have been
> erazed and out of control. The shrinking should be done at early stage of
> kernel running, I would say, lest crash may happen during that period.
> 
> We ever tried several different ways to enlarge the crashkernel size
> dynamically, but didn't think of a good way.

Yes, enlarging it at runtime much more difficult than shrinking.

[...]

>> A kernel early during boot can only guess. A kernel late during boot knows.
>> Please correct me if I'm wrong.
> 
> Well, I would not say it's guess, and would like to call them experical
> values from statistical data. With a priori vlaue given by 'auto',
> basically normal users of kdump don't need to care about the setting.
> E.g on Fedora, 'auto' can cover all systems, assume nobody would deploy
> it on a high end server. Everything we do is to make thing simple enough.
> If you don't know how to set, just add 'crashkernel=auto' to cmdline,
> then everything is done. I believe you agree that not everybody would
> like to dig into kexec/kdump just for getting how big crashkernel size
> need be set when they want to use kdump functionality.

Oh absolutely. But OTOH, most users will leave the value untouched if it 
works -- and complain at least in the VM environment to me about 
surpises waste of system RAM with "crashkernel=auto".

[...]

>> I know how helpful "crashkernel=auto" was so far, but I am also aware that
>> there was strong pushback in the past, and I remember for the reasons I
>> gave. IMHO we should refine that approach instead of trying to push the same
>> thing upstream every couple of years.
>>
>> I ran into the "512MB crashkernel" on a 64G VM with memory ballooning issue
>> already but didn't report a BZ, because so far, I was under the impression
>> that more memory means more crashkernel. But you explained to me that I was
>> just running into a (for my use case) bad heuristic.
> 
> I re-read the old posts, didn't see strong push-back. People just gave
> some different ideas instead. When we were silent, we tried different
> way, e.g the enlarging crashkernel at run time as told at above, but
> failed. Reusing free pages and user space pages of 1st kernel in kdump
> kernel, also failed. We also talked with people to consult if it's

Thanks for an insight into the history.

> doable to remove 'auto' support, nobody would like to give an affirmative
> answer. I know SUSE is using the way you mentioned to get a recommended
> size for long time, but it needs severeal more steps and need reboot. We
> prefer to take that way too as an improvement. The simpler, the better.

At least I'm happy to hear that other people had the same idea as me ;)

I can understand the desire for simplicity. it would be great to hear 
SUSEs perception of the problem and how they would ideally want to move 
forward with this.

[...]

>> I'll be happy to help looking into dynamic shrinking of the crashkernel size
>> if that approach makes sense. We could even let user space trigger that
>> resizing -- without a reboot.
> 
> Don't reply each inline comment since I believe they have been covered
> by the earlier reply. Thanks for looking to this and telling your
> thought, to let us know that in fact you really care about the extra
> memory on VMs which we have realized, but didn't realized it really cause
> issue.

I mess with dynamic resizing of VMs, that's why I usually take a closer 
look at all things that do stuff based on the initial VM size; yes, 
there is still a lot other such things out there.

It also bugged me for quite a bit that we don't have a sane way to 
achieve what we're doing here upstream. It somewhat feels like "this 
doesn't belong in the kernel and is user policy" but then, the existing 
kernel support is suboptimal.

Maybe reserving some "maybe too big but okayish to boot the system in a 
sane environment -- e.g., X% of system RAM and at least Y" size first 
and shrinking it later as triggered by user space early (where we do 
seem to have a way to pre-calculate things now) might actually be a good 
direction to look into.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10 11:01               ` David Hildenbrand
@ 2021-05-10 11:44                 ` Dave Young
  2021-05-10 11:56                   ` David Hildenbrand
  0 siblings, 1 reply; 119+ messages in thread
From: Dave Young @ 2021-05-10 11:44 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Baoquan He, Andrew Morton, andreyknvl, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, rppt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko

Hi David,
On 05/10/21 at 01:01pm, David Hildenbrand wrote:
[snip]
> It also bugged me for quite a bit that we don't have a sane way to achieve
> what we're doing here upstream. It somewhat feels like "this doesn't belong
> in the kernel and is user policy" but then, the existing kernel support is
> suboptimal.
> 
> Maybe reserving some "maybe too big but okayish to boot the system in a sane
> environment -- e.g., X% of system RAM and at least Y" size first and
> shrinking it later as triggered by user space early (where we do seem to
> have a way to pre-calculate things now) might actually be a good direction
> to look into.

Hmm, that is also an option we considered before.  Even for your
suggestion we still need a kernel option to set the default ratio/value.
and the ratio/value should be another patch which expands crashkernel
syntax.

Actually the kconfig help text in this patch is indeed misleading, it is
not introducing crashkernel=a:b... and no need to explain about the
crashkernel syntax, the config option is actually just some interface we
can add any valid crashkernel settings to be used by default. So current
patch help text describes the default value of crash auto str, instead
of describes what crash auto str is. 

And crashkernel=auto makes this more flexibly. We can tune the values
easily when upgrading.  But if we pass a fixed value in userspace we
can not know if the value is set by distribution automatically or by user
manually thus we can not blindly update it.

Thanks
Dave


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10 11:44                 ` Dave Young
@ 2021-05-10 11:56                   ` David Hildenbrand
  2021-05-11 13:36                     ` Baoquan He
  2021-05-12  7:42                     ` Dave Young
  0 siblings, 2 replies; 119+ messages in thread
From: David Hildenbrand @ 2021-05-10 11:56 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Andrew Morton, andreyknvl, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, rppt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko

On 10.05.21 13:44, Dave Young wrote:
> Hi David,

Hi Dave,

> On 05/10/21 at 01:01pm, David Hildenbrand wrote:
> [snip]
>> It also bugged me for quite a bit that we don't have a sane way to achieve
>> what we're doing here upstream. It somewhat feels like "this doesn't belong
>> in the kernel and is user policy" but then, the existing kernel support is
>> suboptimal.
>>
>> Maybe reserving some "maybe too big but okayish to boot the system in a sane
>> environment -- e.g., X% of system RAM and at least Y" size first and
>> shrinking it later as triggered by user space early (where we do seem to
>> have a way to pre-calculate things now) might actually be a good direction
>> to look into.
> 
> Hmm, that is also an option we considered before.  Even for your
> suggestion we still need a kernel option to set the default ratio/value.
> and the ratio/value should be another patch which expands crashkernel
> syntax.

Right.

> 
> Actually the kconfig help text in this patch is indeed misleading, it is
> not introducing crashkernel=a:b... and no need to explain about the
> crashkernel syntax, the config option is actually just some interface we
> can add any valid crashkernel settings to be used by default. So current
> patch help text describes the default value of crash auto str, instead
> of describes what crash auto str is.

Right. And I would much rather prefer either

a) handling "auto" completely in the kernel, not just setting some 
questionable default at compile time
b) passing it explicitly in via the cmdline

> 
> And crashkernel=auto makes this more flexibly. We can tune the values
> easily when upgrading.  But if we pass a fixed value in userspace we
> can not know if the value is set by distribution automatically or by user
> manually thus we can not blindly update it.

I think there are two different cases:


1. kernel space updates the value later during boot. "crashkernel=auto" 
really does the right thing, meaning

a) allocate something reasonable and safe during early boot
b) update the allocation during late boot when we know what kind of 
system we're running on

Then, we indeed care about "crashkernel=auto" in the kernel and I think 
it would be a nice thing to have. The only question is on how to make 
that a little configurable, depending on different thingies we might 
want to run in the crashkernel (assuming someone doesn't want kdump).


2. user space updates the value later during boot

IMHO we don't really car who decided on the value as we do the update 
from user space. If an admin messes with crashkernel=, the admin can 
also mess with kdump not doing any overwrites (e.g., make that 
configurable, or detect the overwrite in kdump somehow).

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10 11:56                   ` David Hildenbrand
@ 2021-05-11 13:36                     ` Baoquan He
  2021-05-11 16:31                       ` Mike Rapoport
  2021-05-12  7:42                     ` Dave Young
  1 sibling, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-11 13:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Dave Young, Andrew Morton, christian.brauner, colin.king, corbet,
	frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt, rppt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2, Michal Hocko, kasong

On 05/10/21 at 01:56pm, David Hildenbrand wrote:
> On 10.05.21 13:44, Dave Young wrote:
> > Hi David,
> 
> Hi Dave,
> 
> > On 05/10/21 at 01:01pm, David Hildenbrand wrote:
> > [snip]
> > > It also bugged me for quite a bit that we don't have a sane way to achieve
> > > what we're doing here upstream. It somewhat feels like "this doesn't belong
> > > in the kernel and is user policy" but then, the existing kernel support is
> > > suboptimal.
> > > 
> > > Maybe reserving some "maybe too big but okayish to boot the system in a sane
> > > environment -- e.g., X% of system RAM and at least Y" size first and
> > > shrinking it later as triggered by user space early (where we do seem to
> > > have a way to pre-calculate things now) might actually be a good direction
> > > to look into.
> > 
> > Hmm, that is also an option we considered before.  Even for your
> > suggestion we still need a kernel option to set the default ratio/value.
> > and the ratio/value should be another patch which expands crashkernel
> > syntax.
> 
> Right.
> 
> > 
> > Actually the kconfig help text in this patch is indeed misleading, it is
> > not introducing crashkernel=a:b... and no need to explain about the
> > crashkernel syntax, the config option is actually just some interface we
> > can add any valid crashkernel settings to be used by default. So current
> > patch help text describes the default value of crash auto str, instead
> > of describes what crash auto str is.
> 
> Right. And I would much rather prefer either
> 
> a) handling "auto" completely in the kernel, not just setting some
> questionable default at compile time

Thanks for the suggestions.

If the way adding default value into kernel config is disliked,
this a) option looks good. We can get value with x% of system RAM, but
clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
defined with a default value for different ARCHes. It's very close to
our current implementation, and handling 'auto' in kernel.

And kernel config provided so that people can tune the MIN/MAX value,
but no need to post patch to do the tuning each time if have to?


> b) passing it explicitly in via the cmdline
> 
> > 
> > And crashkernel=auto makes this more flexibly. We can tune the values
> > easily when upgrading.  But if we pass a fixed value in userspace we
> > can not know if the value is set by distribution automatically or by user
> > manually thus we can not blindly update it.
> 
> I think there are two different cases:
> 
> 
> 1. kernel space updates the value later during boot. "crashkernel=auto"
> really does the right thing, meaning
> 
> a) allocate something reasonable and safe during early boot
> b) update the allocation during late boot when we know what kind of system
> we're running on
> 
> Then, we indeed care about "crashkernel=auto" in the kernel and I think it
> would be a nice thing to have. The only question is on how to make that a
> little configurable, depending on different thingies we might want to run in
> the crashkernel (assuming someone doesn't want kdump).
> 
> 
> 2. user space updates the value later during boot
> 
> IMHO we don't really car who decided on the value as we do the update from
> user space. If an admin messes with crashkernel=, the admin can also mess
> with kdump not doing any overwrites (e.g., make that configurable, or detect
> the overwrite in kdump somehow).
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-11 13:36                     ` Baoquan He
@ 2021-05-11 16:31                       ` Mike Rapoport
  2021-05-11 17:07                         ` David Hildenbrand
  2021-05-12 14:13                         ` Baoquan He
  0 siblings, 2 replies; 119+ messages in thread
From: Mike Rapoport @ 2021-05-11 16:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: David Hildenbrand, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

Hi Baoquan,

On Tue, May 11, 2021 at 09:36:41PM +0800, Baoquan He wrote:
> On 05/10/21 at 01:56pm, David Hildenbrand wrote:
> > On 10.05.21 13:44, Dave Young wrote:
> > > Hi David,
> > 
> > Hi Dave,
> > 
> > > On 05/10/21 at 01:01pm, David Hildenbrand wrote:
> > > [snip]
> > > > It also bugged me for quite a bit that we don't have a sane way to achieve
> > > > what we're doing here upstream. It somewhat feels like "this doesn't belong
> > > > in the kernel and is user policy" but then, the existing kernel support is
> > > > suboptimal.
> > > > 
> > > > Maybe reserving some "maybe too big but okayish to boot the system in a sane
> > > > environment -- e.g., X% of system RAM and at least Y" size first and
> > > > shrinking it later as triggered by user space early (where we do seem to
> > > > have a way to pre-calculate things now) might actually be a good direction
> > > > to look into.
> > > 
> > > Hmm, that is also an option we considered before.  Even for your
> > > suggestion we still need a kernel option to set the default ratio/value.
> > > and the ratio/value should be another patch which expands crashkernel
> > > syntax.
> > 
> > Right.
> > 
> > > 
> > > Actually the kconfig help text in this patch is indeed misleading, it is
> > > not introducing crashkernel=a:b... and no need to explain about the
> > > crashkernel syntax, the config option is actually just some interface we
> > > can add any valid crashkernel settings to be used by default. So current
> > > patch help text describes the default value of crash auto str, instead
> > > of describes what crash auto str is.
> > 
> > Right. And I would much rather prefer either
> > 
> > a) handling "auto" completely in the kernel, not just setting some
> > questionable default at compile time
> 
> Thanks for the suggestions.
> 
> If the way adding default value into kernel config is disliked,
> this a) option looks good. We can get value with x% of system RAM, but
> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> defined with a default value for different ARCHes. It's very close to
> our current implementation, and handling 'auto' in kernel.
> 
> And kernel config provided so that people can tune the MIN/MAX value,
> but no need to post patch to do the tuning each time if have to?
 
Maybe I'm missing something, but the whole point is to avoid kernel
configuration option at all. If the crashkernel=auto works good for 99% of
the cases, there is no need to provide build time configuration along with
it. There are plenty of ways users can control crashkernel reservations
with the existing 2-4 (depending on architecture) command line options.

Simply hard coding a reasonable defaults (e.g.
"1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
crashkernel=auto is set would cover the same 99% of users you referred to.

If we can resize the reservation later during boot this will also address
David's concern about the wasted memory.

You mentioned that amount of memory that is required for crash kernel
reservation depends on the devices present on the system. Is is possible to
detect how much memory is required at late stages of boot?

> > b) passing it explicitly in via the cmdline
> > 
> > > 
> > > And crashkernel=auto makes this more flexibly. We can tune the values
> > > easily when upgrading.  But if we pass a fixed value in userspace we
> > > can not know if the value is set by distribution automatically or by user
> > > manually thus we can not blindly update it.
> > 
> > I think there are two different cases:
> > 
> > 
> > 1. kernel space updates the value later during boot. "crashkernel=auto"
> > really does the right thing, meaning
> > 
> > a) allocate something reasonable and safe during early boot
> > b) update the allocation during late boot when we know what kind of system
> > we're running on
> > 
> > Then, we indeed care about "crashkernel=auto" in the kernel and I think it
> > would be a nice thing to have. The only question is on how to make that a
> > little configurable, depending on different thingies we might want to run in
> > the crashkernel (assuming someone doesn't want kdump).
> > 
> > 
> > 2. user space updates the value later during boot
> > 
> > IMHO we don't really car who decided on the value as we do the update from
> > user space. If an admin messes with crashkernel=, the admin can also mess
> > with kdump not doing any overwrites (e.g., make that configurable, or detect
> > the overwrite in kdump somehow).
> > 
> > -- 
> > Thanks,
> > 
> > David / dhildenb
> > 
> 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-11 16:31                       ` Mike Rapoport
@ 2021-05-11 17:07                         ` David Hildenbrand
  2021-05-12 14:51                           ` Baoquan He
  2021-05-12 14:13                         ` Baoquan He
  1 sibling, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-11 17:07 UTC (permalink / raw)
  To: Mike Rapoport, Baoquan He
  Cc: Dave Young, Andrew Morton, christian.brauner, colin.king, corbet,
	frederic, gpiccoli, john.p.donnelly, jpoimboe, keescook,
	linux-mm, masahiroy, mchehab+huawei, mike.kravetz, mingo,
	mm-commits, paulmck, peterz, rdunlap, rostedt,
	saeed.mirzamohammadi, samitolvanen, sboyd, tglx, torvalds,
	vgoyal, yifeifz2, Michal Hocko, kasong

>> If the way adding default value into kernel config is disliked,
>> this a) option looks good. We can get value with x% of system RAM, but
>> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
>> defined with a default value for different ARCHes. It's very close to
>> our current implementation, and handling 'auto' in kernel.
>>
>> And kernel config provided so that people can tune the MIN/MAX value,
>> but no need to post patch to do the tuning each time if have to?
>   
> Maybe I'm missing something, but the whole point is to avoid kernel
> configuration option at all. If the crashkernel=auto works good for 99% of
> the cases, there is no need to provide build time configuration along with
> it. There are plenty of ways users can control crashkernel reservations
> with the existing 2-4 (depending on architecture) command line options.
> 
> Simply hard coding a reasonable defaults (e.g.
> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> crashkernel=auto is set would cover the same 99% of users you referred to.

Right, and we can easily allocate a bit more as a safety net temporarily 
when we can actually shrink the area later.

> 
> If we can resize the reservation later during boot this will also address
> David's concern about the wasted memory.
> 

Yes.

> You mentioned that amount of memory that is required for crash kernel
> reservation depends on the devices present on the system. Is is possible to
> detect how much memory is required at late stages of boot?

Here is my thinking:

There seems to be some kind of formula we can roughly use to come up 
with the final crashkernel size. Baoquan for sure knows all the dirty 
details, I assume it's roughly "core kernel + drivers + user space".

In the kernel, we can only come up with "core kernel + drivers" 
expecting that we will run

a) roughly the same kernel
b) with roughly the same drivers

The "user space" part is completely under user space control, depending 
on what application will be run after kexec.

So I wonder if something like

crashkernel=auto,100M

whereby "100M" corresponds to user space demands in addition to the 
variable part depend on the current kernel + drivers.

would already be somewhat sufficient for main use cases I guess.

Of course, that approach will get more complicated if the user space 
portion heavily depends on the drivers etc. Then we need more tunables.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-10 11:56                   ` David Hildenbrand
  2021-05-11 13:36                     ` Baoquan He
@ 2021-05-12  7:42                     ` Dave Young
  1 sibling, 0 replies; 119+ messages in thread
From: Dave Young @ 2021-05-12  7:42 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Baoquan He, Andrew Morton, andreyknvl, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, rppt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

On 05/10/21 at 01:56pm, David Hildenbrand wrote:
> On 10.05.21 13:44, Dave Young wrote:
> > Hi David,
> 
> Hi Dave,
> 
> > On 05/10/21 at 01:01pm, David Hildenbrand wrote:
> > [snip]
> > > It also bugged me for quite a bit that we don't have a sane way to achieve
> > > what we're doing here upstream. It somewhat feels like "this doesn't belong
> > > in the kernel and is user policy" but then, the existing kernel support is
> > > suboptimal.
> > > 
> > > Maybe reserving some "maybe too big but okayish to boot the system in a sane
> > > environment -- e.g., X% of system RAM and at least Y" size first and
> > > shrinking it later as triggered by user space early (where we do seem to
> > > have a way to pre-calculate things now) might actually be a good direction
> > > to look into.
> > 
> > Hmm, that is also an option we considered before.  Even for your
> > suggestion we still need a kernel option to set the default ratio/value.
> > and the ratio/value should be another patch which expands crashkernel
> > syntax.
> 
> Right.
> 
> > 
> > Actually the kconfig help text in this patch is indeed misleading, it is
> > not introducing crashkernel=a:b... and no need to explain about the
> > crashkernel syntax, the config option is actually just some interface we
> > can add any valid crashkernel settings to be used by default. So current
> > patch help text describes the default value of crash auto str, instead
> > of describes what crash auto str is.
> 
> Right. And I would much rather prefer either
> 
> a) handling "auto" completely in the kernel, not just setting some
> questionable default at compile time
> b) passing it explicitly in via the cmdline
> 
> > 
> > And crashkernel=auto makes this more flexibly. We can tune the values
> > easily when upgrading.  But if we pass a fixed value in userspace we
> > can not know if the value is set by distribution automatically or by user
> > manually thus we can not blindly update it.
> 
> I think there are two different cases:
> 
> 
> 1. kernel space updates the value later during boot. "crashkernel=auto"
> really does the right thing, meaning
> 
> a) allocate something reasonable and safe during early boot
> b) update the allocation during late boot when we know what kind of system
> we're running on

Sorry for my laggy reply :)

As for kernel late boot action, the other notable issue is most device
drivers are kernel modules, they are loaded with udev. Especially about
some complex storage/network drivers, they often use a lot memory.
Kairui has a tool named "memstrack" which can be used for monitor the
module loading phase peak memory.  But that can only be done in
userspace for now.

And we have some different setups for normal boot and kdump
kernel, eg. some special cmdline eg. nr_cpu=1; and also some in kernel
handling for example some patches merged in networking drivers to use
less memory in kdump kernel via smaller queues etc.

Otherwise about other kernel memory requirement can be done in kernel.

> 
> Then, we indeed care about "crashkernel=auto" in the kernel and I think it
> would be a nice thing to have. The only question is on how to make that a
> little configurable, depending on different thingies we might want to run in
> the crashkernel (assuming someone doesn't want kdump).
> 
> 
> 2. user space updates the value later during boot
> 
> IMHO we don't really car who decided on the value as we do the update from
> user space. If an admin messes with crashkernel=, the admin can also mess
> with kdump not doing any overwrites (e.g., make that configurable, or detect
> the overwrite in kdump somehow).
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 

Thanks
Dave


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-11 16:31                       ` Mike Rapoport
  2021-05-11 17:07                         ` David Hildenbrand
@ 2021-05-12 14:13                         ` Baoquan He
  1 sibling, 0 replies; 119+ messages in thread
From: Baoquan He @ 2021-05-12 14:13 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: David Hildenbrand, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong, piliu

On 05/11/21 at 07:31pm, Mike Rapoport wrote:
> Hi Baoquan,
> 
> On Tue, May 11, 2021 at 09:36:41PM +0800, Baoquan He wrote:
> > On 05/10/21 at 01:56pm, David Hildenbrand wrote:
> > > On 10.05.21 13:44, Dave Young wrote:
> > > > Hi David,
> > > 
> > > Hi Dave,
> > > 
> > > > On 05/10/21 at 01:01pm, David Hildenbrand wrote:
> > > > [snip]
> > > > > It also bugged me for quite a bit that we don't have a sane way to achieve
> > > > > what we're doing here upstream. It somewhat feels like "this doesn't belong
> > > > > in the kernel and is user policy" but then, the existing kernel support is
> > > > > suboptimal.
> > > > > 
> > > > > Maybe reserving some "maybe too big but okayish to boot the system in a sane
> > > > > environment -- e.g., X% of system RAM and at least Y" size first and
> > > > > shrinking it later as triggered by user space early (where we do seem to
> > > > > have a way to pre-calculate things now) might actually be a good direction
> > > > > to look into.
> > > > 
> > > > Hmm, that is also an option we considered before.  Even for your
> > > > suggestion we still need a kernel option to set the default ratio/value.
> > > > and the ratio/value should be another patch which expands crashkernel
> > > > syntax.
> > > 
> > > Right.
> > > 
> > > > 
> > > > Actually the kconfig help text in this patch is indeed misleading, it is
> > > > not introducing crashkernel=a:b... and no need to explain about the
> > > > crashkernel syntax, the config option is actually just some interface we
> > > > can add any valid crashkernel settings to be used by default. So current
> > > > patch help text describes the default value of crash auto str, instead
> > > > of describes what crash auto str is.
> > > 
> > > Right. And I would much rather prefer either
> > > 
> > > a) handling "auto" completely in the kernel, not just setting some
> > > questionable default at compile time
> > 
> > Thanks for the suggestions.
> > 
> > If the way adding default value into kernel config is disliked,
> > this a) option looks good. We can get value with x% of system RAM, but
> > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > defined with a default value for different ARCHes. It's very close to
> > our current implementation, and handling 'auto' in kernel.
> > 
> > And kernel config provided so that people can tune the MIN/MAX value,
> > but no need to post patch to do the tuning each time if have to?
>  
> Maybe I'm missing something, but the whole point is to avoid kernel
> configuration option at all. If the crashkernel=auto works good for 99% of
> the cases, there is no need to provide build time configuration along with
> it. There are plenty of ways users can control crashkernel reservations
> with the existing 2-4 (depending on architecture) command line options.
> 
> Simply hard coding a reasonable defaults (e.g.
> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> crashkernel=auto is set would cover the same 99% of users you referred to.

Thanks for looking into this, Mike.

The crashkernel=auto works well for 99% of systems with a prerequisite
that values of 'auto' corresponds to a certain kernel, e.g distros kernel.
Say so because the kernel configs of a distros kernel decides the kernel
size, and also the initrd size. A generic default value for
crashkernel=auto doesn't make much sense when we make it into distros.
That's why we want to add the default value into kernel config originally.
Then asking for a minimal size with a kernel config tunable as the second
best when handle 'auto' in kernel as David's option a) suggested.

Here it's a little not clear to me about why kernel config has to be
avoided. We have this kind of tunable, e.g CONFIG_CMA_SIZE_SEL_MBYTES.

> 
> If we can resize the reservation later during boot this will also address
> David's concern about the wasted memory.

We can't resize the reservation, we can only shrink currently.

> 
> You mentioned that amount of memory that is required for crash kernel
> reservation depends on the devices present on the system. Is is possible to
> detect how much memory is required at late stages of boot?

It may be doable to detect at late stage of boot, need investigation, now we
are working to do after system bootup. The thing is the detection is
very coarse-grained. We count all loaded kernel modules in. But in kdump
kernel, only very necessary modules is added in our distros. e.g if we
dump through network, NIC modules are collected. otherwise we filter it out
to reduce memory usage in kdump kernel. For most of normal systems with
dozens of devices, memory required by device driver in kdump kernel is
limited. On VM guests, it's even much less since only very necessary
devices are added, e.g disk/NIC/serial.

So, I said 99% of systems can be covered by default value, it's based on
a certain kernel with fixed kernel configs, mainly related to distros.
Adding a permanent default value in upstream kernel doesn't make much
sense, if no tunable provided for distros to adjust.

> 
> > > b) passing it explicitly in via the cmdline
> > > 
> > > > 
> > > > And crashkernel=auto makes this more flexibly. We can tune the values
> > > > easily when upgrading.  But if we pass a fixed value in userspace we
> > > > can not know if the value is set by distribution automatically or by user
> > > > manually thus we can not blindly update it.
> > > 
> > > I think there are two different cases:
> > > 
> > > 
> > > 1. kernel space updates the value later during boot. "crashkernel=auto"
> > > really does the right thing, meaning
> > > 
> > > a) allocate something reasonable and safe during early boot
> > > b) update the allocation during late boot when we know what kind of system
> > > we're running on
> > > 
> > > Then, we indeed care about "crashkernel=auto" in the kernel and I think it
> > > would be a nice thing to have. The only question is on how to make that a
> > > little configurable, depending on different thingies we might want to run in
> > > the crashkernel (assuming someone doesn't want kdump).
> > > 
> > > 
> > > 2. user space updates the value later during boot
> > > 
> > > IMHO we don't really car who decided on the value as we do the update from
> > > user space. If an admin messes with crashkernel=, the admin can also mess
> > > with kdump not doing any overwrites (e.g., make that configurable, or detect
> > > the overwrite in kdump somehow).
> > > 
> > > -- 
> > > Thanks,
> > > 
> > > David / dhildenb
> > > 
> > 
> > 
> 
> -- 
> Sincerely yours,
> Mike.
> 


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-11 17:07                         ` David Hildenbrand
@ 2021-05-12 14:51                           ` Baoquan He
  2021-05-12 15:07                             ` David Hildenbrand
                                               ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Baoquan He @ 2021-05-12 14:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

On 05/11/21 at 07:07pm, David Hildenbrand wrote:
> > > If the way adding default value into kernel config is disliked,
> > > this a) option looks good. We can get value with x% of system RAM, but
> > > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > > defined with a default value for different ARCHes. It's very close to
> > > our current implementation, and handling 'auto' in kernel.
> > > 
> > > And kernel config provided so that people can tune the MIN/MAX value,
> > > but no need to post patch to do the tuning each time if have to?
> > Maybe I'm missing something, but the whole point is to avoid kernel
> > configuration option at all. If the crashkernel=auto works good for 99% of
> > the cases, there is no need to provide build time configuration along with
> > it. There are plenty of ways users can control crashkernel reservations
> > with the existing 2-4 (depending on architecture) command line options.
> > 
> > Simply hard coding a reasonable defaults (e.g.
> > "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> > crashkernel=auto is set would cover the same 99% of users you referred to.
> 
> Right, and we can easily allocate a bit more as a safety net temporarily
> when we can actually shrink the area later.
> 
> > 
> > If we can resize the reservation later during boot this will also address
> > David's concern about the wasted memory.
> > 
> 
> Yes.
> 
> > You mentioned that amount of memory that is required for crash kernel
> > reservation depends on the devices present on the system. Is is possible to
> > detect how much memory is required at late stages of boot?
> 
> Here is my thinking:
> 
> There seems to be some kind of formula we can roughly use to come up with
> the final crashkernel size. Baoquan for sure knows all the dirty details, I
> assume it's roughly "core kernel + drivers + user space".
> 
> In the kernel, we can only come up with "core kernel + drivers" expecting
> that we will run
> 
> a) roughly the same kernel
> b) with roughly the same drivers

As replied to Mike, kernel size is undecided for different kernel with
different configs. We can define a default minimal size to cover kernel
and driver on systems with not many devices, but hardcoding the size
into upstream is not helpful. If the size is big, users will be asked to
check and shrink always. If the size is too small, a new value need be
got and added to cmdline and reboot.

> 
> The "user space" part is completely under user space control, depending on
> what application will be run after kexec.
> 
> So I wonder if something like
> 
> crashkernel=auto,100M
> 
> whereby "100M" corresponds to user space demands in addition to the variable
> part depend on the current kernel + drivers.
> 
> would already be somewhat sufficient for main use cases I guess.
> 
> Of course, that approach will get more complicated if the user space portion
> heavily depends on the drivers etc. Then we need more tunables.
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-12 14:51                           ` Baoquan He
@ 2021-05-12 15:07                             ` David Hildenbrand
  2021-05-13  5:04                               ` Baoquan He
  2021-05-12 19:03                             ` Kairui Song
  2021-05-17  8:22                             ` David Hildenbrand
  2 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-12 15:07 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

On 12.05.21 16:51, Baoquan He wrote:
> On 05/11/21 at 07:07pm, David Hildenbrand wrote:
>>>> If the way adding default value into kernel config is disliked,
>>>> this a) option looks good. We can get value with x% of system RAM, but
>>>> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
>>>> defined with a default value for different ARCHes. It's very close to
>>>> our current implementation, and handling 'auto' in kernel.
>>>>
>>>> And kernel config provided so that people can tune the MIN/MAX value,
>>>> but no need to post patch to do the tuning each time if have to?
>>> Maybe I'm missing something, but the whole point is to avoid kernel
>>> configuration option at all. If the crashkernel=auto works good for 99% of
>>> the cases, there is no need to provide build time configuration along with
>>> it. There are plenty of ways users can control crashkernel reservations
>>> with the existing 2-4 (depending on architecture) command line options.
>>>
>>> Simply hard coding a reasonable defaults (e.g.
>>> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
>>> crashkernel=auto is set would cover the same 99% of users you referred to.
>>
>> Right, and we can easily allocate a bit more as a safety net temporarily
>> when we can actually shrink the area later.
>>
>>>
>>> If we can resize the reservation later during boot this will also address
>>> David's concern about the wasted memory.
>>>
>>
>> Yes.
>>
>>> You mentioned that amount of memory that is required for crash kernel
>>> reservation depends on the devices present on the system. Is is possible to
>>> detect how much memory is required at late stages of boot?
>>
>> Here is my thinking:
>>
>> There seems to be some kind of formula we can roughly use to come up with
>> the final crashkernel size. Baoquan for sure knows all the dirty details, I
>> assume it's roughly "core kernel + drivers + user space".
>>
>> In the kernel, we can only come up with "core kernel + drivers" expecting
>> that we will run
>>
>> a) roughly the same kernel
>> b) with roughly the same drivers
> 
> As replied to Mike, kernel size is undecided for different kernel with
> different configs. We can define a default minimal size to cover kernel
> and driver on systems with not many devices, but hardcoding the size

I never talked about hardcoding, did I?

> into upstream is not helpful. If the size is big, users will be asked to
> check and shrink always. If the size is too small, a new value need be
> got and added to cmdline and reboot.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-12 14:51                           ` Baoquan He
  2021-05-12 15:07                             ` David Hildenbrand
@ 2021-05-12 19:03                             ` Kairui Song
  2021-05-17  8:22                             ` David Hildenbrand
  2 siblings, 0 replies; 119+ messages in thread
From: Kairui Song @ 2021-05-12 19:03 UTC (permalink / raw)
  To: David Hildenbrand, Baoquan He
  Cc: Mike Rapoport, Dave Young, Andrew Morton, Christian Brauner,
	Colin Ian King, Jonathan Corbet, Frederic Weisbecker,
	Guilherme G. Piccoli, John Donnelly, Josh Poimboeuf, Kees Cook,
	linux-mm, Masahiro Yamada, Mauro Carvalho Chehab, Mike Kravetz,
	Ingo Molnar, mm-commits, Paul E. McKenney, Peter Zijlstra,
	Randy Dunlap, Steven Rostedt (VMware),
	Saeed Mirzamohammadi, Sami Tolvanen, Stephen Boyd,
	Thomas Gleixner, torvalds, Vivek Goyal, YiFei Zhu, Michal Hocko

On Wed, May 12, 2021 at 10:52 PM Baoquan He <bhe@redhat.com> wrote:
> On 05/11/21 at 07:07pm, David Hildenbrand wrote:
> > > > If the way adding default value into kernel config is disliked,
> > > > this a) option looks good. We can get value with x% of system RAM, but
> > > > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > > > defined with a default value for different ARCHes. It's very close to
> > > > our current implementation, and handling 'auto' in kernel.
> > > >
> > > > And kernel config provided so that people can tune the MIN/MAX value,
> > > > but no need to post patch to do the tuning each time if have to?
> > > Maybe I'm missing something, but the whole point is to avoid kernel
> > > configuration option at all. If the crashkernel=auto works good for 99% of
> > > the cases, there is no need to provide build time configuration along with
> > > it. There are plenty of ways users can control crashkernel reservations
> > > with the existing 2-4 (depending on architecture) command line options.
> > >
> > > Simply hard coding a reasonable defaults (e.g.
> > > "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> > > crashkernel=auto is set would cover the same 99% of users you referred to.
> >
> > Right, and we can easily allocate a bit more as a safety net temporarily
> > when we can actually shrink the area later.
> >
> > >
> > > If we can resize the reservation later during boot this will also address
> > > David's concern about the wasted memory.
> > >
> >
> > Yes.
> >
> > > You mentioned that amount of memory that is required for crash kernel
> > > reservation depends on the devices present on the system. Is is possible to
> > > detect how much memory is required at late stages of boot?
> >
> > Here is my thinking:
> >
> > There seems to be some kind of formula we can roughly use to come up with
> > the final crashkernel size. Baoquan for sure knows all the dirty details, I
> > assume it's roughly "core kernel + drivers + user space".
> >
> > In the kernel, we can only come up with "core kernel + drivers" expecting
> > that we will run
> >
> > a) roughly the same kernel
> > b) with roughly the same drivers
>
> As replied to Mike, kernel size is undecided for different kernel with
> different configs. We can define a default minimal size to cover kernel
> and driver on systems with not many devices, but hardcoding the size
> into upstream is not helpful. If the size is big, users will be asked to
> check and shrink always. If the size is too small, a new value need be
> got and added to cmdline and reboot.
>
> >
> > The "user space" part is completely under user space control, depending on
> > what application will be run after kexec.
> >
> > So I wonder if something like
> >
> > crashkernel=auto,100M
> >
> > whereby "100M" corresponds to user space demands in addition to the variable
> > part depend on the current kernel + drivers.
> >
> > would already be somewhat sufficient for main use cases I guess.
> >
> > Of course, that approach will get more complicated if the user space portion
> > heavily depends on the drivers etc. Then we need more tunables.
> >

I actually like this idea of "crashkernel=auto,100M" at first look, it
gives some tunable space for userspace, and kernel can just take care
of its own memory usage. Userspace is completely undeterminable.

But unfortunately estimating kernel usage for kdump is also very hard.
It's heavily related to the kdump kernel's cmdline, and kernel has
many kdump specified behavior/workaround that affects mem usage, and
kernel kconfig also affects it.

Just for example, `nr_cpus=1`, `noefi` are commonly used for kdump
kernel cmdline to reduce memory usage, but it's also completely
acceptable to not use such kernel params for kdump kernel. Even a
rough estimation most likely won't work, those moving parts can change
the memory usage by a lot.

So basically the kdump's memory usage (userspace or kernel) is not
estimable from kernel side in a generic way. It's strictly bonded to
distro implementation and config.

And also that's why this patch started with adding a kconfig, so
distros can set a value that corresponds to their default setup.

Baoquan has added reasons why passing the `crashkernel=` config via
cmdline also mess things up. So at the time this patch is sent, having
a tunable (via kconfig) `crashkernel=auto` seemed the most helpful
way. I'm not sure if there is a better way to make it distro tunable
if not through kconfig.

--
Best Regards,
Kairui Song


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-12 15:07                             ` David Hildenbrand
@ 2021-05-13  5:04                               ` Baoquan He
  0 siblings, 0 replies; 119+ messages in thread
From: Baoquan He @ 2021-05-13  5:04 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

On 05/12/21 at 05:07pm, David Hildenbrand wrote:
> On 12.05.21 16:51, Baoquan He wrote:
> > On 05/11/21 at 07:07pm, David Hildenbrand wrote:
> > > > > If the way adding default value into kernel config is disliked,
> > > > > this a) option looks good. We can get value with x% of system RAM, but
> > > > > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > > > > defined with a default value for different ARCHes. It's very close to
> > > > > our current implementation, and handling 'auto' in kernel.
> > > > > 
> > > > > And kernel config provided so that people can tune the MIN/MAX value,
> > > > > but no need to post patch to do the tuning each time if have to?
> > > > Maybe I'm missing something, but the whole point is to avoid kernel
> > > > configuration option at all. If the crashkernel=auto works good for 99% of
> > > > the cases, there is no need to provide build time configuration along with
> > > > it. There are plenty of ways users can control crashkernel reservations
> > > > with the existing 2-4 (depending on architecture) command line options.
> > > > 
> > > > Simply hard coding a reasonable defaults (e.g.
> > > > "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> > > > crashkernel=auto is set would cover the same 99% of users you referred to.
> > > 
> > > Right, and we can easily allocate a bit more as a safety net temporarily
> > > when we can actually shrink the area later.
> > > 
> > > > 
> > > > If we can resize the reservation later during boot this will also address
> > > > David's concern about the wasted memory.
> > > > 
> > > 
> > > Yes.
> > > 
> > > > You mentioned that amount of memory that is required for crash kernel
> > > > reservation depends on the devices present on the system. Is is possible to
> > > > detect how much memory is required at late stages of boot?
> > > 
> > > Here is my thinking:
> > > 
> > > There seems to be some kind of formula we can roughly use to come up with
> > > the final crashkernel size. Baoquan for sure knows all the dirty details, I
> > > assume it's roughly "core kernel + drivers + user space".
> > > 
> > > In the kernel, we can only come up with "core kernel + drivers" expecting
> > > that we will run
> > > 
> > > a) roughly the same kernel
> > > b) with roughly the same drivers
> > 
> > As replied to Mike, kernel size is undecided for different kernel with
> > different configs. We can define a default minimal size to cover kernel
> > and driver on systems with not many devices, but hardcoding the size
> 
> I never talked about hardcoding, did I?

Sorry, I didn't make it clear. I said hardcoding, meaning a hardcoding
min value. No matter what formula we take, it needs a default MIN value
to restrict the lowest size, right? That MIN value is the hardcoding I
meant. With it properly chosen, most of systems have no need to shrink
or adjust the crashkernel, given most of systems own memory less than
64G. Let alone the later estimation is done in 1st kernel, very likely
it will get a bigger value as really needed.

> 
> > into upstream is not helpful. If the size is big, users will be asked to
> > check and shrink always. If the size is too small, a new value need be
> > got and added to cmdline and reboot.
> 
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-12 14:51                           ` Baoquan He
  2021-05-12 15:07                             ` David Hildenbrand
  2021-05-12 19:03                             ` Kairui Song
@ 2021-05-17  8:22                             ` David Hildenbrand
  2021-05-18  8:49                               ` Baoquan He
  2 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-17  8:22 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong

On 12.05.21 16:51, Baoquan He wrote:
> On 05/11/21 at 07:07pm, David Hildenbrand wrote:
>>>> If the way adding default value into kernel config is disliked,
>>>> this a) option looks good. We can get value with x% of system RAM, but
>>>> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
>>>> defined with a default value for different ARCHes. It's very close to
>>>> our current implementation, and handling 'auto' in kernel.
>>>>
>>>> And kernel config provided so that people can tune the MIN/MAX value,
>>>> but no need to post patch to do the tuning each time if have to?
>>> Maybe I'm missing something, but the whole point is to avoid kernel
>>> configuration option at all. If the crashkernel=auto works good for 99% of
>>> the cases, there is no need to provide build time configuration along with
>>> it. There are plenty of ways users can control crashkernel reservations
>>> with the existing 2-4 (depending on architecture) command line options.
>>>
>>> Simply hard coding a reasonable defaults (e.g.
>>> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
>>> crashkernel=auto is set would cover the same 99% of users you referred to.
>>
>> Right, and we can easily allocate a bit more as a safety net temporarily
>> when we can actually shrink the area later.
>>
>>>
>>> If we can resize the reservation later during boot this will also address
>>> David's concern about the wasted memory.
>>>
>>
>> Yes.
>>
>>> You mentioned that amount of memory that is required for crash kernel
>>> reservation depends on the devices present on the system. Is is possible to
>>> detect how much memory is required at late stages of boot?
>>
>> Here is my thinking:
>>
>> There seems to be some kind of formula we can roughly use to come up with
>> the final crashkernel size. Baoquan for sure knows all the dirty details, I
>> assume it's roughly "core kernel + drivers + user space".
>>
>> In the kernel, we can only come up with "core kernel + drivers" expecting
>> that we will run
>>
>> a) roughly the same kernel
>> b) with roughly the same drivers
> 
> As replied to Mike, kernel size is undecided for different kernel with
> different configs. We can define a default minimal size to cover kernel
> and driver on systems with not many devices, but hardcoding the size
> into upstream is not helpful. If the size is big, users will be asked to
> check and shrink always. If the size is too small, a new value need be
> got and added to cmdline and reboot.
> 

Hi Baoquan, Kairui, Dave,

so IIUC now, our "old" kernel cannot actually tell us any reliable 
"crashkernel area size" because

a) it has no idea with which cmdline parameters the crashkernel will be
    started with, and these can have a big impact.
b) it has no idea which driver will be loaded in the crashkernel.
c) It has no idea what will be running in the crashkernel user space.


AFAIKS, best we can do without further information is, therefore, use 
some heuristic to a) allocate some memory early during boot in the 
kernel and b) later refine our allocation, triggered by user space (-> 
shrink the crashkernel area).

I dislike calling a) "auto". It provides a default based on some 
heuristic (boot memory size), and that default might be very unfortunate 
in some scenarios (-> waste memory).

While we could discuss calling the current approach ( a) 
)"crashkernel=default", whereby the default is encoded at compile time 
as determined by a distributor, I still still quite don't like it 
because it feels like this is not necessary. We have a way to pass 
something like that via the cmdline, so it's just a matter of properly 
using that feature from user space.


AFAIKS, all you want is most probably a more dynamic way to construct a 
kernel cmdline, with some properties specific to a kernel.

Let's assume the following:

a) When a distributor ships a kernel, he also ships some kind of 
defaults file. Let's assume for simplicity

/lib/modules/5.11.19-200.fc33.x86_64/defaults.conf

The file might contain

CRASHKERNEL_DEFAULT=WHATEVER


b) When generating the cmdline for e.g., 
/boot/loader/entries/XXX-5.11.19-200.fc33.x86_64.conf we run some script 
that consult that file in addition to /etc/default/grub. For example, if 
the kdump service was installed and /etc/default/grub does not contain 
"crashkernel=" (except when we encounter "crashkernel=auto" for compat 
handling), we add "crashkernel=WHATEVER". Of course, we might do more 
involved stuff based on the current setup, user config, etc.


c) When we install the kdump service, all we have to do is re-generate 
the boot entries AFAIKS. Just like we would when adding 
"crashkernel=auto" right now.


The end result would also allow for having per-kernel defaults and 
change them on kernel updates. Would require some thought on how to make 
it fly in user space, how to "ship" the defaults etc.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-17  8:22                             ` David Hildenbrand
@ 2021-05-18  8:49                               ` Baoquan He
  2021-05-18  8:51                                 ` David Hildenbrand
  0 siblings, 1 reply; 119+ messages in thread
From: Baoquan He @ 2021-05-18  8:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong, hbathini

On 05/17/21 at 10:22am, David Hildenbrand wrote:
> On 12.05.21 16:51, Baoquan He wrote:
> > On 05/11/21 at 07:07pm, David Hildenbrand wrote:
> > > > > If the way adding default value into kernel config is disliked,
> > > > > this a) option looks good. We can get value with x% of system RAM, but
> > > > > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > > > > defined with a default value for different ARCHes. It's very close to
> > > > > our current implementation, and handling 'auto' in kernel.
> > > > > 
> > > > > And kernel config provided so that people can tune the MIN/MAX value,
> > > > > but no need to post patch to do the tuning each time if have to?
> > > > Maybe I'm missing something, but the whole point is to avoid kernel
> > > > configuration option at all. If the crashkernel=auto works good for 99% of
> > > > the cases, there is no need to provide build time configuration along with
> > > > it. There are plenty of ways users can control crashkernel reservations
> > > > with the existing 2-4 (depending on architecture) command line options.
> > > > 
> > > > Simply hard coding a reasonable defaults (e.g.
> > > > "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> > > > crashkernel=auto is set would cover the same 99% of users you referred to.
> > > 
> > > Right, and we can easily allocate a bit more as a safety net temporarily
> > > when we can actually shrink the area later.
> > > 
> > > > 
> > > > If we can resize the reservation later during boot this will also address
> > > > David's concern about the wasted memory.
> > > > 
> > > 
> > > Yes.
> > > 
> > > > You mentioned that amount of memory that is required for crash kernel
> > > > reservation depends on the devices present on the system. Is is possible to
> > > > detect how much memory is required at late stages of boot?
> > > 
> > > Here is my thinking:
> > > 
> > > There seems to be some kind of formula we can roughly use to come up with
> > > the final crashkernel size. Baoquan for sure knows all the dirty details, I
> > > assume it's roughly "core kernel + drivers + user space".
> > > 
> > > In the kernel, we can only come up with "core kernel + drivers" expecting
> > > that we will run
> > > 
> > > a) roughly the same kernel
> > > b) with roughly the same drivers
> > 
> > As replied to Mike, kernel size is undecided for different kernel with
> > different configs. We can define a default minimal size to cover kernel
> > and driver on systems with not many devices, but hardcoding the size
> > into upstream is not helpful. If the size is big, users will be asked to
> > check and shrink always. If the size is too small, a new value need be
> > got and added to cmdline and reboot.
> > 
> 
> Hi Baoquan, Kairui, Dave,
> 
> so IIUC now, our "old" kernel cannot actually tell us any reliable
> "crashkernel area size" because
> 
> a) it has no idea with which cmdline parameters the crashkernel will be
>    started with, and these can have a big impact.
> b) it has no idea which driver will be loaded in the crashkernel.
> c) It has no idea what will be running in the crashkernel user space.
> 
> 
> AFAIKS, best we can do without further information is, therefore, use some
> heuristic to a) allocate some memory early during boot in the kernel and b)
> later refine our allocation, triggered by user space (-> shrink the
> crashkernel area).
> 
> I dislike calling a) "auto". It provides a default based on some heuristic
> (boot memory size), and that default might be very unfortunate in some
> scenarios (-> waste memory).
> 
> While we could discuss calling the current approach ( a)
> )"crashkernel=default", whereby the default is encoded at compile time as
> determined by a distributor, I still still quite don't like it because it
> feels like this is not necessary. We have a way to pass something like that
> via the cmdline, so it's just a matter of properly using that feature from
> user space.
> 
> 
> AFAIKS, all you want is most probably a more dynamic way to construct a
> kernel cmdline, with some properties specific to a kernel.
> 
> Let's assume the following:
> 
> a) When a distributor ships a kernel, he also ships some kind of defaults
> file. Let's assume for simplicity
> 
> /lib/modules/5.11.19-200.fc33.x86_64/defaults.conf
> 
> The file might contain
> 
> CRASHKERNEL_DEFAULT=WHATEVER
> 
> 
> b) When generating the cmdline for e.g.,
> /boot/loader/entries/XXX-5.11.19-200.fc33.x86_64.conf we run some script
> that consult that file in addition to /etc/default/grub. For example, if the
> kdump service was installed and /etc/default/grub does not contain
> "crashkernel=" (except when we encounter "crashkernel=auto" for compat
> handling), we add "crashkernel=WHATEVER". Of course, we might do more
> involved stuff based on the current setup, user config, etc.
> 
> 
> c) When we install the kdump service, all we have to do is re-generate the
> boot entries AFAIKS. Just like we would when adding "crashkernel=auto" right
> now.
> 
> 
> The end result would also allow for having per-kernel defaults and change
> them on kernel updates. Would require some thought on how to make it fly in
> user space, how to "ship" the defaults etc.

Thanks for looking into this, and really appreciate your insight,
comments and patience.

We had a sync in team about various viable solutions the other day,
and also talked about the similar one as you suggested here since
it seems to be able to resolve the concerns we have for a replacement
of crashkernel=auto. We will try these in userspace in our side, hope it
won't introduce risk and can replace crashkernel=auto perfectly.


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-18  8:49                               ` Baoquan He
@ 2021-05-18  8:51                                 ` David Hildenbrand
  2021-05-18  9:24                                   ` Dave Young
  0 siblings, 1 reply; 119+ messages in thread
From: David Hildenbrand @ 2021-05-18  8:51 UTC (permalink / raw)
  To: Baoquan He
  Cc: Mike Rapoport, Dave Young, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong, hbathini

On 18.05.21 10:49, Baoquan He wrote:
> On 05/17/21 at 10:22am, David Hildenbrand wrote:
>> On 12.05.21 16:51, Baoquan He wrote:
>>> On 05/11/21 at 07:07pm, David Hildenbrand wrote:
>>>>>> If the way adding default value into kernel config is disliked,
>>>>>> this a) option looks good. We can get value with x% of system RAM, but
>>>>>> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
>>>>>> defined with a default value for different ARCHes. It's very close to
>>>>>> our current implementation, and handling 'auto' in kernel.
>>>>>>
>>>>>> And kernel config provided so that people can tune the MIN/MAX value,
>>>>>> but no need to post patch to do the tuning each time if have to?
>>>>> Maybe I'm missing something, but the whole point is to avoid kernel
>>>>> configuration option at all. If the crashkernel=auto works good for 99% of
>>>>> the cases, there is no need to provide build time configuration along with
>>>>> it. There are plenty of ways users can control crashkernel reservations
>>>>> with the existing 2-4 (depending on architecture) command line options.
>>>>>
>>>>> Simply hard coding a reasonable defaults (e.g.
>>>>> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
>>>>> crashkernel=auto is set would cover the same 99% of users you referred to.
>>>>
>>>> Right, and we can easily allocate a bit more as a safety net temporarily
>>>> when we can actually shrink the area later.
>>>>
>>>>>
>>>>> If we can resize the reservation later during boot this will also address
>>>>> David's concern about the wasted memory.
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>> You mentioned that amount of memory that is required for crash kernel
>>>>> reservation depends on the devices present on the system. Is is possible to
>>>>> detect how much memory is required at late stages of boot?
>>>>
>>>> Here is my thinking:
>>>>
>>>> There seems to be some kind of formula we can roughly use to come up with
>>>> the final crashkernel size. Baoquan for sure knows all the dirty details, I
>>>> assume it's roughly "core kernel + drivers + user space".
>>>>
>>>> In the kernel, we can only come up with "core kernel + drivers" expecting
>>>> that we will run
>>>>
>>>> a) roughly the same kernel
>>>> b) with roughly the same drivers
>>>
>>> As replied to Mike, kernel size is undecided for different kernel with
>>> different configs. We can define a default minimal size to cover kernel
>>> and driver on systems with not many devices, but hardcoding the size
>>> into upstream is not helpful. If the size is big, users will be asked to
>>> check and shrink always. If the size is too small, a new value need be
>>> got and added to cmdline and reboot.
>>>
>>
>> Hi Baoquan, Kairui, Dave,
>>
>> so IIUC now, our "old" kernel cannot actually tell us any reliable
>> "crashkernel area size" because
>>
>> a) it has no idea with which cmdline parameters the crashkernel will be
>>     started with, and these can have a big impact.
>> b) it has no idea which driver will be loaded in the crashkernel.
>> c) It has no idea what will be running in the crashkernel user space.
>>
>>
>> AFAIKS, best we can do without further information is, therefore, use some
>> heuristic to a) allocate some memory early during boot in the kernel and b)
>> later refine our allocation, triggered by user space (-> shrink the
>> crashkernel area).
>>
>> I dislike calling a) "auto". It provides a default based on some heuristic
>> (boot memory size), and that default might be very unfortunate in some
>> scenarios (-> waste memory).
>>
>> While we could discuss calling the current approach ( a)
>> )"crashkernel=default", whereby the default is encoded at compile time as
>> determined by a distributor, I still still quite don't like it because it
>> feels like this is not necessary. We have a way to pass something like that
>> via the cmdline, so it's just a matter of properly using that feature from
>> user space.
>>
>>
>> AFAIKS, all you want is most probably a more dynamic way to construct a
>> kernel cmdline, with some properties specific to a kernel.
>>
>> Let's assume the following:
>>
>> a) When a distributor ships a kernel, he also ships some kind of defaults
>> file. Let's assume for simplicity
>>
>> /lib/modules/5.11.19-200.fc33.x86_64/defaults.conf
>>
>> The file might contain
>>
>> CRASHKERNEL_DEFAULT=WHATEVER
>>
>>
>> b) When generating the cmdline for e.g.,
>> /boot/loader/entries/XXX-5.11.19-200.fc33.x86_64.conf we run some script
>> that consult that file in addition to /etc/default/grub. For example, if the
>> kdump service was installed and /etc/default/grub does not contain
>> "crashkernel=" (except when we encounter "crashkernel=auto" for compat
>> handling), we add "crashkernel=WHATEVER". Of course, we might do more
>> involved stuff based on the current setup, user config, etc.
>>
>>
>> c) When we install the kdump service, all we have to do is re-generate the
>> boot entries AFAIKS. Just like we would when adding "crashkernel=auto" right
>> now.
>>
>>
>> The end result would also allow for having per-kernel defaults and change
>> them on kernel updates. Would require some thought on how to make it fly in
>> user space, how to "ship" the defaults etc.
> 
> Thanks for looking into this, and really appreciate your insight,
> comments and patience.

Thanks for being patient with me :)

> 
> We had a sync in team about various viable solutions the other day,
> and also talked about the similar one as you suggested here since
> it seems to be able to resolve the concerns we have for a replacement
> of crashkernel=auto. We will try these in userspace in our side, hope it
> won't introduce risk and can replace crashkernel=auto perfectly.

Sure, and as I said, if we want to look into shrinking of the 
crashkernel area triggered by user space, I'm happy to help.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 119+ messages in thread

* Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation
  2021-05-18  8:51                                 ` David Hildenbrand
@ 2021-05-18  9:24                                   ` Dave Young
  0 siblings, 0 replies; 119+ messages in thread
From: Dave Young @ 2021-05-18  9:24 UTC (permalink / raw)
  To: David Hildenbrand, hbathini
  Cc: Baoquan He, Mike Rapoport, Andrew Morton, christian.brauner,
	colin.king, corbet, frederic, gpiccoli, john.p.donnelly,
	jpoimboe, keescook, linux-mm, masahiroy, mchehab+huawei,
	mike.kravetz, mingo, mm-commits, paulmck, peterz, rdunlap,
	rostedt, saeed.mirzamohammadi, samitolvanen, sboyd, tglx,
	torvalds, vgoyal, yifeifz2, Michal Hocko, kasong, kexec

[Add kexec list, for people interested about the old replies, please find in linux-mm archive]
On 05/18/21 at 10:51am, David Hildenbrand wrote:
> On 18.05.21 10:49, Baoquan He wrote:
> > On 05/17/21 at 10:22am, David Hildenbrand wrote:
> > > On 12.05.21 16:51, Baoquan He wrote:
> > > > On 05/11/21 at 07:07pm, David Hildenbrand wrote:
> > > > > > > If the way adding default value into kernel config is disliked,
> > > > > > > this a) option looks good. We can get value with x% of system RAM, but
> > > > > > > clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may need be
> > > > > > > defined with a default value for different ARCHes. It's very close to
> > > > > > > our current implementation, and handling 'auto' in kernel.
> > > > > > > 
> > > > > > > And kernel config provided so that people can tune the MIN/MAX value,
> > > > > > > but no need to post patch to do the tuning each time if have to?
> > > > > > Maybe I'm missing something, but the whole point is to avoid kernel
> > > > > > configuration option at all. If the crashkernel=auto works good for 99% of
> > > > > > the cases, there is no need to provide build time configuration along with
> > > > > > it. There are plenty of ways users can control crashkernel reservations
> > > > > > with the existing 2-4 (depending on architecture) command line options.
> > > > > > 
> > > > > > Simply hard coding a reasonable defaults (e.g.
> > > > > > "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when
> > > > > > crashkernel=auto is set would cover the same 99% of users you referred to.
> > > > > 
> > > > > Right, and we can easily allocate a bit more as a safety net temporarily
> > > > > when we can actually shrink the area later.
> > > > > 
> > > > > > 
> > > > > > If we can resize the reservation later during boot this will also address
> > > > > > David's concern about the wasted memory.
> > > > > > 
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > You mentioned that amount of memory that is required for crash kernel
> > > > > > reservation depends on the devices present on the system. Is is possible to
> > > > > > detect how much memory is required at late stages of boot?
> > > > > 
> > > > > Here is my thinking:
> > > > > 
> > > > > There seems to be some kind of formula we can roughly use to come up with
> > > > > the final crashkernel size. Baoquan for sure knows all the dirty details, I
> > > > > assume it's roughly "core kernel + drivers + user space".
> > > > > 
> > > > > In the kernel, we can only come up with "core kernel + drivers" expecting
> > > > > that we will run
> > > > > 
> > > > > a) roughly the same kernel
> > > > > b) with roughly the same drivers
> > > > 
> > > > As replied to Mike, kernel size is undecided for different kernel with
> > > > different configs. We can define a default minimal size to cover kernel
> > > > and driver on systems with not many devices, but hardcoding the size
> > > > into upstream is not helpful. If the size is big, users will be asked to
> > > > check and shrink always. If the size is too small, a new value need be
> > > > got and added to cmdline and reboot.
> > > > 
> > > 
> > > Hi Baoquan, Kairui, Dave,
> > > 
> > > so IIUC now, our "old" kernel cannot actually tell us any reliable
> > > "crashkernel area size" because
> > > 
> > > a) it has no idea with which cmdline parameters the crashkernel will be
> > >     started with, and these can have a big impact.
> > > b) it has no idea which driver will be loaded in the crashkernel.
> > > c) It has no idea what will be running in the crashkernel user space.
> > > 
> > > 
> > > AFAIKS, best we can do without further information is, therefore, use some
> > > heuristic to a) allocate some memory early during boot in the kernel and b)
> > > later refine our allocation, triggered by user space (-> shrink the
> > > crashkernel area).
> > > 
> > > I dislike calling a) "auto". It provides a default based on some heuristic
> > > (boot memory size), and that default might be very unfortunate in some
> > > scenarios (-> waste memory).
> > > 
> > > While we could discuss calling the current approach ( a)
> > > )"crashkernel=default", whereby the default is encoded at compile time as
> > > determined by a distributor, I still still quite don't like it because it
> > > feels like this is not necessary. We have a way to pass something like that
> > > via the cmdline, so it's just a matter of properly using that feature from
> > > user space.
> > > 
> > > 
> > > AFAIKS, all you want is most probably a more dynamic way to construct a
> > > kernel cmdline, with some properties specific to a kernel.
> > > 
> > > Let's assume the following:
> > > 
> > > a) When a distributor ships a kernel, he also ships some kind of defaults
> > > file. Let's assume for simplicity
> > > 
> > > /lib/modules/5.11.19-200.fc33.x86_64/defaults.conf
> > > 
> > > The file might contain
> > > 
> > > CRASHKERNEL_DEFAULT=WHATEVER
> > > 
> > > 
> > > b) When generating the cmdline for e.g.,
> > > /boot/loader/entries/XXX-5.11.19-200.fc33.x86_64.conf we run some script
> > > that consult that file in addition to /etc/default/grub. For example, if the
> > > kdump service was installed and /etc/default/grub does not contain
> > > "crashkernel=" (except when we encounter "crashkernel=auto" for compat
> > > handling), we add "crashkernel=WHATEVER". Of course, we might do more
> > > involved stuff based on the current setup, user config, etc.
> > > 
> > > 
> > > c) When we install the kdump service, all we have to do is re-generate the
> > > boot entries AFAIKS. Just like we would when adding "crashkernel=auto" right
> > > now.
> > > 
> > > 
> > > The end result would also allow for having per-kernel defaults and change
> > > them on kernel updates. Would require some thought on how to make it fly in
> > > user space, how to "ship" the defaults etc.
> > 
> > Thanks for looking into this, and really appreciate your insight,
> > comments and patience.
> 
> Thanks for being patient with me :)
> 
> > 
> > We had a sync in team about various viable solutions the other day,
> > and also talked about the similar one as you suggested here since
> > it seems to be able to resolve the concerns we have for a replacement
> > of crashkernel=auto. We will try these in userspace in our side, hope it
> > won't introduce risk and can replace crashkernel=auto perfectly.
> 
> Sure, and as I said, if we want to look into shrinking of the crashkernel
> area triggered by user space, I'm happy to help.
> 

David, Baoquan, thank you both for exploring the issue.  Let's try to do
it like this in downstream.

Kdump initramfs is created for kdump needed only with less memory
requirements, but fadump depends on the normal kernel initramfs thus
fadump needs more memory than kdump.

Hari, with this new no-auto approach, another thing we need to consider is how
fadump will use same value if you do not introduce a new param.  As you
are working in dracut to pack kdump initramfs into 1st kernel initramfs,
it is possible that kdump and fadump can use same value, maybe kdump
crashkernel value plus some static number for powerpc only. Anyway just
a thought.  Please provide your comments if any.

Thanks
Dave


^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2021-05-18  9:24 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-07  1:01 incoming Andrew Morton
2021-05-07  1:02 ` [patch 01/91] alpha: eliminate old-style function definitions Andrew Morton
2021-05-07  1:02 ` [patch 02/91] alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h> Andrew Morton
2021-05-07  1:02 ` [patch 03/91] fs/proc/generic.c: fix incorrect pde_is_permanent check Andrew Morton
2021-05-07  1:02 ` [patch 04/91] proc: save LOC in __xlate_proc_name() Andrew Morton
2021-05-07  2:24   ` Linus Torvalds
2021-05-07  1:02 ` [patch 05/91] proc: mandate ->proc_lseek in "struct proc_ops" Andrew Morton
2021-05-07  1:02 ` [patch 06/91] proc: delete redundant subset=pid check Andrew Morton
2021-05-07  1:02 ` [patch 07/91] selftests: proc: test subset=pid Andrew Morton
2021-05-07  1:02 ` [patch 08/91] proc/sysctl: fix function name error in comments Andrew Morton
2021-05-07  1:02 ` [patch 09/91] include: remove pagemap.h from blkdev.h Andrew Morton
2021-05-07  1:02 ` [patch 10/91] kernel.h: drop inclusion in bitmap.h Andrew Morton
2021-05-07  1:02 ` [patch 11/91] linux/profile.h: remove unnecessary declaration Andrew Morton
2021-05-07  1:02 ` [patch 12/91] kernel/async.c: fix pr_debug statement Andrew Morton
2021-05-07  1:02 ` [patch 13/91] kernel/cred.c: make init_groups static Andrew Morton
2021-05-07  1:02 ` [patch 14/91] tools: disable -Wno-type-limits Andrew Morton
2021-05-07  1:02 ` [patch 15/91] tools: bitmap: sync function declarations with the kernel Andrew Morton
2021-05-07  1:02 ` [patch 16/91] tools: sync BITMAP_LAST_WORD_MASK() macro " Andrew Morton
2021-05-07  1:02 ` [patch 17/91] arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300 Andrew Morton
2021-05-07  1:02 ` [patch 18/91] lib: extend the scope of small_const_nbits() macro Andrew Morton
2021-05-07  1:03 ` [patch 19/91] tools: sync small_const_nbits() macro with the kernel Andrew Morton
2021-05-07  1:03 ` [patch 20/91] lib: inline _find_next_bit() wrappers Andrew Morton
2021-05-07  1:03 ` [patch 21/91] tools: sync find_next_bit implementation Andrew Morton
2021-05-07  1:03 ` [patch 22/91] lib: add fast path for find_next_*_bit() Andrew Morton
2021-05-07  1:03 ` [patch 23/91] lib: add fast path for find_first_*_bit() and find_last_bit() Andrew Morton
2021-05-07  1:03 ` [patch 24/91] tools: sync lib/find_bit implementation Andrew Morton
2021-05-07  1:03 ` [patch 25/91] MAINTAINERS: add entry for the bitmap API Andrew Morton
2021-05-07  1:03 ` [patch 26/91] lib/bch.c: fix a typo in the file bch.c Andrew Morton
2021-05-07  1:03 ` [patch 27/91] lib: fix inconsistent indenting in process_bit1() Andrew Morton
2021-05-07  1:03 ` [patch 28/91] lib/list_sort.c: fix typo in function description Andrew Morton
2021-05-07  1:03 ` [patch 29/91] lib/genalloc.c: fix a typo Andrew Morton
2021-05-07  1:03 ` [patch 30/91] lib: crc8: pointer to data block should be const Andrew Morton
2021-05-07  1:03 ` [patch 31/91] lib: stackdepot: turn depot_lock spinlock to raw_spinlock Andrew Morton
2021-05-07  1:03 ` [patch 32/91] lib/percpu_counter: tame kernel-doc compile warning Andrew Morton
2021-05-07  1:03 ` [patch 33/91] lib/genalloc: add parameter description to fix doc " Andrew Morton
2021-05-07  1:03 ` [patch 34/91] lib: parser: clean up kernel-doc Andrew Morton
2021-05-07  1:03 ` [patch 35/91] include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx() Andrew Morton
2021-05-07  1:03 ` [patch 36/91] checkpatch: warn when missing newline in return sysfs_emit() formats Andrew Morton
2021-05-07  1:03 ` [patch 37/91] checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE Andrew Morton
2021-05-07  1:04 ` [patch 38/91] checkpatch: improve ALLOC_ARRAY_ARGS test Andrew Morton
2021-05-07  1:04 ` [patch 39/91] kselftest: introduce new epoll test case Andrew Morton
2021-05-07  1:04 ` [patch 40/91] fs/epoll: restore waking from ep_done_scan() Andrew Morton
2021-05-07  1:04 ` [patch 41/91] isofs: fix fall-through warnings for Clang Andrew Morton
2021-05-07  1:04 ` [patch 42/91] fs/nilfs2: fix misspellings using codespell tool Andrew Morton
2021-05-07  1:04 ` [patch 43/91] nilfs2: fix typos in comments Andrew Morton
2021-05-07  1:04 ` [patch 44/91] hpfs: replace one-element array with flexible-array member Andrew Morton
2021-05-07  1:04 ` [patch 45/91] do_wait: make PIDTYPE_PID case O(1) instead of O(n) Andrew Morton
2021-05-07  1:04 ` [patch 46/91] kernel/fork.c: simplify copy_mm() Andrew Morton
2021-05-07  1:04 ` [patch 47/91] kernel/fork.c: fix typos Andrew Morton
2021-05-07  1:04 ` [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Andrew Morton
2021-05-07  7:25   ` Linus Torvalds
2021-05-08  3:13     ` Baoquan He
2021-05-08  3:29       ` Baoquan He
2021-05-07  8:16   ` David Hildenbrand
2021-05-08  8:51     ` Baoquan He
2021-05-08  9:22       ` David Hildenbrand
2021-05-10  4:53         ` Baoquan He
2021-05-10  8:32           ` David Hildenbrand
2021-05-10 10:43             ` Baoquan He
2021-05-10 11:01               ` David Hildenbrand
2021-05-10 11:44                 ` Dave Young
2021-05-10 11:56                   ` David Hildenbrand
2021-05-11 13:36                     ` Baoquan He
2021-05-11 16:31                       ` Mike Rapoport
2021-05-11 17:07                         ` David Hildenbrand
2021-05-12 14:51                           ` Baoquan He
2021-05-12 15:07                             ` David Hildenbrand
2021-05-13  5:04                               ` Baoquan He
2021-05-12 19:03                             ` Kairui Song
2021-05-17  8:22                             ` David Hildenbrand
2021-05-18  8:49                               ` Baoquan He
2021-05-18  8:51                                 ` David Hildenbrand
2021-05-18  9:24                                   ` Dave Young
2021-05-12 14:13                         ` Baoquan He
2021-05-12  7:42                     ` Dave Young
2021-05-07  1:04 ` [patch 49/91] kexec: add kexec reboot string Andrew Morton
2021-05-07  1:04 ` [patch 50/91] kernel: kexec_file: fix error return code of kexec_calculate_store_digests() Andrew Morton
2021-05-07  1:04 ` [patch 51/91] kexec: dump kmessage before machine_kexec Andrew Morton
2021-05-07  1:04 ` [patch 52/91] gcov: combine common code Andrew Morton
2021-05-07  1:04 ` [patch 53/91] gcov: simplify buffer allocation Andrew Morton
2021-05-07  1:04 ` [patch 54/91] gcov: use kvmalloc() Andrew Morton
2021-05-07  1:04 ` [patch 55/91] gcov: clang: drop support for clang-10 and older Andrew Morton
2021-05-07  1:04 ` [patch 56/91] smp: kernel/panic.c - silence warnings Andrew Morton
2021-05-07  1:05 ` [patch 57/91] delayacct: clear right task's flag after blkio completes Andrew Morton
2021-05-07  1:05 ` [patch 58/91] gdb: lx-symbols: store the abspath() Andrew Morton
2021-05-07  1:05 ` [patch 59/91] scripts/gdb: document lx_current is only supported by x86 Andrew Morton
2021-05-07  1:05 ` [patch 60/91] scripts/gdb: add lx_current support for arm64 Andrew Morton
2021-05-07  1:05 ` [patch 61/91] kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources Andrew Morton
2021-05-07  1:05 ` [patch 62/91] kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources Andrew Morton
2021-05-07  1:05 ` [patch 63/91] kernel/resource: remove first_lvl / siblings_only logic Andrew Morton
2021-05-07  1:05 ` [patch 64/91] kernel/resource: allow region_intersects users to hold resource_lock Andrew Morton
2021-05-07  1:05 ` [patch 65/91] kernel/resource: refactor __request_region to allow external locking Andrew Morton
2021-05-07  1:05 ` [patch 66/91] kernel/resource: fix locking in request_free_mem_region Andrew Morton
2021-05-07  1:05 ` [patch 67/91] selftests: remove duplicate include Andrew Morton
2021-05-07  1:05 ` [patch 68/91] kernel/async.c: stop guarding pr_debug() statements Andrew Morton
2021-05-07  1:05 ` [patch 69/91] kernel/async.c: remove async_unregister_domain() Andrew Morton
2021-05-07  1:05 ` [patch 70/91] init/initramfs.c: do unpacking asynchronously Andrew Morton
2021-05-07  1:05 ` [patch 71/91] modules: add CONFIG_MODPROBE_PATH Andrew Morton
2021-05-07  1:05 ` [patch 72/91] ipc/sem.c: mundane typo fixes Andrew Morton
2021-05-07  1:05 ` [patch 73/91] mm: fix some typos and code style problems Andrew Morton
2021-05-07  1:05 ` [patch 74/91] drivers/char: remove /dev/kmem for good Andrew Morton
2021-05-07  1:06 ` [patch 75/91] mm: remove xlate_dev_kmem_ptr() Andrew Morton
2021-05-07  1:06 ` [patch 76/91] mm/vmalloc: remove vwrite() Andrew Morton
2021-05-07  1:06 ` [patch 77/91] arm: print alloc free paths for address in registers Andrew Morton
2021-05-07  1:06 ` [patch 78/91] scripts/spelling.txt: add "overlfow" Andrew Morton
2021-05-07  1:06 ` [patch 79/91] scripts/spelling.txt: add "diabled" typo Andrew Morton
2021-05-07  1:06 ` [patch 80/91] scripts/spelling.txt: add "overflw" Andrew Morton
2021-05-07  1:06 ` [patch 81/91] mm/slab.c: fix spelling mistake "disired" -> "desired" Andrew Morton
2021-05-07  1:06 ` [patch 82/91] include/linux/pgtable.h: few spelling fixes Andrew Morton
2021-05-07  1:06 ` [patch 83/91] kernel/umh.c: fix some spelling mistakes Andrew Morton
2021-05-07  1:06 ` [patch 84/91] kernel/user_namespace.c: fix typos Andrew Morton
2021-05-07  1:06 ` [patch 85/91] kernel/up.c: fix typo Andrew Morton
2021-05-07  1:06 ` [patch 86/91] kernel/sys.c: " Andrew Morton
2021-05-07  1:06 ` [patch 87/91] fs: fat: fix spelling typo of values Andrew Morton
2021-05-07  1:06 ` [patch 88/91] ipc/sem.c: spelling fix Andrew Morton
2021-05-07  1:06 ` [patch 89/91] treewide: remove editor modelines and cruft Andrew Morton
2021-05-07  1:06 ` [patch 90/91] mm: fix typos in comments Andrew Morton
2021-05-07  1:06 ` [patch 91/91] " Andrew Morton
2021-05-07  7:12 ` incoming Linus Torvalds

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox