linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2021-11-05 20:34 Andrew Morton
  2021-11-05 20:34 ` [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt Andrew Morton
                   ` (261 more replies)
  0 siblings, 262 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

262 patches, based on 8bb7eca972ad531c9b149c0a51ab43a417385813

Subsystems affected by this patch series:

  scripts
  ocfs2
  vfs
  mm/slab-generic
  mm/slab
  mm/slub
  mm/kconfig
  mm/dax
  mm/kasan
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/mprotect
  mm/mremap
  mm/iomap
  mm/tracing
  mm/vmalloc
  mm/pagealloc
  mm/memory-failure
  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/tools
  mm/memblock
  mm/oom-kill
  mm/hugetlbfs
  mm/migration
  mm/thp
  mm/readahead
  mm/nommu
  mm/ksm
  mm/vmstat
  mm/madvise
  mm/memory-hotplug
  mm/rmap
  mm/zsmalloc
  mm/highmem
  mm/zram
  mm/cleanups
  mm/kfence
  mm/damon

Subsystem: scripts

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

    Sven Eckelmann <sven@narfation.org>:
      scripts/spelling.txt: fix "mistake" version of "synchronization"

    weidonghui <weidonghui@allwinnertech.com>:
      scripts/decodecode: fix faulting instruction no print when opps.file is DOS format

Subsystem: ocfs2

    Chenyuan Mi <cymi20@fudan.edu.cn>:
      ocfs2: fix handle refcount leak in two exception handling paths

    Valentin Vidic <vvidic@valentin-vidic.from.hr>:
      ocfs2: cleanup journal init and shutdown

    Colin Ian King <colin.king@canonical.com>:
      ocfs2/dlm: remove redundant assignment of variable ret

    Jan Kara <jack@suse.cz>:
    Patch series "ocfs2: Truncate data corruption fix":
      ocfs2: fix data corruption on truncate
      ocfs2: do not zero pages beyond i_size

Subsystem: vfs

    Arnd Bergmann <arnd@arndb.de>:
      fs/posix_acl.c: avoid -Wempty-body warning

    Jia He <justin.he@arm.com>:
      d_path: fix Kernel doc validator complaining

Subsystem: mm/slab-generic

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: move kvmalloc-related functions to slab.h

Subsystem: mm/slab

    Shi Lei <shi_lei@massclouds.com>:
      mm/slab.c: remove useless lines in enable_cpucache()

Subsystem: mm/slub

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      slub: add back check for free nonslab objects

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: change percpu partial accounting from objects to pages
      mm/slub: increase default cpu partial list sizes

    Hyeonggon Yoo <42.hyeyoo@gmail.com>:
      mm, slub: use prefetchw instead of prefetch

Subsystem: mm/kconfig

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT

Subsystem: mm/dax

    Christoph Hellwig <hch@lst.de>:
      mm: don't include <linux/dax.h> in <linux/mempolicy.h>

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
    Patch series "stackdepot, kasan, workqueue: Avoid expanding stackdepot slabs when holding raw_spin_lock", v2:
      lib/stackdepot: include gfp.h
      lib/stackdepot: remove unused function argument
      lib/stackdepot: introduce __stack_depot_save()
      kasan: common: provide can_alloc in kasan_save_stack()
      kasan: generic: introduce kasan_record_aux_stack_noalloc()
      workqueue, kasan: avoid alloc_pages() when recording stack

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      kasan: fix tag for large allocations when using CONFIG_SLAB

    Peter Collingbourne <pcc@google.com>:
      kasan: test: add memcpy test that avoids out-of-bounds write

Subsystem: mm/debug

    Peter Xu <peterx@redhat.com>:
    Patch series "mm/smaps: Fixes and optimizations on shmem swap handling":
      mm/smaps: fix shmem pte hole swap calculation
      mm/smaps: use vma->vm_pgoff directly when counting partial swap
      mm/smaps: simplify shmem handling of pte holes

    Guo Ren <guoren@linux.alibaba.com>:
      mm: debug_vm_pgtable: don't use __P000 directly

    Kees Cook <keescook@chromium.org>:
      kasan: test: bypass __alloc_size checks
    Patch series "Add __alloc_size()", v3:
      rapidio: avoid bogus __alloc_size warning
      Compiler Attributes: add __alloc_size() for better bounds checking
      slab: clean up function prototypes
      slab: add __alloc_size attributes for better bounds checking
      mm/kvmalloc: add __alloc_size attributes for better bounds checking
      mm/vmalloc: add __alloc_size attributes for better bounds checking
      mm/page_alloc: add __alloc_size attributes for better bounds checking
      percpu: add __alloc_size attributes for better bounds checking

    Yinan Zhang <zhangyinan2019@email.szu.edu.cn>:
      mm/page_ext.c: fix a comment

Subsystem: mm/pagecache

    David Howells <dhowells@redhat.com>:
      mm: stop filemap_read() from grabbing a superfluous page

    Christoph Hellwig <hch@lst.de>:
    Patch series "simplify bdi unregistation":
      mm: export bdi_unregister
      mtd: call bdi_unregister explicitly
      fs: explicitly unregister per-superblock BDIs
      mm: don't automatically unregister bdis
      mm: simplify bdi refcounting

    Jens Axboe <axboe@kernel.dk>:
      mm: don't read i_size of inode unless we need it

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap.c: remove bogus VM_BUG_ON

    Jens Axboe <axboe@kernel.dk>:
      mm: move more expensive part of XA setup out of mapping check

Subsystem: mm/gup

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: further simplify __gup_device_huge()

Subsystem: mm/swap

    Xu Wang <vulab@iscas.ac.cn>:
      mm/swapfile: remove needless request_queue NULL pointer check

    Rafael Aquini <aquini@redhat.com>:
      mm/swapfile: fix an integer overflow in swap_show()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: optimise put_pages_list()

Subsystem: mm/memcg

    Peter Xu <peterx@redhat.com>:
      mm/memcg: drop swp_entry_t* in mc_handle_file_pte()

    Shakeel Butt <shakeelb@google.com>:
      memcg: flush stats only if updated
      memcg: unify memcg stat flushing

    Waiman Long <longman@redhat.com>:
      mm/memcg: remove obsolete memcg_free_kmem()

    Len Baker <len.baker@gmx.com>:
      mm/list_lru.c: prefer struct_size over open coded arithmetic

    Shakeel Butt <shakeelb@google.com>:
      memcg, kmem: further deprecate kmem.limit_in_bytes

    Muchun Song <songmuchun@bytedance.com>:
      mm: list_lru: remove holding lru lock
      mm: list_lru: fix the return value of list_lru_count_one()
      mm: memcontrol: remove kmemcg_id reparenting
      mm: memcontrol: remove the kmem states
      mm: list_lru: only add memcg-aware lrus to the global lru list

    Vasily Averin <vvs@virtuozzo.com>:
    Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3:
      mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks

    Michal Hocko <mhocko@suse.com>:
      mm, oom: do not trigger out_of_memory from the #PF

    Vasily Averin <vvs@virtuozzo.com>:
      memcg: prohibit unconditional exceeding the limit of dying tasks

Subsystem: mm/pagemap

    Peng Liu <liupeng256@huawei.com>:
      mm/mmap.c: fix a data race of mm->total_vm

    Rolf Eike Beer <eb@emlix.com>:
      mm: use __pfn_to_section() instead of open coding it

    Amit Daniel Kachhap <amit.kachhap@arm.com>:
      mm/memory.c: avoid unnecessary kernel/user pointer conversion

    Nadav Amit <namit@vmware.com>:
      mm/memory.c: use correct VMA flags when freeing page-tables

    Peter Xu <peterx@redhat.com>:
    Patch series "mm: A few cleanup patches around zap, shmem and uffd", v4:
      mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte
      mm: clear vmf->pte after pte_unmap_same() returns
      mm: drop first_index/last_index in zap_details
      mm: add zap_skip_check_mapping() helper

    Qi Zheng <zhengqi.arch@bytedance.com>:
    Patch series "Do some code cleanups related to mm", v3:
      mm: introduce pmd_install() helper
      mm: remove redundant smp_wmb()

    Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>:
      Documentation: update pagemap with shmem exceptions

    Nicholas Piggin <npiggin@gmail.com>:
    Patch series "shoot lazy tlbs", v4:
      lazy tlb: introduce lazy mm refcount helper functions
      lazy tlb: allow lazy tlb mm refcounting to be configurable
      lazy tlb: shoot lazies, a non-refcounting lazy tlb option
      powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      memory: remove unused CONFIG_MEM_BLOCK_SIZE

Subsystem: mm/mprotect

    Liu Song <liu.song11@zte.com.cn>:
      mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey()

Subsystem: mm/mremap

    Dmitry Safonov <dima@arista.com>:
      mm/mremap: don't account pages in vma_to_resize()

Subsystem: mm/iomap

    Lucas De Marchi <lucas.demarchi@intel.com>:
      include/linux/io-mapping.h: remove fallback for writecombine

Subsystem: mm/tracing

    Gang Li <ligang.bdlg@bytedance.com>:
      mm: mmap_lock: remove redundant newline  in TP_printk
      mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN

Subsystem: mm/vmalloc

    Vasily Averin <vvs@virtuozzo.com>:
      mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node()

    Peter Zijlstra <peterz@infradead.org>:
      mm/vmalloc: don't allow VM_NO_GUARD on vmap()

    Eric Dumazet <edumazet@google.com>:
      mm/vmalloc: make show_numa_info() aware of hugepage mappings
      mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: do not adjust the search size for alignment overhead
      mm/vmalloc: check various alignments when debugging

    Vasily Averin <vvs@virtuozzo.com>:
      vmalloc: back off when the current task is OOM-killed

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      vmalloc: choose a better start address in vm_area_register_early()
      arm64: support page mapping percpu first chunk allocator
      kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC

    Michal Hocko <mhocko@suse.com>:
      mm/vmalloc: be more explicit about supported gfp flags

    Chen Wandun <chenwandun@huawei.com>:
      mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation

    Changcheng Deng <deng.changcheng@zte.com.cn>:
      lib/test_vmalloc.c: use swap() to make code cleaner

Subsystem: mm/pagealloc

    Eric Dumazet <edumazet@google.com>:
      mm/large system hash: avoid possible NULL deref in alloc_large_system_hash

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups and fixup for page_alloc", v2:
      mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order()
      mm/page_alloc.c: simplify the code by using macro K()
      mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk()
      mm/page_alloc.c: use helper function zone_spans_pfn()
      mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid]

    Bharata B Rao <bharata@amd.com>:
    Patch series "Fix NUMA nodes fallback list ordering":
      mm/page_alloc: print node fallback order

    Krupa Ramakrishnan <krupa.ramakrishnan@amd.com>:
      mm/page_alloc: use accumulated load when building node fallback list

    Geert Uytterhoeven <geert+renesas@glider.be>:
    Patch series "Fix NUMA without SMP":
      mm: move node_reclaim_distance to fix NUMA without SMP
      mm: move fold_vm_numa_events() to fix NUMA without SMP

    Eric Dumazet <edumazet@google.com>:
      mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page()

    Feng Tang <feng.tang@intel.com>:
      mm/page_alloc: detect allocation forbidden by cpuset and bail out early

    Liangcai Fan <liangcaifan19@gmail.com>:
      mm/page_alloc.c: show watermark_boost of zone in zoneinfo

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm: create a new system state and fix core_kernel_text()
      mm: make generic arch_is_kernel_initmem_freed() do what it says
      powerpc: use generic version of arch_is_kernel_initmem_freed()
      s390: use generic version of arch_is_kernel_initmem_freed()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm: page_alloc: use migrate_disable() in drain_local_pages_wq()

    Wang ShaoBo <bobo.shaobowang@huawei.com>:
      mm/page_alloc: use clamp() to simplify code

Subsystem: mm/memory-failure

    Marco Elver <elver@google.com>:
      mm: fix data race in PagePoisoned()

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      mm/memory_failure: constify static mm_walk_ops

    Yang Shi <shy828301@gmail.com>:
    Patch series "Solve silent data loss caused by poisoned page cache (shmem/tmpfs)", v5:
      mm: filemap: coding style cleanup for filemap_map_pmd()
      mm: hwpoison: refactor refcount check handling
      mm: shmem: don't truncate page if memory failure happens
      mm: hwpoison: handle non-anonymous THP correctly

Subsystem: mm/hugetlb

    Peter Xu <peterx@redhat.com>:
      mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "hugetlb: add demote/split page functionality", v4:
      hugetlb: add demote hugetlb page sysfs interfaces
      mm/cma: add cma_pages_valid to determine if pages are in CMA
      hugetlb: be sure to free demoted CMA pages to CMA
      hugetlb: add demote bool to gigantic page routines
      hugetlb: add hugetlb demote page support

    Liangcai Fan <liangcaifan19@gmail.com>:
      mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged

    Mina Almasry <almasrymina@google.com>:
      mm, hugepages: add mremap() support for hugepage backed vma
      mm, hugepages: add hugetlb vma mremap() test

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      hugetlb: support node specified when using cma for gigantic hugepages

    Ran Jianping <ran.jianping@zte.com.cn>:
      mm: remove duplicate include in hugepage-mremap.c

    Baolin Wang <baolin.wang@linux.alibaba.com>:
    Patch series "Some cleanups and improvements for hugetlb":
      hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro
      hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments
      hugetlb: remove redundant validation in has_same_uncharge_info()
      hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range()

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page

Subsystem: mm/userfaultfd

    Axel Rasmussen <axelrasmussen@google.com>:
    Patch series "Small userfaultfd selftest fixups", v2:
      userfaultfd/selftests: don't rely on GNU extensions for random numbers
      userfaultfd/selftests: fix feature support detection
      userfaultfd/selftests: fix calculation of expected ioctls

Subsystem: mm/vmscan

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/page_isolation: fix potential missing call to unset_migratetype_isolate()
      mm/page_isolation: guard against possible putback unisolated page

    Kai Song <songkai01@inspur.com>:
      mm/vmscan.c: fix -Wunused-but-set-variable warning

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Remove dependency on congestion_wait in mm/", v5. Patch series:
      mm/vmscan: throttle reclaim until some writeback completes if congested
      mm/vmscan: throttle reclaim and compaction when too may pages are isolated
      mm/vmscan: throttle reclaim when no progress is being made
      mm/writeback: throttle based on page writeback instead of congestion
      mm/page_alloc: remove the throttling logic from the page allocator
      mm/vmscan: centralise timeout values for reclaim_throttle
      mm/vmscan: increase the timeout if page reclaim is not making progress
      mm/vmscan: delay waking of tasks throttled on NOPROGRESS

    Yuanzheng Song <songyuanzheng@huawei.com>:
      mm/vmpressure: fix data-race with memcg->socket_pressure

Subsystem: mm/tools

    Zhenliang Wei <weizhenliang@huawei.com>:
      tools/vm/page_owner_sort.c: count and sort by mem

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
    Patch series "tools/vm/page-types.c: a few improvements":
      tools/vm/page-types.c: make walk_file() aware of address range option
      tools/vm/page-types.c: move show_file() to summary output
      tools/vm/page-types.c: print file offset in hexadecimal

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "memblock: cleanup memblock_free interface", v2:
      arch_numa: simplify numa_distance allocation
      xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer
      memblock: drop memblock_free_early_nid() and memblock_free_early()
      memblock: stop aliasing __memblock_free_late with memblock_free_late
      memblock: rename memblock_free to memblock_phys_free
      memblock: use memblock_free for freeing virtual pointers

Subsystem: mm/oom-kill

    Sultan Alsawaf <sultan@kerneltoast.com>:
      mm: mark the OOM reaper thread as freezable

Subsystem: mm/hugetlbfs

    Zhenguo Yao <yaozhenguo1@gmail.com>:
      hugetlbfs: extend the definition of hugepages parameter to support node allocation

Subsystem: mm/migration

    John Hubbard <jhubbard@nvidia.com>:
      mm/migrate: de-duplicate migrate_reason strings

    Yang Shi <shy828301@gmail.com>:
      mm: migrate: make demotion knob depend on migration

Subsystem: mm/thp

    "George G. Davis" <davis.george@siemens.com>:
      selftests/vm/transhuge-stress: fix ram size thinko

    Rongwei Wang <rongwei.wang@linux.alibaba.com>:
    Patch series "fix two bugs for file THP":
      mm, thp: lock filemap when truncating page cache
      mm, thp: fix incorrect unmap behavior for private pages

Subsystem: mm/readahead

    Lin Feng <linf@wangsu.com>:
      mm/readahead.c: fix incorrect comments for get_init_ra_size

Subsystem: mm/nommu

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: nommu: kill arch_get_unmapped_area()

Subsystem: mm/ksm

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      selftest/vm: fix ksm selftest to run with different NUMA topologies

    Pedro Demarchi Gomes <pedrodemargomes@gmail.com>:
      selftests: vm: add KSM huge pages merging time test

Subsystem: mm/vmstat

    Liu Shixin <liushixin2@huawei.com>:
      mm/vmstat: annotate data race for zone->free_area[order].nr_free

    Lin Feng <linf@wangsu.com>:
      mm: vmstat.c: make extfrag_index show more pretty

Subsystem: mm/madvise

    David Hildenbrand <david@redhat.com>:
      selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers

Subsystem: mm/memory-hotplug

    Tang Yizhou <tangyizhou@huawei.com>:
      mm/memory_hotplug: add static qualifier for online_policy_to_str()

    David Hildenbrand <david@redhat.com>:
    Patch series "memory-hotplug.rst: document the "auto-movable" online policy":
      memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node"
      memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path
      memory-hotplug.rst: document the "auto-movable" online policy
    Patch series "mm/memory_hotplug: Kconfig and 32 bit cleanups":
      mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG
      mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
      mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit
      mm/memory_hotplug: remove HIGHMEM leftovers
      mm/memory_hotplug: remove stale function declarations
      x86: remove memory hotplug support on X86_32
    Patch series "mm/memory_hotplug: full support for add_memory_driver_managed() with CONFIG_ARCH_KEEP_MEMBLOCK", v2:
      mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource()
      memblock: improve MEMBLOCK_HOTPLUG documentation
      memblock: allow to specify flags with memblock_add_node()
      memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
      mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED

Subsystem: mm/rmap

    Alistair Popple <apopple@nvidia.com>:
      mm/rmap.c: avoid double faults migrating device private pages

Subsystem: mm/zsmalloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()

Subsystem: mm/highmem

    Ira Weiny <ira.weiny@intel.com>:
      mm/highmem: remove deprecated kmap_atomic

Subsystem: mm/zram

    Jaewon Kim <jaewon31.kim@samsung.com>:
      zram_drv: allow reclaim on bio_alloc

    Dan Carpenter <dan.carpenter@oracle.com>:
      zram: off by one in read_block_state()

    Brian Geffon <bgeffon@google.com>:
      zram: introduce an aged idle interface

Subsystem: mm/cleanups

    Stephen Kitt <steve@sk2.org>:
      mm: remove HARDENED_USERCOPY_FALLBACK

    Mianhan Liu <liumh1@shanghaitech.edu.cn>:
      include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      stacktrace: move filter_irq_stacks() to kernel/stacktrace.c
      kfence: count unexpectedly skipped allocations
      kfence: move saving stack trace of allocations into __kfence_alloc()
      kfence: limit currently covered allocations when pool nearly full
      kfence: add note to documentation about skipping covered allocations
      kfence: test: use kunit_skip() to skip tests
      kfence: shorten critical sections of alloc/free
      kfence: always use static branches to guard kfence_alloc()
      kfence: default to dynamic branch instead of static keys mode

Subsystem: mm/damon

    Geert Uytterhoeven <geert@linux-m68k.org>:
      mm/damon: grammar s/works/work/

    SeongJae Park <sjpark@amazon.de>:
      Documentation/vm: move user guides to admin-guide/mm/

    SeongJae Park <sj@kernel.org>:
      MAINTAINERS: update SeongJae's email address

    SeongJae Park <sjpark@amazon.de>:
      docs/vm/damon: remove broken reference
      include/linux/damon.h: fix kernel-doc comments for 'damon_callback'

    SeongJae Park <sj@kernel.org>:
      mm/damon/core: print kdamond start log in debug mode only

    Changbin Du <changbin.du@gmail.com>:
      mm/damon: remove unnecessary do_exit() from kdamond
      mm/damon: needn't hold kdamond_lock to print pid of kdamond

    Colin Ian King <colin.king@canonical.com>:
      mm/damon/core: nullify pointer ctx->kdamond with a NULL

    SeongJae Park <sj@kernel.org>:
    Patch series "Implement Data Access Monitoring-based Memory Operation Schemes":
      mm/damon/core: account age of target regions
      mm/damon/core: implement DAMON-based Operation Schemes (DAMOS)
      mm/damon/vaddr: support DAMON-based Operation Schemes
      mm/damon/dbgfs: support DAMON-based Operation Schemes
      mm/damon/schemes: implement statistics feature
      selftests/damon: add 'schemes' debugfs tests
      Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes
    Patch series "DAMON: Support Physical Memory Address Space Monitoring::
      mm/damon/dbgfs: allow users to set initial monitoring target regions
      mm/damon/dbgfs-test: add a unit test case for 'init_regions'
      Docs/admin-guide/mm/damon: document 'init_regions' feature
      mm/damon/vaddr: separate commonly usable functions
      mm/damon: implement primitives for physical address space monitoring
      mm/damon/dbgfs: support physical memory monitoring
      Docs/DAMON: document physical memory monitoring support

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      mm/damon/vaddr: constify static mm_walk_ops

    Rongwei Wang <rongwei.wang@linux.alibaba.com>:
      mm/damon/dbgfs: remove unnecessary variables

    SeongJae Park <sj@kernel.org>:
      mm/damon/paddr: support the pageout scheme
      mm/damon/schemes: implement size quota for schemes application speed control
      mm/damon/schemes: skip already charged targets and regions
      mm/damon/schemes: implement time quota
      mm/damon/dbgfs: support quotas of schemes
      mm/damon/selftests: support schemes quotas
      mm/damon/schemes: prioritize regions within the quotas
      mm/damon/vaddr,paddr: support pageout prioritization
      mm/damon/dbgfs: support prioritization weights
      tools/selftests/damon: update for regions prioritization of schemes
      mm/damon/schemes: activate schemes based on a watermarks mechanism
      mm/damon/dbgfs: support watermarks
      selftests/damon: support watermarks
      mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
      Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM

    Xin Hao <xhao@linux.alibaba.com>:
    Patch series "mm/damon: Fix some small bugs", v4:
      mm/damon: remove unnecessary variable initialization
      mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on

    SeongJae Park <sj@kernel.org>:
    Patch series "Fix trivial nits in Documentation/admin-guide/mm":
      Docs/admin-guide/mm/damon/start: fix wrong example commands
      Docs/admin-guide/mm/damon/start: fix a wrong link
      Docs/admin-guide/mm/damon/start: simplify the content
      Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions

    Changbin Du <changbin.du@gmail.com>:
      mm/damon: simplify stop mechanism

    Colin Ian King <colin.i.king@googlemail.com>:
      mm/damon: fix a few spelling mistakes in comments and a pr_debug message

    Changbin Du <changbin.du@gmail.com>:
      mm/damon: remove return value from before_terminate callback

 a/Documentation/admin-guide/blockdev/zram.rst                  |    8 
 a/Documentation/admin-guide/cgroup-v1/memory.rst               |   11 
 a/Documentation/admin-guide/kernel-parameters.txt              |   14 
 a/Documentation/admin-guide/mm/damon/index.rst                 |    1 
 a/Documentation/admin-guide/mm/damon/reclaim.rst               |  235 +++
 a/Documentation/admin-guide/mm/damon/start.rst                 |  140 +
 a/Documentation/admin-guide/mm/damon/usage.rst                 |  117 +
 a/Documentation/admin-guide/mm/hugetlbpage.rst                 |   42 
 a/Documentation/admin-guide/mm/memory-hotplug.rst              |  147 +-
 a/Documentation/admin-guide/mm/pagemap.rst                     |   75 -
 a/Documentation/core-api/memory-hotplug.rst                    |    3 
 a/Documentation/dev-tools/kfence.rst                           |   23 
 a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst |    4 
 a/Documentation/vm/damon/design.rst                            |   29 
 a/Documentation/vm/damon/faq.rst                               |    5 
 a/Documentation/vm/damon/index.rst                             |    1 
 a/Documentation/vm/page_owner.rst                              |   23 
 a/MAINTAINERS                                                  |    2 
 a/Makefile                                                     |   15 
 a/arch/Kconfig                                                 |   28 
 a/arch/alpha/kernel/core_irongate.c                            |    6 
 a/arch/arc/mm/init.c                                           |    6 
 a/arch/arm/mach-hisi/platmcpm.c                                |    2 
 a/arch/arm/mach-rpc/ecard.c                                    |    2 
 a/arch/arm/mm/init.c                                           |    2 
 a/arch/arm64/Kconfig                                           |    4 
 a/arch/arm64/mm/kasan_init.c                                   |   16 
 a/arch/arm64/mm/mmu.c                                          |    4 
 a/arch/ia64/mm/contig.c                                        |    2 
 a/arch/ia64/mm/init.c                                          |    2 
 a/arch/m68k/mm/mcfmmu.c                                        |    3 
 a/arch/m68k/mm/motorola.c                                      |    6 
 a/arch/mips/loongson64/init.c                                  |    4 
 a/arch/mips/mm/init.c                                          |    6 
 a/arch/mips/sgi-ip27/ip27-memory.c                             |    3 
 a/arch/mips/sgi-ip30/ip30-setup.c                              |    6 
 a/arch/powerpc/Kconfig                                         |    1 
 a/arch/powerpc/configs/skiroot_defconfig                       |    1 
 a/arch/powerpc/include/asm/machdep.h                           |    2 
 a/arch/powerpc/include/asm/sections.h                          |   13 
 a/arch/powerpc/kernel/dt_cpu_ftrs.c                            |    8 
 a/arch/powerpc/kernel/paca.c                                   |    8 
 a/arch/powerpc/kernel/setup-common.c                           |    4 
 a/arch/powerpc/kernel/setup_64.c                               |    6 
 a/arch/powerpc/kernel/smp.c                                    |    2 
 a/arch/powerpc/mm/book3s64/radix_tlb.c                         |    4 
 a/arch/powerpc/mm/hugetlbpage.c                                |    9 
 a/arch/powerpc/platforms/powernv/pci-ioda.c                    |    4 
 a/arch/powerpc/platforms/powernv/setup.c                       |    4 
 a/arch/powerpc/platforms/pseries/setup.c                       |    2 
 a/arch/powerpc/platforms/pseries/svm.c                         |    9 
 a/arch/riscv/kernel/setup.c                                    |   10 
 a/arch/s390/include/asm/sections.h                             |   12 
 a/arch/s390/kernel/setup.c                                     |   11 
 a/arch/s390/kernel/smp.c                                       |    6 
 a/arch/s390/kernel/uv.c                                        |    2 
 a/arch/s390/mm/init.c                                          |    3 
 a/arch/s390/mm/kasan_init.c                                    |    2 
 a/arch/sh/boards/mach-ap325rxa/setup.c                         |    2 
 a/arch/sh/boards/mach-ecovec24/setup.c                         |    4 
 a/arch/sh/boards/mach-kfr2r09/setup.c                          |    2 
 a/arch/sh/boards/mach-migor/setup.c                            |    2 
 a/arch/sh/boards/mach-se/7724/setup.c                          |    4 
 a/arch/sparc/kernel/smp_64.c                                   |    4 
 a/arch/um/kernel/mem.c                                         |    4 
 a/arch/x86/Kconfig                                             |    6 
 a/arch/x86/kernel/setup.c                                      |    4 
 a/arch/x86/kernel/setup_percpu.c                               |    2 
 a/arch/x86/mm/init.c                                           |    2 
 a/arch/x86/mm/init_32.c                                        |   31 
 a/arch/x86/mm/kasan_init_64.c                                  |    4 
 a/arch/x86/mm/numa.c                                           |    2 
 a/arch/x86/mm/numa_emulation.c                                 |    2 
 a/arch/x86/xen/mmu_pv.c                                        |    8 
 a/arch/x86/xen/p2m.c                                           |    4 
 a/arch/x86/xen/setup.c                                         |    6 
 a/drivers/base/Makefile                                        |    2 
 a/drivers/base/arch_numa.c                                     |   96 +
 a/drivers/base/node.c                                          |    9 
 a/drivers/block/zram/zram_drv.c                                |   66 
 a/drivers/firmware/efi/memmap.c                                |    2 
 a/drivers/hwmon/occ/p9_sbe.c                                   |    1 
 a/drivers/macintosh/smu.c                                      |    2 
 a/drivers/mmc/core/mmc_test.c                                  |    1 
 a/drivers/mtd/mtdcore.c                                        |    1 
 a/drivers/of/kexec.c                                           |    4 
 a/drivers/of/of_reserved_mem.c                                 |    5 
 a/drivers/rapidio/devices/rio_mport_cdev.c                     |    9 
 a/drivers/s390/char/sclp_early.c                               |    4 
 a/drivers/usb/early/xhci-dbc.c                                 |   10 
 a/drivers/virtio/Kconfig                                       |    2 
 a/drivers/xen/swiotlb-xen.c                                    |    4 
 a/fs/d_path.c                                                  |    8 
 a/fs/exec.c                                                    |    4 
 a/fs/ocfs2/alloc.c                                             |   21 
 a/fs/ocfs2/dlm/dlmrecovery.c                                   |    1 
 a/fs/ocfs2/file.c                                              |    8 
 a/fs/ocfs2/inode.c                                             |    4 
 a/fs/ocfs2/journal.c                                           |   28 
 a/fs/ocfs2/journal.h                                           |    3 
 a/fs/ocfs2/super.c                                             |   40 
 a/fs/open.c                                                    |   16 
 a/fs/posix_acl.c                                               |    3 
 a/fs/proc/task_mmu.c                                           |   28 
 a/fs/super.c                                                   |    3 
 a/include/asm-generic/sections.h                               |   14 
 a/include/linux/backing-dev-defs.h                             |    3 
 a/include/linux/backing-dev.h                                  |    1 
 a/include/linux/cma.h                                          |    1 
 a/include/linux/compiler-gcc.h                                 |    8 
 a/include/linux/compiler_attributes.h                          |   10 
 a/include/linux/compiler_types.h                               |   12 
 a/include/linux/cpuset.h                                       |   17 
 a/include/linux/damon.h                                        |  258 +++
 a/include/linux/fs.h                                           |    1 
 a/include/linux/gfp.h                                          |    8 
 a/include/linux/highmem.h                                      |   28 
 a/include/linux/hugetlb.h                                      |   36 
 a/include/linux/io-mapping.h                                   |    6 
 a/include/linux/kasan.h                                        |    8 
 a/include/linux/kernel.h                                       |    1 
 a/include/linux/kfence.h                                       |   21 
 a/include/linux/memblock.h                                     |   48 
 a/include/linux/memcontrol.h                                   |    9 
 a/include/linux/memory.h                                       |   26 
 a/include/linux/memory_hotplug.h                               |    3 
 a/include/linux/mempolicy.h                                    |    5 
 a/include/linux/migrate.h                                      |   23 
 a/include/linux/migrate_mode.h                                 |   13 
 a/include/linux/mm.h                                           |   57 
 a/include/linux/mm_types.h                                     |    2 
 a/include/linux/mmzone.h                                       |   41 
 a/include/linux/node.h                                         |    4 
 a/include/linux/page-flags.h                                   |    2 
 a/include/linux/percpu.h                                       |    6 
 a/include/linux/sched/mm.h                                     |   25 
 a/include/linux/slab.h                                         |  181 +-
 a/include/linux/slub_def.h                                     |   13 
 a/include/linux/stackdepot.h                                   |    8 
 a/include/linux/stacktrace.h                                   |    1 
 a/include/linux/swap.h                                         |    1 
 a/include/linux/vmalloc.h                                      |   24 
 a/include/trace/events/mmap_lock.h                             |   50 
 a/include/trace/events/vmscan.h                                |   42 
 a/include/trace/events/writeback.h                             |    7 
 a/init/Kconfig                                                 |    2 
 a/init/initramfs.c                                             |    4 
 a/init/main.c                                                  |    6 
 a/kernel/cgroup/cpuset.c                                       |   23 
 a/kernel/cpu.c                                                 |    2 
 a/kernel/dma/swiotlb.c                                         |    6 
 a/kernel/exit.c                                                |    2 
 a/kernel/extable.c                                             |    2 
 a/kernel/fork.c                                                |   51 
 a/kernel/kexec_file.c                                          |    5 
 a/kernel/kthread.c                                             |   21 
 a/kernel/locking/lockdep.c                                     |   15 
 a/kernel/printk/printk.c                                       |    4 
 a/kernel/sched/core.c                                          |   37 
 a/kernel/sched/sched.h                                         |    4 
 a/kernel/sched/topology.c                                      |    1 
 a/kernel/stacktrace.c                                          |   30 
 a/kernel/tsacct.c                                              |    2 
 a/kernel/workqueue.c                                           |    2 
 a/lib/Kconfig.debug                                            |    2 
 a/lib/Kconfig.kfence                                           |   26 
 a/lib/bootconfig.c                                             |    2 
 a/lib/cpumask.c                                                |    6 
 a/lib/stackdepot.c                                             |   76 -
 a/lib/test_kasan.c                                             |   26 
 a/lib/test_kasan_module.c                                      |    2 
 a/lib/test_vmalloc.c                                           |    6 
 a/mm/Kconfig                                                   |   10 
 a/mm/backing-dev.c                                             |   65 
 a/mm/cma.c                                                     |   26 
 a/mm/compaction.c                                              |   12 
 a/mm/damon/Kconfig                                             |   24 
 a/mm/damon/Makefile                                            |    4 
 a/mm/damon/core.c                                              |  500 ++++++-
 a/mm/damon/dbgfs-test.h                                        |   56 
 a/mm/damon/dbgfs.c                                             |  486 +++++-
 a/mm/damon/paddr.c                                             |  275 +++
 a/mm/damon/prmtv-common.c                                      |  133 +
 a/mm/damon/prmtv-common.h                                      |   20 
 a/mm/damon/reclaim.c                                           |  356 ++++
 a/mm/damon/vaddr-test.h                                        |    2 
 a/mm/damon/vaddr.c                                             |  167 +-
 a/mm/debug.c                                                   |   20 
 a/mm/debug_vm_pgtable.c                                        |    7 
 a/mm/filemap.c                                                 |   78 -
 a/mm/gup.c                                                     |    5 
 a/mm/highmem.c                                                 |    6 
 a/mm/hugetlb.c                                                 |  713 +++++++++-
 a/mm/hugetlb_cgroup.c                                          |    3 
 a/mm/internal.h                                                |   26 
 a/mm/kasan/common.c                                            |    8 
 a/mm/kasan/generic.c                                           |   16 
 a/mm/kasan/kasan.h                                             |    2 
 a/mm/kasan/shadow.c                                            |    5 
 a/mm/kfence/core.c                                             |  214 ++-
 a/mm/kfence/kfence.h                                           |    2 
 a/mm/kfence/kfence_test.c                                      |   14 
 a/mm/khugepaged.c                                              |   10 
 a/mm/list_lru.c                                                |   58 
 a/mm/memblock.c                                                |   35 
 a/mm/memcontrol.c                                              |  217 +--
 a/mm/memory-failure.c                                          |  117 +
 a/mm/memory.c                                                  |  166 +-
 a/mm/memory_hotplug.c                                          |   57 
 a/mm/mempolicy.c                                               |  143 +-
 a/mm/migrate.c                                                 |   61 
 a/mm/mmap.c                                                    |    2 
 a/mm/mprotect.c                                                |    5 
 a/mm/mremap.c                                                  |   86 -
 a/mm/nommu.c                                                   |    6 
 a/mm/oom_kill.c                                                |   27 
 a/mm/page-writeback.c                                          |   13 
 a/mm/page_alloc.c                                              |  119 -
 a/mm/page_ext.c                                                |    2 
 a/mm/page_isolation.c                                          |   29 
 a/mm/percpu.c                                                  |   24 
 a/mm/readahead.c                                               |    2 
 a/mm/rmap.c                                                    |    8 
 a/mm/shmem.c                                                   |   44 
 a/mm/slab.c                                                    |   16 
 a/mm/slab_common.c                                             |    8 
 a/mm/slub.c                                                    |  117 -
 a/mm/sparse-vmemmap.c                                          |    2 
 a/mm/sparse.c                                                  |    6 
 a/mm/swap.c                                                    |   23 
 a/mm/swapfile.c                                                |    6 
 a/mm/userfaultfd.c                                             |    8 
 a/mm/vmalloc.c                                                 |  107 +
 a/mm/vmpressure.c                                              |    2 
 a/mm/vmscan.c                                                  |  194 ++
 a/mm/vmstat.c                                                  |   76 -
 a/mm/zsmalloc.c                                                |    7 
 a/net/ipv4/tcp.c                                               |    1 
 a/net/ipv4/udp.c                                               |    1 
 a/net/netfilter/ipvs/ip_vs_ctl.c                               |    1 
 a/net/openvswitch/meter.c                                      |    1 
 a/net/sctp/protocol.c                                          |    1 
 a/scripts/checkpatch.pl                                        |    3 
 a/scripts/decodecode                                           |    2 
 a/scripts/spelling.txt                                         |   18 
 a/security/Kconfig                                             |   14 
 a/tools/testing/selftests/damon/debugfs_attrs.sh               |   25 
 a/tools/testing/selftests/memory-hotplug/config                |    1 
 a/tools/testing/selftests/vm/.gitignore                        |    1 
 a/tools/testing/selftests/vm/Makefile                          |    1 
 a/tools/testing/selftests/vm/hugepage-mremap.c                 |  161 ++
 a/tools/testing/selftests/vm/ksm_tests.c                       |  154 ++
 a/tools/testing/selftests/vm/madv_populate.c                   |   15 
 a/tools/testing/selftests/vm/run_vmtests.sh                    |   11 
 a/tools/testing/selftests/vm/transhuge-stress.c                |    2 
 a/tools/testing/selftests/vm/userfaultfd.c                     |  157 +-
 a/tools/vm/page-types.c                                        |   38 
 a/tools/vm/page_owner_sort.c                                   |   94 +
 b/Documentation/admin-guide/mm/index.rst                       |    2 
 b/Documentation/vm/index.rst                                   |   26 
 260 files changed, 6448 insertions(+), 2327 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt
  2021-11-05 20:34 incoming Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 002/262] scripts/spelling.txt: fix "mistake" version of "synchronization" Andrew Morton
                   ` (260 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, colin.king, linux-mm, mm-commits, torvalds

From: Colin Ian King <colin.king@canonical.com>
Subject: scripts/spelling.txt: add more spellings to spelling.txt

Some of the more common spelling mistakes and typos that I've found
while fixing up spelling mistakes in the kernel in the past few months.

Link: https://lkml.kernel.org/r/20210907072941.7033-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- a/scripts/spelling.txt~scripts-spellingtxt-add-more-spellings-to-spellingtxt
+++ a/scripts/spelling.txt
@@ -178,6 +178,7 @@ assum||assume
 assumtpion||assumption
 asuming||assuming
 asycronous||asynchronous
+asychronous||asynchronous
 asynchnous||asynchronous
 asynchromous||asynchronous
 asymetric||asymmetric
@@ -241,6 +242,7 @@ beter||better
 betweeen||between
 bianries||binaries
 bitmast||bitmask
+bitwiedh||bitwidth
 boardcast||broadcast
 borad||board
 boundry||boundary
@@ -265,7 +267,10 @@ calucate||calculate
 calulate||calculate
 cancelation||cancellation
 cancle||cancel
+cant||can't
+cant'||can't
 canot||cannot
+cann't||can't
 capabilites||capabilities
 capabilties||capabilities
 capabilty||capability
@@ -501,6 +506,7 @@ disble||disable
 disgest||digest
 disired||desired
 dispalying||displaying
+dissable||disable
 diplay||display
 directon||direction
 direcly||directly
@@ -595,6 +601,7 @@ exceded||exceeded
 exceds||exceeds
 exceeed||exceed
 excellant||excellent
+exchnage||exchange
 execeeded||exceeded
 execeeds||exceeds
 exeed||exceed
@@ -938,6 +945,7 @@ migrateable||migratable
 milliseonds||milliseconds
 minium||minimum
 minimam||minimum
+minimun||minimum
 miniumum||minimum
 minumum||minimum
 misalinged||misaligned
@@ -956,6 +964,7 @@ mmnemonic||mnemonic
 mnay||many
 modfiy||modify
 modifer||modifier
+modul||module
 modulues||modules
 momery||memory
 memomry||memory
@@ -1154,6 +1163,7 @@ programable||programmable
 programers||programmers
 programm||program
 programms||programs
+progres||progress
 progresss||progress
 prohibitted||prohibited
 prohibitting||prohibiting
@@ -1328,6 +1338,7 @@ servive||service
 setts||sets
 settting||setting
 shapshot||snapshot
+shoft||shift
 shotdown||shutdown
 shoud||should
 shouldnt||shouldn't
@@ -1439,6 +1450,7 @@ syfs||sysfs
 symetric||symmetric
 synax||syntax
 synchonized||synchronized
+synchronization||synchronization
 synchronuously||synchronously
 syncronize||synchronize
 syncronized||synchronized
@@ -1521,6 +1533,7 @@ unexpexted||unexpected
 unfortunatelly||unfortunately
 unifiy||unify
 uniterrupted||uninterrupted
+uninterruptable||uninterruptible
 unintialized||uninitialized
 unitialized||uninitialized
 unkmown||unknown
@@ -1553,6 +1566,7 @@ unuseful||useless
 unvalid||invalid
 upate||update
 upsupported||unsupported
+useable||usable
 usefule||useful
 usefull||useful
 usege||usage
@@ -1574,6 +1588,7 @@ varient||variant
 vaule||value
 verbse||verbose
 veify||verify
+verfication||verification
 veriosn||version
 verisons||versions
 verison||version
@@ -1586,6 +1601,7 @@ visiters||visitors
 vitual||virtual
 vunerable||vulnerable
 wakeus||wakeups
+was't||wasn't
 wathdog||watchdog
 wating||waiting
 wiat||wait
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 002/262] scripts/spelling.txt: fix "mistake" version of "synchronization"
  2021-11-05 20:34 incoming Andrew Morton
  2021-11-05 20:34 ` [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 003/262] scripts/decodecode: fix faulting instruction no print when opps.file is DOS format Andrew Morton
                   ` (259 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, colin.king, linux-mm, mm-commits, sven, torvalds

From: Sven Eckelmann <sven@narfation.org>
Subject: scripts/spelling.txt: fix "mistake" version of "synchronization"

If both "mistake" version and "correction" version are the same, a warning
message is created by checkpatch which is impossible to fix.  But it was
noticed that Colan Ian King created a commit e6c0a0889b80 ("ALSA: aloop:
Fix spelling mistake "synchronization" -> "synchronization"") which
suggests that this spelling mistake was fixed by replacing the word
"synchronization" with itself.  But the actual diff shows that the mistake
in the code was "sychronization".  It is rather likely that the "mistake"
in spelling.txt should have been the latter.

Link: https://lkml.kernel.org/r/20210926065529.6880-1-sven@narfation.org
Fixes: 2e74c9433ba8 ("scripts/spelling.txt: add more spellings to spelling.txt")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/spelling.txt |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/scripts/spelling.txt~scripts-spellingtxt-fix-mistake-version-of-synchronization
+++ a/scripts/spelling.txt
@@ -1450,7 +1450,7 @@ syfs||sysfs
 symetric||symmetric
 synax||syntax
 synchonized||synchronized
-synchronization||synchronization
+sychronization||synchronization
 synchronuously||synchronously
 syncronize||synchronize
 syncronized||synchronized
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 003/262] scripts/decodecode: fix faulting instruction no print when opps.file is DOS format
  2021-11-05 20:34 incoming Andrew Morton
  2021-11-05 20:34 ` [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt Andrew Morton
  2021-11-05 20:34 ` [patch 002/262] scripts/spelling.txt: fix "mistake" version of "synchronization" Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 004/262] ocfs2: fix handle refcount leak in two exception handling paths Andrew Morton
                   ` (258 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, bp, linux-mm, maz, mm-commits, rabin, torvalds, weidonghui, will

From: weidonghui <weidonghui@allwinnertech.com>
Subject: scripts/decodecode: fix faulting instruction no print when opps.file is DOS format

If opps.file is in DOS format, faulting instruction cannot be printed:
/ # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
/ # ./scripts/decodecode < oops.file
[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)
aarch64-linux-gnu-strip: '/tmp/tmp.5Y9eybnnSi.o': No such file
aarch64-linux-gnu-objdump: '/tmp/tmp.5Y9eybnnSi.o': No such file
All code
========
   0:   d0002881        adrp    x1, 0x512000
   4:   912f9c21        add     x1, x1, #0xbe7
   8:   94067e68        bl      0x19f9a8
   c:   d2800001        mov     x1, #0x0                        // #0
  10:   b900003f        str     wzr, [x1]

Code starting with the faulting instruction
===========================================

Background: The compilation environment is Ubuntu, and the test
environment is Windows.  Most logs are generated in the Windows
environment.  In this way, CR (carriage return) will inevitably appear,
which will affect the use of decodecode in the Ubuntu environment.

The repaired effect is as follows:

/ # ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
/ # ./scripts/decodecode < oops.file
[ 0.734345] Code: d0002881 912f9c21 94067e68 d2800001 (b900003f)
All code
========
   0:   d0002881        adrp    x1, 0x512000
   4:   912f9c21        add     x1, x1, #0xbe7
   8:   94067e68        bl      0x19f9a8
   c:   d2800001        mov     x1, #0x0                        // #0
  10:*  b900003f        str     wzr, [x1]               <-- trapping instruction

Code starting with the faulting instruction
===========================================
   0:   b900003f        str     wzr, [x1]

Link: https://lkml.kernel.org/r/20211008064712.926-1-weidonghui@allwinnertech.com
Signed-off-by: weidonghui <weidonghui@allwinnertech.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Marc Zyngier <maz@misterjones.org>
Cc: Will Deacon <will@kernel.org>
Cc: Rabin Vincent <rabin@rab.in>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 scripts/decodecode |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/scripts/decodecode~scripts-decodecode-fix-faulting-instruction-no-print-when-oppsfile-is-dos-format
+++ a/scripts/decodecode
@@ -126,7 +126,7 @@ if [ $marker -ne 0 ]; then
 fi
 echo Code starting with the faulting instruction  > $T.aa
 echo =========================================== >> $T.aa
-code=`echo $code | sed -e 's/ [<(]/ /;s/[>)] / /;s/ /,0x/g; s/[>)]$//'`
+code=`echo $code | sed -e 's/\r//;s/ [<(]/ /;s/[>)] / /;s/ /,0x/g; s/[>)]$//'`
 echo -n "	.$type 0x" > $T.s
 echo $code >> $T.s
 disas $T 0
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 004/262] ocfs2: fix handle refcount leak in two exception handling paths
  2021-11-05 20:34 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2021-11-05 20:34 ` [patch 003/262] scripts/decodecode: fix faulting instruction no print when opps.file is DOS format Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 005/262] ocfs2: cleanup journal init and shutdown Andrew Morton
                   ` (257 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, cymi20, gechangwei, ghe, jlbec, joseph.qi, junxiao.bi,
	linux-mm, mark, mm-commits, piaojun, tanxin.ctf, torvalds,
	wen.gang.wang, xiyuyang19

From: Chenyuan Mi <cymi20@fudan.edu.cn>
Subject: ocfs2: fix handle refcount leak in two exception handling paths

The reference counting issue happens in two exception handling paths
of ocfs2_replay_truncate_records(). When executing these two exception
handling paths, the function forgets to decrease the refcount of handle
increased by ocfs2_start_trans(), causing a refcount leak.

Fix this issue by using ocfs2_commit_trans() to decrease the refcount
of handle in two handling paths.

Link: https://lkml.kernel.org/r/20210908102055.10168-1-cymi20@fudan.edu.cn
Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Wengang Wang <wen.gang.wang@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/alloc.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/fs/ocfs2/alloc.c~ocfs2-fix-handle-refcount-leak-in-two-exception-handling-paths
+++ a/fs/ocfs2/alloc.c
@@ -5940,6 +5940,7 @@ static int ocfs2_replay_truncate_records
 		status = ocfs2_journal_access_di(handle, INODE_CACHE(tl_inode), tl_bh,
 						 OCFS2_JOURNAL_ACCESS_WRITE);
 		if (status < 0) {
+			ocfs2_commit_trans(osb, handle);
 			mlog_errno(status);
 			goto bail;
 		}
@@ -5964,6 +5965,7 @@ static int ocfs2_replay_truncate_records
 						     data_alloc_bh, start_blk,
 						     num_clusters);
 			if (status < 0) {
+				ocfs2_commit_trans(osb, handle);
 				mlog_errno(status);
 				goto bail;
 			}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 005/262] ocfs2: cleanup journal init and shutdown
  2021-11-05 20:34 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2021-11-05 20:34 ` [patch 004/262] ocfs2: fix handle refcount leak in two exception handling paths Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 006/262] ocfs2/dlm: remove redundant assignment of variable ret Andrew Morton
                   ` (256 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, gechangwei, ghe, jlbec, joseph.qi, junxiao.bi, linux-mm,
	mark, mm-commits, piaojun, torvalds, vvidic

From: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Subject: ocfs2: cleanup journal init and shutdown

Allocate and free struct ocfs2_journal in ocfs2_journal_init and
ocfs2_journal_shutdown.  Init and release of system inodes references the
journal so reorder calls to make sure they work correctly.

Link: https://lkml.kernel.org/r/20211009145006.3478-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/inode.c   |    4 ++--
 fs/ocfs2/journal.c |   28 ++++++++++++++++++++++------
 fs/ocfs2/journal.h |    3 +--
 fs/ocfs2/super.c   |   40 +++-------------------------------------
 4 files changed, 28 insertions(+), 47 deletions(-)

--- a/fs/ocfs2/inode.c~ocfs2-cleanup-journal-init-and-shutdown
+++ a/fs/ocfs2/inode.c
@@ -125,7 +125,6 @@ struct inode *ocfs2_iget(struct ocfs2_su
 	struct inode *inode = NULL;
 	struct super_block *sb = osb->sb;
 	struct ocfs2_find_inode_args args;
-	journal_t *journal = OCFS2_SB(sb)->journal->j_journal;
 
 	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
 			       sysfile_type);
@@ -172,10 +171,11 @@ struct inode *ocfs2_iget(struct ocfs2_su
 	 * part of the transaction - the inode could have been reclaimed and
 	 * now it is reread from disk.
 	 */
-	if (journal) {
+	if (osb->journal) {
 		transaction_t *transaction;
 		tid_t tid;
 		struct ocfs2_inode_info *oi = OCFS2_I(inode);
+		journal_t *journal = osb->journal->j_journal;
 
 		read_lock(&journal->j_state_lock);
 		if (journal->j_running_transaction)
--- a/fs/ocfs2/journal.c~ocfs2-cleanup-journal-init-and-shutdown
+++ a/fs/ocfs2/journal.c
@@ -810,19 +810,34 @@ void ocfs2_set_journal_params(struct ocf
 	write_unlock(&journal->j_state_lock);
 }
 
-int ocfs2_journal_init(struct ocfs2_journal *journal, int *dirty)
+int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
 {
 	int status = -1;
 	struct inode *inode = NULL; /* the journal inode */
 	journal_t *j_journal = NULL;
+	struct ocfs2_journal *journal = NULL;
 	struct ocfs2_dinode *di = NULL;
 	struct buffer_head *bh = NULL;
-	struct ocfs2_super *osb;
 	int inode_lock = 0;
 
-	BUG_ON(!journal);
-
-	osb = journal->j_osb;
+	/* initialize our journal structure */
+	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
+	if (!journal) {
+		mlog(ML_ERROR, "unable to alloc journal\n");
+		status = -ENOMEM;
+		goto done;
+	}
+	osb->journal = journal;
+	journal->j_osb = osb;
+
+	atomic_set(&journal->j_num_trans, 0);
+	init_rwsem(&journal->j_trans_barrier);
+	init_waitqueue_head(&journal->j_checkpointed);
+	spin_lock_init(&journal->j_lock);
+	journal->j_trans_id = 1UL;
+	INIT_LIST_HEAD(&journal->j_la_cleanups);
+	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
+	journal->j_state = OCFS2_JOURNAL_FREE;
 
 	/* already have the inode for our journal */
 	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
@@ -1028,9 +1043,10 @@ void ocfs2_journal_shutdown(struct ocfs2
 
 	journal->j_state = OCFS2_JOURNAL_FREE;
 
-//	up_write(&journal->j_trans_barrier);
 done:
 	iput(inode);
+	kfree(journal);
+	osb->journal = NULL;
 }
 
 static void ocfs2_clear_journal_error(struct super_block *sb,
--- a/fs/ocfs2/journal.h~ocfs2-cleanup-journal-init-and-shutdown
+++ a/fs/ocfs2/journal.h
@@ -167,8 +167,7 @@ int ocfs2_compute_replay_slots(struct oc
  *  ocfs2_start_checkpoint - Kick the commit thread to do a checkpoint.
  */
 void   ocfs2_set_journal_params(struct ocfs2_super *osb);
-int    ocfs2_journal_init(struct ocfs2_journal *journal,
-			  int *dirty);
+int    ocfs2_journal_init(struct ocfs2_super *osb, int *dirty);
 void   ocfs2_journal_shutdown(struct ocfs2_super *osb);
 int    ocfs2_journal_wipe(struct ocfs2_journal *journal,
 			  int full);
--- a/fs/ocfs2/super.c~ocfs2-cleanup-journal-init-and-shutdown
+++ a/fs/ocfs2/super.c
@@ -1894,8 +1894,6 @@ static void ocfs2_dismount_volume(struct
 	/* This will disable recovery and flush any recovery work. */
 	ocfs2_recovery_exit(osb);
 
-	ocfs2_journal_shutdown(osb);
-
 	ocfs2_sync_blockdev(sb);
 
 	ocfs2_purge_refcount_trees(osb);
@@ -1918,6 +1916,8 @@ static void ocfs2_dismount_volume(struct
 
 	ocfs2_release_system_inodes(osb);
 
+	ocfs2_journal_shutdown(osb);
+
 	/*
 	 * If we're dismounting due to mount error, mount.ocfs2 will clean
 	 * up heartbeat.  If we're a local mount, there is no heartbeat.
@@ -2016,7 +2016,6 @@ static int ocfs2_initialize_super(struct
 	int i, cbits, bbits;
 	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
 	struct inode *inode = NULL;
-	struct ocfs2_journal *journal;
 	struct ocfs2_super *osb;
 	u64 total_blocks;
 
@@ -2197,33 +2196,6 @@ static int ocfs2_initialize_super(struct
 
 	get_random_bytes(&osb->s_next_generation, sizeof(u32));
 
-	/* FIXME
-	 * This should be done in ocfs2_journal_init(), but unknown
-	 * ordering issues will cause the filesystem to crash.
-	 * If anyone wants to figure out what part of the code
-	 * refers to osb->journal before ocfs2_journal_init() is run,
-	 * be my guest.
-	 */
-	/* initialize our journal structure */
-
-	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
-	if (!journal) {
-		mlog(ML_ERROR, "unable to alloc journal\n");
-		status = -ENOMEM;
-		goto bail;
-	}
-	osb->journal = journal;
-	journal->j_osb = osb;
-
-	atomic_set(&journal->j_num_trans, 0);
-	init_rwsem(&journal->j_trans_barrier);
-	init_waitqueue_head(&journal->j_checkpointed);
-	spin_lock_init(&journal->j_lock);
-	journal->j_trans_id = (unsigned long) 1;
-	INIT_LIST_HEAD(&journal->j_la_cleanups);
-	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
-	journal->j_state = OCFS2_JOURNAL_FREE;
-
 	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
 	init_llist_head(&osb->dquot_drop_list);
 
@@ -2404,7 +2376,7 @@ static int ocfs2_check_volume(struct ocf
 						  * ourselves. */
 
 	/* Init our journal object. */
-	status = ocfs2_journal_init(osb->journal, &dirty);
+	status = ocfs2_journal_init(osb, &dirty);
 	if (status < 0) {
 		mlog(ML_ERROR, "Could not initialize journal!\n");
 		goto finally;
@@ -2513,12 +2485,6 @@ static void ocfs2_delete_osb(struct ocfs
 
 	kfree(osb->osb_orphan_wipes);
 	kfree(osb->slot_recovery_generations);
-	/* FIXME
-	 * This belongs in journal shutdown, but because we have to
-	 * allocate osb->journal at the start of ocfs2_initialize_osb(),
-	 * we free it here.
-	 */
-	kfree(osb->journal);
 	kfree(osb->local_alloc_copy);
 	kfree(osb->uuid_str);
 	kfree(osb->vol_label);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 006/262] ocfs2/dlm: remove redundant assignment of variable ret
  2021-11-05 20:34 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2021-11-05 20:34 ` [patch 005/262] ocfs2: cleanup journal init and shutdown Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 007/262] ocfs2: fix data corruption on truncate Andrew Morton
                   ` (255 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, colin.king, gechangwei, ghe, jlbec, joseph.qi, junxiao.bi,
	linux-mm, mark, mm-commits, piaojun, torvalds

From: Colin Ian King <colin.king@canonical.com>
Subject: ocfs2/dlm: remove redundant assignment of variable ret

The variable ret is being assigned a value that is never read, it is
updated later on with a different value.  The assignment is redundant and
can be removed.

Addresses-Coverity: ("Unused value")
Link: https://lkml.kernel.org/r/20211007233452.30815-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/dlm/dlmrecovery.c |    1 -
 1 file changed, 1 deletion(-)

--- a/fs/ocfs2/dlm/dlmrecovery.c~ocfs2-dlm-remove-redundant-assignment-of-variable-ret
+++ a/fs/ocfs2/dlm/dlmrecovery.c
@@ -2698,7 +2698,6 @@ static int dlm_send_begin_reco_message(s
 			continue;
 		}
 retry:
-		ret = -EINVAL;
 		mlog(0, "attempting to send begin reco msg to %d\n",
 			  nodenum);
 		ret = o2net_send_message(DLM_BEGIN_RECO_MSG, dlm->key,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 007/262] ocfs2: fix data corruption on truncate
  2021-11-05 20:34 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2021-11-05 20:34 ` [patch 006/262] ocfs2/dlm: remove redundant assignment of variable ret Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:34 ` [patch 008/262] ocfs2: do not zero pages beyond i_size Andrew Morton
                   ` (254 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, gechangwei, ghe, jack, jlbec, joseph.qi, junxiao.bi,
	linux-mm, mark, mm-commits, piaojun, stable, torvalds

From: Jan Kara <jack@suse.cz>
Subject: ocfs2: fix data corruption on truncate

Patch series "ocfs2: Truncate data corruption fix".

As further testing has shown, commit 5314454ea3f ("ocfs2: fix data
corruption after conversion from inline format") didn't fix all the data
corruption issues the customer started observing after 6dbf7bb55598 ("fs:
Don't invalidate page buffers in block_write_full_page()") This time I
have tracked them down to two bugs in ocfs2 truncation code.

One bug (truncating page cache before clearing tail cluster and setting
i_size) could cause data corruption even before 6dbf7bb55598, but before
that commit it needed a race with page fault, after 6dbf7bb55598 it
started to be pretty deterministic.

Another bug (zeroing pages beyond old i_size) used to be harmless
inefficiency before commit 6dbf7bb55598.  But after commit 6dbf7bb55598 in
combination with the first bug it resulted in deterministic data
corruption.

Although fixing only the first problem is needed to stop data corruption,
I've fixed both issues to make the code more robust.


This patch (of 2):

ocfs2_truncate_file() did unmap invalidate page cache pages before zeroing
partial tail cluster and setting i_size.  Thus some pages could be left
(and likely have left if the cluster zeroing happened) in the page cache
beyond i_size after truncate finished letting user possibly see stale data
once the file was extended again.  Also the tail cluster zeroing was not
guaranteed to finish before truncate finished causing possible stale data
exposure.  The problem started to be particularly easy to hit after commit
6dbf7bb55598 "fs: Don't invalidate page buffers in
block_write_full_page()" stopped invalidation of pages beyond i_size from
page writeback path.

Fix these problems by unmapping and invalidating pages in the page cache
after the i_size is reduced and tail cluster is zeroed out.

Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz
Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/file.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/fs/ocfs2/file.c~ocfs2-fix-data-corruption-on-truncate
+++ a/fs/ocfs2/file.c
@@ -476,10 +476,11 @@ int ocfs2_truncate_file(struct inode *in
 	 * greater than page size, so we have to truncate them
 	 * anyway.
 	 */
-	unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1);
-	truncate_inode_pages(inode->i_mapping, new_i_size);
 
 	if (OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL) {
+		unmap_mapping_range(inode->i_mapping,
+				    new_i_size + PAGE_SIZE - 1, 0, 1);
+		truncate_inode_pages(inode->i_mapping, new_i_size);
 		status = ocfs2_truncate_inline(inode, di_bh, new_i_size,
 					       i_size_read(inode), 1);
 		if (status)
@@ -498,6 +499,9 @@ int ocfs2_truncate_file(struct inode *in
 		goto bail_unlock_sem;
 	}
 
+	unmap_mapping_range(inode->i_mapping, new_i_size + PAGE_SIZE - 1, 0, 1);
+	truncate_inode_pages(inode->i_mapping, new_i_size);
+
 	status = ocfs2_commit_truncate(osb, inode, di_bh);
 	if (status < 0) {
 		mlog_errno(status);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 008/262] ocfs2: do not zero pages beyond i_size
  2021-11-05 20:34 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2021-11-05 20:34 ` [patch 007/262] ocfs2: fix data corruption on truncate Andrew Morton
@ 2021-11-05 20:34 ` Andrew Morton
  2021-11-05 20:35 ` [patch 009/262] fs/posix_acl.c: avoid -Wempty-body warning Andrew Morton
                   ` (253 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:34 UTC (permalink / raw)
  To: akpm, gechangwei, ghe, jack, jlbec, joseph.qi, junxiao.bi,
	linux-mm, mark, mm-commits, piaojun, torvalds

From: Jan Kara <jack@suse.cz>
Subject: ocfs2: do not zero pages beyond i_size

ocfs2_zero_range_for_truncate() can try to zero pages beyond current inode
size despite the fact that underlying blocks should be already zeroed out
and writeback will skip writing such pages anyway.  Avoid the pointless
work.

Link: https://lkml.kernel.org/r/20211025151332.11301-2-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/alloc.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

--- a/fs/ocfs2/alloc.c~ocfs2-do-not-zero-pages-beyond-i_size
+++ a/fs/ocfs2/alloc.c
@@ -6923,13 +6923,12 @@ static int ocfs2_grab_eof_pages(struct i
 }
 
 /*
- * Zero the area past i_size but still within an allocated
- * cluster. This avoids exposing nonzero data on subsequent file
- * extends.
+ * Zero partial cluster for a hole punch or truncate. This avoids exposing
+ * nonzero data on subsequent file extends.
  *
  * We need to call this before i_size is updated on the inode because
  * otherwise block_write_full_page() will skip writeout of pages past
- * i_size. The new_i_size parameter is passed for this reason.
+ * i_size.
  */
 int ocfs2_zero_range_for_truncate(struct inode *inode, handle_t *handle,
 				  u64 range_start, u64 range_end)
@@ -6947,6 +6946,15 @@ int ocfs2_zero_range_for_truncate(struct
 	if (!ocfs2_sparse_alloc(OCFS2_SB(sb)))
 		return 0;
 
+	/*
+	 * Avoid zeroing pages fully beyond current i_size. It is pointless as
+	 * underlying blocks of those pages should be already zeroed out and
+	 * page writeback will skip them anyway.
+	 */
+	range_end = min_t(u64, range_end, i_size_read(inode));
+	if (range_start >= range_end)
+		return 0;
+
 	pages = kcalloc(ocfs2_pages_per_cluster(sb),
 			sizeof(struct page *), GFP_NOFS);
 	if (pages == NULL) {
@@ -6955,9 +6963,6 @@ int ocfs2_zero_range_for_truncate(struct
 		goto out;
 	}
 
-	if (range_start == range_end)
-		goto out;
-
 	ret = ocfs2_extent_map_get_blocks(inode,
 					  range_start >> sb->s_blocksize_bits,
 					  &phys, NULL, &ext_flags);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 009/262] fs/posix_acl.c: avoid -Wempty-body warning
  2021-11-05 20:34 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2021-11-05 20:34 ` [patch 008/262] ocfs2: do not zero pages beyond i_size Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 010/262] d_path: fix Kernel doc validator complaining Andrew Morton
                   ` (252 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, arnd, christian.brauner, jamorris, linux-mm, mm-commits,
	mszeredi, serge, torvalds, viro

From: Arnd Bergmann <arnd@arndb.de>
Subject: fs/posix_acl.c: avoid -Wempty-body warning

The fallthrough comment for an ignored cmpxchg() return value produces a
harmless warning with 'make W=1':

fs/posix_acl.c: In function 'get_acl':
fs/posix_acl.c:127:36: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  127 |                 /* fall through */ ;
      |                                    ^

Simplify it as a step towards a clean W=1 build.  As all architectures
define cmpxchg() as a statement expression these days, it is no longer
necessary to evaluate its return code, and the if() can just be droped.

Link: https://lkml.kernel.org/r/20210927102410.1863853-1-arnd@kernel.org
Link: https://lore.kernel.org/all/20210322132103.qiun2rjilnlgztxe@wittgenstein/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/posix_acl.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/posix_acl.c~posix-acl-avoid-wempty-body-warning
+++ a/fs/posix_acl.c
@@ -134,8 +134,7 @@ struct posix_acl *get_acl(struct inode *
 	 * to just call ->get_acl to fetch the ACL ourself.  (This is going to
 	 * be an unlikely race.)
 	 */
-	if (cmpxchg(p, ACL_NOT_CACHED, sentinel) != ACL_NOT_CACHED)
-		/* fall through */ ;
+	cmpxchg(p, ACL_NOT_CACHED, sentinel);
 
 	/*
 	 * Normally, the ACL returned by ->get_acl will be cached.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 010/262] d_path: fix Kernel doc validator complaining
  2021-11-05 20:34 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2021-11-05 20:35 ` [patch 009/262] fs/posix_acl.c: avoid -Wempty-body warning Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 011/262] mm: move kvmalloc-related functions to slab.h Andrew Morton
                   ` (251 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andriy.shevchenko, justin.he, linux-mm, mm-commits,
	rdunlap, torvalds, viro

From: Jia He <justin.he@arm.com>
Subject: d_path: fix Kernel doc validator complaining

Kernel doc validator complains:
  Function parameter or member 'p' not described in 'prepend_name'
  Excess function parameter 'buffer' description in 'prepend_name'

Link: https://lkml.kernel.org/r/20211011005614.26189-1-justin.he@arm.com
Fixes: ad08ae586586 ("d_path: introduce struct prepend_buffer")
Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/d_path.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- a/fs/d_path.c~d_path-fix-kernel-doc-validator-complaining
+++ a/fs/d_path.c
@@ -77,9 +77,8 @@ static bool prepend(struct prepend_buffe
 
 /**
  * prepend_name - prepend a pathname in front of current buffer pointer
- * @buffer: buffer pointer
- * @buflen: allocated length of the buffer
- * @name:   name string and length qstr structure
+ * @p: prepend buffer which contains buffer pointer and allocated length
+ * @name: name string and length qstr structure
  *
  * With RCU path tracing, it may race with d_move(). Use READ_ONCE() to
  * make sure that either the old or the new name pointer and length are
@@ -141,8 +140,7 @@ static int __prepend_path(const struct d
  * prepend_path - Prepend path string to a buffer
  * @path: the dentry/vfsmount to report
  * @root: root vfsmnt/dentry
- * @buffer: pointer to the end of the buffer
- * @buflen: pointer to buffer length
+ * @p: prepend buffer which contains buffer pointer and allocated length
  *
  * The function will first try to write out the pathname without taking any
  * lock other than the RCU read lock to make sure that dentries won't go away.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 011/262] mm: move kvmalloc-related functions to slab.h
  2021-11-05 20:34 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2021-11-05 20:35 ` [patch 010/262] d_path: fix Kernel doc validator complaining Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 012/262] mm/slab.c: remove useless lines in enable_cpucache() Andrew Morton
                   ` (250 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, cl, iamjoonsoo.kim, linux-mm, mm-commits, penberg,
	rientjes, torvalds, vbabka, willy

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm: move kvmalloc-related functions to slab.h

Not all files in the kernel should include mm.h.  Migrating callers from
kmalloc to kvmalloc is easier if the kvmalloc functions are in slab.h.

[akpm@linux-foundation.org: move the new kvrealloc() also]
[akpm@linux-foundation.org: drivers/hwmon/occ/p9_sbe.c needs slab.h]
Link: https://lkml.kernel.org/r/20210622215757.3525604-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/hwmon/occ/p9_sbe.c |    1 +
 drivers/of/kexec.c         |    1 +
 include/linux/mm.h         |   34 ----------------------------------
 include/linux/slab.h       |   34 ++++++++++++++++++++++++++++++++++
 4 files changed, 36 insertions(+), 34 deletions(-)

--- a/drivers/hwmon/occ/p9_sbe.c~mm-move-kvmalloc-related-functions-to-slabh
+++ a/drivers/hwmon/occ/p9_sbe.c
@@ -3,6 +3,7 @@
 
 #include <linux/device.h>
 #include <linux/errno.h>
+#include <linux/slab.h>
 #include <linux/fsi-occ.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
--- a/drivers/of/kexec.c~mm-move-kvmalloc-related-functions-to-slabh
+++ a/drivers/of/kexec.c
@@ -16,6 +16,7 @@
 #include <linux/of.h>
 #include <linux/of_fdt.h>
 #include <linux/random.h>
+#include <linux/slab.h>
 #include <linux/types.h>
 
 #define RNG_SEED_SIZE		128
--- a/include/linux/mm.h~mm-move-kvmalloc-related-functions-to-slabh
+++ a/include/linux/mm.h
@@ -799,40 +799,6 @@ static inline int is_vmalloc_or_module_a
 }
 #endif
 
-extern void *kvmalloc_node(size_t size, gfp_t flags, int node);
-static inline void *kvmalloc(size_t size, gfp_t flags)
-{
-	return kvmalloc_node(size, flags, NUMA_NO_NODE);
-}
-static inline void *kvzalloc_node(size_t size, gfp_t flags, int node)
-{
-	return kvmalloc_node(size, flags | __GFP_ZERO, node);
-}
-static inline void *kvzalloc(size_t size, gfp_t flags)
-{
-	return kvmalloc(size, flags | __GFP_ZERO);
-}
-
-static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags)
-{
-	size_t bytes;
-
-	if (unlikely(check_mul_overflow(n, size, &bytes)))
-		return NULL;
-
-	return kvmalloc(bytes, flags);
-}
-
-static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)
-{
-	return kvmalloc_array(n, size, flags | __GFP_ZERO);
-}
-
-extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize,
-		gfp_t flags);
-extern void kvfree(const void *addr);
-extern void kvfree_sensitive(const void *addr, size_t len);
-
 static inline int head_compound_mapcount(struct page *head)
 {
 	return atomic_read(compound_mapcount_ptr(head)) + 1;
--- a/include/linux/slab.h~mm-move-kvmalloc-related-functions-to-slabh
+++ a/include/linux/slab.h
@@ -732,6 +732,40 @@ static inline void *kzalloc_node(size_t
 	return kmalloc_node(size, flags | __GFP_ZERO, node);
 }
 
+extern void *kvmalloc_node(size_t size, gfp_t flags, int node);
+static inline void *kvmalloc(size_t size, gfp_t flags)
+{
+	return kvmalloc_node(size, flags, NUMA_NO_NODE);
+}
+static inline void *kvzalloc_node(size_t size, gfp_t flags, int node)
+{
+	return kvmalloc_node(size, flags | __GFP_ZERO, node);
+}
+static inline void *kvzalloc(size_t size, gfp_t flags)
+{
+	return kvmalloc(size, flags | __GFP_ZERO);
+}
+
+static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags)
+{
+	size_t bytes;
+
+	if (unlikely(check_mul_overflow(n, size, &bytes)))
+		return NULL;
+
+	return kvmalloc(bytes, flags);
+}
+
+static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)
+{
+	return kvmalloc_array(n, size, flags | __GFP_ZERO);
+}
+
+extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize,
+		gfp_t flags);
+extern void kvfree(const void *addr);
+extern void kvfree_sensitive(const void *addr, size_t len);
+
 unsigned int kmem_cache_size(struct kmem_cache *s);
 void __init kmem_cache_init_late(void);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 012/262] mm/slab.c: remove useless lines in enable_cpucache()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2021-11-05 20:35 ` [patch 011/262] mm: move kvmalloc-related functions to slab.h Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 013/262] slub: add back check for free nonslab objects Andrew Morton
                   ` (249 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, cl, iamjoonsoo.kim, linux-mm, mm-commits, penberg,
	rientjes, shi_lei, torvalds, vbabka

From: Shi Lei <shi_lei@massclouds.com>
Subject: mm/slab.c: remove useless lines in enable_cpucache()

These lines are useless, so remove them.

Link: https://lkml.kernel.org/r/20210930034845.2539-1-shi_lei@massclouds.com
Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations")
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/mm/slab.c~mm-remove-useless-lines-in-enable_cpucache
+++ a/mm/slab.c
@@ -3900,8 +3900,6 @@ static int enable_cpucache(struct kmem_c
 	if (err)
 		goto end;
 
-	if (limit && shared && batchcount)
-		goto skip_setup;
 	/*
 	 * The head array serves three purposes:
 	 * - create a LIFO ordering, i.e. return objects that are cache-warm
@@ -3944,7 +3942,6 @@ static int enable_cpucache(struct kmem_c
 		limit = 32;
 #endif
 	batchcount = (limit + 1) / 2;
-skip_setup:
 	err = do_tune_cpucache(cachep, limit, batchcount, shared, gfp);
 end:
 	if (err)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 013/262] slub: add back check for free nonslab objects
  2021-11-05 20:34 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2021-11-05 20:35 ` [patch 012/262] mm/slab.c: remove useless lines in enable_cpucache() Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 014/262] mm, slub: change percpu partial accounting from objects to pages Andrew Morton
                   ` (248 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, cl, iamjoonsoo.kim, linux-mm, mm-commits, penberg,
	rientjes, shakeelb, torvalds, vbabka, wangkefeng.wang, willy

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: slub: add back check for free nonslab objects

After commit ("f227f0faf63b slub: fix unreclaimable slab stat for bulk
free"), the check for free nonslab page is replaced by VM_BUG_ON_PAGE,
which only check with CONFIG_DEBUG_VM enabled, but this config may impact
performance, so it only for debug.

Commit ("0937502af7c9 slub: Add check for kfree() of non slab objects.")
add the ability, which should be needed in any configs to catch the
invalid free, they even could be potential issue, eg, memory corruption,
use after free and double free, so replace VM_BUG_ON_PAGE to WARN_ON_ONCE,
add object address printing to help use to debug the issue.

Link: https://lkml.kernel.org/r/20210930070214.61499-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rienjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/slub.c~slub-add-back-check-for-free-nonslab-objects
+++ a/mm/slub.c
@@ -3522,7 +3522,9 @@ static inline void free_nonslab_page(str
 {
 	unsigned int order = compound_order(page);
 
-	VM_BUG_ON_PAGE(!PageCompound(page), page);
+	if (WARN_ON_ONCE(!PageCompound(page)))
+		pr_warn_once("object pointer: 0x%p\n", object);
+
 	kfree_hook(object);
 	mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, -(PAGE_SIZE << order));
 	__free_pages(page, order);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 014/262] mm, slub: change percpu partial accounting from objects to pages
  2021-11-05 20:34 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2021-11-05 20:35 ` [patch 013/262] slub: add back check for free nonslab objects Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 015/262] mm/slub: increase default cpu partial list sizes Andrew Morton
                   ` (247 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, cl, guro, iamjoonsoo.kim, jannh, linux-mm, mm-commits,
	penberg, rientjes, torvalds, vbabka

From: Vlastimil Babka <vbabka@suse.cz>
Subject: mm, slub: change percpu partial accounting from objects to pages

With CONFIG_SLUB_CPU_PARTIAL enabled, SLUB keeps a percpu list of partial
slabs that can be promoted to cpu slab when the previous one is depleted,
without accessing the shared partial list.  A slab can be added to this
list by 1) refill of an empty list from get_partial_node() - once we
really have to access the shared partial list, we acquire multiple slabs
to amortize the cost of locking, and 2) first free to a previously full
slab - instead of putting the slab on a shared partial list, we can more
cheaply freeze it and put it on the per-cpu list.

To control how large a percpu partial list can grow for a kmem cache,
set_cpu_partial() calculates a target number of free objects on each cpu's
percpu partial list, and this can be also set by the sysfs file
cpu_partial.

However, the tracking of actual number of objects is imprecise, in order
to limit overhead from cpu X freeing an objects to a slab on percpu
partial list of cpu Y.  Basically, the percpu partial slabs form a single
linked list, and when we add a new slab to the list with current head
"oldpage", we set in the struct page of the slab we're adding:

page->pages = oldpage->pages + 1; // this is precise
page->pobjects = oldpage->pobjects + (page->objects - page->inuse);
page->next = oldpage;

Thus the real number of free objects in the slab (objects - inuse) is only
determined at the moment of adding the slab to the percpu partial list,
and further freeing doesn't update the pobjects counter nor propagate it
to the current list head.  As Jann reports [1], this can easily lead to
large inaccuracies, where the target number of objects (up to 30 by
default) can translate to the same number of (empty) slab pages on the
list.  In case 2) above, we put a slab with 1 free object on the list,
thus only increase page->pobjects by 1, even if there are subsequent frees
on the same slab.  Jann has noticed this in practice and so did we [2]
when investigating significant increase of kmemcg usage after switching
from SLAB to SLUB.

While this is no longer a problem in kmemcg context thanks to the
accounting rewrite in 5.9, the memory waste is still not ideal and it's
questionable whether it makes sense to perform free object count based
control when object counts can easily become so much inaccurate.  So this
patch converts the accounting to be based on number of pages only (which
is precise) and removes the page->pobjects field completely.  This is also
ultimately simpler.

To retain the existing set_cpu_partial() heuristic, first calculate the
target number of objects as previously, but then convert it to target
number of pages by assuming the pages will be half-filled on average. 
This assumption might obviously also be inaccurate in practice, but cannot
degrade to actual number of pages being equal to the target number of
objects.

We could also skip the intermediate step with target number of objects and
rewrite the heuristic in terms of pages.  However we still have the sysfs
file cpu_partial which uses number of objects and could break existing
users if it suddenly becomes number of pages, so this patch doesn't do
that.

In practice, after this patch the heuristics limit the size of percpu
partial list up to 2 pages.  In case of a reported regression (which would
mean some workload has benefited from the previous imprecise object based
counting), we can tune the heuristics to get a better compromise within
the new scheme, while still avoid the unexpectedly long percpu partial
lists.

[1] https://lore.kernel.org/linux-mm/CAG48ez2Qx5K1Cab-m8BdSibp6wLTip6ro4=-umR7BLsEgjEYzA@mail.gmail.com/
[2] https://lore.kernel.org/all/2f0f46e8-2535-410a-1859-e9cfa4e57c18@suse.cz/

==========
Evaluation
==========

Mel was kind enough to run v1 through mmtests machinery for netperf
(localhost) and hackbench and, for most significant results see below.  So
there are some apparent regressions, especially with hackbench, which I
think ultimately boils down to having shorter percpu partial lists on
average and some benchmarks benefiting from longer ones.  Monitoring slab
usage also indicated less memory usage by slab.  Based on that, the
following patch will bump the defaults to allow longer percpu partial
lists than after this patch.

However the goal is certainly not such that we would limit the percpu
partial lists to 30 pages just because previously a specific alloc/free
pattern could lead to the limit of 30 objects translate to a limit to 30
pages - that would make little sense.  This is a correctness patch, and if
a workload benefits from larger lists, the sysfs tuning knobs are still
there to allow that.

Netperf

2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads
per socket), 384GB RAM
TCP-RR:
hmean before 127045.79 after 121092.94 (-4.69%, worse)
stddev before  2634.37 after   1254.08
UDP-RR:
hmean before 166985.45 after 160668.94 ( -3.78%, worse)
stddev before 4059.69 after 1943.63

2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads
per socket), 512GB RAM
TCP-RR:
hmean before 84173.25 after 76914.72 ( -8.62%, worse)
UDP-RR:
hmean before 93571.12 after 96428.69 ( 3.05%, better)
stddev before 23118.54 after 16828.14

2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads
per socket), 64GB RAM
TCP-RR:
hmean before 49984.92 after 48922.27 ( -2.13%, worse)
stddev before 6248.15 after 4740.51
UDP-RR:
hmean before 61854.31 after 68761.81 ( 11.17%, better)
stddev before 4093.54 after 5898.91

other machines - within 2%

Hackbench

(results before and after the patch, negative % means worse)

2-socket AMD EPYC 7713 (64 cores, 128 threads per core), 256GB RAM
hackbench-process-sockets
Amean 	1 	0.5380	0.5583	( -3.78%)
Amean 	4 	0.7510	0.8150	( -8.52%)
Amean 	7 	0.7930	0.9533	( -20.22%)
Amean 	12 	0.7853	1.1313	( -44.06%)
Amean 	21 	1.1520	1.4993	( -30.15%)
Amean 	30 	1.6223	1.9237	( -18.57%)
Amean 	48 	2.6767	2.9903	( -11.72%)
Amean 	79 	4.0257	5.1150	( -27.06%)
Amean 	110	5.5193	7.4720	( -35.38%)
Amean 	141	7.2207	9.9840	( -38.27%)
Amean 	172	8.4770	12.1963	( -43.88%)
Amean 	203	9.6473	14.3137	( -48.37%)
Amean 	234	11.3960	18.7917	( -64.90%)
Amean 	265	13.9627	22.4607	( -60.86%)
Amean 	296	14.9163	26.0483	( -74.63%)

hackbench-thread-sockets
Amean 	1 	0.5597	0.5877	( -5.00%)
Amean 	4 	0.7913	0.8960	( -13.23%)
Amean 	7 	0.8190	1.0017	( -22.30%)
Amean 	12 	0.9560	1.1727	( -22.66%)
Amean 	21 	1.7587	1.5660	( 10.96%)
Amean 	30 	2.4477	1.9807	( 19.08%)
Amean 	48 	3.4573	3.0630	( 11.41%)
Amean 	79 	4.7903	5.1733	( -8.00%)
Amean 	110	6.1370	7.4220	( -20.94%)
Amean 	141	7.5777	9.2617	( -22.22%)
Amean 	172	9.2280	11.0907	( -20.18%)
Amean 	203	10.2793	13.3470	( -29.84%)
Amean 	234	11.2410	17.1070	( -52.18%)
Amean 	265	12.5970	23.3323	( -85.22%)
Amean 	296	17.1540	24.2857	( -41.57%)

2-socket Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz (20 cores, 40 threads
per socket), 384GB RAM
hackbench-process-sockets
Amean 	1 	0.5760	0.4793	( 16.78%)
Amean 	4 	0.9430	0.9707	( -2.93%)
Amean 	7 	1.5517	1.8843	( -21.44%)
Amean 	12 	2.4903	2.7267	( -9.49%)
Amean 	21 	3.9560	4.2877	( -8.38%)
Amean 	30 	5.4613	5.8343	( -6.83%)
Amean 	48 	8.5337	9.2937	( -8.91%)
Amean 	79 	14.0670	15.2630	( -8.50%)
Amean 	110	19.2253	21.2467	( -10.51%)
Amean 	141	23.7557	25.8550	( -8.84%)
Amean 	172	28.4407	29.7603	( -4.64%)
Amean 	203	33.3407	33.9927	( -1.96%)
Amean 	234	38.3633	39.1150	( -1.96%)
Amean 	265	43.4420	43.8470	( -0.93%)
Amean 	296	48.3680	48.9300	( -1.16%)

hackbench-thread-sockets
Amean 	1 	0.6080	0.6493	( -6.80%)
Amean 	4 	1.0000	1.0513	( -5.13%)
Amean 	7 	1.6607	2.0260	( -22.00%)
Amean 	12 	2.7637	2.9273	( -5.92%)
Amean 	21 	5.0613	4.5153	( 10.79%)
Amean 	30 	6.3340	6.1140	( 3.47%)
Amean 	48 	9.0567	9.5577	( -5.53%)
Amean 	79 	14.5657	15.7983	( -8.46%)
Amean 	110	19.6213	21.6333	( -10.25%)
Amean 	141	24.1563	26.2697	( -8.75%)
Amean 	172	28.9687	30.2187	( -4.32%)
Amean 	203	33.9763	34.6970	( -2.12%)
Amean 	234	38.8647	39.3207	( -1.17%)
Amean 	265	44.0813	44.1507	( -0.16%)
Amean 	296	49.2040	49.4330	( -0.47%)

2-socket Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores, 40 threads
per socket), 512GB RAM
hackbench-process-sockets
Amean 	1 	0.5027	0.5017	( 0.20%)
Amean 	4 	1.1053	1.2033	( -8.87%)
Amean 	7 	1.8760	2.1820	( -16.31%)
Amean 	12 	2.9053	3.1810	( -9.49%)
Amean 	21 	4.6777	4.9920	( -6.72%)
Amean 	30 	6.5180	6.7827	( -4.06%)
Amean 	48 	10.0710	10.5227	( -4.48%)
Amean 	79 	16.4250	17.5053	( -6.58%)
Amean 	110	22.6203	24.4617	( -8.14%)
Amean 	141	28.0967	31.0363	( -10.46%)
Amean 	172	34.4030	36.9233	( -7.33%)
Amean 	203	40.5933	43.0850	( -6.14%)
Amean 	234	46.6477	48.7220	( -4.45%)
Amean 	265	53.0530	53.9597	( -1.71%)
Amean 	296	59.2760	59.9213	( -1.09%)

hackbench-thread-sockets
Amean 	1 	0.5363	0.5330	( 0.62%)
Amean 	4 	1.1647	1.2157	( -4.38%)
Amean 	7 	1.9237	2.2833	( -18.70%)
Amean 	12 	2.9943	3.3110	( -10.58%)
Amean 	21 	4.9987	5.1880	( -3.79%)
Amean 	30 	6.7583	7.0043	( -3.64%)
Amean 	48 	10.4547	10.8353	( -3.64%)
Amean 	79 	16.6707	17.6790	( -6.05%)
Amean 	110	22.8207	24.4403	( -7.10%)
Amean 	141	28.7090	31.0533	( -8.17%)
Amean 	172	34.9387	36.8260	( -5.40%)
Amean 	203	41.1567	43.0450	( -4.59%)
Amean 	234	47.3790	48.5307	( -2.43%)
Amean 	265	53.9543	54.6987	( -1.38%)
Amean 	296	60.0820	60.2163	( -0.22%)

1-socket Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz (4 cores, 8 threads),
32 GB RAM
hackbench-process-sockets
Amean 	1 	1.4760	1.5773	( -6.87%)
Amean 	3 	3.9370	4.0910	( -3.91%)
Amean 	5 	6.6797	6.9357	( -3.83%)
Amean 	7 	9.3367	9.7150	( -4.05%)
Amean 	12	15.7627	16.1400	( -2.39%)
Amean 	18	23.5360	23.6890	( -0.65%)
Amean 	24	31.0663	31.3137	( -0.80%)
Amean 	30	38.7283	39.0037	( -0.71%)
Amean 	32	41.3417	41.6097	( -0.65%)

hackbench-thread-sockets
Amean 	1 	1.5250	1.6043	( -5.20%)
Amean 	3 	4.0897	4.2603	( -4.17%)
Amean 	5 	6.7760	7.0933	( -4.68%)
Amean 	7 	9.4817	9.9157	( -4.58%)
Amean 	12	15.9610	16.3937	( -2.71%)
Amean 	18	23.9543	24.3417	( -1.62%)
Amean 	24	31.4400	31.7217	( -0.90%)
Amean 	30	39.2457	39.5467	( -0.77%)
Amean 	32	41.8267	42.1230	( -0.71%)

2-socket Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz (12 cores, 24 threads
per socket), 64GB RAM
hackbench-process-sockets
Amean 	1 	1.0347	1.0880	( -5.15%)
Amean 	4 	1.7267	1.8527	( -7.30%)
Amean 	7 	2.6707	2.8110	( -5.25%)
Amean 	12 	4.1617	4.3383	( -4.25%)
Amean 	21 	7.0070	7.2600	( -3.61%)
Amean 	30 	9.9187	10.2397	( -3.24%)
Amean 	48 	15.6710	16.3923	( -4.60%)
Amean 	79 	24.7743	26.1247	( -5.45%)
Amean 	110	34.3000	35.9307	( -4.75%)
Amean 	141	44.2043	44.8010	( -1.35%)
Amean 	172	54.2430	54.7260	( -0.89%)
Amean 	192	60.6557	60.9777	( -0.53%)

hackbench-thread-sockets
Amean 	1 	1.0610	1.1353	( -7.01%)
Amean 	4 	1.7543	1.9140	( -9.10%)
Amean 	7 	2.7840	2.9573	( -6.23%)
Amean 	12 	4.3813	4.4937	( -2.56%)
Amean 	21 	7.3460	7.5350	( -2.57%)
Amean 	30 	10.2313	10.5190	( -2.81%)
Amean 	48 	15.9700	16.5940	( -3.91%)
Amean 	79 	25.3973	26.6637	( -4.99%)
Amean 	110	35.1087	36.4797	( -3.91%)
Amean 	141	45.8220	46.3053	( -1.05%)
Amean 	172	55.4917	55.7320	( -0.43%)
Amean 	192	62.7490	62.5410	( 0.33%)

Link: https://lkml.kernel.org/r/20211012134651.11258-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Jann Horn <jannh@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm_types.h |    2 
 include/linux/slub_def.h |   13 -----
 mm/slub.c                |   89 ++++++++++++++++++++++++-------------
 3 files changed, 61 insertions(+), 43 deletions(-)

--- a/include/linux/mm_types.h~mm-slub-change-percpu-partial-accounting-from-objects-to-pages
+++ a/include/linux/mm_types.h
@@ -124,10 +124,8 @@ struct page {
 					struct page *next;
 #ifdef CONFIG_64BIT
 					int pages;	/* Nr of pages left */
-					int pobjects;	/* Approximate count */
 #else
 					short int pages;
-					short int pobjects;
 #endif
 				};
 			};
--- a/include/linux/slub_def.h~mm-slub-change-percpu-partial-accounting-from-objects-to-pages
+++ a/include/linux/slub_def.h
@@ -99,6 +99,8 @@ struct kmem_cache {
 #ifdef CONFIG_SLUB_CPU_PARTIAL
 	/* Number of per cpu partial objects to keep around */
 	unsigned int cpu_partial;
+	/* Number of per cpu partial pages to keep around */
+	unsigned int cpu_partial_pages;
 #endif
 	struct kmem_cache_order_objects oo;
 
@@ -141,17 +143,6 @@ struct kmem_cache {
 	struct kmem_cache_node *node[MAX_NUMNODES];
 };
 
-#ifdef CONFIG_SLUB_CPU_PARTIAL
-#define slub_cpu_partial(s)		((s)->cpu_partial)
-#define slub_set_cpu_partial(s, n)		\
-({						\
-	slub_cpu_partial(s) = (n);		\
-})
-#else
-#define slub_cpu_partial(s)		(0)
-#define slub_set_cpu_partial(s, n)
-#endif /* CONFIG_SLUB_CPU_PARTIAL */
-
 #ifdef CONFIG_SYSFS
 #define SLAB_SUPPORTS_SYSFS
 void sysfs_slab_unlink(struct kmem_cache *);
--- a/mm/slub.c~mm-slub-change-percpu-partial-accounting-from-objects-to-pages
+++ a/mm/slub.c
@@ -414,6 +414,29 @@ static inline unsigned int oo_objects(st
 	return x.x & OO_MASK;
 }
 
+#ifdef CONFIG_SLUB_CPU_PARTIAL
+static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects)
+{
+	unsigned int nr_pages;
+
+	s->cpu_partial = nr_objects;
+
+	/*
+	 * We take the number of objects but actually limit the number of
+	 * pages on the per cpu partial list, in order to limit excessive
+	 * growth of the list. For simplicity we assume that the pages will
+	 * be half-full.
+	 */
+	nr_pages = DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo));
+	s->cpu_partial_pages = nr_pages;
+}
+#else
+static inline void
+slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects)
+{
+}
+#endif /* CONFIG_SLUB_CPU_PARTIAL */
+
 /*
  * Per slab locking using the pagelock
  */
@@ -2052,7 +2075,7 @@ static inline void remove_partial(struct
  */
 static inline void *acquire_slab(struct kmem_cache *s,
 		struct kmem_cache_node *n, struct page *page,
-		int mode, int *objects)
+		int mode)
 {
 	void *freelist;
 	unsigned long counters;
@@ -2068,7 +2091,6 @@ static inline void *acquire_slab(struct
 	freelist = page->freelist;
 	counters = page->counters;
 	new.counters = counters;
-	*objects = new.objects - new.inuse;
 	if (mode) {
 		new.inuse = page->objects;
 		new.freelist = NULL;
@@ -2106,9 +2128,8 @@ static void *get_partial_node(struct kme
 {
 	struct page *page, *page2;
 	void *object = NULL;
-	unsigned int available = 0;
 	unsigned long flags;
-	int objects;
+	unsigned int partial_pages = 0;
 
 	/*
 	 * Racy check. If we mistakenly see no partial slabs then we
@@ -2126,11 +2147,10 @@ static void *get_partial_node(struct kme
 		if (!pfmemalloc_match(page, gfpflags))
 			continue;
 
-		t = acquire_slab(s, n, page, object == NULL, &objects);
+		t = acquire_slab(s, n, page, object == NULL);
 		if (!t)
 			break;
 
-		available += objects;
 		if (!object) {
 			*ret_page = page;
 			stat(s, ALLOC_FROM_PARTIAL);
@@ -2138,10 +2158,15 @@ static void *get_partial_node(struct kme
 		} else {
 			put_cpu_partial(s, page, 0);
 			stat(s, CPU_PARTIAL_NODE);
+			partial_pages++;
 		}
+#ifdef CONFIG_SLUB_CPU_PARTIAL
 		if (!kmem_cache_has_cpu_partial(s)
-			|| available > slub_cpu_partial(s) / 2)
+			|| partial_pages > s->cpu_partial_pages / 2)
 			break;
+#else
+		break;
+#endif
 
 	}
 	spin_unlock_irqrestore(&n->list_lock, flags);
@@ -2546,14 +2571,13 @@ static void put_cpu_partial(struct kmem_
 	struct page *page_to_unfreeze = NULL;
 	unsigned long flags;
 	int pages = 0;
-	int pobjects = 0;
 
 	local_lock_irqsave(&s->cpu_slab->lock, flags);
 
 	oldpage = this_cpu_read(s->cpu_slab->partial);
 
 	if (oldpage) {
-		if (drain && oldpage->pobjects > slub_cpu_partial(s)) {
+		if (drain && oldpage->pages >= s->cpu_partial_pages) {
 			/*
 			 * Partial array is full. Move the existing set to the
 			 * per node partial list. Postpone the actual unfreezing
@@ -2562,16 +2586,13 @@ static void put_cpu_partial(struct kmem_
 			page_to_unfreeze = oldpage;
 			oldpage = NULL;
 		} else {
-			pobjects = oldpage->pobjects;
 			pages = oldpage->pages;
 		}
 	}
 
 	pages++;
-	pobjects += page->objects - page->inuse;
 
 	page->pages = pages;
-	page->pobjects = pobjects;
 	page->next = oldpage;
 
 	this_cpu_write(s->cpu_slab->partial, page);
@@ -3991,6 +4012,8 @@ static void set_min_partial(struct kmem_
 static void set_cpu_partial(struct kmem_cache *s)
 {
 #ifdef CONFIG_SLUB_CPU_PARTIAL
+	unsigned int nr_objects;
+
 	/*
 	 * cpu_partial determined the maximum number of objects kept in the
 	 * per cpu partial lists of a processor.
@@ -4000,24 +4023,22 @@ static void set_cpu_partial(struct kmem_
 	 * filled up again with minimal effort. The slab will never hit the
 	 * per node partial lists and therefore no locking will be required.
 	 *
-	 * This setting also determines
-	 *
-	 * A) The number of objects from per cpu partial slabs dumped to the
-	 *    per node list when we reach the limit.
-	 * B) The number of objects in cpu partial slabs to extract from the
-	 *    per node list when we run out of per cpu objects. We only fetch
-	 *    50% to keep some capacity around for frees.
+	 * For backwards compatibility reasons, this is determined as number
+	 * of objects, even though we now limit maximum number of pages, see
+	 * slub_set_cpu_partial()
 	 */
 	if (!kmem_cache_has_cpu_partial(s))
-		slub_set_cpu_partial(s, 0);
+		nr_objects = 0;
 	else if (s->size >= PAGE_SIZE)
-		slub_set_cpu_partial(s, 2);
+		nr_objects = 2;
 	else if (s->size >= 1024)
-		slub_set_cpu_partial(s, 6);
+		nr_objects = 6;
 	else if (s->size >= 256)
-		slub_set_cpu_partial(s, 13);
+		nr_objects = 13;
 	else
-		slub_set_cpu_partial(s, 30);
+		nr_objects = 30;
+
+	slub_set_cpu_partial(s, nr_objects);
 #endif
 }
 
@@ -5392,7 +5413,12 @@ SLAB_ATTR(min_partial);
 
 static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf)
 {
-	return sysfs_emit(buf, "%u\n", slub_cpu_partial(s));
+	unsigned int nr_partial = 0;
+#ifdef CONFIG_SLUB_CPU_PARTIAL
+	nr_partial = s->cpu_partial;
+#endif
+
+	return sysfs_emit(buf, "%u\n", nr_partial);
 }
 
 static ssize_t cpu_partial_store(struct kmem_cache *s, const char *buf,
@@ -5463,12 +5489,12 @@ static ssize_t slabs_cpu_partial_show(st
 
 		page = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu));
 
-		if (page) {
+		if (page)
 			pages += page->pages;
-			objects += page->pobjects;
-		}
 	}
 
+	/* Approximate half-full pages , see slub_set_cpu_partial() */
+	objects = (pages * oo_objects(s->oo)) / 2;
 	len += sysfs_emit_at(buf, len, "%d(%d)", objects, pages);
 
 #ifdef CONFIG_SMP
@@ -5476,9 +5502,12 @@ static ssize_t slabs_cpu_partial_show(st
 		struct page *page;
 
 		page = slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu));
-		if (page)
+		if (page) {
+			pages = READ_ONCE(page->pages);
+			objects = (pages * oo_objects(s->oo)) / 2;
 			len += sysfs_emit_at(buf, len, " C%d=%d(%d)",
-					     cpu, page->pobjects, page->pages);
+					     cpu, objects, pages);
+		}
 	}
 #endif
 	len += sysfs_emit_at(buf, len, "\n");
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 015/262] mm/slub: increase default cpu partial list sizes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2021-11-05 20:35 ` [patch 014/262] mm, slub: change percpu partial accounting from objects to pages Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 016/262] mm, slub: use prefetchw instead of prefetch Andrew Morton
                   ` (246 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, cl, guro, iamjoonsoo.kim, jannh, linux-mm, mm-commits,
	penberg, rientjes, torvalds, vbabka

From: Vlastimil Babka <vbabka@suse.cz>
Subject: mm/slub: increase default cpu partial list sizes

The defaults are determined based on object size and can go up to 30 for
objects smaller than 256 bytes.  Before the previous patch changed the
accounting, this could have made cpu partial list contain up to 30 pages. 
After that patch, only up to 2 pages with default allocation order.

Very short lists limit the usefulness of the whole concept of cpu partial
lists, so this patch aims at a more reasonable default under the new
accounting.  The defaults are quadrupled, except for object size >=
PAGE_SIZE where it's doubled.  This makes the lists grow up to 10 pages in
practice.

A quick test of booting a kernel under virtme with 4GB RAM and 8 vcpus
shows the following slab memory usage after boot:

Before previous patch (using page->pobjects):
Slab:              36732 kB
SReclaimable:      14836 kB
SUnreclaim:        21896 kB

After previous patch (using page->pages):
Slab:              34720 kB
SReclaimable:      13716 kB
SUnreclaim:        21004 kB

After this patch (using page->pages, higher defaults):
Slab:              35252 kB
SReclaimable:      13944 kB
SUnreclaim:        21308 kB

In the same setup, I also ran 5 times:
hackbench -l 16000 -g 16

Differences in time were in the noise, we can compare slub stats as given
by slabinfo -r skbuff_head_cache (the other cache heavily used by
hackbench, kmalloc-cg-512 looks similar).  Negligible stats left out for
brevity.

Before previous patch (using page->pobjects):

Objects: 1408, Memory Total:  401408 Used :  304128

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath             469952498  5946606  91   1
Slowpath             42053573 506059465   8  98
Page Alloc              41093    41044   0   0
Add partial                18 21229327   0   4
Remove partial       20039522    36051   3   0
Cpu partial list      4686640 24767229   0   4
RemoteObj/SlabFrozen       16 124027841   0  24
Total                512006071 512006071
Flushes       18

Slab Deactivation             Occurrences %
-------------------------------------------------
Slab empty                       4993    0%
Deactivation bypass           24767229   99%
Refilled from foreign frees   21972674   88%

After previous patch (using page->pages):

Objects: 480, Memory Total:  131072 Used :  103680

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath             473016294  5405653  92   1
Slowpath             38989777 506600418   7  98
Page Alloc              32717    32701   0   0
Add partial                 3 22749164   0   4
Remove partial       11371127    32474   2   0
Cpu partial list     11686226 23090059   2   4
RemoteObj/SlabFrozen        2 67541803   0  13
Total                512006071 512006071
Flushes        3

Slab Deactivation             Occurrences %
-------------------------------------------------
Slab empty                        227    0%
Deactivation bypass           23090059   99%
Refilled from foreign frees   27585695  119%

After this patch (using page->pages, higher defaults):

Objects: 896, Memory Total:  229376 Used :  193536

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath             473799295  4980278  92   0
Slowpath             38206776 507025793   7  99
Page Alloc              32295    32267   0   0
Add partial                11 23291143   0   4
Remove partial        5815764    31278   1   0
Cpu partial list     18119280 23967320   3   4
RemoteObj/SlabFrozen       10 76974794   0  15
Total                512006071 512006071
Flushes       11

Slab Deactivation             Occurrences %
-------------------------------------------------
Slab empty                        989    0%
Deactivation bypass           23967320   99%
Refilled from foreign frees   32358473  135%

As expected, memory usage dropped significantly with change of accounting,
increasing the defaults increased it, but not as much.  The number of page
allocation/frees dropped significantly with the new accounting, but didn't
increase with the higher defaults.  Interestingly, the number of fasthpath
allocations increased, as well as allocations from the cpu partial list,
even though it's shorter.

Link: https://lkml.kernel.org/r/20211012134651.11258-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/slub.c~mm-slub-increase-default-cpu-partial-list-sizes
+++ a/mm/slub.c
@@ -4030,13 +4030,13 @@ static void set_cpu_partial(struct kmem_
 	if (!kmem_cache_has_cpu_partial(s))
 		nr_objects = 0;
 	else if (s->size >= PAGE_SIZE)
-		nr_objects = 2;
-	else if (s->size >= 1024)
 		nr_objects = 6;
+	else if (s->size >= 1024)
+		nr_objects = 24;
 	else if (s->size >= 256)
-		nr_objects = 13;
+		nr_objects = 52;
 	else
-		nr_objects = 30;
+		nr_objects = 120;
 
 	slub_set_cpu_partial(s, nr_objects);
 #endif
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 016/262] mm, slub: use prefetchw instead of prefetch
  2021-11-05 20:34 incoming Andrew Morton
                   ` (14 preceding siblings ...)
  2021-11-05 20:35 ` [patch 015/262] mm/slub: increase default cpu partial list sizes Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 017/262] mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT Andrew Morton
                   ` (245 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: 42.hyeyoo, akpm, cl, iamjoonsoo.kim, linux-mm, mm-commits,
	penberg, rientjes, torvalds, vbabka

From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Subject: mm, slub: use prefetchw instead of prefetch

commit 0ad9500e16fe ("slub: prefetch next freelist pointer in
slab_alloc()") introduced prefetch_freepointer() because when other cpu(s)
freed objects into a page that current cpu owns, the freelist link is hot
on cpu(s) which freed objects and possibly very cold on current cpu.

But if freelist link chain is hot on cpu(s) which freed objects,
it's better to invalidate that chain because they're not going to access
again within a short time.

So use prefetchw instead of prefetch. On supported architectures like x86
and arm, it invalidates other copied instances of a cache line when
prefetching it.

Before:

Time: 91.677

 Performance counter stats for 'hackbench -g 100 -l 10000':
        1462938.07 msec cpu-clock                 #   15.908 CPUs utilized
          18072550      context-switches          #   12.354 K/sec
           1018814      cpu-migrations            #  696.416 /sec
            104558      page-faults               #   71.471 /sec
     1580035699271      cycles                    #    1.080 GHz                      (54.51%)
     2003670016013      instructions              #    1.27  insn per cycle           (54.31%)
        5702204863      branch-misses                                                 (54.28%)
      643368500985      cache-references          #  439.778 M/sec                    (54.26%)
       18475582235      cache-misses              #    2.872 % of all cache refs      (54.28%)
      642206796636      L1-dcache-loads           #  438.984 M/sec                    (46.87%)
       18215813147      L1-dcache-load-misses     #    2.84% of all L1-dcache accesses  (46.83%)
      653842996501      dTLB-loads                #  446.938 M/sec                    (46.63%)
        3227179675      dTLB-load-misses          #    0.49% of all dTLB cache accesses  (46.85%)
      537531951350      iTLB-loads                #  367.433 M/sec                    (54.33%)
         114750630      iTLB-load-misses          #    0.02% of all iTLB cache accesses  (54.37%)
      630135543177      L1-icache-loads           #  430.733 M/sec                    (46.80%)
       22923237620      L1-icache-load-misses     #    3.64% of all L1-icache accesses  (46.76%)

      91.964452802 seconds time elapsed

      43.416742000 seconds user
    1422.441123000 seconds sys

After:

Time: 90.220

 Performance counter stats for 'hackbench -g 100 -l 10000':
        1437418.48 msec cpu-clock                 #   15.880 CPUs utilized
          17694068      context-switches          #   12.310 K/sec
            958257      cpu-migrations            #  666.651 /sec
            100604      page-faults               #   69.989 /sec
     1583259429428      cycles                    #    1.101 GHz                      (54.57%)
     2004002484935      instructions              #    1.27  insn per cycle           (54.37%)
        5594202389      branch-misses                                                 (54.36%)
      643113574524      cache-references          #  447.409 M/sec                    (54.39%)
       18233791870      cache-misses              #    2.835 % of all cache refs      (54.37%)
      640205852062      L1-dcache-loads           #  445.386 M/sec                    (46.75%)
       17968160377      L1-dcache-load-misses     #    2.81% of all L1-dcache accesses  (46.79%)
      651747432274      dTLB-loads                #  453.415 M/sec                    (46.59%)
        3127124271      dTLB-load-misses          #    0.48% of all dTLB cache accesses  (46.75%)
      535395273064      iTLB-loads                #  372.470 M/sec                    (54.38%)
         113500056      iTLB-load-misses          #    0.02% of all iTLB cache accesses  (54.35%)
      628871845924      L1-icache-loads           #  437.501 M/sec                    (46.80%)
       22585641203      L1-icache-load-misses     #    3.59% of all L1-icache accesses  (46.79%)

      90.514819303 seconds time elapsed

      43.877656000 seconds user
    1397.176001000 seconds sys

Link: https://lkml.org/lkml/2021/10/8/598=20
Link: https://lkml.kernel.org/r/20211011144331.70084-1-42.hyeyoo@gmail.com
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slub.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/slub.c~mm-slub-use-prefetchw-instead-of-prefetch
+++ a/mm/slub.c
@@ -354,7 +354,7 @@ static inline void *get_freepointer(stru
 
 static void prefetch_freepointer(const struct kmem_cache *s, void *object)
 {
-	prefetch(object + s->offset);
+	prefetchw(object + s->offset);
 }
 
 static inline void *get_freepointer_safe(struct kmem_cache *s, void *object)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 017/262] mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT
  2021-11-05 20:34 incoming Andrew Morton
                   ` (15 preceding siblings ...)
  2021-11-05 20:35 ` [patch 016/262] mm, slub: use prefetchw instead of prefetch Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 018/262] mm: don't include <linux/dax.h> in <linux/mempolicy.h> Andrew Morton
                   ` (244 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, bigeasy, david, linux-mm, mgorman, mm-commits, peterz,
	tglx, torvalds, vbabka

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT

TRANSPARENT_HUGEPAGE:
There are potential non-deterministic delays to an RT thread if a critical
memory region is not THP-aligned and a non-RT buffer is located in the same
hugepage-aligned region. It's also possible for an unrelated thread to migrate
pages belonging to an RT task incurring unexpected page faults due to memory
defragmentation even if khugepaged is disabled.

Regular HUGEPAGEs are not affected by this can be used.

NUMA_BALANCING:
There is a non-deterministic delay to mark PTEs PROT_NONE to gather NUMA fault
samples, increased page faults of regions even if mlocked and non-deterministic
delays when migrating pages.

[Mel Gorman worded 99% of the commit description].

Link: https://lore.kernel.org/all/20200304091159.GN3818@techsingularity.net/
Link: https://lore.kernel.org/all/20211026165100.ahz5bkx44lrrw5pt@linutronix.de/
Link: https://lkml.kernel.org/r/20211028143327.hfbxjze7palrpfgp@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 init/Kconfig |    2 +-
 mm/Kconfig   |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/init/Kconfig~mm-disable-numa_balancing_default_enabled-and-transparent_hugepage-on-preempt_rt
+++ a/init/Kconfig
@@ -901,7 +901,7 @@ config NUMA_BALANCING
 	bool "Memory placement aware NUMA scheduler"
 	depends on ARCH_SUPPORTS_NUMA_BALANCING
 	depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
-	depends on SMP && NUMA && MIGRATION
+	depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
 	help
 	  This option adds support for automatic NUMA aware memory/task placement.
 	  The mechanism is quite primitive and is based on migrating memory when
--- a/mm/Kconfig~mm-disable-numa_balancing_default_enabled-and-transparent_hugepage-on-preempt_rt
+++ a/mm/Kconfig
@@ -371,7 +371,7 @@ config NOMMU_INITIAL_TRIM_EXCESS
 
 config TRANSPARENT_HUGEPAGE
 	bool "Transparent Hugepage Support"
-	depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE
+	depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE && !PREEMPT_RT
 	select COMPACTION
 	select XARRAY_MULTI
 	help
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 018/262] mm: don't include <linux/dax.h> in <linux/mempolicy.h>
  2021-11-05 20:34 incoming Andrew Morton
                   ` (16 preceding siblings ...)
  2021-11-05 20:35 ` [patch 017/262] mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 019/262] lib/stackdepot: include gfp.h Andrew Morton
                   ` (243 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, dan.j.williams, hch, linux-mm, mm-commits, naoya.horiguchi,
	torvalds

From: Christoph Hellwig <hch@lst.de>
Subject: mm: don't include <linux/dax.h> in <linux/mempolicy.h>

Not required at all, and having this causes a huge kernel rebuild as soon
as something in dax.h changes.

Link: https://lkml.kernel.org/r/20210921082253.1859794-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mempolicy.h |    1 -
 mm/memory-failure.c       |    1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/mempolicy.h~mm-dont-include-linux-daxh-in-linux-mempolicyh
+++ a/include/linux/mempolicy.h
@@ -8,7 +8,6 @@
 
 #include <linux/sched.h>
 #include <linux/mmzone.h>
-#include <linux/dax.h>
 #include <linux/slab.h>
 #include <linux/rbtree.h>
 #include <linux/spinlock.h>
--- a/mm/memory-failure.c~mm-dont-include-linux-daxh-in-linux-mempolicyh
+++ a/mm/memory-failure.c
@@ -39,6 +39,7 @@
 #include <linux/kernel-page-flags.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/task.h>
+#include <linux/dax.h>
 #include <linux/ksm.h>
 #include <linux/rmap.h>
 #include <linux/export.h>
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 019/262] lib/stackdepot: include gfp.h
  2021-11-05 20:34 incoming Andrew Morton
                   ` (17 preceding siblings ...)
  2021-11-05 20:35 ` [patch 018/262] mm: don't include <linux/dax.h> in <linux/mempolicy.h> Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 020/262] lib/stackdepot: remove unused function argument Andrew Morton
                   ` (242 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: lib/stackdepot: include gfp.h

Patch series "stackdepot, kasan, workqueue: Avoid expanding stackdepot slabs when holding raw_spin_lock", v2.

Shuah Khan reported [1]:

 | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled,
 | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when
 | it tries to allocate memory attempting to acquire spinlock in page
 | allocation code while holding workqueue pool raw_spinlock.
 |
 | There are several instances of this problem when block layer tries
 | to __queue_work(). Call trace from one of these instances is below:
 |
 |     kblockd_mod_delayed_work_on()
 |       mod_delayed_work_on()
 |         __queue_delayed_work()
 |           __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held)
 |             insert_work()
 |               kasan_record_aux_stack()
 |                 kasan_save_stack()
 |                   stack_depot_save()
 |                     alloc_pages()
 |                       __alloc_pages()
 |                         get_page_from_freelist()
 |                           rm_queue()
 |                             rm_queue_pcplist()
 |                               local_lock_irqsave(&pagesets.lock, flags);
 |                               [ BUG: Invalid wait context triggered ]

PROVE_RAW_LOCK_NESTING is pointing out that (on RT kernels) the locking
rules are being violated.  More generally, memory is being allocated from
a non-preemptive context (raw_spin_lock'd c-s) where it is not allowed.

To properly fix this, we must prevent stackdepot from replenishing its
"stack slab" pool if memory allocations cannot be done in the current
context: it's a bug to use either GFP_ATOMIC nor GFP_NOWAIT in certain
non-preemptive contexts, including raw_spin_locks (see gfp.h and
ab00db216c9c7).

The only downside is that saving a stack trace may fail if: stackdepot
runs out of space AND the same stack trace has not been recorded before. 
I expect this to be unlikely, and a simple experiment (boot the kernel)
didn't result in any failure to record stack trace from insert_work().

The series includes a few minor fixes to stackdepot that I noticed in
preparing the series.  It then introduces __stack_depot_save(), which
exposes the option to force stackdepot to not allocate any memory. 
Finally, KASAN is changed to use the new stackdepot interface and provide
kasan_record_aux_stack_noalloc(), which is then used by workqueue code.

[1] https://lkml.kernel.org/r/20210902200134.25603-1-skhan@linuxfoundation.org


This patch (of 6):

<linux/stackdepot.h> refers to gfp_t, but doesn't include gfp.h.

Fix it by including <linux/gfp.h>.

Link: https://lkml.kernel.org/r/20210913112609.2651084-1-elver@google.com
Link: https://lkml.kernel.org/r/20210913112609.2651084-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/stackdepot.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/include/linux/stackdepot.h~lib-stackdepot-include-gfph
+++ a/include/linux/stackdepot.h
@@ -11,6 +11,8 @@
 #ifndef _LINUX_STACKDEPOT_H
 #define _LINUX_STACKDEPOT_H
 
+#include <linux/gfp.h>
+
 typedef u32 depot_stack_handle_t;
 
 depot_stack_handle_t stack_depot_save(unsigned long *entries,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 020/262] lib/stackdepot: remove unused function argument
  2021-11-05 20:34 incoming Andrew Morton
                   ` (18 preceding siblings ...)
  2021-11-05 20:35 ` [patch 019/262] lib/stackdepot: include gfp.h Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 021/262] lib/stackdepot: introduce __stack_depot_save() Andrew Morton
                   ` (241 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: lib/stackdepot: remove unused function argument

alloc_flags in depot_alloc_stack() is no longer used; remove it.

Link: https://lkml.kernel.org/r/20210913112609.2651084-3-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/stackdepot.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/lib/stackdepot.c~lib-stackdepot-remove-unused-function-argument
+++ a/lib/stackdepot.c
@@ -102,8 +102,8 @@ static bool init_stack_slab(void **preal
 }
 
 /* Allocation of a new stack in raw storage */
-static struct stack_record *depot_alloc_stack(unsigned long *entries, int size,
-		u32 hash, void **prealloc, gfp_t alloc_flags)
+static struct stack_record *
+depot_alloc_stack(unsigned long *entries, int size, u32 hash, void **prealloc)
 {
 	struct stack_record *stack;
 	size_t required_size = struct_size(stack, entries, size);
@@ -309,9 +309,8 @@ depot_stack_handle_t stack_depot_save(un
 
 	found = find_stack(*bucket, entries, nr_entries, hash);
 	if (!found) {
-		struct stack_record *new =
-			depot_alloc_stack(entries, nr_entries,
-					  hash, &prealloc, alloc_flags);
+		struct stack_record *new = depot_alloc_stack(entries, nr_entries, hash, &prealloc);
+
 		if (new) {
 			new->next = *bucket;
 			/*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 021/262] lib/stackdepot: introduce __stack_depot_save()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (19 preceding siblings ...)
  2021-11-05 20:35 ` [patch 020/262] lib/stackdepot: remove unused function argument Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 022/262] kasan: common: provide can_alloc in kasan_save_stack() Andrew Morton
                   ` (240 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: lib/stackdepot: introduce __stack_depot_save()

Add __stack_depot_save(), which provides more fine-grained control over
stackdepot's memory allocation behaviour, in case stackdepot runs out of
"stack slabs".

Normally stackdepot uses alloc_pages() in case it runs out of space;
passing can_alloc==false to __stack_depot_save() prohibits this, at the
cost of more likely failure to record a stack trace.

Link: https://lkml.kernel.org/r/20210913112609.2651084-4-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/stackdepot.h |    4 +++
 lib/stackdepot.c           |   43 ++++++++++++++++++++++++++++++-----
 2 files changed, 41 insertions(+), 6 deletions(-)

--- a/include/linux/stackdepot.h~lib-stackdepot-introduce-__stack_depot_save
+++ a/include/linux/stackdepot.h
@@ -15,6 +15,10 @@
 
 typedef u32 depot_stack_handle_t;
 
+depot_stack_handle_t __stack_depot_save(unsigned long *entries,
+					unsigned int nr_entries,
+					gfp_t gfp_flags, bool can_alloc);
+
 depot_stack_handle_t stack_depot_save(unsigned long *entries,
 				      unsigned int nr_entries, gfp_t gfp_flags);
 
--- a/lib/stackdepot.c~lib-stackdepot-introduce-__stack_depot_save
+++ a/lib/stackdepot.c
@@ -248,17 +248,28 @@ unsigned int stack_depot_fetch(depot_sta
 EXPORT_SYMBOL_GPL(stack_depot_fetch);
 
 /**
- * stack_depot_save - Save a stack trace from an array
+ * __stack_depot_save - Save a stack trace from an array
  *
  * @entries:		Pointer to storage array
  * @nr_entries:		Size of the storage array
  * @alloc_flags:	Allocation gfp flags
+ * @can_alloc:		Allocate stack slabs (increased chance of failure if false)
+ *
+ * Saves a stack trace from @entries array of size @nr_entries. If @can_alloc is
+ * %true, is allowed to replenish the stack slab pool in case no space is left
+ * (allocates using GFP flags of @alloc_flags). If @can_alloc is %false, avoids
+ * any allocations and will fail if no space is left to store the stack trace.
+ *
+ * Context: Any context, but setting @can_alloc to %false is required if
+ *          alloc_pages() cannot be used from the current context. Currently
+ *          this is the case from contexts where neither %GFP_ATOMIC nor
+ *          %GFP_NOWAIT can be used (NMI, raw_spin_lock).
  *
- * Return: The handle of the stack struct stored in depot
+ * Return: The handle of the stack struct stored in depot, 0 on failure.
  */
-depot_stack_handle_t stack_depot_save(unsigned long *entries,
-				      unsigned int nr_entries,
-				      gfp_t alloc_flags)
+depot_stack_handle_t __stack_depot_save(unsigned long *entries,
+					unsigned int nr_entries,
+					gfp_t alloc_flags, bool can_alloc)
 {
 	struct stack_record *found = NULL, **bucket;
 	depot_stack_handle_t retval = 0;
@@ -291,7 +302,7 @@ depot_stack_handle_t stack_depot_save(un
 	 * The smp_load_acquire() here pairs with smp_store_release() to
 	 * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
 	 */
-	if (unlikely(!smp_load_acquire(&next_slab_inited))) {
+	if (unlikely(can_alloc && !smp_load_acquire(&next_slab_inited))) {
 		/*
 		 * Zero out zone modifiers, as we don't have specific zone
 		 * requirements. Keep the flags related to allocation in atomic
@@ -339,6 +350,26 @@ exit:
 fast_exit:
 	return retval;
 }
+EXPORT_SYMBOL_GPL(__stack_depot_save);
+
+/**
+ * stack_depot_save - Save a stack trace from an array
+ *
+ * @entries:		Pointer to storage array
+ * @nr_entries:		Size of the storage array
+ * @alloc_flags:	Allocation gfp flags
+ *
+ * Context: Contexts where allocations via alloc_pages() are allowed.
+ *          See __stack_depot_save() for more details.
+ *
+ * Return: The handle of the stack struct stored in depot, 0 on failure.
+ */
+depot_stack_handle_t stack_depot_save(unsigned long *entries,
+				      unsigned int nr_entries,
+				      gfp_t alloc_flags)
+{
+	return __stack_depot_save(entries, nr_entries, alloc_flags, true);
+}
 EXPORT_SYMBOL_GPL(stack_depot_save);
 
 static inline int in_irqentry_text(unsigned long ptr)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 022/262] kasan: common: provide can_alloc in kasan_save_stack()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (20 preceding siblings ...)
  2021-11-05 20:35 ` [patch 021/262] lib/stackdepot: introduce __stack_depot_save() Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 023/262] kasan: generic: introduce kasan_record_aux_stack_noalloc() Andrew Morton
                   ` (239 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: kasan: common: provide can_alloc in kasan_save_stack()

Add another argument, can_alloc, to kasan_save_stack() which is passed
as-is to __stack_depot_save().

No functional change intended.

Link: https://lkml.kernel.org/r/20210913112609.2651084-5-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kasan/common.c  |    6 +++---
 mm/kasan/generic.c |    2 +-
 mm/kasan/kasan.h   |    2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

--- a/mm/kasan/common.c~kasan-common-provide-can_alloc-in-kasan_save_stack
+++ a/mm/kasan/common.c
@@ -30,20 +30,20 @@
 #include "kasan.h"
 #include "../slab.h"
 
-depot_stack_handle_t kasan_save_stack(gfp_t flags)
+depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc)
 {
 	unsigned long entries[KASAN_STACK_DEPTH];
 	unsigned int nr_entries;
 
 	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0);
 	nr_entries = filter_irq_stacks(entries, nr_entries);
-	return stack_depot_save(entries, nr_entries, flags);
+	return __stack_depot_save(entries, nr_entries, flags, can_alloc);
 }
 
 void kasan_set_track(struct kasan_track *track, gfp_t flags)
 {
 	track->pid = current->pid;
-	track->stack = kasan_save_stack(flags);
+	track->stack = kasan_save_stack(flags, true);
 }
 
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
--- a/mm/kasan/generic.c~kasan-common-provide-can_alloc-in-kasan_save_stack
+++ a/mm/kasan/generic.c
@@ -345,7 +345,7 @@ void kasan_record_aux_stack(void *addr)
 		return;
 
 	alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0];
-	alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT);
+	alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, true);
 }
 
 void kasan_set_free_info(struct kmem_cache *cache,
--- a/mm/kasan/kasan.h~kasan-common-provide-can_alloc-in-kasan_save_stack
+++ a/mm/kasan/kasan.h
@@ -251,7 +251,7 @@ void kasan_report_invalid_free(void *obj
 
 struct page *kasan_addr_to_page(const void *addr);
 
-depot_stack_handle_t kasan_save_stack(gfp_t flags);
+depot_stack_handle_t kasan_save_stack(gfp_t flags, bool can_alloc);
 void kasan_set_track(struct kasan_track *track, gfp_t flags);
 void kasan_set_free_info(struct kmem_cache *cache, void *object, u8 tag);
 struct kasan_track *kasan_get_free_track(struct kmem_cache *cache,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 023/262] kasan: generic: introduce kasan_record_aux_stack_noalloc()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (21 preceding siblings ...)
  2021-11-05 20:35 ` [patch 022/262] kasan: common: provide can_alloc in kasan_save_stack() Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 024/262] workqueue, kasan: avoid alloc_pages() when recording stack Andrew Morton
                   ` (238 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: kasan: generic: introduce kasan_record_aux_stack_noalloc()

Introduce a variant of kasan_record_aux_stack() that does not do any
memory allocation through stackdepot.  This will permit using it in
contexts that cannot allocate any memory.

Link: https://lkml.kernel.org/r/20210913112609.2651084-6-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kasan.h |    2 ++
 mm/kasan/generic.c    |   14 ++++++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

--- a/include/linux/kasan.h~kasan-generic-introduce-kasan_record_aux_stack_noalloc
+++ a/include/linux/kasan.h
@@ -370,12 +370,14 @@ static inline void kasan_unpoison_task_s
 void kasan_cache_shrink(struct kmem_cache *cache);
 void kasan_cache_shutdown(struct kmem_cache *cache);
 void kasan_record_aux_stack(void *ptr);
+void kasan_record_aux_stack_noalloc(void *ptr);
 
 #else /* CONFIG_KASAN_GENERIC */
 
 static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
 static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
 static inline void kasan_record_aux_stack(void *ptr) {}
+static inline void kasan_record_aux_stack_noalloc(void *ptr) {}
 
 #endif /* CONFIG_KASAN_GENERIC */
 
--- a/mm/kasan/generic.c~kasan-generic-introduce-kasan_record_aux_stack_noalloc
+++ a/mm/kasan/generic.c
@@ -328,7 +328,7 @@ DEFINE_ASAN_SET_SHADOW(f3);
 DEFINE_ASAN_SET_SHADOW(f5);
 DEFINE_ASAN_SET_SHADOW(f8);
 
-void kasan_record_aux_stack(void *addr)
+static void __kasan_record_aux_stack(void *addr, bool can_alloc)
 {
 	struct page *page = kasan_addr_to_page(addr);
 	struct kmem_cache *cache;
@@ -345,7 +345,17 @@ void kasan_record_aux_stack(void *addr)
 		return;
 
 	alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0];
-	alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, true);
+	alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, can_alloc);
+}
+
+void kasan_record_aux_stack(void *addr)
+{
+	return __kasan_record_aux_stack(addr, true);
+}
+
+void kasan_record_aux_stack_noalloc(void *addr)
+{
+	return __kasan_record_aux_stack(addr, false);
 }
 
 void kasan_set_free_info(struct kmem_cache *cache,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 024/262] workqueue, kasan: avoid alloc_pages() when recording stack
  2021-11-05 20:34 incoming Andrew Morton
                   ` (22 preceding siblings ...)
  2021-11-05 20:35 ` [patch 023/262] kasan: generic: introduce kasan_record_aux_stack_noalloc() Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 025/262] kasan: fix tag for large allocations when using CONFIG_SLAB Andrew Morton
                   ` (237 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, bigeasy, dvyukov, elver, glider, gustavoars,
	jiangshanlai, linux-mm, mm-commits, ryabinin.a.a, skhan,
	tarasmadan, tglx, tj, torvalds, vinmenon, vjitta, walter-zh.wu

From: Marco Elver <elver@google.com>
Subject: workqueue, kasan: avoid alloc_pages() when recording stack

Shuah Khan reported:

 | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled,
 | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when
 | it tries to allocate memory attempting to acquire spinlock in page
 | allocation code while holding workqueue pool raw_spinlock.
 |
 | There are several instances of this problem when block layer tries
 | to __queue_work(). Call trace from one of these instances is below:
 |
 |     kblockd_mod_delayed_work_on()
 |       mod_delayed_work_on()
 |         __queue_delayed_work()
 |           __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held)
 |             insert_work()
 |               kasan_record_aux_stack()
 |                 kasan_save_stack()
 |                   stack_depot_save()
 |                     alloc_pages()
 |                       __alloc_pages()
 |                         get_page_from_freelist()
 |                           rm_queue()
 |                             rm_queue_pcplist()
 |                               local_lock_irqsave(&pagesets.lock, flags);
 |                               [ BUG: Invalid wait context triggered ]

The default kasan_record_aux_stack() calls stack_depot_save() with
GFP_NOWAIT, which in turn can then call alloc_pages(GFP_NOWAIT, ...).  In
general, however, it is not even possible to use either GFP_ATOMIC nor
GFP_NOWAIT in certain non-preemptive contexts, including raw_spin_locks
(see gfp.h and ab00db216c9c7).

Fix it by instructing stackdepot to not expand stack storage via
alloc_pages() in case it runs out by using
kasan_record_aux_stack_noalloc().

While there is an increased risk of failing to insert the stack trace,
this is typically unlikely, especially if the same insertion had already
succeeded previously (stack depot hit).  For frequent calls from the same
location, it therefore becomes extremely unlikely that
kasan_record_aux_stack_noalloc() fails.

Link: https://lkml.kernel.org/r/20210902200134.25603-1-skhan@linuxfoundation.org
Link: https://lkml.kernel.org/r/20210913112609.2651084-7-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Taras Madan <tarasmadan@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/workqueue.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/workqueue.c~workqueue-kasan-avoid-alloc_pages-when-recording-stack
+++ a/kernel/workqueue.c
@@ -1350,7 +1350,7 @@ static void insert_work(struct pool_work
 	struct worker_pool *pool = pwq->pool;
 
 	/* record the work call stack in order to print it in KASAN reports */
-	kasan_record_aux_stack(work);
+	kasan_record_aux_stack_noalloc(work);
 
 	/* we own @work, set data and link */
 	set_work_pwq(work, pwq, extra_flags);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 025/262] kasan: fix tag for large allocations when using CONFIG_SLAB
  2021-11-05 20:34 incoming Andrew Morton
                   ` (23 preceding siblings ...)
  2021-11-05 20:35 ` [patch 024/262] workqueue, kasan: avoid alloc_pages() when recording stack Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 026/262] kasan: test: add memcpy test that avoids out-of-bounds write Andrew Morton
                   ` (236 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, dvyukov, elver, glider, linux-mm, mm-commits,
	ryabinin.a.a, torvalds, willy

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: kasan: fix tag for large allocations when using CONFIG_SLAB

If an object is allocated on a tail page of a multi-page slab, kasan will
get the wrong tag because page->s_mem is NULL for tail pages.  I'm not
quite sure what the user-visible effect of this might be.

Link: https://lkml.kernel.org/r/20211001024105.3217339-1-willy@infradead.org
Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kasan/common.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/kasan/common.c~kasan-fix-tag-for-large-allocations-when-using-config_slab
+++ a/mm/kasan/common.c
@@ -298,7 +298,7 @@ static inline u8 assign_tag(struct kmem_
 	/* For caches that either have a constructor or SLAB_TYPESAFE_BY_RCU: */
 #ifdef CONFIG_SLAB
 	/* For SLAB assign tags based on the object index in the freelist. */
-	return (u8)obj_to_index(cache, virt_to_page(object), (void *)object);
+	return (u8)obj_to_index(cache, virt_to_head_page(object), (void *)object);
 #else
 	/*
 	 * For SLUB assign a random tag during slab creation, otherwise reuse
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 026/262] kasan: test: add memcpy test that avoids out-of-bounds write
  2021-11-05 20:34 incoming Andrew Morton
                   ` (24 preceding siblings ...)
  2021-11-05 20:35 ` [patch 025/262] kasan: fix tag for large allocations when using CONFIG_SLAB Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:35 ` [patch 027/262] mm/smaps: fix shmem pte hole swap calculation Andrew Morton
                   ` (235 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, elver, eugenis, glider,
	linux-mm, mark.rutland, mm-commits, pcc, robin.murphy, torvalds,
	will

From: Peter Collingbourne <pcc@google.com>
Subject: kasan: test: add memcpy test that avoids out-of-bounds write

With HW tag-based KASAN, error checks are performed implicitly by the load
and store instructions in the memcpy implementation.  A failed check
results in tag checks being disabled and execution will keep going.  As a
result, under HW tag-based KASAN, prior to commit 1b0668be62cf ("kasan:
test: disable kmalloc_memmove_invalid_size for HW_TAGS"), this memcpy
would end up corrupting memory until it hits an inaccessible page and
causes a kernel panic.

This is a pre-existing issue that was revealed by commit 285133040e6c
("arm64: Import latest memcpy()/memmove() implementation") which changed
the memcpy implementation from using signed comparisons (incorrectly,
resulting in the memcpy being terminated early for negative sizes) to
using unsigned comparisons.

It is unclear how this could be handled by memcpy itself in a reasonable
way.  One possibility would be to add an exception handler that would
force memcpy to return if a tag check fault is detected -- this would make
the behavior roughly similar to generic and SW tag-based KASAN.  However,
this wouldn't solve the problem for asynchronous mode and also makes
memcpy behavior inconsistent with manually copying data.

This test was added as a part of a series that taught KASAN to detect
negative sizes in memory operations, see commit 8cceeff48f23 ("kasan:
detect negative size in memory operation function").  Therefore we should
keep testing for negative sizes with generic and SW tag-based KASAN.  But
there is some value in testing small memcpy overflows, so let's add
another test with memcpy that does not destabilize the kernel by
performing out-of-bounds writes, and run it in all modes.

Link: https://linux-review.googlesource.com/id/I048d1e6a9aff766c4a53f989fb0c83de68923882
Link: https://lkml.kernel.org/r/20210910211356.3603758-1-pcc@google.com
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_kasan.c |   18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

--- a/lib/test_kasan.c~kasan-test-add-memcpy-test-that-avoids-out-of-bounds-write
+++ a/lib/test_kasan.c
@@ -493,7 +493,7 @@ static void kmalloc_oob_in_memset(struct
 	kfree(ptr);
 }
 
-static void kmalloc_memmove_invalid_size(struct kunit *test)
+static void kmalloc_memmove_negative_size(struct kunit *test)
 {
 	char *ptr;
 	size_t size = 64;
@@ -515,6 +515,21 @@ static void kmalloc_memmove_invalid_size
 	kfree(ptr);
 }
 
+static void kmalloc_memmove_invalid_size(struct kunit *test)
+{
+	char *ptr;
+	size_t size = 64;
+	volatile size_t invalid_size = size;
+
+	ptr = kmalloc(size, GFP_KERNEL);
+	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
+
+	memset((char *)ptr, 0, 64);
+	KUNIT_EXPECT_KASAN_FAIL(test,
+		memmove((char *)ptr, (char *)ptr + 4, invalid_size));
+	kfree(ptr);
+}
+
 static void kmalloc_uaf(struct kunit *test)
 {
 	char *ptr;
@@ -1129,6 +1144,7 @@ static struct kunit_case kasan_kunit_tes
 	KUNIT_CASE(kmalloc_oob_memset_4),
 	KUNIT_CASE(kmalloc_oob_memset_8),
 	KUNIT_CASE(kmalloc_oob_memset_16),
+	KUNIT_CASE(kmalloc_memmove_negative_size),
 	KUNIT_CASE(kmalloc_memmove_invalid_size),
 	KUNIT_CASE(kmalloc_uaf),
 	KUNIT_CASE(kmalloc_uaf_memset),
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 027/262] mm/smaps: fix shmem pte hole swap calculation
  2021-11-05 20:34 incoming Andrew Morton
                   ` (25 preceding siblings ...)
  2021-11-05 20:35 ` [patch 026/262] kasan: test: add memcpy test that avoids out-of-bounds write Andrew Morton
@ 2021-11-05 20:35 ` Andrew Morton
  2021-11-05 20:36 ` [patch 028/262] mm/smaps: use vma->vm_pgoff directly when counting partial swap Andrew Morton
                   ` (234 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:35 UTC (permalink / raw)
  To: aarcange, akpm, hughd, linux-mm, mm-commits, peterx, torvalds,
	vbabka, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm/smaps: fix shmem pte hole swap calculation

Patch series "mm/smaps: Fixes and optimizations on shmem swap handling".


This patch (of 3):

The shmem swap calculation on the privately writable mappings are using
wrong parameters as spotted by Vlastimil.  Fix them.  That's introduced in
commit 48131e03ca4e, when rework shmem_swap_usage to
shmem_partial_swap_usage.

Test program:

==================

void main(void)
{
    char *buffer, *p;
    int i, fd;

    fd = memfd_create("test", 0);
    assert(fd > 0);

    /* isize==2M*3, fill in pages, swap them out */
    ftruncate(fd, SIZE_2M * 3);
    buffer = mmap(NULL, SIZE_2M * 3, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    assert(buffer);
    for (i = 0, p = buffer; i < SIZE_2M * 3 / 4096; i++) {
        *p = 1;
        p += 4096;
    }
    madvise(buffer, SIZE_2M * 3, MADV_PAGEOUT);
    munmap(buffer, SIZE_2M * 3);

    /*
     * Remap with private+writtable mappings on partial of the inode (<= 2M*3),
     * while the size must also be >= 2M*2 to make sure there's a none pmd so
     * smaps_pte_hole will be triggered.
     */
    buffer = mmap(NULL, SIZE_2M * 2, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
    printf("pid=%d, buffer=%p
", getpid(), buffer);

    /* Check /proc/$PID/smap_rollup, should see 4MB swap */
    sleep(1000000);
}
==================

Before the patch, smaps_rollup shows <4MB swap and the number will be
random depending on the alignment of the buffer of mmap() allocated. 
After this patch, it'll show 4MB.

Link: https://lkml.kernel.org/r/20210917164756.8586-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210917164756.8586-2-peterx@redhat.com
Fixes: 48131e03ca4e ("mm, proc: reduce cost of /proc/pid/smaps for unpopulated shmem mappings")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/task_mmu.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/fs/proc/task_mmu.c~mm-smaps-fix-shmem-pte-hole-swap-calculation
+++ a/fs/proc/task_mmu.c
@@ -478,9 +478,11 @@ static int smaps_pte_hole(unsigned long
 			  __always_unused int depth, struct mm_walk *walk)
 {
 	struct mem_size_stats *mss = walk->private;
+	struct vm_area_struct *vma = walk->vma;
 
-	mss->swap += shmem_partial_swap_usage(
-			walk->vma->vm_file->f_mapping, addr, end);
+	mss->swap += shmem_partial_swap_usage(walk->vma->vm_file->f_mapping,
+					      linear_page_index(vma, addr),
+					      linear_page_index(vma, end));
 
 	return 0;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 028/262] mm/smaps: use vma->vm_pgoff directly when counting partial swap
  2021-11-05 20:34 incoming Andrew Morton
                   ` (26 preceding siblings ...)
  2021-11-05 20:35 ` [patch 027/262] mm/smaps: fix shmem pte hole swap calculation Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 029/262] mm/smaps: simplify shmem handling of pte holes Andrew Morton
                   ` (233 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: aarcange, akpm, hughd, linux-mm, mm-commits, peterx, torvalds,
	vbabka, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm/smaps: use vma->vm_pgoff directly when counting partial swap

As it's trying to cover the whole vma anyways, use direct vm_pgoff value
and vma_pages() rather than linear_page_index.

Link: https://lkml.kernel.org/r/20210917164756.8586-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/shmem.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/mm/shmem.c~mm-smaps-use-vma-vm_pgoff-directly-when-counting-partial-swap
+++ a/mm/shmem.c
@@ -856,9 +856,8 @@ unsigned long shmem_swap_usage(struct vm
 		return swapped << PAGE_SHIFT;
 
 	/* Here comes the more involved part */
-	return shmem_partial_swap_usage(mapping,
-			linear_page_index(vma, vma->vm_start),
-			linear_page_index(vma, vma->vm_end));
+	return shmem_partial_swap_usage(mapping, vma->vm_pgoff,
+					vma->vm_pgoff + vma_pages(vma));
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 029/262] mm/smaps: simplify shmem handling of pte holes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (27 preceding siblings ...)
  2021-11-05 20:36 ` [patch 028/262] mm/smaps: use vma->vm_pgoff directly when counting partial swap Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 030/262] mm: debug_vm_pgtable: don't use __P000 directly Andrew Morton
                   ` (232 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: aarcange, akpm, hughd, linux-mm, mm-commits, peterx, torvalds,
	vbabka, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm/smaps: simplify shmem handling of pte holes

Firstly, check_shmem_swap variable is actually not necessary, because it's
always set with pte_hole hook; checking each would work.

Meanwhile, the check within smaps_pte_entry is not easy to follow.  E.g.,
pte_none() check is not needed as "!pte_present && !is_swap_pte" is the
same.  Since at it, use the pte_hole() helper rather than dup the page
cache lookup.

Still keep the CONFIG_SHMEM part so the code can be optimized to nop for !SHMEM.

There will be a very slight functional change in smaps_pte_entry(), that
for !SHMEM we'll return early for pte_none (before checking page==NULL),
but that's even nicer.

Link: https://lkml.kernel.org/r/20210917164756.8586-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/proc/task_mmu.c |   22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

--- a/fs/proc/task_mmu.c~mm-smaps-simplify-shmem-handling-of-pte-holes
+++ a/fs/proc/task_mmu.c
@@ -397,7 +397,6 @@ struct mem_size_stats {
 	u64 pss_shmem;
 	u64 pss_locked;
 	u64 swap_pss;
-	bool check_shmem_swap;
 };
 
 static void smaps_page_accumulate(struct mem_size_stats *mss,
@@ -490,6 +489,16 @@ static int smaps_pte_hole(unsigned long
 #define smaps_pte_hole		NULL
 #endif /* CONFIG_SHMEM */
 
+static void smaps_pte_hole_lookup(unsigned long addr, struct mm_walk *walk)
+{
+#ifdef CONFIG_SHMEM
+	if (walk->ops->pte_hole) {
+		/* depth is not used */
+		smaps_pte_hole(addr, addr + PAGE_SIZE, 0, walk);
+	}
+#endif
+}
+
 static void smaps_pte_entry(pte_t *pte, unsigned long addr,
 		struct mm_walk *walk)
 {
@@ -518,12 +527,8 @@ static void smaps_pte_entry(pte_t *pte,
 			}
 		} else if (is_pfn_swap_entry(swpent))
 			page = pfn_swap_entry_to_page(swpent);
-	} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
-							&& pte_none(*pte))) {
-		page = xa_load(&vma->vm_file->f_mapping->i_pages,
-						linear_page_index(vma, addr));
-		if (xa_is_value(page))
-			mss->swap += PAGE_SIZE;
+	} else {
+		smaps_pte_hole_lookup(addr, walk);
 		return;
 	}
 
@@ -737,8 +742,6 @@ static void smap_gather_stats(struct vm_
 		return;
 
 #ifdef CONFIG_SHMEM
-	/* In case of smaps_rollup, reset the value from previous vma */
-	mss->check_shmem_swap = false;
 	if (vma->vm_file && shmem_mapping(vma->vm_file->f_mapping)) {
 		/*
 		 * For shared or readonly shmem mappings we know that all
@@ -756,7 +759,6 @@ static void smap_gather_stats(struct vm_
 					!(vma->vm_flags & VM_WRITE))) {
 			mss->swap += shmem_swapped;
 		} else {
-			mss->check_shmem_swap = true;
 			ops = &smaps_shmem_walk_ops;
 		}
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 030/262] mm: debug_vm_pgtable: don't use __P000 directly
  2021-11-05 20:34 incoming Andrew Morton
                   ` (28 preceding siblings ...)
  2021-11-05 20:36 ` [patch 029/262] mm/smaps: simplify shmem handling of pte holes Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 031/262] kasan: test: bypass __alloc_size checks Andrew Morton
                   ` (231 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, anshuman.khandual, christophe.leroy, gerald.schaefer,
	gshan, guoren, linux-mm, mm-commits, torvalds

From: Guo Ren <guoren@linux.alibaba.com>
Subject: mm: debug_vm_pgtable: don't use __P000 directly

The __Pxxx/__Sxxx macros are only for protection_map[] init.  All usage of
them in linux should come from protection_map array.

Because a lot of architectures would re-initilize protection_map[] array,
eg: x86-mem_encrypt, m68k-motorola, mips, arm, sparc.

Using __P000 is not rigorous.

Link: https://lkml.kernel.org/r/20210924060821.1138281-1-guoren@kernel.org
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/debug_vm_pgtable.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/debug_vm_pgtable.c~mm-debug_vm_pgtable-dont-use-__p000-directly
+++ a/mm/debug_vm_pgtable.c
@@ -1104,13 +1104,14 @@ static int __init init_args(struct pgtab
 	/*
 	 * Initialize the debugging data.
 	 *
-	 * __P000 (or even __S000) will help create page table entries with
-	 * PROT_NONE permission as required for pxx_protnone_tests().
+	 * protection_map[0] (or even protection_map[8]) will help create
+	 * page table entries with PROT_NONE permission as required for
+	 * pxx_protnone_tests().
 	 */
 	memset(args, 0, sizeof(*args));
 	args->vaddr              = get_random_vaddr();
 	args->page_prot          = vm_get_page_prot(VMFLAGS);
-	args->page_prot_none     = __P000;
+	args->page_prot_none     = protection_map[0];
 	args->is_contiguous_page = false;
 	args->pud_pfn            = ULONG_MAX;
 	args->pmd_pfn            = ULONG_MAX;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 031/262] kasan: test: bypass __alloc_size checks
  2021-11-05 20:34 incoming Andrew Morton
                   ` (29 preceding siblings ...)
  2021-11-05 20:36 ` [patch 030/262] mm: debug_vm_pgtable: don't use __P000 directly Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 032/262] rapidio: avoid bogus __alloc_size warning Andrew Morton
                   ` (230 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, andreyknvl, dvyukov, glider, keescook, linux-mm,
	mm-commits, ryabinin.a.a, torvalds

From: Kees Cook <keescook@chromium.org>
Subject: kasan: test: bypass __alloc_size checks

Intentional overflows, as performed by the KASAN tests, are detected at
compile time[1] (instead of only at run-time) with the addition of
__alloc_size.  Fix this by forcing the compiler into not being able to
trust the size used following the kmalloc()s.

[1] https://lore.kernel.org/lkml/20211005184717.65c6d8eb39350395e387b71f@linux-foundation.org

Link: https://lkml.kernel.org/r/20211006181544.1670992-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_kasan.c        |    8 +++++++-
 lib/test_kasan_module.c |    2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

--- a/lib/test_kasan.c~kasan-test-bypass-__alloc_size-checks
+++ a/lib/test_kasan.c
@@ -440,6 +440,7 @@ static void kmalloc_oob_memset_2(struct
 	ptr = kmalloc(size, GFP_KERNEL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+	OPTIMIZER_HIDE_VAR(size);
 	KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 1, 0, 2));
 	kfree(ptr);
 }
@@ -452,6 +453,7 @@ static void kmalloc_oob_memset_4(struct
 	ptr = kmalloc(size, GFP_KERNEL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+	OPTIMIZER_HIDE_VAR(size);
 	KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 3, 0, 4));
 	kfree(ptr);
 }
@@ -464,6 +466,7 @@ static void kmalloc_oob_memset_8(struct
 	ptr = kmalloc(size, GFP_KERNEL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+	OPTIMIZER_HIDE_VAR(size);
 	KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 7, 0, 8));
 	kfree(ptr);
 }
@@ -476,6 +479,7 @@ static void kmalloc_oob_memset_16(struct
 	ptr = kmalloc(size, GFP_KERNEL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+	OPTIMIZER_HIDE_VAR(size);
 	KUNIT_EXPECT_KASAN_FAIL(test, memset(ptr + size - 15, 0, 16));
 	kfree(ptr);
 }
@@ -488,6 +492,7 @@ static void kmalloc_oob_in_memset(struct
 	ptr = kmalloc(size, GFP_KERNEL);
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+	OPTIMIZER_HIDE_VAR(size);
 	KUNIT_EXPECT_KASAN_FAIL(test,
 				memset(ptr, 0, size + KASAN_GRANULE_SIZE));
 	kfree(ptr);
@@ -497,7 +502,7 @@ static void kmalloc_memmove_negative_siz
 {
 	char *ptr;
 	size_t size = 64;
-	volatile size_t invalid_size = -2;
+	size_t invalid_size = -2;
 
 	/*
 	 * Hardware tag-based mode doesn't check memmove for negative size.
@@ -510,6 +515,7 @@ static void kmalloc_memmove_negative_siz
 	KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
 	memset((char *)ptr, 0, 64);
+	OPTIMIZER_HIDE_VAR(invalid_size);
 	KUNIT_EXPECT_KASAN_FAIL(test,
 		memmove((char *)ptr, (char *)ptr + 4, invalid_size));
 	kfree(ptr);
--- a/lib/test_kasan_module.c~kasan-test-bypass-__alloc_size-checks
+++ a/lib/test_kasan_module.c
@@ -35,6 +35,8 @@ static noinline void __init copy_user_te
 		return;
 	}
 
+	OPTIMIZER_HIDE_VAR(size);
+
 	pr_info("out-of-bounds in copy_from_user()\n");
 	unused = copy_from_user(kmem, usermem, size + 1);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 032/262] rapidio: avoid bogus __alloc_size warning
  2021-11-05 20:34 incoming Andrew Morton
                   ` (30 preceding siblings ...)
  2021-11-05 20:36 ` [patch 031/262] kasan: test: bypass __alloc_size checks Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 033/262] Compiler Attributes: add __alloc_size() for better bounds checking Andrew Morton
                   ` (229 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: rapidio: avoid bogus __alloc_size warning

Patch series "Add __alloc_size()", v3.

GCC and Clang both use the "alloc_size" attribute to assist with bounds
checking around the use of allocation functions.  Add the attribute,
adjust the Makefile to silence needless warnings, and add the hints to the
allocators where possible.  These changes have been in use for a while now
in GrapheneOS.


This patch (of 8):

After adding __alloc_size attributes to the allocators, GCC 9.3 (but not
later) may incorrectly evaluate the arguments to check_copy_size(),
getting seemingly confused by the size being returned from array_size(). 
Instead, perform the calculation once, which both makes the code more
readable and avoids the bug in GCC.

   In file included from arch/x86/include/asm/preempt.h:7,
                    from include/linux/preempt.h:78,
                    from include/linux/spinlock.h:55,
                    from include/linux/mm_types.h:9,
                    from include/linux/buildid.h:5,
                    from include/linux/module.h:14,
                    from drivers/rapidio/devices/rio_mport_cdev.c:13:
   In function 'check_copy_size',
       inlined from 'copy_from_user' at include/linux/uaccess.h:191:6,
       inlined from 'rio_mport_transfer_ioctl' at drivers/rapidio/devices/rio_mport_cdev.c:983:6:
   include/linux/thread_info.h:213:4: error: call to '__bad_copy_to' declared with attribute error: copy destination size is too small
     213 |    __bad_copy_to();
         |    ^~~~~~~~~~~~~~~

But the allocation size and the copy size are identical:

	transfer = vmalloc(array_size(sizeof(*transfer), transaction.count));
	if (!transfer)
		return -ENOMEM;

	if (unlikely(copy_from_user(transfer,
				    (void __user *)(uintptr_t)transaction.block,
				    array_size(sizeof(*transfer), transaction.count)))) {

Link: https://lkml.kernel.org/r/20210930222704.2631604-1-keescook@chromium.org
Link: https://lkml.kernel.org/r/20210930222704.2631604-2-keescook@chromium.org
Link: https://lore.kernel.org/linux-mm/202109091134.FHnRmRxu-lkp@intel.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/rapidio/devices/rio_mport_cdev.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

--- a/drivers/rapidio/devices/rio_mport_cdev.c~rapidio-avoid-bogus-__alloc_size-warning
+++ a/drivers/rapidio/devices/rio_mport_cdev.c
@@ -965,6 +965,7 @@ static int rio_mport_transfer_ioctl(stru
 	struct rio_transfer_io *transfer;
 	enum dma_data_direction dir;
 	int i, ret = 0;
+	size_t size;
 
 	if (unlikely(copy_from_user(&transaction, arg, sizeof(transaction))))
 		return -EFAULT;
@@ -976,13 +977,14 @@ static int rio_mport_transfer_ioctl(stru
 	     priv->md->properties.transfer_mode) == 0)
 		return -ENODEV;
 
-	transfer = vmalloc(array_size(sizeof(*transfer), transaction.count));
+	size = array_size(sizeof(*transfer), transaction.count);
+	transfer = vmalloc(size);
 	if (!transfer)
 		return -ENOMEM;
 
 	if (unlikely(copy_from_user(transfer,
 				    (void __user *)(uintptr_t)transaction.block,
-				    array_size(sizeof(*transfer), transaction.count)))) {
+				    size))) {
 		ret = -EFAULT;
 		goto out_free;
 	}
@@ -994,8 +996,7 @@ static int rio_mport_transfer_ioctl(stru
 			transaction.sync, dir, &transfer[i]);
 
 	if (unlikely(copy_to_user((void __user *)(uintptr_t)transaction.block,
-				  transfer,
-				  array_size(sizeof(*transfer), transaction.count))))
+				  transfer, size)))
 		ret = -EFAULT;
 
 out_free:
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 033/262] Compiler Attributes: add __alloc_size() for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (31 preceding siblings ...)
  2021-11-05 20:36 ` [patch 032/262] rapidio: avoid bogus __alloc_size warning Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 034/262] slab: clean up function prototypes Andrew Morton
                   ` (228 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: Compiler Attributes: add __alloc_size() for better bounds checking

GCC and Clang can use the "alloc_size" attribute to better inform the
results of __builtin_object_size() (for compile-time constant values). 
Clang can additionally use alloc_size to inform the results of
__builtin_dynamic_object_size() (for run-time values).

Because GCC sees the frequent use of struct_size() as an allocator size
argument, and notices it can return SIZE_MAX (the overflow indication), it
complains about these call sites overflowing (since SIZE_MAX is greater
than the default -Walloc-size-larger-than=PTRDIFF_MAX).  This isn't
helpful since we already know a SIZE_MAX will be caught at run-time (this
was an intentional design).  To deal with this, we must disable this check
as it is both a false positive and redundant.  (Clang does not have this
warning option.)

Unfortunately, just checking the -Wno-alloc-size-larger-than is not
sufficient to make the __alloc_size attribute behave correctly under older
GCC versions.  The attribute itself must be disabled in those situations
too, as there appears to be no way to reliably silence the SIZE_MAX
constant expression cases for GCC versions less than 9.1:

In file included from ./include/linux/resource_ext.h:11,
                 from ./include/linux/pci.h:40,
                 from drivers/net/ethernet/intel/ixgbe/ixgbe.h:9,
                 from drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c:4:
In function 'kmalloc_node',
    inlined from 'ixgbe_alloc_q_vector' at ./include/linux/slab.h:743:9:
./include/linux/slab.h:618:9: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
  return __kmalloc_node(size, flags, node);
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/slab.h: In function 'ixgbe_alloc_q_vector':
./include/linux/slab.h:455:7: note: in a call to allocation function '__kmalloc_node' declared here
 void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_slab_alignment __malloc;
       ^~~~~~~~~~~~~~

Specifically:
-Wno-alloc-size-larger-than is not correctly handled by GCC < 9.1
  https://godbolt.org/z/hqsfG7q84 (doesn't disable)
  https://godbolt.org/z/P9jdrPTYh (doesn't admit to not knowing about option)
  https://godbolt.org/z/465TPMWKb (only warns when other warnings appear)

-Walloc-size-larger-than=18446744073709551615 is not handled by GCC < 8.2
  https://godbolt.org/z/73hh1EPxz (ignores numeric value)

Since anything marked with __alloc_size would also qualify for marking
with __malloc, just include __malloc along with it to avoid redundant
markings. (Suggested by Linus Torvalds.)

Finally, make sure checkpatch.pl doesn't get confused about finding the
__alloc_size attribute on functions. (Thanks to Joe Perches.)

Link: https://lkml.kernel.org/r/20210930222704.2631604-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Makefile                            |   15 +++++++++++++++
 include/linux/compiler-gcc.h        |    8 ++++++++
 include/linux/compiler_attributes.h |   10 ++++++++++
 include/linux/compiler_types.h      |   12 ++++++++++++
 scripts/checkpatch.pl               |    3 ++-
 5 files changed, 47 insertions(+), 1 deletion(-)

--- a/include/linux/compiler_attributes.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking
+++ a/include/linux/compiler_attributes.h
@@ -34,6 +34,15 @@
 #define __aligned_largest               __attribute__((__aligned__))
 
 /*
+ * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally
+ * available and includes other attributes.
+ *
+ *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
+ */
+#define __alloc_size__(x, ...)		__attribute__((__alloc_size__(x, ## __VA_ARGS__)))
+
+/*
  * Note: users of __always_inline currently do not write "inline" themselves,
  * which seems to be required by gcc to apply the attribute according
  * to its docs (and also "warning: always_inline function might not be
@@ -153,6 +162,7 @@
 
 /*
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#malloc
  */
 #define __malloc                        __attribute__((__malloc__))
 
--- a/include/linux/compiler-gcc.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking
+++ a/include/linux/compiler-gcc.h
@@ -144,3 +144,11 @@
 #else
 #define __diag_GCC_8(s)
 #endif
+
+/*
+ * Prior to 9.1, -Wno-alloc-size-larger-than (and therefore the "alloc_size"
+ * attribute) do not work, and must be disabled.
+ */
+#if GCC_VERSION < 90100
+#undef __alloc_size__
+#endif
--- a/include/linux/compiler_types.h~compiler-attributes-add-__alloc_size-for-better-bounds-checking
+++ a/include/linux/compiler_types.h
@@ -250,6 +250,18 @@ struct ftrace_likely_data {
 # define __cficanonical
 #endif
 
+/*
+ * Any place that could be marked with the "alloc_size" attribute is also
+ * a place to be marked with the "malloc" attribute. Do this as part of the
+ * __alloc_size macro to avoid redundant attributes and to avoid missing a
+ * __malloc marking.
+ */
+#ifdef __alloc_size__
+# define __alloc_size(x, ...)	__alloc_size__(x, ## __VA_ARGS__) __malloc
+#else
+# define __alloc_size(x, ...)	__malloc
+#endif
+
 #ifndef asm_volatile_goto
 #define asm_volatile_goto(x...) asm goto(x)
 #endif
--- a/Makefile~compiler-attributes-add-__alloc_size-for-better-bounds-checking
+++ a/Makefile
@@ -1008,6 +1008,21 @@ ifdef CONFIG_CC_IS_GCC
 KBUILD_CFLAGS += -Wno-maybe-uninitialized
 endif
 
+ifdef CONFIG_CC_IS_GCC
+# The allocators already balk at large sizes, so silence the compiler
+# warnings for bounds checks involving those possible values. While
+# -Wno-alloc-size-larger-than would normally be used here, earlier versions
+# of gcc (<9.1) weirdly don't handle the option correctly when _other_
+# warnings are produced (?!). Using -Walloc-size-larger-than=SIZE_MAX
+# doesn't work (as it is documented to), silently resolving to "0" prior to
+# version 9.1 (and producing an error more recently). Numeric values larger
+# than PTRDIFF_MAX also don't work prior to version 9.1, which are silently
+# ignored, continuing to default to PTRDIFF_MAX. So, left with no other
+# choice, we must perform a versioned check to disable this warning.
+# https://lore.kernel.org/lkml/20210824115859.187f272f@canb.auug.org.au
+KBUILD_CFLAGS += $(call cc-ifversion, -ge, 0901, -Wno-alloc-size-larger-than)
+endif
+
 # disable invalid "can't wrap" optimizations for signed / pointers
 KBUILD_CFLAGS	+= -fno-strict-overflow
 
--- a/scripts/checkpatch.pl~compiler-attributes-add-__alloc_size-for-better-bounds-checking
+++ a/scripts/checkpatch.pl
@@ -489,7 +489,8 @@ our $Attribute	= qr{
 			____cacheline_aligned|
 			____cacheline_aligned_in_smp|
 			____cacheline_internodealigned_in_smp|
-			__weak
+			__weak|
+			__alloc_size\s*\(\s*\d+\s*(?:,\s*\d+\s*)?\)
 		  }x;
 our $Modifier;
 our $Inline	= qr{inline|__always_inline|noinline|__inline|__inline__};
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 034/262] slab: clean up function prototypes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (32 preceding siblings ...)
  2021-11-05 20:36 ` [patch 033/262] Compiler Attributes: add __alloc_size() for better bounds checking Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 035/262] slab: add __alloc_size attributes for better bounds checking Andrew Morton
                   ` (227 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: slab: clean up function prototypes

Based on feedback from Joe Perches and Linus Torvalds, regularize the slab
function prototypes before making attribute changes.

Link: https://lkml.kernel.org/r/20210930222704.2631604-4-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/slab.h |   68 ++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 34 deletions(-)

--- a/include/linux/slab.h~slab-clean-up-function-prototypes
+++ a/include/linux/slab.h
@@ -152,8 +152,8 @@ struct kmem_cache *kmem_cache_create_use
 			slab_flags_t flags,
 			unsigned int useroffset, unsigned int usersize,
 			void (*ctor)(void *));
-void kmem_cache_destroy(struct kmem_cache *);
-int kmem_cache_shrink(struct kmem_cache *);
+void kmem_cache_destroy(struct kmem_cache *s);
+int kmem_cache_shrink(struct kmem_cache *s);
 
 /*
  * Please use this macro to create slab caches. Simply specify the
@@ -181,11 +181,11 @@ int kmem_cache_shrink(struct kmem_cache
 /*
  * Common kmalloc functions provided by all allocators
  */
-void * __must_check krealloc(const void *, size_t, gfp_t);
-void kfree(const void *);
-void kfree_sensitive(const void *);
-size_t __ksize(const void *);
-size_t ksize(const void *);
+void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags);
+void kfree(const void *objp);
+void kfree_sensitive(const void *objp);
+size_t __ksize(const void *objp);
+size_t ksize(const void *objp);
 #ifdef CONFIG_PRINTK
 bool kmem_valid_obj(void *object);
 void kmem_dump_obj(void *object);
@@ -426,8 +426,8 @@ static __always_inline unsigned int __km
 #endif /* !CONFIG_SLOB */
 
 void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc;
-void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment __malloc;
-void kmem_cache_free(struct kmem_cache *, void *);
+void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc;
+void kmem_cache_free(struct kmem_cache *s, void *objp);
 
 /*
  * Bulk allocation and freeing operations. These are accelerated in an
@@ -436,8 +436,8 @@ void kmem_cache_free(struct kmem_cache *
  *
  * Note that interrupts must be enabled when calling these functions.
  */
-void kmem_cache_free_bulk(struct kmem_cache *, size_t, void **);
-int kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
+void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
+int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, void **p);
 
 /*
  * Caller must not use kfree_bulk() on memory not originally allocated
@@ -450,7 +450,8 @@ static __always_inline void kfree_bulk(s
 
 #ifdef CONFIG_NUMA
 void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc;
-void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment __malloc;
+void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment
+									 __malloc;
 #else
 static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
@@ -464,25 +465,24 @@ static __always_inline void *kmem_cache_
 #endif
 
 #ifdef CONFIG_TRACING
-extern void *kmem_cache_alloc_trace(struct kmem_cache *, gfp_t, size_t) __assume_slab_alignment __malloc;
+extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
+				   __assume_slab_alignment __malloc;
 
 #ifdef CONFIG_NUMA
-extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
-					   gfp_t gfpflags,
-					   int node, size_t size) __assume_slab_alignment __malloc;
+extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
+					 int node, size_t size) __assume_slab_alignment __malloc;
 #else
-static __always_inline void *
-kmem_cache_alloc_node_trace(struct kmem_cache *s,
-			      gfp_t gfpflags,
-			      int node, size_t size)
+static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
+							 gfp_t gfpflags, int node,
+							 size_t size)
 {
 	return kmem_cache_alloc_trace(s, gfpflags, size);
 }
 #endif /* CONFIG_NUMA */
 
 #else /* CONFIG_TRACING */
-static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s,
-		gfp_t flags, size_t size)
+static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags,
+						    size_t size)
 {
 	void *ret = kmem_cache_alloc(s, flags);
 
@@ -490,10 +490,8 @@ static __always_inline void *kmem_cache_
 	return ret;
 }
 
-static __always_inline void *
-kmem_cache_alloc_node_trace(struct kmem_cache *s,
-			      gfp_t gfpflags,
-			      int node, size_t size)
+static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
+							 int node, size_t size)
 {
 	void *ret = kmem_cache_alloc_node(s, gfpflags, node);
 
@@ -502,13 +500,14 @@ kmem_cache_alloc_node_trace(struct kmem_
 }
 #endif /* CONFIG_TRACING */
 
-extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc;
+extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment
+									 __malloc;
 
 #ifdef CONFIG_TRACING
-extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc;
+extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
+				__assume_page_alignment __malloc;
 #else
-static __always_inline void *
-kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
+static __always_inline void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
 {
 	return kmalloc_order(size, flags, order);
 }
@@ -638,8 +637,8 @@ static inline void *kmalloc_array(size_t
  * @new_size: new size of a single member of the array
  * @flags: the type of memory to allocate (see kmalloc)
  */
-static __must_check inline void *
-krealloc_array(void *p, size_t new_n, size_t new_size, gfp_t flags)
+static inline void * __must_check krealloc_array(void *p, size_t new_n, size_t new_size,
+						 gfp_t flags)
 {
 	size_t bytes;
 
@@ -668,7 +667,7 @@ static inline void *kcalloc(size_t n, si
  * allocator where we care about the real place the memory allocation
  * request comes from.
  */
-extern void *__kmalloc_track_caller(size_t, gfp_t, unsigned long);
+extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller);
 #define kmalloc_track_caller(size, flags) \
 	__kmalloc_track_caller(size, flags, _RET_IP_)
 
@@ -691,7 +690,8 @@ static inline void *kcalloc_node(size_t
 
 
 #ifdef CONFIG_NUMA
-extern void *__kmalloc_node_track_caller(size_t, gfp_t, int, unsigned long);
+extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
+					 unsigned long caller);
 #define kmalloc_node_track_caller(size, flags, node) \
 	__kmalloc_node_track_caller(size, flags, node, \
 			_RET_IP_)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 035/262] slab: add __alloc_size attributes for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (33 preceding siblings ...)
  2021-11-05 20:36 ` [patch 034/262] slab: clean up function prototypes Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 036/262] mm/kvmalloc: " Andrew Morton
                   ` (226 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: slab: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for regular
kmalloc interfaces, to provide additional hinting for better bounds
checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210930222704.2631604-5-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/slab.h |   61 ++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 28 deletions(-)

--- a/include/linux/slab.h~slab-add-__alloc_size-attributes-for-better-bounds-checking
+++ a/include/linux/slab.h
@@ -181,7 +181,7 @@ int kmem_cache_shrink(struct kmem_cache
 /*
  * Common kmalloc functions provided by all allocators
  */
-void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags);
+void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2);
 void kfree(const void *objp);
 void kfree_sensitive(const void *objp);
 size_t __ksize(const void *objp);
@@ -425,7 +425,7 @@ static __always_inline unsigned int __km
 #define kmalloc_index(s) __kmalloc_index(s, true)
 #endif /* !CONFIG_SLOB */
 
-void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc;
+void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1);
 void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc;
 void kmem_cache_free(struct kmem_cache *s, void *objp);
 
@@ -449,11 +449,12 @@ static __always_inline void kfree_bulk(s
 }
 
 #ifdef CONFIG_NUMA
-void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc;
+void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment
+							 __alloc_size(1);
 void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment
 									 __malloc;
 #else
-static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node)
+static __always_inline __alloc_size(1) void *__kmalloc_node(size_t size, gfp_t flags, int node)
 {
 	return __kmalloc(size, flags);
 }
@@ -466,23 +467,23 @@ static __always_inline void *kmem_cache_
 
 #ifdef CONFIG_TRACING
 extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
-				   __assume_slab_alignment __malloc;
+				   __assume_slab_alignment __alloc_size(3);
 
 #ifdef CONFIG_NUMA
 extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
-					 int node, size_t size) __assume_slab_alignment __malloc;
+					 int node, size_t size) __assume_slab_alignment
+								__alloc_size(4);
 #else
-static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
-							 gfp_t gfpflags, int node,
-							 size_t size)
+static __always_inline __alloc_size(4) void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
+						 gfp_t gfpflags, int node, size_t size)
 {
 	return kmem_cache_alloc_trace(s, gfpflags, size);
 }
 #endif /* CONFIG_NUMA */
 
 #else /* CONFIG_TRACING */
-static __always_inline void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags,
-						    size_t size)
+static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_cache *s,
+								    gfp_t flags, size_t size)
 {
 	void *ret = kmem_cache_alloc(s, flags);
 
@@ -501,19 +502,20 @@ static __always_inline void *kmem_cache_
 #endif /* CONFIG_TRACING */
 
 extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment
-									 __malloc;
+									 __alloc_size(1);
 
 #ifdef CONFIG_TRACING
 extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
-				__assume_page_alignment __malloc;
+				__assume_page_alignment __alloc_size(1);
 #else
-static __always_inline void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
+static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags,
+								 unsigned int order)
 {
 	return kmalloc_order(size, flags, order);
 }
 #endif
 
-static __always_inline void *kmalloc_large(size_t size, gfp_t flags)
+static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags)
 {
 	unsigned int order = get_order(size);
 	return kmalloc_order_trace(size, flags, order);
@@ -573,7 +575,7 @@ static __always_inline void *kmalloc_lar
  *	Try really hard to succeed the allocation but fail
  *	eventually.
  */
-static __always_inline void *kmalloc(size_t size, gfp_t flags)
+static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
 {
 	if (__builtin_constant_p(size)) {
 #ifndef CONFIG_SLOB
@@ -595,7 +597,7 @@ static __always_inline void *kmalloc(siz
 	return __kmalloc(size, flags);
 }
 
-static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node)
+static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
 {
 #ifndef CONFIG_SLOB
 	if (__builtin_constant_p(size) &&
@@ -619,7 +621,7 @@ static __always_inline void *kmalloc_nod
  * @size: element size.
  * @flags: the type of memory to allocate (see kmalloc).
  */
-static inline void *kmalloc_array(size_t n, size_t size, gfp_t flags)
+static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_t flags)
 {
 	size_t bytes;
 
@@ -637,8 +639,10 @@ static inline void *kmalloc_array(size_t
  * @new_size: new size of a single member of the array
  * @flags: the type of memory to allocate (see kmalloc)
  */
-static inline void * __must_check krealloc_array(void *p, size_t new_n, size_t new_size,
-						 gfp_t flags)
+static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p,
+								    size_t new_n,
+								    size_t new_size,
+								    gfp_t flags)
 {
 	size_t bytes;
 
@@ -654,7 +658,7 @@ static inline void * __must_check kreall
  * @size: element size.
  * @flags: the type of memory to allocate (see kmalloc).
  */
-static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
+static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flags)
 {
 	return kmalloc_array(n, size, flags | __GFP_ZERO);
 }
@@ -667,12 +671,13 @@ static inline void *kcalloc(size_t n, si
  * allocator where we care about the real place the memory allocation
  * request comes from.
  */
-extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller);
+extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller)
+				   __alloc_size(1);
 #define kmalloc_track_caller(size, flags) \
 	__kmalloc_track_caller(size, flags, _RET_IP_)
 
-static inline void *kmalloc_array_node(size_t n, size_t size, gfp_t flags,
-				       int node)
+static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags,
+							  int node)
 {
 	size_t bytes;
 
@@ -683,7 +688,7 @@ static inline void *kmalloc_array_node(s
 	return __kmalloc_node(bytes, flags, node);
 }
 
-static inline void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node)
+static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node)
 {
 	return kmalloc_array_node(n, size, flags | __GFP_ZERO, node);
 }
@@ -691,7 +696,7 @@ static inline void *kcalloc_node(size_t
 
 #ifdef CONFIG_NUMA
 extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
-					 unsigned long caller);
+					 unsigned long caller) __alloc_size(1);
 #define kmalloc_node_track_caller(size, flags, node) \
 	__kmalloc_node_track_caller(size, flags, node, \
 			_RET_IP_)
@@ -716,7 +721,7 @@ static inline void *kmem_cache_zalloc(st
  * @size: how many bytes of memory are required.
  * @flags: the type of memory to allocate (see kmalloc).
  */
-static inline void *kzalloc(size_t size, gfp_t flags)
+static inline __alloc_size(1) void *kzalloc(size_t size, gfp_t flags)
 {
 	return kmalloc(size, flags | __GFP_ZERO);
 }
@@ -727,7 +732,7 @@ static inline void *kzalloc(size_t size,
  * @flags: the type of memory to allocate (see kmalloc).
  * @node: memory node from which to allocate
  */
-static inline void *kzalloc_node(size_t size, gfp_t flags, int node)
+static inline __alloc_size(1) void *kzalloc_node(size_t size, gfp_t flags, int node)
 {
 	return kmalloc_node(size, flags | __GFP_ZERO, node);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 036/262] mm/kvmalloc: add __alloc_size attributes for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (34 preceding siblings ...)
  2021-11-05 20:36 ` [patch 035/262] slab: add __alloc_size attributes for better bounds checking Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 037/262] mm/vmalloc: " Andrew Morton
                   ` (225 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: mm/kvmalloc: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for regular
kvmalloc interfaces, to provide additional hinting for better bounds
checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210930222704.2631604-6-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/slab.h |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/include/linux/slab.h~mm-kvmalloc-add-__alloc_size-attributes-for-better-bounds-checking
+++ a/include/linux/slab.h
@@ -737,21 +737,21 @@ static inline __alloc_size(1) void *kzal
 	return kmalloc_node(size, flags | __GFP_ZERO, node);
 }
 
-extern void *kvmalloc_node(size_t size, gfp_t flags, int node);
-static inline void *kvmalloc(size_t size, gfp_t flags)
+extern void *kvmalloc_node(size_t size, gfp_t flags, int node) __alloc_size(1);
+static inline __alloc_size(1) void *kvmalloc(size_t size, gfp_t flags)
 {
 	return kvmalloc_node(size, flags, NUMA_NO_NODE);
 }
-static inline void *kvzalloc_node(size_t size, gfp_t flags, int node)
+static inline __alloc_size(1) void *kvzalloc_node(size_t size, gfp_t flags, int node)
 {
 	return kvmalloc_node(size, flags | __GFP_ZERO, node);
 }
-static inline void *kvzalloc(size_t size, gfp_t flags)
+static inline __alloc_size(1) void *kvzalloc(size_t size, gfp_t flags)
 {
 	return kvmalloc(size, flags | __GFP_ZERO);
 }
 
-static inline void *kvmalloc_array(size_t n, size_t size, gfp_t flags)
+static inline __alloc_size(1, 2) void *kvmalloc_array(size_t n, size_t size, gfp_t flags)
 {
 	size_t bytes;
 
@@ -761,13 +761,13 @@ static inline void *kvmalloc_array(size_
 	return kvmalloc(bytes, flags);
 }
 
-static inline void *kvcalloc(size_t n, size_t size, gfp_t flags)
+static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags)
 {
 	return kvmalloc_array(n, size, flags | __GFP_ZERO);
 }
 
-extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize,
-		gfp_t flags);
+extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags)
+		      __alloc_size(3);
 extern void kvfree(const void *addr);
 extern void kvfree_sensitive(const void *addr, size_t len);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 037/262] mm/vmalloc: add __alloc_size attributes for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (35 preceding siblings ...)
  2021-11-05 20:36 ` [patch 036/262] mm/kvmalloc: " Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 038/262] mm/page_alloc: " Andrew Morton
                   ` (224 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: mm/vmalloc: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate vmalloc allocator interfaces, to provide additional hinting
for better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other
compiler optimizations.

Link: https://lkml.kernel.org/r/20210930222704.2631604-7-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/vmalloc.h |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

--- a/include/linux/vmalloc.h~mm-vmalloc-add-__alloc_size-attributes-for-better-bounds-checking
+++ a/include/linux/vmalloc.h
@@ -136,21 +136,21 @@ static inline void vmalloc_init(void)
 static inline unsigned long vmalloc_nr_pages(void) { return 0; }
 #endif
 
-extern void *vmalloc(unsigned long size);
-extern void *vzalloc(unsigned long size);
-extern void *vmalloc_user(unsigned long size);
-extern void *vmalloc_node(unsigned long size, int node);
-extern void *vzalloc_node(unsigned long size, int node);
-extern void *vmalloc_32(unsigned long size);
-extern void *vmalloc_32_user(unsigned long size);
-extern void *__vmalloc(unsigned long size, gfp_t gfp_mask);
+extern void *vmalloc(unsigned long size) __alloc_size(1);
+extern void *vzalloc(unsigned long size) __alloc_size(1);
+extern void *vmalloc_user(unsigned long size) __alloc_size(1);
+extern void *vmalloc_node(unsigned long size, int node) __alloc_size(1);
+extern void *vzalloc_node(unsigned long size, int node) __alloc_size(1);
+extern void *vmalloc_32(unsigned long size) __alloc_size(1);
+extern void *vmalloc_32_user(unsigned long size) __alloc_size(1);
+extern void *__vmalloc(unsigned long size, gfp_t gfp_mask) __alloc_size(1);
 extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
 			unsigned long start, unsigned long end, gfp_t gfp_mask,
 			pgprot_t prot, unsigned long vm_flags, int node,
-			const void *caller);
+			const void *caller) __alloc_size(1);
 void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask,
-		int node, const void *caller);
-void *vmalloc_no_huge(unsigned long size);
+		int node, const void *caller) __alloc_size(1);
+void *vmalloc_no_huge(unsigned long size) __alloc_size(1);
 
 extern void vfree(const void *addr);
 extern void vfree_atomic(const void *addr);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 038/262] mm/page_alloc: add __alloc_size attributes for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (36 preceding siblings ...)
  2021-11-05 20:36 ` [patch 037/262] mm/vmalloc: " Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 039/262] percpu: " Andrew Morton
                   ` (223 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: mm/page_alloc: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate page allocator interfaces, to provide additional hinting for
better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Link: https://lkml.kernel.org/r/20210930222704.2631604-8-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/gfp.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/gfp.h~mm-page_alloc-add-__alloc_size-attributes-for-better-bounds-checking
+++ a/include/linux/gfp.h
@@ -608,9 +608,9 @@ static inline struct page *alloc_pages(g
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
-void *alloc_pages_exact(size_t size, gfp_t gfp_mask);
+void *alloc_pages_exact(size_t size, gfp_t gfp_mask) __alloc_size(1);
 void free_pages_exact(void *virt, size_t size);
-void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
+__meminit void *alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask) __alloc_size(1);
 
 #define __get_free_page(gfp_mask) \
 		__get_free_pages((gfp_mask), 0)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 039/262] percpu: add __alloc_size attributes for better bounds checking
  2021-11-05 20:34 incoming Andrew Morton
                   ` (37 preceding siblings ...)
  2021-11-05 20:36 ` [patch 038/262] mm/page_alloc: " Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 040/262] mm/page_ext.c: fix a comment Andrew Morton
                   ` (222 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, alex.bou9, apw, cl, danielmicay, dennis, dwaipayanray1,
	gustavoars, iamjoonsoo.kim, ira.weiny, jhubbard, jingxiangfeng,
	joe, jrdr.linux, keescook, linux-mm, lkp, lukas.bulwahn,
	mm-commits, mporter, nathan, ndesaulniers, ojeda, penberg,
	rdunlap, rientjes, tj, torvalds, vbabka

From: Kees Cook <keescook@chromium.org>
Subject: percpu: add __alloc_size attributes for better bounds checking

As already done in GrapheneOS, add the __alloc_size attribute for
appropriate percpu allocator interfaces, to provide additional hinting for
better bounds checking, assisting CONFIG_FORTIFY_SOURCE and other compiler
optimizations.

Note that due to the implementation of the percpu API, this is unlikely to
ever actually provide compile-time checking beyond very simple non-SMP
builds.  But, since they are technically allocators, mark them as such.

Link: https://lkml.kernel.org/r/20210930222704.2631604-9-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/percpu.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/percpu.h~percpu-add-__alloc_size-attributes-for-better-bounds-checking
+++ a/include/linux/percpu.h
@@ -123,7 +123,7 @@ extern int __init pcpu_page_first_chunk(
 				pcpu_fc_populate_pte_fn_t populate_pte_fn);
 #endif
 
-extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align);
+extern void __percpu *__alloc_reserved_percpu(size_t size, size_t align) __alloc_size(1);
 extern bool __is_kernel_percpu_address(unsigned long addr, unsigned long *can_addr);
 extern bool is_kernel_percpu_address(unsigned long addr);
 
@@ -131,8 +131,8 @@ extern bool is_kernel_percpu_address(uns
 extern void __init setup_per_cpu_areas(void);
 #endif
 
-extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp);
-extern void __percpu *__alloc_percpu(size_t size, size_t align);
+extern void __percpu *__alloc_percpu_gfp(size_t size, size_t align, gfp_t gfp) __alloc_size(1);
+extern void __percpu *__alloc_percpu(size_t size, size_t align) __alloc_size(1);
 extern void free_percpu(void __percpu *__pdata);
 extern phys_addr_t per_cpu_ptr_to_phys(void *addr);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 040/262] mm/page_ext.c: fix a comment
  2021-11-05 20:34 incoming Andrew Morton
                   ` (38 preceding siblings ...)
  2021-11-05 20:36 ` [patch 039/262] percpu: " Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 041/262] mm: stop filemap_read() from grabbing a superfluous page Andrew Morton
                   ` (221 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, vbabka, zhangyinan2019

From: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Subject: mm/page_ext.c: fix a comment

I have noticed that the previous macro is #ifndef CONFIG_SPARSEMEM.  I
think the comment of #else should be CONFIG_SPARSEMEM.

Link: https://lkml.kernel.org/r/20211008140312.6492-1-zhangyinan2019@email.szu.edu.cn
Signed-off-by: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_ext.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_ext.c~mm-fix-a-comment
+++ a/mm/page_ext.c
@@ -201,7 +201,7 @@ fail:
 	panic("Out of memory");
 }
 
-#else /* CONFIG_FLATMEM */
+#else /* CONFIG_SPARSEMEM */
 
 struct page_ext *lookup_page_ext(const struct page *page)
 {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 041/262] mm: stop filemap_read() from grabbing a superfluous page
  2021-11-05 20:34 incoming Andrew Morton
                   ` (39 preceding siblings ...)
  2021-11-05 20:36 ` [patch 040/262] mm/page_ext.c: fix a comment Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 042/262] mm: export bdi_unregister Andrew Morton
                   ` (220 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, dhowells, jlayton, kent.overstreet, linux-mm, mm-commits,
	torvalds, willy

From: David Howells <dhowells@redhat.com>
Subject: mm: stop filemap_read() from grabbing a superfluous page

Under some circumstances, filemap_read() will allocate sufficient pages to
read to the end of the file, call readahead/readpages on them and copy the
data over - and then it will allocate another page at the EOF and call
readpage on that and then ignore it.  This is unnecessary and a waste of
time and resources.

filemap_read() *does* check for this, but only after it has already done
the allocation and I/O.  Fix this by checking before calling
filemap_get_pages() also.

Link: https://lkml.kernel.org/r/163472463105.3126792.7056099385135786492.stgit@warthog.procyon.org.uk
Link: https://lore.kernel.org/r/160588481358.3465195.16552616179674485179.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/163456863216.2614702.6384850026368833133.stgit@warthog.procyon.org.uk/
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/filemap.c~mm-stop-filemap_read-from-grabbing-a-superfluous-page
+++ a/mm/filemap.c
@@ -2625,6 +2625,9 @@ ssize_t filemap_read(struct kiocb *iocb,
 		if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
 			iocb->ki_flags |= IOCB_NOWAIT;
 
+		if (unlikely(iocb->ki_pos >= i_size_read(inode)))
+			break;
+
 		error = filemap_get_pages(iocb, iter, &pvec);
 		if (error < 0)
 			break;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 042/262] mm: export bdi_unregister
  2021-11-05 20:34 incoming Andrew Morton
                   ` (40 preceding siblings ...)
  2021-11-05 20:36 ` [patch 041/262] mm: stop filemap_read() from grabbing a superfluous page Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 043/262] mtd: call bdi_unregister explicitly Andrew Morton
                   ` (219 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, hch, jack, linux-mm, miquel.raynal, mm-commits, richard,
	torvalds, vigneshr

From: Christoph Hellwig <hch@lst.de>
Subject: mm: export bdi_unregister

Patch series "simplify bdi unregistation".

This series simplifies the BDI code to get rid of the magic
auto-unregister feature that hid a recent block layer refcounting bug.


This patch (of 5):

To wind down the magic auto-unregister semantics we'll need to push this
into modular code.

Link: https://lkml.kernel.org/r/20211021124441.668816-1-hch@lst.de
Link: https://lkml.kernel.org/r/20211021124441.668816-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/backing-dev.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/backing-dev.c~mm-export-bdi_unregister
+++ a/mm/backing-dev.c
@@ -958,6 +958,7 @@ void bdi_unregister(struct backing_dev_i
 		bdi->owner = NULL;
 	}
 }
+EXPORT_SYMBOL(bdi_unregister);
 
 static void release_bdi(struct kref *ref)
 {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 043/262] mtd: call bdi_unregister explicitly
  2021-11-05 20:34 incoming Andrew Morton
                   ` (41 preceding siblings ...)
  2021-11-05 20:36 ` [patch 042/262] mm: export bdi_unregister Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:36 ` [patch 044/262] fs: explicitly unregister per-superblock BDIs Andrew Morton
                   ` (218 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, hch, jack, linux-mm, miquel.raynal, mm-commits, richard,
	torvalds, vigneshr

From: Christoph Hellwig <hch@lst.de>
Subject: mtd: call bdi_unregister explicitly

Call bdi_unregister explicitly instead of relying on the automatic
unregistration.

Link: https://lkml.kernel.org/r/20211021124441.668816-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/mtd/mtdcore.c |    1 +
 1 file changed, 1 insertion(+)

--- a/drivers/mtd/mtdcore.c~mtd-call-bdi_unregister-explicitly
+++ a/drivers/mtd/mtdcore.c
@@ -2409,6 +2409,7 @@ static void __exit cleanup_mtd(void)
 	if (proc_mtd)
 		remove_proc_entry("mtd", NULL);
 	class_unregister(&mtd_class);
+	bdi_unregister(mtd_bdi);
 	bdi_put(mtd_bdi);
 	idr_destroy(&mtd_idr);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 044/262] fs: explicitly unregister per-superblock BDIs
  2021-11-05 20:34 incoming Andrew Morton
                   ` (42 preceding siblings ...)
  2021-11-05 20:36 ` [patch 043/262] mtd: call bdi_unregister explicitly Andrew Morton
@ 2021-11-05 20:36 ` Andrew Morton
  2021-11-05 20:37 ` [patch 045/262] mm: don't automatically unregister bdis Andrew Morton
                   ` (217 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:36 UTC (permalink / raw)
  To: akpm, hch, jack, linux-mm, miquel.raynal, mm-commits, richard,
	torvalds, vigneshr

From: Christoph Hellwig <hch@lst.de>
Subject: fs: explicitly unregister per-superblock BDIs

Add a new SB_I_ flag to mark superblocks that have an ephemeral bdi
associated with them, and unregister it when the superblock is shut down.

Link: https://lkml.kernel.org/r/20211021124441.668816-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/super.c         |    3 +++
 include/linux/fs.h |    1 +
 2 files changed, 4 insertions(+)

--- a/fs/super.c~fs-explicitly-unregister-per-superblock-bdis
+++ a/fs/super.c
@@ -476,6 +476,8 @@ void generic_shutdown_super(struct super
 	spin_unlock(&sb_lock);
 	up_write(&sb->s_umount);
 	if (sb->s_bdi != &noop_backing_dev_info) {
+		if (sb->s_iflags & SB_I_PERSB_BDI)
+			bdi_unregister(sb->s_bdi);
 		bdi_put(sb->s_bdi);
 		sb->s_bdi = &noop_backing_dev_info;
 	}
@@ -1562,6 +1564,7 @@ int super_setup_bdi_name(struct super_bl
 	}
 	WARN_ON(sb->s_bdi != &noop_backing_dev_info);
 	sb->s_bdi = bdi;
+	sb->s_iflags |= SB_I_PERSB_BDI;
 
 	return 0;
 }
--- a/include/linux/fs.h~fs-explicitly-unregister-per-superblock-bdis
+++ a/include/linux/fs.h
@@ -1443,6 +1443,7 @@ extern int send_sigurg(struct fown_struc
 #define SB_I_UNTRUSTED_MOUNTER		0x00000040
 
 #define SB_I_SKIP_SYNC	0x00000100	/* Skip superblock at global sync */
+#define SB_I_PERSB_BDI	0x00000200	/* has a per-sb bdi */
 
 /* Possible states of 'frozen' field */
 enum {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 045/262] mm: don't automatically unregister bdis
  2021-11-05 20:34 incoming Andrew Morton
                   ` (43 preceding siblings ...)
  2021-11-05 20:36 ` [patch 044/262] fs: explicitly unregister per-superblock BDIs Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 046/262] mm: simplify bdi refcounting Andrew Morton
                   ` (216 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, hch, jack, linux-mm, miquel.raynal, mm-commits, richard,
	torvalds, vigneshr

From: Christoph Hellwig <hch@lst.de>
Subject: mm: don't automatically unregister bdis

All BDI users now unregister explicitly.

Link: https://lkml.kernel.org/r/20211021124441.668816-5-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/backing-dev.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/backing-dev.c~mm-dont-automatically-unregister-bdis
+++ a/mm/backing-dev.c
@@ -965,8 +965,7 @@ static void release_bdi(struct kref *ref
 	struct backing_dev_info *bdi =
 			container_of(ref, struct backing_dev_info, refcnt);
 
-	if (test_bit(WB_registered, &bdi->wb.state))
-		bdi_unregister(bdi);
+	WARN_ON_ONCE(test_bit(WB_registered, &bdi->wb.state));
 	WARN_ON_ONCE(bdi->dev);
 	wb_exit(&bdi->wb);
 	kfree(bdi);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 046/262] mm: simplify bdi refcounting
  2021-11-05 20:34 incoming Andrew Morton
                   ` (44 preceding siblings ...)
  2021-11-05 20:37 ` [patch 045/262] mm: don't automatically unregister bdis Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 047/262] mm: don't read i_size of inode unless we need it Andrew Morton
                   ` (215 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, hch, jack, linux-mm, miquel.raynal, mm-commits, richard,
	torvalds, vigneshr

From: Christoph Hellwig <hch@lst.de>
Subject: mm: simplify bdi refcounting

Move grabbing and releasing the bdi refcount out of the common
wb_init/wb_exit helpers into code that is only used for the non-default
memcg driven bdi_writeback structures.

[hch@lst.de: add comment]
  Link: https://lkml.kernel.org/r/20211027074207.GA12793@lst.de
[akpm@linux-foundation.org: fix typo]
Link: https://lkml.kernel.org/r/20211021124441.668816-6-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/backing-dev-defs.h |    3 +++
 mm/backing-dev.c                 |   13 +++++--------
 2 files changed, 8 insertions(+), 8 deletions(-)

--- a/include/linux/backing-dev-defs.h~mm-simplify-bdi-refcounting
+++ a/include/linux/backing-dev-defs.h
@@ -103,6 +103,9 @@ struct wb_completion {
  * change as blkcg is disabled and enabled higher up in the hierarchy, a wb
  * is tested for blkcg after lookup and removed from index on mismatch so
  * that a new wb for the combination can be created.
+ *
+ * Each bdi_writeback that is not embedded into the backing_dev_info must hold
+ * a reference to the parent backing_dev_info.  See cgwb_create() for details.
  */
 struct bdi_writeback {
 	struct backing_dev_info *bdi;	/* our parent bdi */
--- a/mm/backing-dev.c~mm-simplify-bdi-refcounting
+++ a/mm/backing-dev.c
@@ -291,8 +291,6 @@ static int wb_init(struct bdi_writeback
 
 	memset(wb, 0, sizeof(*wb));
 
-	if (wb != &bdi->wb)
-		bdi_get(bdi);
 	wb->bdi = bdi;
 	wb->last_old_flush = jiffies;
 	INIT_LIST_HEAD(&wb->b_dirty);
@@ -316,7 +314,7 @@ static int wb_init(struct bdi_writeback
 
 	err = fprop_local_init_percpu(&wb->completions, gfp);
 	if (err)
-		goto out_put_bdi;
+		return err;
 
 	for (i = 0; i < NR_WB_STAT_ITEMS; i++) {
 		err = percpu_counter_init(&wb->stat[i], 0, gfp);
@@ -330,9 +328,6 @@ out_destroy_stat:
 	while (i--)
 		percpu_counter_destroy(&wb->stat[i]);
 	fprop_local_destroy_percpu(&wb->completions);
-out_put_bdi:
-	if (wb != &bdi->wb)
-		bdi_put(bdi);
 	return err;
 }
 
@@ -373,8 +368,6 @@ static void wb_exit(struct bdi_writeback
 		percpu_counter_destroy(&wb->stat[i]);
 
 	fprop_local_destroy_percpu(&wb->completions);
-	if (wb != &wb->bdi->wb)
-		bdi_put(wb->bdi);
 }
 
 #ifdef CONFIG_CGROUP_WRITEBACK
@@ -397,6 +390,7 @@ static void cgwb_release_workfn(struct w
 	struct bdi_writeback *wb = container_of(work, struct bdi_writeback,
 						release_work);
 	struct blkcg *blkcg = css_to_blkcg(wb->blkcg_css);
+	struct backing_dev_info *bdi = wb->bdi;
 
 	mutex_lock(&wb->bdi->cgwb_release_mutex);
 	wb_shutdown(wb);
@@ -416,6 +410,7 @@ static void cgwb_release_workfn(struct w
 
 	percpu_ref_exit(&wb->refcnt);
 	wb_exit(wb);
+	bdi_put(bdi);
 	WARN_ON_ONCE(!list_empty(&wb->b_attached));
 	kfree_rcu(wb, rcu);
 }
@@ -497,6 +492,7 @@ static int cgwb_create(struct backing_de
 	INIT_LIST_HEAD(&wb->b_attached);
 	INIT_WORK(&wb->release_work, cgwb_release_workfn);
 	set_bit(WB_registered, &wb->state);
+	bdi_get(bdi);
 
 	/*
 	 * The root wb determines the registered state of the whole bdi and
@@ -528,6 +524,7 @@ static int cgwb_create(struct backing_de
 	goto out_put;
 
 err_fprop_exit:
+	bdi_put(bdi);
 	fprop_local_destroy_percpu(&wb->memcg_completions);
 err_ref_exit:
 	percpu_ref_exit(&wb->refcnt);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 047/262] mm: don't read i_size of inode unless we need it
  2021-11-05 20:34 incoming Andrew Morton
                   ` (45 preceding siblings ...)
  2021-11-05 20:37 ` [patch 046/262] mm: simplify bdi refcounting Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON Andrew Morton
                   ` (214 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, asml.silence, axboe, clm, david, jack, josef, linux-mm,
	mm-commits, torvalds

From: Jens Axboe <axboe@kernel.dk>
Subject: mm: don't read i_size of inode unless we need it

We always go through i_size_read(), and we rarely end up needing it. Push
the read to down where we need to check it, which avoids it for most
cases.

It looks like we can even remove this check entirely, which might be
worth pursuing. But at least this takes it out of the hot path.

Link: https://lkml.kernel.org/r/6b67981f-57d4-c80e-bc07-6020aa601381@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/mm/filemap.c~mm-dont-read-i_size-of-inode-unless-we-need-it
+++ a/mm/filemap.c
@@ -2740,9 +2740,7 @@ generic_file_read_iter(struct kiocb *ioc
 		struct file *file = iocb->ki_filp;
 		struct address_space *mapping = file->f_mapping;
 		struct inode *inode = mapping->host;
-		loff_t size;
 
-		size = i_size_read(inode);
 		if (iocb->ki_flags & IOCB_NOWAIT) {
 			if (filemap_range_needs_writeback(mapping, iocb->ki_pos,
 						iocb->ki_pos + count - 1))
@@ -2774,8 +2772,9 @@ generic_file_read_iter(struct kiocb *ioc
 		 * the rest of the read.  Buffered reads will not work for
 		 * DAX files, so don't bother trying.
 		 */
-		if (retval < 0 || !count || iocb->ki_pos >= size ||
-		    IS_DAX(inode))
+		if (retval < 0 || !count || IS_DAX(inode))
+			return retval;
+		if (iocb->ki_pos >= i_size_read(inode))
 			return retval;
 	}
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON
  2021-11-05 20:34 incoming Andrew Morton
                   ` (46 preceding siblings ...)
  2021-11-05 20:37 ` [patch 047/262] mm: don't read i_size of inode unless we need it Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 049/262] mm: move more expensive part of XA setup out of mapping check Andrew Morton
                   ` (213 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, hughd, linux-mm, mm-commits, stable,
	syzbot+c87be4f669d920c76330, torvalds, willy

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm/filemap.c: remove bogus VM_BUG_ON

It is not safe to check page->index without holding the page lock.  It can
be changed if the page is moved between the swap cache and the page cache
for a shmem file, for example.  There is a VM_BUG_ON below which checks
page->index is correct after taking the page lock.

Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: <syzbot+c87be4f669d920c76330@syzkaller.appspotmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |    1 -
 1 file changed, 1 deletion(-)

--- a/mm/filemap.c~mm-remove-bogus-vm_bug_on
+++ a/mm/filemap.c
@@ -2093,7 +2093,6 @@ unsigned find_lock_entries(struct addres
 		if (!xa_is_value(page)) {
 			if (page->index < start)
 				goto put;
-			VM_BUG_ON_PAGE(page->index != xas.xa_index, page);
 			if (page->index + thp_nr_pages(page) - 1 > end)
 				goto put;
 			if (!trylock_page(page))
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 049/262] mm: move more expensive part of XA setup out of mapping check
  2021-11-05 20:34 incoming Andrew Morton
                   ` (47 preceding siblings ...)
  2021-11-05 20:37 ` [patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 050/262] mm/gup: further simplify __gup_device_huge() Andrew Morton
                   ` (212 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, axboe, linux-mm, mm-commits, torvalds, willy

From: Jens Axboe <axboe@kernel.dk>
Subject: mm: move more expensive part of XA setup out of mapping check

The fast path here is not needing any writeback, yet we spend time setting
up the xarray lookup data upfront.  Move the part that actually needs to
iterate the address space mapping into a separate helper, saving ~30% of
the time here.

Link: https://lkml.kernel.org/r/49f67983-b802-8929-edab-d807f745c9ca@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |   43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

--- a/mm/filemap.c~mm-move-more-expensive-part-of-xa-setup-out-of-mapping-check
+++ a/mm/filemap.c
@@ -639,6 +639,30 @@ static bool mapping_needs_writeback(stru
 	return mapping->nrpages;
 }
 
+static bool filemap_range_has_writeback(struct address_space *mapping,
+					loff_t start_byte, loff_t end_byte)
+{
+	XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
+	pgoff_t max = end_byte >> PAGE_SHIFT;
+	struct page *page;
+
+	if (end_byte < start_byte)
+		return false;
+
+	rcu_read_lock();
+	xas_for_each(&xas, page, max) {
+		if (xas_retry(&xas, page))
+			continue;
+		if (xa_is_value(page))
+			continue;
+		if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
+			break;
+	}
+	rcu_read_unlock();
+	return page != NULL;
+
+}
+
 /**
  * filemap_range_needs_writeback - check if range potentially needs writeback
  * @mapping:           address space within which to check
@@ -656,29 +680,12 @@ static bool mapping_needs_writeback(stru
 bool filemap_range_needs_writeback(struct address_space *mapping,
 				   loff_t start_byte, loff_t end_byte)
 {
-	XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
-	pgoff_t max = end_byte >> PAGE_SHIFT;
-	struct page *page;
-
 	if (!mapping_needs_writeback(mapping))
 		return false;
 	if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
 	    !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
 		return false;
-	if (end_byte < start_byte)
-		return false;
-
-	rcu_read_lock();
-	xas_for_each(&xas, page, max) {
-		if (xas_retry(&xas, page))
-			continue;
-		if (xa_is_value(page))
-			continue;
-		if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
-			break;
-	}
-	rcu_read_unlock();
-	return page != NULL;
+	return filemap_range_has_writeback(mapping, start_byte, end_byte);
 }
 EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 050/262] mm/gup: further simplify __gup_device_huge()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (48 preceding siblings ...)
  2021-11-05 20:37 ` [patch 049/262] mm: move more expensive part of XA setup out of mapping check Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 051/262] mm/swapfile: remove needless request_queue NULL pointer check Andrew Morton
                   ` (211 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, imbrenda, jack, jhubbard, kirill.shutemov, linmiaohe,
	linux-mm, mm-commits, torvalds

From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/gup: further simplify __gup_device_huge()

commit 6401c4eb57f9 ("mm: gup: fix potential pgmap refcnt leak in
__gup_device_huge()") simplified the return paths, but didn't go quite far
enough, as discussed in [1].

Remove the "ret" variable entirely, because there is enough information
already available to provide the return value.

[1] https://lore.kernel.org/r/CAHk-=wgQTRX=5SkCmS+zfmpqubGHGJvXX_HgnPG8JSpHKHBMeg@mail.gmail.com

Link: https://lkml.kernel.org/r/20210904004224.86391-1-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/gup.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/mm/gup.c~mm-gup-further-simplify-__gup_device_huge
+++ a/mm/gup.c
@@ -2228,7 +2228,6 @@ static int __gup_device_huge(unsigned lo
 {
 	int nr_start = *nr;
 	struct dev_pagemap *pgmap = NULL;
-	int ret = 1;
 
 	do {
 		struct page *page = pfn_to_page(pfn);
@@ -2236,14 +2235,12 @@ static int __gup_device_huge(unsigned lo
 		pgmap = get_dev_pagemap(pfn, pgmap);
 		if (unlikely(!pgmap)) {
 			undo_dev_pagemap(nr, nr_start, flags, pages);
-			ret = 0;
 			break;
 		}
 		SetPageReferenced(page);
 		pages[*nr] = page;
 		if (unlikely(!try_grab_page(page, flags))) {
 			undo_dev_pagemap(nr, nr_start, flags, pages);
-			ret = 0;
 			break;
 		}
 		(*nr)++;
@@ -2251,7 +2248,7 @@ static int __gup_device_huge(unsigned lo
 	} while (addr += PAGE_SIZE, addr != end);
 
 	put_dev_pagemap(pgmap);
-	return ret;
+	return addr == end;
 }
 
 static int __gup_device_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 051/262] mm/swapfile: remove needless request_queue NULL pointer check
  2021-11-05 20:34 incoming Andrew Morton
                   ` (49 preceding siblings ...)
  2021-11-05 20:37 ` [patch 050/262] mm/gup: further simplify __gup_device_huge() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 052/262] mm/swapfile: fix an integer overflow in swap_show() Andrew Morton
                   ` (210 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, david, linux-mm, mm-commits, torvalds, vulab

From: Xu Wang <vulab@iscas.ac.cn>
Subject: mm/swapfile: remove needless request_queue NULL pointer check

The request_queue pointer returned from bdev_get_queue() shall never be
NULL, so the null check is unnecessary, just remove it.

Link: https://lkml.kernel.org/r/20210917082111.33923-1-vulab@iscas.ac.cn
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swapfile.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/swapfile.c~mm-swapfile-remove-needless-request_queue-null-pointer-check
+++ a/mm/swapfile.c
@@ -3118,7 +3118,7 @@ static bool swap_discardable(struct swap
 {
 	struct request_queue *q = bdev_get_queue(si->bdev);
 
-	if (!q || !blk_queue_discard(q))
+	if (!blk_queue_discard(q))
 		return false;
 
 	return true;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 052/262] mm/swapfile: fix an integer overflow in swap_show()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (50 preceding siblings ...)
  2021-11-05 20:37 ` [patch 051/262] mm/swapfile: remove needless request_queue NULL pointer check Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 053/262] mm: optimise put_pages_list() Andrew Morton
                   ` (209 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, aquini, hughd, linux-mm, mm-commits, torvalds

From: Rafael Aquini <aquini@redhat.com>
Subject: mm/swapfile: fix an integer overflow in swap_show()

This one is just a minor nuisance for people going through /proc/swaps if
any of their swapareas is bigger than, or equal to 1073741824 pages (4TB).

seq_printf() format string casts as uint the conversion from pages to KB,
and that will overflow in the aforementioned case.

Albeit being almost unthinkable that someone would actually set up such
big of a single swaparea, there is a ticket recently filed against RHEL:
https://bugzilla.redhat.com/show_bug.cgi?id=2008812

Given that all other codesites that use format strings for the same swap
pages-to-KB conversion do cast it as ulong, this patch just follows suit.

Link: https://lkml.kernel.org/r/20211006184011.2579054-1-aquini@redhat.com
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swapfile.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/swapfile.c~mm-swapfile-fix-an-integer-overflow-in-swap_show
+++ a/mm/swapfile.c
@@ -2763,7 +2763,7 @@ static int swap_show(struct seq_file *sw
 	struct swap_info_struct *si = v;
 	struct file *file;
 	int len;
-	unsigned int bytes, inuse;
+	unsigned long bytes, inuse;
 
 	if (si == SEQ_START_TOKEN) {
 		seq_puts(swap, "Filename\t\t\t\tType\t\tSize\t\tUsed\t\tPriority\n");
@@ -2775,7 +2775,7 @@ static int swap_show(struct seq_file *sw
 
 	file = si->swap_file;
 	len = seq_file_path(swap, file, " \t\n\\");
-	seq_printf(swap, "%*s%s\t%u\t%s%u\t%s%d\n",
+	seq_printf(swap, "%*s%s\t%lu\t%s%lu\t%s%d\n",
 			len < 40 ? 40 - len : 1, " ",
 			S_ISBLK(file_inode(file)->i_mode) ?
 				"partition" : "file\t",
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 053/262] mm: optimise put_pages_list()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (51 preceding siblings ...)
  2021-11-05 20:37 ` [patch 052/262] mm/swapfile: fix an integer overflow in swap_show() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 054/262] mm/memcg: drop swp_entry_t* in mc_handle_file_pte() Andrew Morton
                   ` (208 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, anthony.yznaga, linux-mm, mgorman, mm-commits, torvalds, willy

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: mm: optimise put_pages_list()

Instead of calling put_page() one page at a time, pop pages off the list
if their refcount was too high and pass the remainder to
put_unref_page_list().  This should be a speed improvement, but I have no
measurements to support that.  Current callers do not care about
performance, but I hope to add some which do.

Link: https://lkml.kernel.org/r/20211007192138.561673-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/swap.c |   23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

--- a/mm/swap.c~mm-optimise-put_pages_list
+++ a/mm/swap.c
@@ -134,18 +134,27 @@ EXPORT_SYMBOL(__put_page);
  * put_pages_list() - release a list of pages
  * @pages: list of pages threaded on page->lru
  *
- * Release a list of pages which are strung together on page.lru.  Currently
- * used by read_cache_pages() and related error recovery code.
+ * Release a list of pages which are strung together on page.lru.
  */
 void put_pages_list(struct list_head *pages)
 {
-	while (!list_empty(pages)) {
-		struct page *victim;
+	struct page *page, *next;
 
-		victim = lru_to_page(pages);
-		list_del(&victim->lru);
-		put_page(victim);
+	list_for_each_entry_safe(page, next, pages, lru) {
+		if (!put_page_testzero(page)) {
+			list_del(&page->lru);
+			continue;
+		}
+		if (PageHead(page)) {
+			list_del(&page->lru);
+			__put_compound_page(page);
+			continue;
+		}
+		/* Cannot be PageLRU because it's passed to us using the lru */
+		__ClearPageWaiters(page);
 	}
+
+	free_unref_page_list(pages);
 }
 EXPORT_SYMBOL(put_pages_list);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 054/262] mm/memcg: drop swp_entry_t* in mc_handle_file_pte()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (52 preceding siblings ...)
  2021-11-05 20:37 ` [patch 053/262] mm: optimise put_pages_list() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 055/262] memcg: flush stats only if updated Andrew Morton
                   ` (207 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, david, hannes, linux-mm, mhocko, mm-commits, peterx,
	songmuchun, torvalds, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm/memcg: drop swp_entry_t* in mc_handle_file_pte()

After the rework of f5df8635c5a3 ("mm: use find_get_incore_page in memcontrol",
2020-10-13) it's unused.

Link: https://lkml.kernel.org/r/20210916193014.80129-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-drop-swp_entry_t-in-mc_handle_file_pte
+++ a/mm/memcontrol.c
@@ -5545,7 +5545,7 @@ static struct page *mc_handle_swap_pte(s
 #endif
 
 static struct page *mc_handle_file_pte(struct vm_area_struct *vma,
-			unsigned long addr, pte_t ptent, swp_entry_t *entry)
+			unsigned long addr, pte_t ptent)
 {
 	if (!vma->vm_file) /* anonymous vma */
 		return NULL;
@@ -5718,7 +5718,7 @@ static enum mc_target_type get_mctgt_typ
 	else if (is_swap_pte(ptent))
 		page = mc_handle_swap_pte(vma, ptent, &ent);
 	else if (pte_none(ptent))
-		page = mc_handle_file_pte(vma, addr, ptent, &ent);
+		page = mc_handle_file_pte(vma, addr, ptent);
 
 	if (!page && !ent.val)
 		return ret;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 055/262] memcg: flush stats only if updated
  2021-11-05 20:34 incoming Andrew Morton
                   ` (53 preceding siblings ...)
  2021-11-05 20:37 ` [patch 054/262] mm/memcg: drop swp_entry_t* in mc_handle_file_pte() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 056/262] memcg: unify memcg stat flushing Andrew Morton
                   ` (206 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, hannes, linux-mm, mhocko, mkoutny, mm-commits, shakeelb, torvalds

From: Shakeel Butt <shakeelb@google.com>
Subject: memcg: flush stats only if updated

At the moment, the kernel flushes the memcg stats on every refault and
also on every reclaim iteration.  Although rstat maintains per-cpu update
tree but on the flush the kernel still has to go through all the cpu rstat
update tree to check if there is anything to flush.  This patch adds the
tracking on the stats update side to make flush side more clever by
skipping the flush if there is no update.

The stats update codepath is very sensitive performance wise for many
workloads and benchmarks.  So, we can not follow what the commit
aa48e47e3906 ("memcg: infrastructure to flush memcg stats") did which was
triggering async flush through queue_work() and caused a lot performance
regression reports.  That got reverted by the commit 1f828223b799 ("memcg:
flush lruvec stats in the refault").

In this patch we kept the stats update codepath very minimal and let the
stats reader side to flush the stats only when the updates are over a
specific threshold.  For now the threshold is (nr_cpus * CHARGE_BATCH).

To evaluate the impact of this patch, an 8 GiB tmpfs file is created on a
system with swap-on-zram and the file was pushed to swap through
memory.force_empty interface.  On reading the whole file, the memcg stat
flush in the refault code path is triggered.  With this patch, we observed
63% reduction in the read time of 8 GiB file.

Link: https://lkml.kernel.org/r/20211001190040.48086-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Reviewed-by: "Michal Koutný" <mkoutny@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   78 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 55 insertions(+), 23 deletions(-)

--- a/mm/memcontrol.c~memcg-flush-stats-only-if-updated
+++ a/mm/memcontrol.c
@@ -103,11 +103,6 @@ static bool do_memsw_account(void)
 	return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap;
 }
 
-/* memcg and lruvec stats flushing */
-static void flush_memcg_stats_dwork(struct work_struct *w);
-static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork);
-static DEFINE_SPINLOCK(stats_flush_lock);
-
 #define THRESHOLDS_EVENTS_TARGET 128
 #define SOFTLIMIT_EVENTS_TARGET 1024
 
@@ -635,6 +630,56 @@ mem_cgroup_largest_soft_limit_node(struc
 	return mz;
 }
 
+/*
+ * memcg and lruvec stats flushing
+ *
+ * Many codepaths leading to stats update or read are performance sensitive and
+ * adding stats flushing in such codepaths is not desirable. So, to optimize the
+ * flushing the kernel does:
+ *
+ * 1) Periodically and asynchronously flush the stats every 2 seconds to not let
+ *    rstat update tree grow unbounded.
+ *
+ * 2) Flush the stats synchronously on reader side only when there are more than
+ *    (MEMCG_CHARGE_BATCH * nr_cpus) update events. Though this optimization
+ *    will let stats be out of sync by atmost (MEMCG_CHARGE_BATCH * nr_cpus) but
+ *    only for 2 seconds due to (1).
+ */
+static void flush_memcg_stats_dwork(struct work_struct *w);
+static DECLARE_DEFERRABLE_WORK(stats_flush_dwork, flush_memcg_stats_dwork);
+static DEFINE_SPINLOCK(stats_flush_lock);
+static DEFINE_PER_CPU(unsigned int, stats_updates);
+static atomic_t stats_flush_threshold = ATOMIC_INIT(0);
+
+static inline void memcg_rstat_updated(struct mem_cgroup *memcg)
+{
+	cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
+	if (!(__this_cpu_inc_return(stats_updates) % MEMCG_CHARGE_BATCH))
+		atomic_inc(&stats_flush_threshold);
+}
+
+static void __mem_cgroup_flush_stats(void)
+{
+	if (!spin_trylock(&stats_flush_lock))
+		return;
+
+	cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup);
+	atomic_set(&stats_flush_threshold, 0);
+	spin_unlock(&stats_flush_lock);
+}
+
+void mem_cgroup_flush_stats(void)
+{
+	if (atomic_read(&stats_flush_threshold) > num_online_cpus())
+		__mem_cgroup_flush_stats();
+}
+
+static void flush_memcg_stats_dwork(struct work_struct *w)
+{
+	mem_cgroup_flush_stats();
+	queue_delayed_work(system_unbound_wq, &stats_flush_dwork, 2UL*HZ);
+}
+
 /**
  * __mod_memcg_state - update cgroup memory statistics
  * @memcg: the memory cgroup
@@ -647,7 +692,7 @@ void __mod_memcg_state(struct mem_cgroup
 		return;
 
 	__this_cpu_add(memcg->vmstats_percpu->state[idx], val);
-	cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
+	memcg_rstat_updated(memcg);
 }
 
 /* idx can be of type enum memcg_stat_item or node_stat_item. */
@@ -675,10 +720,12 @@ void __mod_memcg_lruvec_state(struct lru
 	memcg = pn->memcg;
 
 	/* Update memcg */
-	__mod_memcg_state(memcg, idx, val);
+	__this_cpu_add(memcg->vmstats_percpu->state[idx], val);
 
 	/* Update lruvec */
 	__this_cpu_add(pn->lruvec_stats_percpu->state[idx], val);
+
+	memcg_rstat_updated(memcg);
 }
 
 /**
@@ -780,7 +827,7 @@ void __count_memcg_events(struct mem_cgr
 		return;
 
 	__this_cpu_add(memcg->vmstats_percpu->events[idx], count);
-	cgroup_rstat_updated(memcg->css.cgroup, smp_processor_id());
+	memcg_rstat_updated(memcg);
 }
 
 static unsigned long memcg_events(struct mem_cgroup *memcg, int event)
@@ -5341,21 +5388,6 @@ static void mem_cgroup_css_reset(struct
 	memcg_wb_domain_size_changed(memcg);
 }
 
-void mem_cgroup_flush_stats(void)
-{
-	if (!spin_trylock(&stats_flush_lock))
-		return;
-
-	cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup);
-	spin_unlock(&stats_flush_lock);
-}
-
-static void flush_memcg_stats_dwork(struct work_struct *w)
-{
-	mem_cgroup_flush_stats();
-	queue_delayed_work(system_unbound_wq, &stats_flush_dwork, 2UL*HZ);
-}
-
 static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 056/262] memcg: unify memcg stat flushing
  2021-11-05 20:34 incoming Andrew Morton
                   ` (54 preceding siblings ...)
  2021-11-05 20:37 ` [patch 055/262] memcg: flush stats only if updated Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 057/262] mm/memcg: remove obsolete memcg_free_kmem() Andrew Morton
                   ` (205 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, hannes, linux-mm, mhocko, mkoutny, mm-commits, shakeelb, torvalds

From: Shakeel Butt <shakeelb@google.com>
Subject: memcg: unify memcg stat flushing

The memcg stats can be flushed in multiple context and potentially in
parallel too.  For example multiple parallel user space readers for memcg
stats will contend on the rstat locks with each other.  There is no need
for that.  We just need one flusher and everyone else can benefit.  In
addition after aa48e47e3906 ("memcg: infrastructure to flush memcg stats")
the kernel periodically flush the memcg stats from the root, so, the other
flushers will potentially have much less work to do.

Link: https://lkml.kernel.org/r/20211001190040.48086-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Michal Koutný" <mkoutny@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

--- a/mm/memcontrol.c~memcg-unify-memcg-stat-flushing
+++ a/mm/memcontrol.c
@@ -660,12 +660,14 @@ static inline void memcg_rstat_updated(s
 
 static void __mem_cgroup_flush_stats(void)
 {
-	if (!spin_trylock(&stats_flush_lock))
+	unsigned long flag;
+
+	if (!spin_trylock_irqsave(&stats_flush_lock, flag))
 		return;
 
 	cgroup_rstat_flush_irqsafe(root_mem_cgroup->css.cgroup);
 	atomic_set(&stats_flush_threshold, 0);
-	spin_unlock(&stats_flush_lock);
+	spin_unlock_irqrestore(&stats_flush_lock, flag);
 }
 
 void mem_cgroup_flush_stats(void)
@@ -1461,7 +1463,7 @@ static char *memory_stat_format(struct m
 	 *
 	 * Current memory state:
 	 */
-	cgroup_rstat_flush(memcg->css.cgroup);
+	mem_cgroup_flush_stats();
 
 	for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
 		u64 size;
@@ -3565,8 +3567,7 @@ static unsigned long mem_cgroup_usage(st
 	unsigned long val;
 
 	if (mem_cgroup_is_root(memcg)) {
-		/* mem_cgroup_threshold() calls here from irqsafe context */
-		cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
+		mem_cgroup_flush_stats();
 		val = memcg_page_state(memcg, NR_FILE_PAGES) +
 			memcg_page_state(memcg, NR_ANON_MAPPED);
 		if (swap)
@@ -3947,7 +3948,7 @@ static int memcg_numa_stat_show(struct s
 	int nid;
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 
-	cgroup_rstat_flush(memcg->css.cgroup);
+	mem_cgroup_flush_stats();
 
 	for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) {
 		seq_printf(m, "%s=%lu", stat->name,
@@ -4019,7 +4020,7 @@ static int memcg_stat_show(struct seq_fi
 
 	BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats));
 
-	cgroup_rstat_flush(memcg->css.cgroup);
+	mem_cgroup_flush_stats();
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) {
 		unsigned long nr;
@@ -4522,7 +4523,7 @@ void mem_cgroup_wb_stats(struct bdi_writ
 	struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
 	struct mem_cgroup *parent;
 
-	cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
+	mem_cgroup_flush_stats();
 
 	*pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
 	*pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
@@ -6405,7 +6406,7 @@ static int memory_numa_stat_show(struct
 	int i;
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 
-	cgroup_rstat_flush(memcg->css.cgroup);
+	mem_cgroup_flush_stats();
 
 	for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
 		int nid;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 057/262] mm/memcg: remove obsolete memcg_free_kmem()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (55 preceding siblings ...)
  2021-11-05 20:37 ` [patch 056/262] memcg: unify memcg stat flushing Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 058/262] mm/list_lru.c: prefer struct_size over open coded arithmetic Andrew Morton
                   ` (204 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, atomlin, guro, hannes, linux-mm, longman, mhocko,
	mm-commits, shakeelb, songmuchun, torvalds, vbabka, vdavydov.dev

From: Waiman Long <longman@redhat.com>
Subject: mm/memcg: remove obsolete memcg_free_kmem()

Since commit d648bcc7fe65 ("mm: kmem: make memcg_kmem_enabled()
irreversible"), the only thing memcg_free_kmem() does is to call
memcg_offline_kmem() when the memcg is still online which can happen when
online_css() fails due to -ENOMEM.  However, the name memcg_free_kmem() is
confusing and it is more clear and straight forward to call
memcg_offline_kmem() directly from mem_cgroup_css_free().

Link: https://lkml.kernel.org/r/20211005202450.11775-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Suggested-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

--- a/mm/memcontrol.c~mm-memcg-remove-obsolete-memcg_free_kmem
+++ a/mm/memcontrol.c
@@ -3704,13 +3704,6 @@ static void memcg_offline_kmem(struct me
 
 	memcg_free_cache_id(kmemcg_id);
 }
-
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-	/* css_alloc() failed, offlining didn't happen */
-	if (unlikely(memcg->kmem_state == KMEM_ONLINE))
-		memcg_offline_kmem(memcg);
-}
 #else
 static int memcg_online_kmem(struct mem_cgroup *memcg)
 {
@@ -3719,9 +3712,6 @@ static int memcg_online_kmem(struct mem_
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
 }
-static void memcg_free_kmem(struct mem_cgroup *memcg)
-{
-}
 #endif /* CONFIG_MEMCG_KMEM */
 
 static int memcg_update_kmem_max(struct mem_cgroup *memcg,
@@ -5356,7 +5346,9 @@ static void mem_cgroup_css_free(struct c
 	cancel_work_sync(&memcg->high_work);
 	mem_cgroup_remove_from_trees(memcg);
 	free_shrinker_info(memcg);
-	memcg_free_kmem(memcg);
+
+	/* Need to offline kmem if online_css() fails */
+	memcg_offline_kmem(memcg);
 	mem_cgroup_free(memcg);
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 058/262] mm/list_lru.c: prefer struct_size over open coded arithmetic
  2021-11-05 20:34 incoming Andrew Morton
                   ` (56 preceding siblings ...)
  2021-11-05 20:37 ` [patch 057/262] mm/memcg: remove obsolete memcg_free_kmem() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 059/262] memcg, kmem: further deprecate kmem.limit_in_bytes Andrew Morton
                   ` (203 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, gustavoars, keescook, len.baker, linux-mm, mm-commits, torvalds

From: Len Baker <len.baker@gmx.com>
Subject: mm/list_lru.c: prefer struct_size over open coded arithmetic

As noted in the "Deprecated Interfaces, Language Features, Attributes, and
Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing.  This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting.  Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

So, use the struct_size() helper to do the arithmetic instead of the
argument "size + count * size" in the kvmalloc() functions.

Also, take the opportunity to refactor the memcpy() call to use the
flex_array_size() helper.

This code was detected with the help of Coccinelle and audited and fixed
manually.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Link: https://lkml.kernel.org/r/20211017105929.9284-1-len.baker@gmx.com
Signed-off-by: Len Baker <len.baker@gmx.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/list_lru.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

--- a/mm/list_lru.c~mm-list_lruc-prefer-struct_size-over-open-coded-arithmetic
+++ a/mm/list_lru.c
@@ -354,8 +354,7 @@ static int memcg_init_list_lru_node(stru
 	struct list_lru_memcg *memcg_lrus;
 	int size = memcg_nr_cache_ids;
 
-	memcg_lrus = kvmalloc(sizeof(*memcg_lrus) +
-			      size * sizeof(void *), GFP_KERNEL);
+	memcg_lrus = kvmalloc(struct_size(memcg_lrus, lru, size), GFP_KERNEL);
 	if (!memcg_lrus)
 		return -ENOMEM;
 
@@ -389,7 +388,7 @@ static int memcg_update_list_lru_node(st
 
 	old = rcu_dereference_protected(nlru->memcg_lrus,
 					lockdep_is_held(&list_lrus_mutex));
-	new = kvmalloc(sizeof(*new) + new_size * sizeof(void *), GFP_KERNEL);
+	new = kvmalloc(struct_size(new, lru, new_size), GFP_KERNEL);
 	if (!new)
 		return -ENOMEM;
 
@@ -398,7 +397,7 @@ static int memcg_update_list_lru_node(st
 		return -ENOMEM;
 	}
 
-	memcpy(&new->lru, &old->lru, old_size * sizeof(void *));
+	memcpy(&new->lru, &old->lru, flex_array_size(new, lru, old_size));
 
 	/*
 	 * The locking below allows readers that hold nlru->lock avoid taking
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 059/262] memcg, kmem: further deprecate kmem.limit_in_bytes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (57 preceding siblings ...)
  2021-11-05 20:37 ` [patch 058/262] mm/list_lru.c: prefer struct_size over open coded arithmetic Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 060/262] mm: list_lru: remove holding lru lock Andrew Morton
                   ` (202 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, arnd, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, vvs

From: Shakeel Butt <shakeelb@google.com>
Subject: memcg, kmem: further deprecate kmem.limit_in_bytes

The deprecation process of kmem.limit_in_bytes started with the commit
0158115f702 ("memcg, kmem: deprecate kmem.limit_in_bytes") which also
explains in detail the motivation behind the deprecation.  To summarize,
it is the unexpected behavior on hitting the kmem limit.  This patch moves
the deprecation process to the next stage by disallowing to set the kmem
limit.  In future we might just remove the kmem.limit_in_bytes file
completely.

[akpm@linux-foundation.org: s/ENOTSUPP/EOPNOTSUPP/]
[arnd@arndb.de: mark cancel_charge() inline]
  Link: https://lkml.kernel.org/r/20211022070542.679839-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20211019153408.2916808-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/cgroup-v1/memory.rst |   11 ----
 mm/memcontrol.c                                |   39 +--------------
 2 files changed, 7 insertions(+), 43 deletions(-)

--- a/Documentation/admin-guide/cgroup-v1/memory.rst~memcg-kmem-further-deprecate-kmemlimit_in_bytes
+++ a/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -87,10 +87,8 @@ Brief summary of control files.
  memory.oom_control		     set/show oom controls.
  memory.numa_stat		     show the number of memory usage per numa
 				     node
- memory.kmem.limit_in_bytes          set/show hard limit for kernel memory
-                                     This knob is deprecated and shouldn't be
-                                     used. It is planned that this be removed in
-                                     the foreseeable future.
+ memory.kmem.limit_in_bytes          This knob is deprecated and writing to
+                                     it will return -ENOTSUPP.
  memory.kmem.usage_in_bytes          show current kernel memory allocation
  memory.kmem.failcnt                 show the number of kernel memory usage
 				     hits limits
@@ -518,11 +516,6 @@ will be charged as a new owner of it.
   charged file caches. Some out-of-use page caches may keep charged until
   memory pressure happens. If you want to avoid that, force_empty will be useful.
 
-  Also, note that when memory.kmem.limit_in_bytes is set the charges due to
-  kernel pages will still be seen. This is not considered a failure and the
-  write will still return success. In this case, it is expected that
-  memory.kmem.usage_in_bytes == memory.usage_in_bytes.
-
 5.2 stat file
 -------------
 
--- a/mm/memcontrol.c~memcg-kmem-further-deprecate-kmemlimit_in_bytes
+++ a/mm/memcontrol.c
@@ -2771,8 +2771,7 @@ static inline int try_charge(struct mem_
 	return try_charge_memcg(memcg, gfp_mask, nr_pages);
 }
 
-#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MMU)
-static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
+static inline void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
 	if (mem_cgroup_is_root(memcg))
 		return;
@@ -2781,7 +2780,6 @@ static void cancel_charge(struct mem_cgr
 	if (do_memsw_account())
 		page_counter_uncharge(&memcg->memsw, nr_pages);
 }
-#endif
 
 static void commit_charge(struct page *page, struct mem_cgroup *memcg)
 {
@@ -3000,7 +2998,6 @@ static void obj_cgroup_uncharge_pages(st
 static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
 				   unsigned int nr_pages)
 {
-	struct page_counter *counter;
 	struct mem_cgroup *memcg;
 	int ret;
 
@@ -3010,21 +3007,8 @@ static int obj_cgroup_charge_pages(struc
 	if (ret)
 		goto out;
 
-	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
-	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
-
-		/*
-		 * Enforce __GFP_NOFAIL allocation because callers are not
-		 * prepared to see failures and likely do not have any failure
-		 * handling code.
-		 */
-		if (gfp & __GFP_NOFAIL) {
-			page_counter_charge(&memcg->kmem, nr_pages);
-			goto out;
-		}
-		cancel_charge(memcg, nr_pages);
-		ret = -ENOMEM;
-	}
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		page_counter_charge(&memcg->kmem, nr_pages);
 out:
 	css_put(&memcg->css);
 
@@ -3714,17 +3698,6 @@ static void memcg_offline_kmem(struct me
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
-static int memcg_update_kmem_max(struct mem_cgroup *memcg,
-				 unsigned long max)
-{
-	int ret;
-
-	mutex_lock(&memcg_max_mutex);
-	ret = page_counter_set_max(&memcg->kmem, max);
-	mutex_unlock(&memcg_max_mutex);
-	return ret;
-}
-
 static int memcg_update_tcp_max(struct mem_cgroup *memcg, unsigned long max)
 {
 	int ret;
@@ -3790,10 +3763,8 @@ static ssize_t mem_cgroup_write(struct k
 			ret = mem_cgroup_resize_max(memcg, nr_pages, true);
 			break;
 		case _KMEM:
-			pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. "
-				     "Please report your usecase to linux-mm@kvack.org if you "
-				     "depend on this functionality.\n");
-			ret = memcg_update_kmem_max(memcg, nr_pages);
+			/* kmem.limit_in_bytes is deprecated. */
+			ret = -EOPNOTSUPP;
 			break;
 		case _TCP:
 			ret = memcg_update_tcp_max(memcg, nr_pages);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 060/262] mm: list_lru: remove holding lru lock
  2021-11-05 20:34 incoming Andrew Morton
                   ` (58 preceding siblings ...)
  2021-11-05 20:37 ` [patch 059/262] memcg, kmem: further deprecate kmem.limit_in_bytes Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 061/262] mm: list_lru: fix the return value of list_lru_count_one() Andrew Morton
                   ` (201 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, willy

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: list_lru: remove holding lru lock

Since commit e5bc3af7734f ("rcu: Consolidate PREEMPT and !PREEMPT
synchronize_rcu()"), the critical section of spin lock can serve as an
RCU read-side critical section which already allows readers that hold
nlru->lock to avoid taking rcu lock.  So just remove holding lock.

Link: https://lkml.kernel.org/r/20211025124534.56345-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/list_lru.c |   11 -----------
 1 file changed, 11 deletions(-)

--- a/mm/list_lru.c~mm-list_lru-remove-holding-lru-lock
+++ a/mm/list_lru.c
@@ -398,18 +398,7 @@ static int memcg_update_list_lru_node(st
 	}
 
 	memcpy(&new->lru, &old->lru, flex_array_size(new, lru, old_size));
-
-	/*
-	 * The locking below allows readers that hold nlru->lock avoid taking
-	 * rcu_read_lock (see list_lru_from_memcg_idx).
-	 *
-	 * Since list_lru_{add,del} may be called under an IRQ-safe lock,
-	 * we have to use IRQ-safe primitives here to avoid deadlock.
-	 */
-	spin_lock_irq(&nlru->lock);
 	rcu_assign_pointer(nlru->memcg_lrus, new);
-	spin_unlock_irq(&nlru->lock);
-
 	kvfree_rcu(old, rcu);
 	return 0;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 061/262] mm: list_lru: fix the return value of list_lru_count_one()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (59 preceding siblings ...)
  2021-11-05 20:37 ` [patch 060/262] mm: list_lru: remove holding lru lock Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 062/262] mm: memcontrol: remove kmemcg_id reparenting Andrew Morton
                   ` (200 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, willy

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: list_lru: fix the return value of list_lru_count_one()

Since commit 2788cf0c401c ("memcg: reparent list_lrus and free kmemcg_id
on css offline"), ->nr_items can be negative during memory cgroup
reparenting.  In this case, list_lru_count_one() will return an unusual
and huge value, which can surprise users.  At least for now it hasn't
affected any users.  But it is better to let list_lru_count_ont() returns
zero when ->nr_items is negative.

Link: https://lkml.kernel.org/r/20211025124910.56433-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/list_lru.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/list_lru.c~mm-list_lru-fix-the-return-value-of-list_lru_count_one
+++ a/mm/list_lru.c
@@ -176,13 +176,16 @@ unsigned long list_lru_count_one(struct
 {
 	struct list_lru_node *nlru = &lru->node[nid];
 	struct list_lru_one *l;
-	unsigned long count;
+	long count;
 
 	rcu_read_lock();
 	l = list_lru_from_memcg_idx(nlru, memcg_cache_id(memcg));
 	count = READ_ONCE(l->nr_items);
 	rcu_read_unlock();
 
+	if (unlikely(count < 0))
+		count = 0;
+
 	return count;
 }
 EXPORT_SYMBOL_GPL(list_lru_count_one);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 062/262] mm: memcontrol: remove kmemcg_id reparenting
  2021-11-05 20:34 incoming Andrew Morton
                   ` (60 preceding siblings ...)
  2021-11-05 20:37 ` [patch 061/262] mm: list_lru: fix the return value of list_lru_count_one() Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 063/262] mm: memcontrol: remove the kmem states Andrew Morton
                   ` (199 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, willy

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: memcontrol: remove kmemcg_id reparenting

Since slab objects and kmem pages are charged to object cgroup instead of
memory cgroup, memcg_reparent_objcgs() will reparent this cgroup and all
its descendants to its parent cgroup.  This already makes further
list_lru_add()'s add elements to the parent's list.  So it is unnecessary
to change kmemcg_id of an offline cgroup to its parent's id.  It just
wastes CPU cycles.  Just to remove those redundant code.

Link: https://lkml.kernel.org/r/20211025125102.56533-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

--- a/mm/memcontrol.c~mm-memcontrol-remove-kmemcg_id-reparenting
+++ a/mm/memcontrol.c
@@ -3650,8 +3650,7 @@ static int memcg_online_kmem(struct mem_
 
 static void memcg_offline_kmem(struct mem_cgroup *memcg)
 {
-	struct cgroup_subsys_state *css;
-	struct mem_cgroup *parent, *child;
+	struct mem_cgroup *parent;
 	int kmemcg_id;
 
 	if (memcg->kmem_state != KMEM_ONLINE)
@@ -3669,21 +3668,11 @@ static void memcg_offline_kmem(struct me
 	BUG_ON(kmemcg_id < 0);
 
 	/*
-	 * Change kmemcg_id of this cgroup and all its descendants to the
-	 * parent's id, and then move all entries from this cgroup's list_lrus
-	 * to ones of the parent. After we have finished, all list_lrus
-	 * corresponding to this cgroup are guaranteed to remain empty. The
-	 * ordering is imposed by list_lru_node->lock taken by
+	 * After we have finished memcg_reparent_objcgs(), all list_lrus
+	 * corresponding to this cgroup are guaranteed to remain empty.
+	 * The ordering is imposed by list_lru_node->lock taken by
 	 * memcg_drain_all_list_lrus().
 	 */
-	rcu_read_lock(); /* can be called from css_free w/o cgroup_mutex */
-	css_for_each_descendant_pre(css, &memcg->css) {
-		child = mem_cgroup_from_css(css);
-		BUG_ON(child->kmemcg_id != kmemcg_id);
-		child->kmemcg_id = parent->kmemcg_id;
-	}
-	rcu_read_unlock();
-
 	memcg_drain_all_list_lrus(kmemcg_id, parent);
 
 	memcg_free_cache_id(kmemcg_id);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 063/262] mm: memcontrol: remove the kmem states
  2021-11-05 20:34 incoming Andrew Morton
                   ` (61 preceding siblings ...)
  2021-11-05 20:37 ` [patch 062/262] mm: memcontrol: remove kmemcg_id reparenting Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:37 ` [patch 064/262] mm: list_lru: only add memcg-aware lrus to the global lru list Andrew Morton
                   ` (198 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, willy

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: memcontrol: remove the kmem states

Now the kmem states is only used to indicate whether the kmem is offline. 
However, we can set ->kmemcg_id to -1 to indicate whether the kmem is
offline.  Finally, we can remove the kmem states to simplify the code.

Link: https://lkml.kernel.org/r/20211025125259.56624-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |    7 -------
 mm/memcontrol.c            |    7 ++-----
 2 files changed, 2 insertions(+), 12 deletions(-)

--- a/include/linux/memcontrol.h~mm-memcontrol-remove-the-kmem-states
+++ a/include/linux/memcontrol.h
@@ -180,12 +180,6 @@ struct mem_cgroup_thresholds {
 	struct mem_cgroup_threshold_ary *spare;
 };
 
-enum memcg_kmem_state {
-	KMEM_NONE,
-	KMEM_ALLOCATED,
-	KMEM_ONLINE,
-};
-
 #if defined(CONFIG_SMP)
 struct memcg_padding {
 	char x[0];
@@ -318,7 +312,6 @@ struct mem_cgroup {
 
 #ifdef CONFIG_MEMCG_KMEM
 	int kmemcg_id;
-	enum memcg_kmem_state kmem_state;
 	struct obj_cgroup __rcu *objcg;
 	struct list_head objcg_list; /* list of inherited objcgs */
 #endif
--- a/mm/memcontrol.c~mm-memcontrol-remove-the-kmem-states
+++ a/mm/memcontrol.c
@@ -3626,7 +3626,6 @@ static int memcg_online_kmem(struct mem_
 		return 0;
 
 	BUG_ON(memcg->kmemcg_id >= 0);
-	BUG_ON(memcg->kmem_state);
 
 	memcg_id = memcg_alloc_cache_id();
 	if (memcg_id < 0)
@@ -3643,7 +3642,6 @@ static int memcg_online_kmem(struct mem_
 	static_branch_enable(&memcg_kmem_enabled_key);
 
 	memcg->kmemcg_id = memcg_id;
-	memcg->kmem_state = KMEM_ONLINE;
 
 	return 0;
 }
@@ -3653,11 +3651,9 @@ static void memcg_offline_kmem(struct me
 	struct mem_cgroup *parent;
 	int kmemcg_id;
 
-	if (memcg->kmem_state != KMEM_ONLINE)
+	if (memcg->kmemcg_id == -1)
 		return;
 
-	memcg->kmem_state = KMEM_ALLOCATED;
-
 	parent = parent_mem_cgroup(memcg);
 	if (!parent)
 		parent = root_mem_cgroup;
@@ -3676,6 +3672,7 @@ static void memcg_offline_kmem(struct me
 	memcg_drain_all_list_lrus(kmemcg_id, parent);
 
 	memcg_free_cache_id(kmemcg_id);
+	memcg->kmemcg_id = -1;
 }
 #else
 static int memcg_online_kmem(struct mem_cgroup *memcg)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 064/262] mm: list_lru: only add memcg-aware lrus to the global lru list
  2021-11-05 20:34 incoming Andrew Morton
                   ` (62 preceding siblings ...)
  2021-11-05 20:37 ` [patch 063/262] mm: memcontrol: remove the kmem states Andrew Morton
@ 2021-11-05 20:37 ` Andrew Morton
  2021-11-05 20:38 ` [patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Andrew Morton
                   ` (197 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:37 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mhocko, mm-commits, shakeelb,
	songmuchun, torvalds, willy

From: Muchun Song <songmuchun@bytedance.com>
Subject: mm: list_lru: only add memcg-aware lrus to the global lru list

The non-memcg-aware lru is always skiped when traversing the global lru
list, which is not efficient.  We can only add the memcg-aware lru to the
global lru list instead to make traversing more efficient.

Link: https://lkml.kernel.org/r/20211025124353.55781-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/list_lru.c |   35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

--- a/mm/list_lru.c~mm-list_lru-only-add-memcg-aware-lrus-to-the-global-lru-list
+++ a/mm/list_lru.c
@@ -15,18 +15,29 @@
 #include "slab.h"
 
 #ifdef CONFIG_MEMCG_KMEM
-static LIST_HEAD(list_lrus);
+static LIST_HEAD(memcg_list_lrus);
 static DEFINE_MUTEX(list_lrus_mutex);
 
+static inline bool list_lru_memcg_aware(struct list_lru *lru)
+{
+	return lru->memcg_aware;
+}
+
 static void list_lru_register(struct list_lru *lru)
 {
+	if (!list_lru_memcg_aware(lru))
+		return;
+
 	mutex_lock(&list_lrus_mutex);
-	list_add(&lru->list, &list_lrus);
+	list_add(&lru->list, &memcg_list_lrus);
 	mutex_unlock(&list_lrus_mutex);
 }
 
 static void list_lru_unregister(struct list_lru *lru)
 {
+	if (!list_lru_memcg_aware(lru))
+		return;
+
 	mutex_lock(&list_lrus_mutex);
 	list_del(&lru->list);
 	mutex_unlock(&list_lrus_mutex);
@@ -37,11 +48,6 @@ static int lru_shrinker_id(struct list_l
 	return lru->shrinker_id;
 }
 
-static inline bool list_lru_memcg_aware(struct list_lru *lru)
-{
-	return lru->memcg_aware;
-}
-
 static inline struct list_lru_one *
 list_lru_from_memcg_idx(struct list_lru_node *nlru, int idx)
 {
@@ -457,9 +463,6 @@ static int memcg_update_list_lru(struct
 {
 	int i;
 
-	if (!list_lru_memcg_aware(lru))
-		return 0;
-
 	for_each_node(i) {
 		if (memcg_update_list_lru_node(&lru->node[i],
 					       old_size, new_size))
@@ -482,9 +485,6 @@ static void memcg_cancel_update_list_lru
 {
 	int i;
 
-	if (!list_lru_memcg_aware(lru))
-		return;
-
 	for_each_node(i)
 		memcg_cancel_update_list_lru_node(&lru->node[i],
 						  old_size, new_size);
@@ -497,7 +497,7 @@ int memcg_update_all_list_lrus(int new_s
 	int old_size = memcg_nr_cache_ids;
 
 	mutex_lock(&list_lrus_mutex);
-	list_for_each_entry(lru, &list_lrus, list) {
+	list_for_each_entry(lru, &memcg_list_lrus, list) {
 		ret = memcg_update_list_lru(lru, old_size, new_size);
 		if (ret)
 			goto fail;
@@ -506,7 +506,7 @@ out:
 	mutex_unlock(&list_lrus_mutex);
 	return ret;
 fail:
-	list_for_each_entry_continue_reverse(lru, &list_lrus, list)
+	list_for_each_entry_continue_reverse(lru, &memcg_list_lrus, list)
 		memcg_cancel_update_list_lru(lru, old_size, new_size);
 	goto out;
 }
@@ -543,9 +543,6 @@ static void memcg_drain_list_lru(struct
 {
 	int i;
 
-	if (!list_lru_memcg_aware(lru))
-		return;
-
 	for_each_node(i)
 		memcg_drain_list_lru_node(lru, i, src_idx, dst_memcg);
 }
@@ -555,7 +552,7 @@ void memcg_drain_all_list_lrus(int src_i
 	struct list_lru *lru;
 
 	mutex_lock(&list_lrus_mutex);
-	list_for_each_entry(lru, &list_lrus, list)
+	list_for_each_entry(lru, &memcg_list_lrus, list)
 		memcg_drain_list_lru(lru, src_idx, dst_memcg);
 	mutex_unlock(&list_lrus_mutex);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks
  2021-11-05 20:34 incoming Andrew Morton
                   ` (63 preceding siblings ...)
  2021-11-05 20:37 ` [patch 064/262] mm: list_lru: only add memcg-aware lrus to the global lru list Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 066/262] mm, oom: do not trigger out_of_memory from the #PF Andrew Morton
                   ` (196 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mgorman, mhocko, mm-commits,
	penguin-kernel, shakeelb, stable, torvalds, urezki, vbabka,
	vdavydov.dev, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks

Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3.

Memory cgroup charging allows killed or exiting tasks to exceed the hard
limit.  It can be misused and allowed to trigger global OOM from inside a
memcg-limited container.  On the other hand if memcg fails allocation,
called from inside #PF handler it triggers global OOM from inside
pagefault_out_of_memory().

To prevent these problems this patchset:
a) removes execution of out_of_memory() from pagefault_out_of_memory(),
   becasue nobody can explain why it is necessary.
b) allow memcg to fail allocation of dying/killed tasks.


This patch (of 3):

Any allocation failure during the #PF path will return with VM_FAULT_OOM
which in turn results in pagefault_out_of_memory which in turn executes
out_out_memory() and can kill a random task.

An allocation might fail when the current task is the oom victim and there
are no memory reserves left.  The OOM killer is already handled at the
page allocator level for the global OOM and at the charging level for the
memcg one.  Both have much more information about the scope of
allocation/charge request.  This means that either the OOM killer has been
invoked properly and didn't lead to the allocation success or it has been
skipped because it couldn't have been invoked.  In both cases triggering
it from here is pointless and even harmful.

It makes much more sense to let the killed task die rather than to wake up
an eternally hungry oom-killer and send him to choose a fatter victim for
breakfast.

Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/oom_kill.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/oom_kill.c~mm-oom-pagefault_out_of_memory-dont-force-global-oom-for-dying-tasks
+++ a/mm/oom_kill.c
@@ -1137,6 +1137,9 @@ void pagefault_out_of_memory(void)
 	if (mem_cgroup_oom_synchronize(true))
 		return;
 
+	if (fatal_signal_pending(current))
+		return;
+
 	if (!mutex_trylock(&oom_lock))
 		return;
 	out_of_memory(&oc);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 066/262] mm, oom: do not trigger out_of_memory from the #PF
  2021-11-05 20:34 incoming Andrew Morton
                   ` (64 preceding siblings ...)
  2021-11-05 20:38 ` [patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks Andrew Morton
                   ` (195 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mgorman, mhocko, mm-commits,
	penguin-kernel, shakeelb, stable, torvalds, urezki, vbabka,
	vdavydov.dev, vvs

From: Michal Hocko <mhocko@suse.com>
Subject: mm, oom: do not trigger out_of_memory from the #PF

Any allocation failure during the #PF path will return with VM_FAULT_OOM
which in turn results in pagefault_out_of_memory.  This can happen for 2
different reasons.  a) Memcg is out of memory and we rely on
mem_cgroup_oom_synchronize to perform the memcg OOM handling or b) normal
allocation fails.

The latter is quite problematic because allocation paths already trigger
out_of_memory and the page allocator tries really hard to not fail
allocations.  Anyway, if the OOM killer has been already invoked there is
no reason to invoke it again from the #PF path.  Especially when the OOM
condition might be gone by that time and we have no way to find out other
than allocate.

Moreover if the allocation failed and the OOM killer hasn't been invoked
then we are unlikely to do the right thing from the #PF context because we
have already lost the allocation context and restictions and therefore
might oom kill a task from a different NUMA domain.

This all suggests that there is no legitimate reason to trigger
out_of_memory from pagefault_out_of_memory so drop it.  Just to be sure
that no #PF path returns with VM_FAULT_OOM without allocation print a
warning that this is happening before we restart the #PF.

[VvS: #PF allocation can hit into limit of cgroup v1 kmem controller. 
This is a local problem related to memcg, however, it causes unnecessary
global OOM kills that are repeated over and over again and escalate into a
real disaster.  This has been broken since kmem accounting has been
introduced for cgroup v1 (3.8).  There was no kmem specific reclaim for
the separate limit so the only way to handle kmem hard limit was to return
with ENOMEM.  In upstream the problem will be fixed by removing the
outdated kmem limit, however stable and LTS kernels cannot do it and are
still affected.  This patch fixes the problem and should be backported
into stable/LTS.]

Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/oom_kill.c |   22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

--- a/mm/oom_kill.c~mm-oom-do-not-trigger-out_of_memory-from-the-pf
+++ a/mm/oom_kill.c
@@ -1120,19 +1120,15 @@ bool out_of_memory(struct oom_control *o
 }
 
 /*
- * The pagefault handler calls here because it is out of memory, so kill a
- * memory-hogging task. If oom_lock is held by somebody else, a parallel oom
- * killing is already in progress so do nothing.
+ * The pagefault handler calls here because some allocation has failed. We have
+ * to take care of the memcg OOM here because this is the only safe context without
+ * any locks held but let the oom killer triggered from the allocation context care
+ * about the global OOM.
  */
 void pagefault_out_of_memory(void)
 {
-	struct oom_control oc = {
-		.zonelist = NULL,
-		.nodemask = NULL,
-		.memcg = NULL,
-		.gfp_mask = 0,
-		.order = 0,
-	};
+	static DEFINE_RATELIMIT_STATE(pfoom_rs, DEFAULT_RATELIMIT_INTERVAL,
+				      DEFAULT_RATELIMIT_BURST);
 
 	if (mem_cgroup_oom_synchronize(true))
 		return;
@@ -1140,10 +1136,8 @@ void pagefault_out_of_memory(void)
 	if (fatal_signal_pending(current))
 		return;
 
-	if (!mutex_trylock(&oom_lock))
-		return;
-	out_of_memory(&oc);
-	mutex_unlock(&oom_lock);
+	if (__ratelimit(&pfoom_rs))
+		pr_warn("Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF\n");
 }
 
 SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks
  2021-11-05 20:34 incoming Andrew Morton
                   ` (65 preceding siblings ...)
  2021-11-05 20:38 ` [patch 066/262] mm, oom: do not trigger out_of_memory from the #PF Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 068/262] mm/mmap.c: fix a data race of mm->total_vm Andrew Morton
                   ` (194 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, guro, hannes, linux-mm, mgorman, mhocko, mm-commits,
	penguin-kernel, shakeelb, stable, torvalds, urezki, vbabka,
	vdavydov.dev, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: memcg: prohibit unconditional exceeding the limit of dying tasks

Memory cgroup charging allows killed or exiting tasks to exceed the hard
limit.  It is assumed that the amount of the memory charged by those tasks
is bound and most of the memory will get released while the task is
exiting.  This is resembling a heuristic for the global OOM situation when
tasks get access to memory reserves.  There is no global memory shortage
at the memcg level so the memcg heuristic is more relieved.

The above assumption is overly optimistic though.  E.g.  vmalloc can scale
to really large requests and the heuristic would allow that.  We used to
have an early break in the vmalloc allocator for killed tasks but this has
been reverted by commit b8c8a338f75e ("Revert "vmalloc: back off when the
current task is killed"").  There are likely other similar code paths
which do not check for fatal signals in an allocation&charge loop.  Also
there are some kernel objects charged to a memcg which are not bound to a
process life time.

It has been observed that it is not really hard to trigger these bypasses
and cause global OOM situation.

One potential way to address these runaways would be to limit the amount
of excess (similar to the global OOM with limited oom reserves).  This is
certainly possible but it is not really clear how much of an excess is
desirable and still protects from global OOMs as that would have to
consider the overall memcg configuration.

This patch is addressing the problem by removing the heuristic altogether.
Bypass is only allowed for requests which either cannot fail or where the
failure is not desirable while excess should be still limited (e.g. 
atomic requests).  Implementation wise a killed or dying task fails to
charge if it has passed the OOM killer stage.  That should give all forms
of reclaim chance to restore the limit before the failure (ENOMEM) and
tell the caller to back off.

In addition, this patch renames should_force_charge() helper to
task_is_dying() because now its use is not associated witch forced
charging.

This patch depends on pagefault_out_of_memory() to not trigger
out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM
and cause a global OOM killer.

Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |   27 ++++++++-------------------
 1 file changed, 8 insertions(+), 19 deletions(-)

--- a/mm/memcontrol.c~memcg-prohibit-unconditional-exceeding-the-limit-of-dying-tasks
+++ a/mm/memcontrol.c
@@ -234,7 +234,7 @@ enum res_type {
 	     iter != NULL;				\
 	     iter = mem_cgroup_iter(NULL, iter, NULL))
 
-static inline bool should_force_charge(void)
+static inline bool task_is_dying(void)
 {
 	return tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
 		(current->flags & PF_EXITING);
@@ -1624,7 +1624,7 @@ static bool mem_cgroup_out_of_memory(str
 	 * A few threads which were not waiting at mutex_lock_killable() can
 	 * fail to bail out. Therefore, check again after holding oom_lock.
 	 */
-	ret = should_force_charge() || out_of_memory(&oc);
+	ret = task_is_dying() || out_of_memory(&oc);
 
 unlock:
 	mutex_unlock(&oom_lock);
@@ -2579,6 +2579,7 @@ static int try_charge_memcg(struct mem_c
 	struct page_counter *counter;
 	enum oom_status oom_status;
 	unsigned long nr_reclaimed;
+	bool passed_oom = false;
 	bool may_swap = true;
 	bool drained = false;
 	unsigned long pflags;
@@ -2614,15 +2615,6 @@ retry:
 		goto force;
 
 	/*
-	 * Unlike in global OOM situations, memcg is not in a physical
-	 * memory shortage.  Allow dying and OOM-killed tasks to
-	 * bypass the last charges so that they can exit quickly and
-	 * free their memory.
-	 */
-	if (unlikely(should_force_charge()))
-		goto force;
-
-	/*
 	 * Prevent unbounded recursion when reclaim operations need to
 	 * allocate memory. This might exceed the limits temporarily,
 	 * but we prefer facilitating memory reclaim and getting back
@@ -2679,8 +2671,9 @@ retry:
 	if (gfp_mask & __GFP_RETRY_MAYFAIL)
 		goto nomem;
 
-	if (fatal_signal_pending(current))
-		goto force;
+	/* Avoid endless loop for tasks bypassed by the oom killer */
+	if (passed_oom && task_is_dying())
+		goto nomem;
 
 	/*
 	 * keep retrying as long as the memcg oom killer is able to make
@@ -2689,14 +2682,10 @@ retry:
 	 */
 	oom_status = mem_cgroup_oom(mem_over_limit, gfp_mask,
 		       get_order(nr_pages * PAGE_SIZE));
-	switch (oom_status) {
-	case OOM_SUCCESS:
+	if (oom_status == OOM_SUCCESS) {
+		passed_oom = true;
 		nr_retries = MAX_RECLAIM_RETRIES;
 		goto retry;
-	case OOM_FAILED:
-		goto force;
-	default:
-		goto nomem;
 	}
 nomem:
 	if (!(gfp_mask & __GFP_NOFAIL))
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 068/262] mm/mmap.c: fix a data race of mm->total_vm
  2021-11-05 20:34 incoming Andrew Morton
                   ` (66 preceding siblings ...)
  2021-11-05 20:38 ` [patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 069/262] mm: use __pfn_to_section() instead of open coding it Andrew Morton
                   ` (193 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, linux-mm, liupeng256, mm-commits, torvalds

From: Peng Liu <liupeng256@huawei.com>
Subject: mm/mmap.c: fix a data race of mm->total_vm

Variable mm->total_vm could be accessed concurrently during mmaping and
system accounting as noticed by KCSAN,

BUG: KCSAN: data-race in __acct_update_integrals / mmap_region

read-write to 0xffffa40267bd14c8 of 8 bytes by task 15609 on cpu 3:
 mmap_region+0x6dc/0x1400
 do_mmap+0x794/0xca0
 vm_mmap_pgoff+0xdf/0x150
 ksys_mmap_pgoff+0xe1/0x380
 do_syscall_64+0x37/0x50
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

read to 0xffffa40267bd14c8 of 8 bytes by interrupt on cpu 2:
 __acct_update_integrals+0x187/0x1d0
 acct_account_cputime+0x3c/0x40
 update_process_times+0x5c/0x150
 tick_sched_timer+0x184/0x210
 __run_hrtimer+0x119/0x3b0
 hrtimer_interrupt+0x350/0xaa0
 __sysvec_apic_timer_interrupt+0x7b/0x220
 asm_call_irq_on_stack+0x12/0x20
 sysvec_apic_timer_interrupt+0x4d/0x80
 asm_sysvec_apic_timer_interrupt+0x12/0x20
 smp_call_function_single+0x192/0x2b0
 perf_install_in_context+0x29b/0x4a0
 __se_sys_perf_event_open+0x1a98/0x2550
 __x64_sys_perf_event_open+0x63/0x70
 do_syscall_64+0x37/0x50
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 2 PID: 15610 Comm: syz-executor.3 Not tainted 5.10.0+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014

In vm_stat_account which called by mmap_region, increase total_vm, and
__acct_update_integrals may read total_vm at the same time.  This will
cause a data race which lead to undefined behaviour.  To avoid potential
bad read/write, volatile property and barrier are both used to avoid
undefined behaviour.

Link: https://lkml.kernel.org/r/20210913105550.1569419-1-liupeng256@huawei.com
Signed-off-by: Peng Liu <liupeng256@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/tsacct.c |    2 +-
 mm/mmap.c       |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/tsacct.c~mm-mmapc-fix-a-data-race-of-mm-total_vm
+++ a/kernel/tsacct.c
@@ -137,7 +137,7 @@ static void __acct_update_integrals(stru
 	 * the rest of the math is done in xacct_add_tsk.
 	 */
 	tsk->acct_rss_mem1 += delta * get_mm_rss(tsk->mm) >> 10;
-	tsk->acct_vm_mem1 += delta * tsk->mm->total_vm >> 10;
+	tsk->acct_vm_mem1 += delta * READ_ONCE(tsk->mm->total_vm) >> 10;
 }
 
 /**
--- a/mm/mmap.c~mm-mmapc-fix-a-data-race-of-mm-total_vm
+++ a/mm/mmap.c
@@ -3332,7 +3332,7 @@ bool may_expand_vm(struct mm_struct *mm,
 
 void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages)
 {
-	mm->total_vm += npages;
+	WRITE_ONCE(mm->total_vm, READ_ONCE(mm->total_vm)+npages);
 
 	if (is_exec_mapping(flags))
 		mm->exec_vm += npages;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 069/262] mm: use __pfn_to_section() instead of open coding it
  2021-11-05 20:34 incoming Andrew Morton
                   ` (67 preceding siblings ...)
  2021-11-05 20:38 ` [patch 068/262] mm/mmap.c: fix a data race of mm->total_vm Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 070/262] mm/memory.c: avoid unnecessary kernel/user pointer conversion Andrew Morton
                   ` (192 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, eb, linux-mm, mm-commits, torvalds

From: Rolf Eike Beer <eb@emlix.com>
Subject: mm: use __pfn_to_section() instead of open coding it

It is defined in the same file just a few lines above.

Link: https://lkml.kernel.org/r/4598487.Rc0NezkW7i@mobilepool36.emlix.com
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/mmzone.h~mm-use-__pfn_to_section-instead-of-open-coding-it
+++ a/include/linux/mmzone.h
@@ -1481,7 +1481,7 @@ static inline int pfn_valid(unsigned lon
 
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
-	ms = __nr_to_section(pfn_to_section_nr(pfn));
+	ms = __pfn_to_section(pfn);
 	if (!valid_section(ms))
 		return 0;
 	/*
@@ -1496,7 +1496,7 @@ static inline int pfn_in_present_section
 {
 	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
 		return 0;
-	return present_section(__nr_to_section(pfn_to_section_nr(pfn)));
+	return present_section(__pfn_to_section(pfn));
 }
 
 static inline unsigned long next_present_section_nr(unsigned long section_nr)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 070/262] mm/memory.c: avoid unnecessary kernel/user pointer conversion
  2021-11-05 20:34 incoming Andrew Morton
                   ` (68 preceding siblings ...)
  2021-11-05 20:38 ` [patch 069/262] mm: use __pfn_to_section() instead of open coding it Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables Andrew Morton
                   ` (191 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, amit.kachhap, kirill.shutemov, linux-mm, mm-commits,
	torvalds, Vincenzo.Frascino

From: Amit Daniel Kachhap <amit.kachhap@arm.com>
Subject: mm/memory.c: avoid unnecessary kernel/user pointer conversion

Annotating a pointer from __user to kernel and then back again might
confuse sparse.  In copy_huge_page_from_user() it can be avoided by
removing the intermediate variable since it is never used.

Link: https://lkml.kernel.org/r/20210914150820.19326-1-amit.kachhap@arm.com
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/mm/memory.c~mm-memory-avoid-unnecessary-kernel-user-pointer-conversion
+++ a/mm/memory.c
@@ -5421,7 +5421,6 @@ long copy_huge_page_from_user(struct pag
 				unsigned int pages_per_huge_page,
 				bool allow_pagefault)
 {
-	void *src = (void *)usr_src;
 	void *page_kaddr;
 	unsigned long i, rc = 0;
 	unsigned long ret_val = pages_per_huge_page * PAGE_SIZE;
@@ -5434,8 +5433,7 @@ long copy_huge_page_from_user(struct pag
 		else
 			page_kaddr = kmap_atomic(subpage);
 		rc = copy_from_user(page_kaddr,
-				(const void __user *)(src + i * PAGE_SIZE),
-				PAGE_SIZE);
+				usr_src + i * PAGE_SIZE, PAGE_SIZE);
 		if (allow_pagefault)
 			kunmap(subpage);
 		else
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables
  2021-11-05 20:34 incoming Andrew Morton
                   ` (69 preceding siblings ...)
  2021-11-05 20:38 ` [patch 070/262] mm/memory.c: avoid unnecessary kernel/user pointer conversion Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:57   ` Nadav Amit
  2021-11-05 20:38 ` [patch 072/262] mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte Andrew Morton
                   ` (190 subsequent siblings)
  261 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: aarcange, akpm, andrew.cooper3, dave.hansen, linux-mm, luto,
	mm-commits, namit, npiggin, peterz, tglx, torvalds, will, yuzhao

From: Nadav Amit <namit@vmware.com>
Subject: mm/memory.c: use correct VMA flags when freeing page-tables

Consistent use of the mmu_gather interface requires a call to
tlb_start_vma() and tlb_end_vma() for each VMA.  free_pgtables() does not
follow this pattern.

Certain architectures need tlb_start_vma() to be called in order for
tlb_update_vma_flags() to update the VMA flags (tlb->vma_exec and
tlb->vma_huge), which are later used for the proper TLB flush to be
issued.  Since tlb_start_vma() is not called, this can lead to the wrong
VMA flags being used when the flush is performed.

Specifically, the munmap syscall would call unmap_region(), which unmaps
the VMAs and then frees the page-tables.  A flush is needed after the
page-tables are removed to prevent page-walk caches from holding stale
entries, but this flush would use the flags of the VMA flags of the last
VMA that was flushed.  This does not appear to be right.

Use tlb_start_vma() and tlb_end_vma() to prevent this from happening. 
This might lead to unnecessary calls to flush_cache_range() on certain
arch's.  If needed, a new flag can be added to mmu_gather to indicate that
the flush is not needed.

Link: https://lkml.kernel.org/r/20211021122322.592822-1-namit@vmware.com
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/mm/memory.c~mm-use-correct-vma-flags-when-freeing-page-tables
+++ a/mm/memory.c
@@ -412,6 +412,8 @@ void free_pgtables(struct mmu_gather *tl
 		unlink_anon_vmas(vma);
 		unlink_file_vma(vma);
 
+		tlb_start_vma(tlb, vma);
+
 		if (is_vm_hugetlb_page(vma)) {
 			hugetlb_free_pgd_range(tlb, addr, vma->vm_end,
 				floor, next ? next->vm_start : ceiling);
@@ -429,6 +431,8 @@ void free_pgtables(struct mmu_gather *tl
 			free_pgd_range(tlb, addr, vma->vm_end,
 				floor, next ? next->vm_start : ceiling);
 		}
+
+		tlb_end_vma(tlb, vma);
 		vma = next;
 	}
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 072/262] mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte
  2021-11-05 20:34 incoming Andrew Morton
                   ` (70 preceding siblings ...)
  2021-11-05 20:38 ` [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 073/262] mm: clear vmf->pte after pte_unmap_same() returns Andrew Morton
                   ` (189 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: aarcange, akpm, apopple, axelrasmussen, david, hughd, jglisse,
	kirill, liam.howlett, linmiaohe, linux-mm, mm-commits, peterx,
	rppt, shy828301, torvalds, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte

Patch series "mm: A few cleanup patches around zap, shmem and uffd", v4.

IMHO all of them are very nice cleanups to existing code already, they're
all small and self-contained.  They'll be needed by uffd-wp coming series.


This patch (of 4):

It was conditionally done previously, as there's one shmem special case
that we use SetPageDirty() instead.  However that's not necessary and it
should be easier and cleaner to do it unconditionally in
mfill_atomic_install_pte().

The most recent discussion about this is here, where Hugh explained the
history of SetPageDirty() and why it's possible that it's not required at
all:

https://lore.kernel.org/lkml/alpine.LSU.2.11.2104121657050.1097@eggly.anvils/

Currently mfill_atomic_install_pte() has three callers:

        1. shmem_mfill_atomic_pte
        2. mcopy_atomic_pte
        3. mcontinue_atomic_pte

After the change: case (1) should have its SetPageDirty replaced by the
dirty bit on pte (so we unify them together, finally), case (2) should
have no functional change at all as it has page_in_cache==false, case (3)
may add a dirty bit to the pte.  However since case (3) is UFFDIO_CONTINUE
for shmem, it's merely 100% sure the page is dirty after all because
UFFDIO_CONTINUE normally requires another process to modify the page cache
and kick the faulted thread, so should not make a real difference either.

This should make it much easier to follow on which case will set dirty for
uffd, as we'll simply set it all now for all uffd related ioctls. 
Meanwhile, no special handling of SetPageDirty() if there's no need.

Link: https://lkml.kernel.org/r/20210915181456.10739-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20210915181456.10739-2-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/shmem.c       |    1 -
 mm/userfaultfd.c |    3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

--- a/mm/shmem.c~mm-shmem-unconditionally-set-pte-dirty-in-mfill_atomic_install_pte
+++ a/mm/shmem.c
@@ -2423,7 +2423,6 @@ int shmem_mfill_atomic_pte(struct mm_str
 	shmem_recalc_inode(inode);
 	spin_unlock_irq(&info->lock);
 
-	SetPageDirty(page);
 	unlock_page(page);
 	return 0;
 out_delete_from_cache:
--- a/mm/userfaultfd.c~mm-shmem-unconditionally-set-pte-dirty-in-mfill_atomic_install_pte
+++ a/mm/userfaultfd.c
@@ -69,10 +69,9 @@ int mfill_atomic_install_pte(struct mm_s
 	pgoff_t offset, max_off;
 
 	_dst_pte = mk_pte(page, dst_vma->vm_page_prot);
+	_dst_pte = pte_mkdirty(_dst_pte);
 	if (page_in_cache && !vm_shared)
 		writable = false;
-	if (writable || !page_in_cache)
-		_dst_pte = pte_mkdirty(_dst_pte);
 	if (writable) {
 		if (wp_copy)
 			_dst_pte = pte_mkuffd_wp(_dst_pte);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 073/262] mm: clear vmf->pte after pte_unmap_same() returns
  2021-11-05 20:34 incoming Andrew Morton
                   ` (71 preceding siblings ...)
  2021-11-05 20:38 ` [patch 072/262] mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 074/262] mm: drop first_index/last_index in zap_details Andrew Morton
                   ` (188 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: aarcange, akpm, apopple, axelrasmussen, david, hughd, jglisse,
	kirill, liam.howlett, linmiaohe, linux-mm, mm-commits, peterx,
	rppt, shy828301, torvalds, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm: clear vmf->pte after pte_unmap_same() returns

pte_unmap_same() will always unmap the pte pointer.  After the unmap,
vmf->pte will not be valid any more, we should clear it.

It was safe only because no one is accessing vmf->pte after
pte_unmap_same() returns, since the only caller of pte_unmap_same() (so
far) is do_swap_page(), where vmf->pte will in most cases be overwritten
very soon.

Directly pass in vmf into pte_unmap_same() and then we can also avoid the
long parameter list too, which should be a nice cleanup.

Link: https://lkml.kernel.org/r/20210915181533.11188-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/mm/memory.c~mm-clear-vmf-pte-after-pte_unmap_same-returns
+++ a/mm/memory.c
@@ -2728,19 +2728,19 @@ EXPORT_SYMBOL_GPL(apply_to_existing_page
  * proceeding (but do_wp_page is only called after already making such a check;
  * and do_anonymous_page can safely check later on).
  */
-static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
-				pte_t *page_table, pte_t orig_pte)
+static inline int pte_unmap_same(struct vm_fault *vmf)
 {
 	int same = 1;
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPTION)
 	if (sizeof(pte_t) > sizeof(unsigned long)) {
-		spinlock_t *ptl = pte_lockptr(mm, pmd);
+		spinlock_t *ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
 		spin_lock(ptl);
-		same = pte_same(*page_table, orig_pte);
+		same = pte_same(*vmf->pte, vmf->orig_pte);
 		spin_unlock(ptl);
 	}
 #endif
-	pte_unmap(page_table);
+	pte_unmap(vmf->pte);
+	vmf->pte = NULL;
 	return same;
 }
 
@@ -3492,7 +3492,7 @@ vm_fault_t do_swap_page(struct vm_fault
 	vm_fault_t ret = 0;
 	void *shadow = NULL;
 
-	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
+	if (!pte_unmap_same(vmf))
 		goto out;
 
 	entry = pte_to_swp_entry(vmf->orig_pte);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 074/262] mm: drop first_index/last_index in zap_details
  2021-11-05 20:34 incoming Andrew Morton
                   ` (72 preceding siblings ...)
  2021-11-05 20:38 ` [patch 073/262] mm: clear vmf->pte after pte_unmap_same() returns Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 075/262] mm: add zap_skip_check_mapping() helper Andrew Morton
                   ` (187 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: aarcange, akpm, apopple, axelrasmussen, david, hughd, jglisse,
	kirill, liam.howlett, linmiaohe, linux-mm, mm-commits, peterx,
	rppt, shy828301, torvalds, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm: drop first_index/last_index in zap_details

The first_index/last_index parameters in zap_details are actually only
used in unmap_mapping_range_tree().  At the meantime, this function is
only called by unmap_mapping_pages() once.  Instead of passing these two
variables through the whole stack of page zapping code, remove them from
zap_details and let them simply be parameters of
unmap_mapping_range_tree(), which is inlined.

Link: https://lkml.kernel.org/r/20210915181535.11238-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam Howlett <liam.howlett@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |    2 --
 mm/memory.c        |   31 ++++++++++++++++++-------------
 2 files changed, 18 insertions(+), 15 deletions(-)

--- a/include/linux/mm.h~mm-drop-first_index-last_index-in-zap_details
+++ a/include/linux/mm.h
@@ -1688,8 +1688,6 @@ extern void user_shm_unlock(size_t, stru
  */
 struct zap_details {
 	struct address_space *check_mapping;	/* Check page->mapping if set */
-	pgoff_t	first_index;			/* Lowest page->index to unmap */
-	pgoff_t last_index;			/* Highest page->index to unmap */
 	struct page *single_page;		/* Locked page to be unmapped */
 };
 
--- a/mm/memory.c~mm-drop-first_index-last_index-in-zap_details
+++ a/mm/memory.c
@@ -3325,20 +3325,20 @@ static void unmap_mapping_range_vma(stru
 }
 
 static inline void unmap_mapping_range_tree(struct rb_root_cached *root,
+					    pgoff_t first_index,
+					    pgoff_t last_index,
 					    struct zap_details *details)
 {
 	struct vm_area_struct *vma;
 	pgoff_t vba, vea, zba, zea;
 
-	vma_interval_tree_foreach(vma, root,
-			details->first_index, details->last_index) {
-
+	vma_interval_tree_foreach(vma, root, first_index, last_index) {
 		vba = vma->vm_pgoff;
 		vea = vba + vma_pages(vma) - 1;
-		zba = details->first_index;
+		zba = first_index;
 		if (zba < vba)
 			zba = vba;
-		zea = details->last_index;
+		zea = last_index;
 		if (zea > vea)
 			zea = vea;
 
@@ -3364,18 +3364,22 @@ void unmap_mapping_page(struct page *pag
 {
 	struct address_space *mapping = page->mapping;
 	struct zap_details details = { };
+	pgoff_t	first_index;
+	pgoff_t	last_index;
 
 	VM_BUG_ON(!PageLocked(page));
 	VM_BUG_ON(PageTail(page));
 
+	first_index = page->index;
+	last_index = page->index + thp_nr_pages(page) - 1;
+
 	details.check_mapping = mapping;
-	details.first_index = page->index;
-	details.last_index = page->index + thp_nr_pages(page) - 1;
 	details.single_page = page;
 
 	i_mmap_lock_write(mapping);
 	if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)))
-		unmap_mapping_range_tree(&mapping->i_mmap, &details);
+		unmap_mapping_range_tree(&mapping->i_mmap, first_index,
+					 last_index, &details);
 	i_mmap_unlock_write(mapping);
 }
 
@@ -3395,16 +3399,17 @@ void unmap_mapping_pages(struct address_
 		pgoff_t nr, bool even_cows)
 {
 	struct zap_details details = { };
+	pgoff_t	first_index = start;
+	pgoff_t	last_index = start + nr - 1;
 
 	details.check_mapping = even_cows ? NULL : mapping;
-	details.first_index = start;
-	details.last_index = start + nr - 1;
-	if (details.last_index < details.first_index)
-		details.last_index = ULONG_MAX;
+	if (last_index < first_index)
+		last_index = ULONG_MAX;
 
 	i_mmap_lock_write(mapping);
 	if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)))
-		unmap_mapping_range_tree(&mapping->i_mmap, &details);
+		unmap_mapping_range_tree(&mapping->i_mmap, first_index,
+					 last_index, &details);
 	i_mmap_unlock_write(mapping);
 }
 EXPORT_SYMBOL_GPL(unmap_mapping_pages);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 075/262] mm: add zap_skip_check_mapping() helper
  2021-11-05 20:34 incoming Andrew Morton
                   ` (73 preceding siblings ...)
  2021-11-05 20:38 ` [patch 074/262] mm: drop first_index/last_index in zap_details Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 076/262] mm: introduce pmd_install() helper Andrew Morton
                   ` (186 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: aarcange, akpm, apopple, axelrasmussen, david, hughd, jglisse,
	kirill, liam.howlett, linmiaohe, linux-mm, mm-commits, peterx,
	rppt, shy828301, torvalds, willy

From: Peter Xu <peterx@redhat.com>
Subject: mm: add zap_skip_check_mapping() helper

Use the helper for the checks.  Rename "check_mapping" into "zap_mapping"
because "check_mapping" looks like a bool but in fact it stores the
mapping itself.  When it's set, we check the mapping (it must be
non-NULL).  When it's cleared we skip the check, which works like the old
way.

Move the duplicated comments to the helper too.

Link: https://lkml.kernel.org/r/20210915181538.11288-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |   16 +++++++++++++++-
 mm/memory.c        |   29 ++++++-----------------------
 2 files changed, 21 insertions(+), 24 deletions(-)

--- a/include/linux/mm.h~mm-add-zap_skip_check_mapping-helper
+++ a/include/linux/mm.h
@@ -1687,10 +1687,24 @@ extern void user_shm_unlock(size_t, stru
  * Parameter block passed down to zap_pte_range in exceptional cases.
  */
 struct zap_details {
-	struct address_space *check_mapping;	/* Check page->mapping if set */
+	struct address_space *zap_mapping;	/* Check page->mapping if set */
 	struct page *single_page;		/* Locked page to be unmapped */
 };
 
+/*
+ * We set details->zap_mappings when we want to unmap shared but keep private
+ * pages. Return true if skip zapping this page, false otherwise.
+ */
+static inline bool
+zap_skip_check_mapping(struct zap_details *details, struct page *page)
+{
+	if (!details || !page)
+		return false;
+
+	return details->zap_mapping &&
+	    (details->zap_mapping != page_rmapping(page));
+}
+
 struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 			     pte_t pte);
 struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr,
--- a/mm/memory.c~mm-add-zap_skip_check_mapping-helper
+++ a/mm/memory.c
@@ -1337,16 +1337,8 @@ again:
 			struct page *page;
 
 			page = vm_normal_page(vma, addr, ptent);
-			if (unlikely(details) && page) {
-				/*
-				 * unmap_shared_mapping_pages() wants to
-				 * invalidate cache without truncating:
-				 * unmap shared but keep private pages.
-				 */
-				if (details->check_mapping &&
-				    details->check_mapping != page_rmapping(page))
-					continue;
-			}
+			if (unlikely(zap_skip_check_mapping(details, page)))
+				continue;
 			ptent = ptep_get_and_clear_full(mm, addr, pte,
 							tlb->fullmm);
 			tlb_remove_tlb_entry(tlb, pte, addr);
@@ -1379,17 +1371,8 @@ again:
 		    is_device_exclusive_entry(entry)) {
 			struct page *page = pfn_swap_entry_to_page(entry);
 
-			if (unlikely(details && details->check_mapping)) {
-				/*
-				 * unmap_shared_mapping_pages() wants to
-				 * invalidate cache without truncating:
-				 * unmap shared but keep private pages.
-				 */
-				if (details->check_mapping !=
-				    page_rmapping(page))
-					continue;
-			}
-
+			if (unlikely(zap_skip_check_mapping(details, page)))
+				continue;
 			pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
 			rss[mm_counter(page)]--;
 
@@ -3373,7 +3356,7 @@ void unmap_mapping_page(struct page *pag
 	first_index = page->index;
 	last_index = page->index + thp_nr_pages(page) - 1;
 
-	details.check_mapping = mapping;
+	details.zap_mapping = mapping;
 	details.single_page = page;
 
 	i_mmap_lock_write(mapping);
@@ -3402,7 +3385,7 @@ void unmap_mapping_pages(struct address_
 	pgoff_t	first_index = start;
 	pgoff_t	last_index = start + nr - 1;
 
-	details.check_mapping = even_cows ? NULL : mapping;
+	details.zap_mapping = even_cows ? NULL : mapping;
 	if (last_index < first_index)
 		last_index = ULONG_MAX;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 076/262] mm: introduce pmd_install() helper
  2021-11-05 20:34 incoming Andrew Morton
                   ` (74 preceding siblings ...)
  2021-11-05 20:38 ` [patch 075/262] mm: add zap_skip_check_mapping() helper Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 077/262] mm: remove redundant smp_wmb() Andrew Morton
                   ` (185 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, david, hannes, kirill.shutemov, linux-mm, mhocko,
	mika.penttila, mm-commits, songmuchun, tglx, torvalds, vbabka,
	vdavydov.dev, zhengqi.arch

From: Qi Zheng <zhengqi.arch@bytedance.com>
Subject: mm: introduce pmd_install() helper

Patch series "Do some code cleanups related to mm", v3.


This patch (of 2):

Currently we have three times the same few lines repeated in the code. 
Deduplicate them by newly introduced pmd_install() helper.

Link: https://lkml.kernel.org/r/20210901102722.47686-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20210901102722.47686-2-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Mika Penttila <mika.penttila@nextfour.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c  |   11 ++---------
 mm/internal.h |    1 +
 mm/memory.c   |   34 ++++++++++++++++------------------
 3 files changed, 19 insertions(+), 27 deletions(-)

--- a/mm/filemap.c~mm-introduce-pmd_install-helper
+++ a/mm/filemap.c
@@ -3211,15 +3211,8 @@ static bool filemap_map_pmd(struct vm_fa
 	    }
 	}
 
-	if (pmd_none(*vmf->pmd)) {
-		vmf->ptl = pmd_lock(mm, vmf->pmd);
-		if (likely(pmd_none(*vmf->pmd))) {
-			mm_inc_nr_ptes(mm);
-			pmd_populate(mm, vmf->pmd, vmf->prealloc_pte);
-			vmf->prealloc_pte = NULL;
-		}
-		spin_unlock(vmf->ptl);
-	}
+	if (pmd_none(*vmf->pmd))
+		pmd_install(mm, vmf->pmd, &vmf->prealloc_pte);
 
 	/* See comment in handle_pte_fault() */
 	if (pmd_devmap_trans_unstable(vmf->pmd)) {
--- a/mm/internal.h~mm-introduce-pmd_install-helper
+++ a/mm/internal.h
@@ -38,6 +38,7 @@ vm_fault_t do_swap_page(struct vm_fault
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 		unsigned long floor, unsigned long ceiling);
+void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte);
 
 static inline bool can_madv_lru_vma(struct vm_area_struct *vma)
 {
--- a/mm/memory.c~mm-introduce-pmd_install-helper
+++ a/mm/memory.c
@@ -437,9 +437,20 @@ void free_pgtables(struct mmu_gather *tl
 	}
 }
 
+void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte)
+{
+	spinlock_t *ptl = pmd_lock(mm, pmd);
+
+	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
+		mm_inc_nr_ptes(mm);
+		pmd_populate(mm, pmd, *pte);
+		*pte = NULL;
+	}
+	spin_unlock(ptl);
+}
+
 int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
 {
-	spinlock_t *ptl;
 	pgtable_t new = pte_alloc_one(mm);
 	if (!new)
 		return -ENOMEM;
@@ -459,13 +470,7 @@ int __pte_alloc(struct mm_struct *mm, pm
 	 */
 	smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
 
-	ptl = pmd_lock(mm, pmd);
-	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
-		mm_inc_nr_ptes(mm);
-		pmd_populate(mm, pmd, new);
-		new = NULL;
-	}
-	spin_unlock(ptl);
+	pmd_install(mm, pmd, &new);
 	if (new)
 		pte_free(mm, new);
 	return 0;
@@ -4028,17 +4033,10 @@ vm_fault_t finish_fault(struct vm_fault
 				return ret;
 		}
 
-		if (vmf->prealloc_pte) {
-			vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
-			if (likely(pmd_none(*vmf->pmd))) {
-				mm_inc_nr_ptes(vma->vm_mm);
-				pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte);
-				vmf->prealloc_pte = NULL;
-			}
-			spin_unlock(vmf->ptl);
-		} else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) {
+		if (vmf->prealloc_pte)
+			pmd_install(vma->vm_mm, vmf->pmd, &vmf->prealloc_pte);
+		else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd)))
 			return VM_FAULT_OOM;
-		}
 	}
 
 	/* See comment in handle_pte_fault() */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 077/262] mm: remove redundant smp_wmb()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (75 preceding siblings ...)
  2021-11-05 20:38 ` [patch 076/262] mm: introduce pmd_install() helper Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 078/262] Documentation: update pagemap with shmem exceptions Andrew Morton
                   ` (184 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, david, hannes, kirill.shutemov, linux-mm, mhocko,
	mika.penttila, mm-commits, songmuchun, tglx, torvalds, vbabka,
	vdavydov.dev, zhengqi.arch

From: Qi Zheng <zhengqi.arch@bytedance.com>
Subject: mm: remove redundant smp_wmb()

The smp_wmb() which is in the __pte_alloc() is used to ensure all ptes
setup is visible before the pte is made visible to other CPUs by being put
into page tables.  We only need this when the pte is actually populated,
so move it to pmd_install().  __pte_alloc_kernel(), __p4d_alloc(),
__pud_alloc() and __pmd_alloc() are similar to this case.

We can also defer smp_wmb() to the place where the pmd entry is really
populated by preallocated pte.  There are two kinds of user of
preallocated pte, one is filemap & finish_fault(), another is THP.  The
former does not need another smp_wmb() because the smp_wmb() has been done
by pmd_install().  Fortunately, the latter also does not need another
smp_wmb() because there is already a smp_wmb() before populating the new
pte when the THP uses a preallocated pte to split a huge pmd.

Link: https://lkml.kernel.org/r/20210901102722.47686-3-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mika Penttila <mika.penttila@nextfour.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c         |   52 ++++++++++++++++++------------------------
 mm/sparse-vmemmap.c |    2 -
 2 files changed, 24 insertions(+), 30 deletions(-)

--- a/mm/memory.c~mm-remove-redundant-smp_wmb
+++ a/mm/memory.c
@@ -443,6 +443,20 @@ void pmd_install(struct mm_struct *mm, p
 
 	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
 		mm_inc_nr_ptes(mm);
+		/*
+		 * Ensure all pte setup (eg. pte page lock and page clearing) are
+		 * visible before the pte is made visible to other CPUs by being
+		 * put into page tables.
+		 *
+		 * The other side of the story is the pointer chasing in the page
+		 * table walking code (when walking the page table without locking;
+		 * ie. most of the time). Fortunately, these data accesses consist
+		 * of a chain of data-dependent loads, meaning most CPUs (alpha
+		 * being the notable exception) will already guarantee loads are
+		 * seen in-order. See the alpha page table accessors for the
+		 * smp_rmb() barriers in page table walking code.
+		 */
+		smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
 		pmd_populate(mm, pmd, *pte);
 		*pte = NULL;
 	}
@@ -455,21 +469,6 @@ int __pte_alloc(struct mm_struct *mm, pm
 	if (!new)
 		return -ENOMEM;
 
-	/*
-	 * Ensure all pte setup (eg. pte page lock and page clearing) are
-	 * visible before the pte is made visible to other CPUs by being
-	 * put into page tables.
-	 *
-	 * The other side of the story is the pointer chasing in the page
-	 * table walking code (when walking the page table without locking;
-	 * ie. most of the time). Fortunately, these data accesses consist
-	 * of a chain of data-dependent loads, meaning most CPUs (alpha
-	 * being the notable exception) will already guarantee loads are
-	 * seen in-order. See the alpha page table accessors for the
-	 * smp_rmb() barriers in page table walking code.
-	 */
-	smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
-
 	pmd_install(mm, pmd, &new);
 	if (new)
 		pte_free(mm, new);
@@ -482,10 +481,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&init_mm.page_table_lock);
 	if (likely(pmd_none(*pmd))) {	/* Has another populated it ? */
+		smp_wmb(); /* See comment in pmd_install() */
 		pmd_populate_kernel(&init_mm, pmd, new);
 		new = NULL;
 	}
@@ -3849,7 +3847,6 @@ static vm_fault_t __do_fault(struct vm_f
 		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	ret = vma->vm_ops->fault(vmf);
@@ -3920,7 +3917,6 @@ vm_fault_t do_set_pmd(struct vm_fault *v
 		vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
@@ -4145,7 +4141,6 @@ static vm_fault_t do_fault_around(struct
 		vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
 		if (!vmf->prealloc_pte)
 			return VM_FAULT_OOM;
-		smp_wmb(); /* See comment in __pte_alloc() */
 	}
 
 	return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
@@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pg
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&mm->page_table_lock);
-	if (pgd_present(*pgd))		/* Another has populated it */
+	if (pgd_present(*pgd)) {	/* Another has populated it */
 		p4d_free(mm, new);
-	else
+	} else {
+		smp_wmb(); /* See comment in pmd_install() */
 		pgd_populate(mm, pgd, new);
+	}
 	spin_unlock(&mm->page_table_lock);
 	return 0;
 }
@@ -4842,11 +4837,10 @@ int __pud_alloc(struct mm_struct *mm, p4
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	spin_lock(&mm->page_table_lock);
 	if (!p4d_present(*p4d)) {
 		mm_inc_nr_puds(mm);
+		smp_wmb(); /* See comment in pmd_install() */
 		p4d_populate(mm, p4d, new);
 	} else	/* Another has populated it */
 		pud_free(mm, new);
@@ -4867,14 +4861,14 @@ int __pmd_alloc(struct mm_struct *mm, pu
 	if (!new)
 		return -ENOMEM;
 
-	smp_wmb(); /* See comment in __pte_alloc */
-
 	ptl = pud_lock(mm, pud);
 	if (!pud_present(*pud)) {
 		mm_inc_nr_pmds(mm);
+		smp_wmb(); /* See comment in pmd_install() */
 		pud_populate(mm, pud, new);
-	} else	/* Another has populated it */
+	} else {	/* Another has populated it */
 		pmd_free(mm, new);
+	}
 	spin_unlock(ptl);
 	return 0;
 }
--- a/mm/sparse-vmemmap.c~mm-remove-redundant-smp_wmb
+++ a/mm/sparse-vmemmap.c
@@ -76,7 +76,7 @@ static int split_vmemmap_huge_pmd(pmd_t
 		set_pte_at(&init_mm, addr, pte, entry);
 	}
 
-	/* Make pte visible before pmd. See comment in __pte_alloc(). */
+	/* Make pte visible before pmd. See comment in pmd_install(). */
 	smp_wmb();
 	pmd_populate_kernel(&init_mm, pmd, pgtable);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 078/262] Documentation: update pagemap with shmem exceptions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (76 preceding siblings ...)
  2021-11-05 20:38 ` [patch 077/262] mm: remove redundant smp_wmb() Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 079/262] lazy tlb: introduce lazy mm refcount helper functions Andrew Morton
                   ` (183 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, carl.waldspurger, corbet, david, florian.schmidt,
	ivan.teterevkov, jonathan.davies, linux-mm, mm-commits, peterx,
	tiberiu.georgescu, torvalds

From: Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>
Subject: Documentation: update pagemap with shmem exceptions

This patch follows the discussions on previous documentation patch threads
[1][2].  It presents the exception case of shared memory management from
the pagemap's point of view.  It briefly describes what is missing, why it
is missing and alternatives to the pagemap for page info retrieval in user
space.

In short, the kernel does not keep track of PTEs for swapped out shared
pages within the processes that references them.  Thus, the
proc/pid/pagemap tool cannot print the swap destination of the shared
memory pages, instead setting the pagemap entry to zero for both
non-allocated and swapped out pages.  This can create confusion for users
who need information on swapped out pages.

The reasons why maintaining the PTEs of all swapped out shared pages among
all processes while maintaining similar performance is not a trivial task,
or a desirable change, have been discussed extensively [1][3][4][5]. 
There are also arguments for why this arguably missing information should
eventually be exposed to the user in either a future pagemap patch, or by
an alternative tool.

[1]: https://marc.info/?m=162878395426774
[2]: https://lore.kernel.org/lkml/20210920164931.175411-1-tiberiu.georgescu@nutanix.com/
[3]: https://lore.kernel.org/lkml/20210730160826.63785-1-tiberiu.georgescu@nutanix.com/
[4]: https://lore.kernel.org/lkml/20210807032521.7591-1-peterx@redhat.com/
[5]: https://lore.kernel.org/lkml/20210715201651.212134-1-peterx@redhat.com/


Mention the current missing information in the pagemap and alternatives on
how to retrieve it, in case someone stumbles upon unexpected behaviour.

Link: https://lkml.kernel.org/r/20210923064618.157046-1-tiberiu.georgescu@nutanix.com
Link: https://lkml.kernel.org/r/20210923064618.157046-2-tiberiu.georgescu@nutanix.com
Signed-off-by: Tiberiu A Georgescu <tiberiu.georgescu@nutanix.com>
Reviewed-by: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Reviewed-by: Florian Schmidt <florian.schmidt@nutanix.com>
Reviewed-by: Carl Waldspurger <carl.waldspurger@nutanix.com>
Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/pagemap.rst |   22 +++++++++++++++++++++
 1 file changed, 22 insertions(+)

--- a/Documentation/admin-guide/mm/pagemap.rst~documentation-update-pagemap-with-shmem-exceptions
+++ a/Documentation/admin-guide/mm/pagemap.rst
@@ -196,6 +196,28 @@ you can go through every map in the proc
 in kpagecount, and tally up the number of pages that are only referenced
 once.
 
+Exceptions for Shared Memory
+============================
+
+Page table entries for shared pages are cleared when the pages are zapped or
+swapped out. This makes swapped out pages indistinguishable from never-allocated
+ones.
+
+In kernel space, the swap location can still be retrieved from the page cache.
+However, values stored only on the normal PTE get lost irretrievably when the
+page is swapped out (i.e. SOFT_DIRTY).
+
+In user space, whether the page is present, swapped or none can be deduced with
+the help of lseek and/or mincore system calls.
+
+lseek() can differentiate between accessed pages (present or swapped out) and
+holes (none/non-allocated) by specifying the SEEK_DATA flag on the file where
+the pages are backed. For anonymous shared pages, the file can be found in
+``/proc/pid/map_files/``.
+
+mincore() can differentiate between pages in memory (present, including swap
+cache) and out of memory (swapped out or none/non-allocated).
+
 Other notes
 ===========
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 079/262] lazy tlb: introduce lazy mm refcount helper functions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (77 preceding siblings ...)
  2021-11-05 20:38 ` [patch 078/262] Documentation: update pagemap with shmem exceptions Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable Andrew Morton
                   ` (182 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, anton, benh, linux-mm, luto, mm-commits, npiggin, paulus,
	rdunlap, torvalds

From: Nicholas Piggin <npiggin@gmail.com>
Subject: lazy tlb: introduce lazy mm refcount helper functions

Patch series "shoot lazy tlbs", v4.

On a 16-socket 192-core POWER8 system, a context switching benchmark with
as many software threads as CPUs (so each switch will go in and out of
idle), upstream can achieve a rate of about 1 million context switches per
second.  After this series it goes up to 118 million.


This patch (of 4):

Add explicit _lazy_tlb annotated functions for lazy mm refcounting.  This
makes lazy mm references more obvious, and allows explicit refcounting to
be removed if it is not used.

If a kernel thread's current lazy tlb mm happens to be the one it wants to
use, then kthread_use_mm() cleverly transfers the mm refcount from the
lazy tlb mm reference to the returned reference.  If the lazy tlb mm
reference is no longer identical to a normal reference, this trick does
not work, so that is changed to be explicit about the two references.

[npiggin@gmail.com: fix a refcounting bug in kthread_use_mm]
  Link: https://lkml.kernel.org/r/1623125298.bx63h3mopj.astroid@bobo.none
Link: https://lkml.kernel.org/r/20210605014216.446867-1-npiggin@gmail.com
Link: https://lkml.kernel.org/r/20210605014216.446867-2-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm/mach-rpc/ecard.c            |    2 +-
 arch/powerpc/kernel/smp.c            |    2 +-
 arch/powerpc/mm/book3s64/radix_tlb.c |    4 ++--
 fs/exec.c                            |    4 ++--
 include/linux/sched/mm.h             |   11 +++++++++++
 kernel/cpu.c                         |    2 +-
 kernel/exit.c                        |    2 +-
 kernel/kthread.c                     |   21 +++++++++++++--------
 kernel/sched/core.c                  |   15 ++++++++-------
 9 files changed, 40 insertions(+), 23 deletions(-)

--- a/arch/arm/mach-rpc/ecard.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/arch/arm/mach-rpc/ecard.c
@@ -253,7 +253,7 @@ static int ecard_init_mm(void)
 	current->mm = mm;
 	current->active_mm = mm;
 	activate_mm(active_mm, mm);
-	mmdrop(active_mm);
+	mmdrop_lazy_tlb(active_mm);
 	ecard_init_pgtables(mm);
 	return 0;
 }
--- a/arch/powerpc/kernel/smp.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/arch/powerpc/kernel/smp.c
@@ -1582,7 +1582,7 @@ void start_secondary(void *unused)
 	if (IS_ENABLED(CONFIG_PPC32))
 		setup_kup();
 
-	mmgrab(&init_mm);
+	mmgrab_lazy_tlb(&init_mm);
 	current->active_mm = &init_mm;
 
 	smp_store_cpu_info(cpu);
--- a/arch/powerpc/mm/book3s64/radix_tlb.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -786,10 +786,10 @@ void exit_lazy_flush_tlb(struct mm_struc
 	if (current->active_mm == mm) {
 		WARN_ON_ONCE(current->mm != NULL);
 		/* Is a kernel thread and is using mm as the lazy tlb */
-		mmgrab(&init_mm);
+		mmgrab_lazy_tlb(&init_mm);
 		current->active_mm = &init_mm;
 		switch_mm_irqs_off(mm, &init_mm, current);
-		mmdrop(mm);
+		mmdrop_lazy_tlb(mm);
 	}
 
 	/*
--- a/fs/exec.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/fs/exec.c
@@ -1028,9 +1028,9 @@ static int exec_mmap(struct mm_struct *m
 		setmax_mm_hiwater_rss(&tsk->signal->maxrss, old_mm);
 		mm_update_next_owner(old_mm);
 		mmput(old_mm);
-		return 0;
+	} else {
+		mmdrop_lazy_tlb(active_mm);
 	}
-	mmdrop(active_mm);
 	return 0;
 }
 
--- a/include/linux/sched/mm.h~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/include/linux/sched/mm.h
@@ -49,6 +49,17 @@ static inline void mmdrop(struct mm_stru
 		__mmdrop(mm);
 }
 
+/* Helpers for lazy TLB mm refcounting */
+static inline void mmgrab_lazy_tlb(struct mm_struct *mm)
+{
+	mmgrab(mm);
+}
+
+static inline void mmdrop_lazy_tlb(struct mm_struct *mm)
+{
+	mmdrop(mm);
+}
+
 /**
  * mmget() - Pin the address space associated with a &struct mm_struct.
  * @mm: The address space to pin.
--- a/kernel/cpu.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/kernel/cpu.c
@@ -613,7 +613,7 @@ static int finish_cpu(unsigned int cpu)
 	 */
 	if (mm != &init_mm)
 		idle->active_mm = &init_mm;
-	mmdrop(mm);
+	mmdrop_lazy_tlb(mm);
 	return 0;
 }
 
--- a/kernel/exit.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/kernel/exit.c
@@ -475,7 +475,7 @@ static void exit_mm(void)
 		__set_current_state(TASK_RUNNING);
 		mmap_read_lock(mm);
 	}
-	mmgrab(mm);
+	mmgrab_lazy_tlb(mm);
 	BUG_ON(mm != current->active_mm);
 	/* more a memory barrier than a real lock */
 	task_lock(current);
--- a/kernel/kthread.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/kernel/kthread.c
@@ -1350,14 +1350,19 @@ void kthread_use_mm(struct mm_struct *mm
 	WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
 	WARN_ON_ONCE(tsk->mm);
 
+	/*
+	 * It's possible that tsk->active_mm == mm here, but we must
+	 * still mmgrab(mm) and mmdrop_lazy_tlb(active_mm), because lazy
+	 * mm may not have its own refcount (see mmgrab/drop_lazy_tlb()).
+	 */
+	mmgrab(mm);
+
 	task_lock(tsk);
 	/* Hold off tlb flush IPIs while switching mm's */
 	local_irq_disable();
 	active_mm = tsk->active_mm;
-	if (active_mm != mm) {
-		mmgrab(mm);
+	if (active_mm != mm)
 		tsk->active_mm = mm;
-	}
 	tsk->mm = mm;
 	membarrier_update_current_mm(mm);
 	switch_mm_irqs_off(active_mm, mm, tsk);
@@ -1374,12 +1379,9 @@ void kthread_use_mm(struct mm_struct *mm
 	 * memory barrier after storing to tsk->mm, before accessing
 	 * user-space memory. A full memory barrier for membarrier
 	 * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by
-	 * mmdrop(), or explicitly with smp_mb().
+	 * mmdrop_lazy_tlb().
 	 */
-	if (active_mm != mm)
-		mmdrop(active_mm);
-	else
-		smp_mb();
+	mmdrop_lazy_tlb(active_mm);
 
 	to_kthread(tsk)->oldfs = force_uaccess_begin();
 }
@@ -1411,10 +1413,13 @@ void kthread_unuse_mm(struct mm_struct *
 	local_irq_disable();
 	tsk->mm = NULL;
 	membarrier_update_current_mm(NULL);
+	mmgrab_lazy_tlb(mm);
 	/* active_mm is still 'mm' */
 	enter_lazy_tlb(mm, tsk);
 	local_irq_enable();
 	task_unlock(tsk);
+
+	mmdrop(mm);
 }
 EXPORT_SYMBOL_GPL(kthread_unuse_mm);
 
--- a/kernel/sched/core.c~lazy-tlb-introduce-lazy-mm-refcount-helper-functions
+++ a/kernel/sched/core.c
@@ -4831,13 +4831,14 @@ static struct rq *finish_task_switch(str
 	 * rq->curr, before returning to userspace, so provide them here:
 	 *
 	 * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
-	 *   provided by mmdrop(),
+	 *   provided by mmdrop_lazy_tlb(),
 	 * - a sync_core for SYNC_CORE.
 	 */
 	if (mm) {
 		membarrier_mm_sync_core_before_usermode(mm);
-		mmdrop(mm);
+		mmdrop_lazy_tlb(mm);
 	}
+
 	if (unlikely(prev_state == TASK_DEAD)) {
 		if (prev->sched_class->task_dead)
 			prev->sched_class->task_dead(prev);
@@ -4900,9 +4901,9 @@ context_switch(struct rq *rq, struct tas
 
 	/*
 	 * kernel -> kernel   lazy + transfer active
-	 *   user -> kernel   lazy + mmgrab() active
+	 *   user -> kernel   lazy + mmgrab_lazy_tlb() active
 	 *
-	 * kernel ->   user   switch + mmdrop() active
+	 * kernel ->   user   switch + mmdrop_lazy_tlb() active
 	 *   user ->   user   switch
 	 */
 	if (!next->mm) {                                // to kernel
@@ -4910,7 +4911,7 @@ context_switch(struct rq *rq, struct tas
 
 		next->active_mm = prev->active_mm;
 		if (prev->mm)                           // from user
-			mmgrab(prev->active_mm);
+			mmgrab_lazy_tlb(prev->active_mm);
 		else
 			prev->active_mm = NULL;
 	} else {                                        // to user
@@ -4926,7 +4927,7 @@ context_switch(struct rq *rq, struct tas
 		switch_mm_irqs_off(prev->active_mm, next->mm, next);
 
 		if (!prev->mm) {                        // from kernel
-			/* will mmdrop() in finish_task_switch(). */
+			/* will mmdrop_lazy_tlb() in finish_task_switch(). */
 			rq->prev_mm = prev->active_mm;
 			prev->active_mm = NULL;
 		}
@@ -9442,7 +9443,7 @@ void __init sched_init(void)
 	/*
 	 * The boot idle thread does lazy MMU switching as well:
 	 */
-	mmgrab(&init_mm);
+	mmgrab_lazy_tlb(&init_mm);
 	enter_lazy_tlb(&init_mm, current);
 
 	/*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable
  2021-11-05 20:34 incoming Andrew Morton
                   ` (78 preceding siblings ...)
  2021-11-05 20:38 ` [patch 079/262] lazy tlb: introduce lazy mm refcount helper functions Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-06  4:29   ` Andy Lutomirski
  2021-11-05 20:38 ` [patch 081/262] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Andrew Morton
                   ` (181 subsequent siblings)
  261 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, anton, benh, linux-mm, luto, mm-commits, npiggin, paulus,
	rdunlap, torvalds

From: Nicholas Piggin <npiggin@gmail.com>
Subject: lazy tlb: allow lazy tlb mm refcounting to be configurable

Add CONFIG_MMU_TLB_REFCOUNT which enables refcounting of the lazy tlb mm
when it is context switched.  This can be disabled by architectures that
don't require this refcounting if they clean up lazy tlb mms when the last
refcount is dropped.  Currently this is always enabled, which is what
existing code does, so the patch is effectively a no-op.

Rename rq->prev_mm to rq->prev_lazy_mm, because that's what it is.

[akpm@linux-foundation.org: fix comment]
[npiggin@gmail.com: update comments]
  Link: https://lkml.kernel.org/r/1623121605.j47gdpccep.astroid@bobo.none
Link: https://lkml.kernel.org/r/20210605014216.446867-3-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/Kconfig             |   14 ++++++++++++++
 include/linux/sched/mm.h |   14 ++++++++++++--
 kernel/sched/core.c      |   22 ++++++++++++++++++----
 kernel/sched/sched.h     |    4 +++-
 4 files changed, 47 insertions(+), 7 deletions(-)

--- a/arch/Kconfig~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable
+++ a/arch/Kconfig
@@ -428,6 +428,20 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
 	  irqs disabled over activate_mm. Architectures that do IPI based TLB
 	  shootdowns should enable this.
 
+# Use normal mm refcounting for MMU_LAZY_TLB kernel thread references.
+# MMU_LAZY_TLB_REFCOUNT=n can improve the scalability of context switching
+# to/from kernel threads when the same mm is running on a lot of CPUs (a large
+# multi-threaded application), by reducing contention on the mm refcount.
+#
+# This can be disabled if the architecture ensures no CPUs are using an mm as a
+# "lazy tlb" beyond its final refcount (i.e., by the time __mmdrop frees the mm
+# or its kernel page tables). This could be arranged by arch_exit_mmap(), or
+# final exit(2) TLB flush, for example. arch code must also ensure the
+# _lazy_tlb variants of mmgrab/mmdrop are used when dropping the lazy reference
+# to a kthread ->active_mm (non-arch code has been converted already).
+config MMU_LAZY_TLB_REFCOUNT
+	def_bool y
+
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
 	bool
 
--- a/include/linux/sched/mm.h~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable
+++ a/include/linux/sched/mm.h
@@ -52,12 +52,22 @@ static inline void mmdrop(struct mm_stru
 /* Helpers for lazy TLB mm refcounting */
 static inline void mmgrab_lazy_tlb(struct mm_struct *mm)
 {
-	mmgrab(mm);
+	if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT))
+		mmgrab(mm);
 }
 
 static inline void mmdrop_lazy_tlb(struct mm_struct *mm)
 {
-	mmdrop(mm);
+	if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_REFCOUNT)) {
+		mmdrop(mm);
+	} else {
+		/*
+		 * mmdrop_lazy_tlb must provide a full memory barrier, see the
+		 * membarrier comment in finish_task_switch which relies on
+		 * this.
+		 */
+		smp_mb();
+	}
 }
 
 /**
--- a/kernel/sched/core.c~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable
+++ a/kernel/sched/core.c
@@ -4772,7 +4772,7 @@ static struct rq *finish_task_switch(str
 	__releases(rq->lock)
 {
 	struct rq *rq = this_rq();
-	struct mm_struct *mm = rq->prev_mm;
+	struct mm_struct *mm = NULL;
 	long prev_state;
 
 	/*
@@ -4791,7 +4791,10 @@ static struct rq *finish_task_switch(str
 		      current->comm, current->pid, preempt_count()))
 		preempt_count_set(FORK_PREEMPT_COUNT);
 
-	rq->prev_mm = NULL;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+	mm = rq->prev_lazy_mm;
+	rq->prev_lazy_mm = NULL;
+#endif
 
 	/*
 	 * A task struct has one reference for the use as "current".
@@ -4927,9 +4930,20 @@ context_switch(struct rq *rq, struct tas
 		switch_mm_irqs_off(prev->active_mm, next->mm, next);
 
 		if (!prev->mm) {                        // from kernel
-			/* will mmdrop_lazy_tlb() in finish_task_switch(). */
-			rq->prev_mm = prev->active_mm;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+			/* Will mmdrop_lazy_tlb() in finish_task_switch(). */
+			rq->prev_lazy_mm = prev->active_mm;
 			prev->active_mm = NULL;
+#else
+			/*
+			 * Without MMU_LAZY_TLB_REFCOUNT there is no lazy
+			 * tracking (because no rq->prev_lazy_mm) in
+			 * finish_task_switch, so no mmdrop_lazy_tlb(), so no
+			 * memory barrier for membarrier (see the membarrier
+			 * comment in finish_task_switch()).  Do it here.
+			 */
+			smp_mb();
+#endif
 		}
 	}
 
--- a/kernel/sched/sched.h~lazy-tlb-allow-lazy-tlb-mm-refcounting-to-be-configurable
+++ a/kernel/sched/sched.h
@@ -977,7 +977,9 @@ struct rq {
 	struct task_struct	*idle;
 	struct task_struct	*stop;
 	unsigned long		next_balance;
-	struct mm_struct	*prev_mm;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+	struct mm_struct	*prev_lazy_mm;
+#endif
 
 	unsigned int		clock_update_flags;
 	u64			clock;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 081/262] lazy tlb: shoot lazies, a non-refcounting lazy tlb option
  2021-11-05 20:34 incoming Andrew Morton
                   ` (79 preceding siblings ...)
  2021-11-05 20:38 ` [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:38 ` [patch 082/262] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Andrew Morton
                   ` (180 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, anton, benh, linux-mm, luto, mm-commits, npiggin, paulus,
	rdunlap, torvalds

From: Nicholas Piggin <npiggin@gmail.com>
Subject: lazy tlb: shoot lazies, a non-refcounting lazy tlb option

On big systems, the mm refcount can become highly contented when doing a
lot of context switching with threaded applications (particularly
switching between the idle thread and an application thread).

Abandoning lazy tlb slows switching down quite a bit in the important
user->idle->user cases, so instead implement a non-refcounted scheme that
causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down any
remaining lazy ones.

Shootdown IPIs are some concern, but they have not been observed to be a
big problem with this scheme (the powerpc implementation generated 314
additional interrupts on a 144 CPU system during a kernel compile).  There
are a number of strategies that could be employed to reduce IPIs if they
turn out to be a problem for some workload.

[npiggin@gmail.com: update comments]
  Link: https://lkml.kernel.org/r/1623121901.mszkmmum0n.astroid@bobo.none
Link: https://lkml.kernel.org/r/20210605014216.446867-4-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/Kconfig  |   14 +++++++++++++
 kernel/fork.c |   51 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

--- a/arch/Kconfig~lazy-tlb-shoot-lazies-a-non-refcounting-lazy-tlb-option
+++ a/arch/Kconfig
@@ -441,6 +441,20 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
 # to a kthread ->active_mm (non-arch code has been converted already).
 config MMU_LAZY_TLB_REFCOUNT
 	def_bool y
+	depends on !MMU_LAZY_TLB_SHOOTDOWN
+
+# This option allows MMU_LAZY_TLB_REFCOUNT=n. It ensures no CPUs are using an
+# mm as a lazy tlb beyond its last reference count, by shooting down these
+# users before the mm is deallocated. __mmdrop() first IPIs all CPUs that may
+# be using the mm as a lazy tlb, so that they may switch themselves to using
+# init_mm for their active mm. mm_cpumask(mm) is used to determine which CPUs
+# may be using mm as a lazy tlb mm.
+#
+# To implement this, an arch must ensure mm_cpumask(mm) contains at least all
+# possible CPUs in which the mm is lazy, and it must meet the requirements for
+# MMU_LAZY_TLB_REFCOUNT=n (see above).
+config MMU_LAZY_TLB_SHOOTDOWN
+	bool
 
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
 	bool
--- a/kernel/fork.c~lazy-tlb-shoot-lazies-a-non-refcounting-lazy-tlb-option
+++ a/kernel/fork.c
@@ -686,6 +686,53 @@ static void check_mm(struct mm_struct *m
 #define allocate_mm()	(kmem_cache_alloc(mm_cachep, GFP_KERNEL))
 #define free_mm(mm)	(kmem_cache_free(mm_cachep, (mm)))
 
+static void do_shoot_lazy_tlb(void *arg)
+{
+	struct mm_struct *mm = arg;
+
+	if (current->active_mm == mm) {
+		WARN_ON_ONCE(current->mm);
+		current->active_mm = &init_mm;
+		switch_mm(mm, &init_mm, current);
+	}
+}
+
+static void do_check_lazy_tlb(void *arg)
+{
+	struct mm_struct *mm = arg;
+
+	WARN_ON_ONCE(current->active_mm == mm);
+}
+
+static void shoot_lazy_tlbs(struct mm_struct *mm)
+{
+	if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) {
+		/*
+		 * IPI overheads have not found to be expensive, but they could
+		 * be reduced in a number of possible ways, for example (in
+		 * roughly increasing order of complexity):
+		 * - A batch of mms requiring IPIs could be gathered and freed
+		 *   at once.
+		 * - CPUs could store their active mm somewhere that can be
+		 *   remotely checked without a lock, to filter out
+		 *   false-positives in the cpumask.
+		 * - After mm_users or mm_count reaches zero, switching away
+		 *   from the mm could clear mm_cpumask to reduce some IPIs
+		 *   (some batching or delaying would help).
+		 * - A delayed freeing and RCU-like quiescing sequence based on
+		 *   mm switching to avoid IPIs completely.
+		 */
+		on_each_cpu_mask(mm_cpumask(mm), do_shoot_lazy_tlb, (void *)mm, 1);
+		if (IS_ENABLED(CONFIG_DEBUG_VM))
+			on_each_cpu(do_check_lazy_tlb, (void *)mm, 1);
+	} else {
+		/*
+		 * In this case, lazy tlb mms are refounted and would not reach
+		 * __mmdrop until all CPUs have switched away and mmdrop()ed.
+		 */
+	}
+}
+
 /*
  * Called when the last reference to the mm
  * is dropped: either by a lazy thread or by
@@ -695,6 +742,10 @@ void __mmdrop(struct mm_struct *mm)
 {
 	BUG_ON(mm == &init_mm);
 	WARN_ON_ONCE(mm == current->mm);
+
+	/* Ensure no CPUs are using this as their lazy tlb mm */
+	shoot_lazy_tlbs(mm);
+
 	WARN_ON_ONCE(mm == current->active_mm);
 	mm_free_pgd(mm);
 	destroy_context(mm);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 082/262] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN
  2021-11-05 20:34 incoming Andrew Morton
                   ` (80 preceding siblings ...)
  2021-11-05 20:38 ` [patch 081/262] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Andrew Morton
@ 2021-11-05 20:38 ` Andrew Morton
  2021-11-05 20:39 ` [patch 083/262] memory: remove unused CONFIG_MEM_BLOCK_SIZE Andrew Morton
                   ` (179 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:38 UTC (permalink / raw)
  To: akpm, anton, benh, linux-mm, luto, mm-commits, npiggin, paulus,
	rdunlap, torvalds

From: Nicholas Piggin <npiggin@gmail.com>
Subject: powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN

On a 16-socket 192-core POWER8 system, a context switching benchmark
with as many software threads as CPUs (so each switch will go in and
out of idle), upstream can achieve a rate of about 1 million context
switches per second. After this patch it goes up to 118 million.

No real datya for real world workloads unfortunately.  I think it's
always been a "known" cacheline, it just showed up badly on
will-it-scale tests recently when Anton was doing a sweep of low
hanging scalability issues on big systems.

We have some very big systems running certain in-memory databases that
get into very high contention conditions on mutexes that push context
switch rates right up and with idle times pretty high, which would get
a lot of parallel context switching between user and idle thread, we
might be getting a bit of this contention there.

It's not something at the top of profiles though.  And on
multi-threaded workloads like this, the normal refcounting of the user
mm still has fundmaental contention.  It's tricky to get the change
tested on these workloads (machine time is very limited and I can't
drive the software).

I suspect it could also show in things that do high net or disk IO
rates (enough to need a lot of cores), and do some user processing
steps along the way.  You'd potentially get a lot of idle switching.


This infrastructure could be beneficial to other architectures.  The
cacheline is going to bounce in the same situations on other archs, so
I would say yes.  Rik at one stage had some patches to try avoid it for
x86 some years ago, I don't know what happened to those.

The way powerpc has to maintain mm_cpumask for its TLB flushing makes
it relatively easy to do this shootdown, and we decided the additional
IPIs were less of a concern than the bouncing.  Others have different
concerns, but I tried to make it generic and add comments explaining
what other archs can do, or possibly different ways it might be
achieved.

Link: https://lkml.kernel.org/r/20210605014216.446867-5-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/Kconfig |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/powerpc/Kconfig~powerpc-64s-enable-mmu_lazy_tlb_shootdown
+++ a/arch/powerpc/Kconfig
@@ -249,6 +249,7 @@ config PPC
 	select IRQ_FORCED_THREADING
 	select MMU_GATHER_PAGE_SIZE
 	select MMU_GATHER_RCU_TABLE_FREE
+	select MMU_LAZY_TLB_SHOOTDOWN		if PPC_BOOK3S_64
 	select MODULES_USE_ELF_RELA
 	select NEED_DMA_MAP_STATE		if PPC64 || NOT_COHERENT_CACHE
 	select NEED_SG_DMA_LENGTH
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 083/262] memory: remove unused CONFIG_MEM_BLOCK_SIZE
  2021-11-05 20:34 incoming Andrew Morton
                   ` (81 preceding siblings ...)
  2021-11-05 20:38 ` [patch 082/262] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 084/262] mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() Andrew Morton
                   ` (178 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, dave.hansen, david, linux-mm, lukas.bulwahn, mhocko,
	mm-commits, torvalds

From: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Subject: memory: remove unused CONFIG_MEM_BLOCK_SIZE

Commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove
functions") defines CONFIG_MEM_BLOCK_SIZE, but this has never been
utilized anywhere.

It is a good practice to keep the CONFIG_* defines exclusively for the
Kbuild system.  So, drop this unused definition.

This issue was noticed due to running ./scripts/checkkconfigsymbols.py.

Link: https://lkml.kernel.org/r/20211006120354.7468-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memory.h |    1 -
 1 file changed, 1 deletion(-)

--- a/include/linux/memory.h~memory-remove-unused-config_mem_block_size
+++ a/include/linux/memory.h
@@ -140,7 +140,6 @@ typedef int (*walk_memory_blocks_func_t)
 extern int walk_memory_blocks(unsigned long start, unsigned long size,
 			      void *arg, walk_memory_blocks_func_t func);
 extern int for_each_memory_block(void *arg, walk_memory_blocks_func_t func);
-#define CONFIG_MEM_BLOCK_SIZE	(PAGES_PER_SECTION<<PAGE_SHIFT)
 
 extern int memory_group_register_static(int nid, unsigned long max_pages);
 extern int memory_group_register_dynamic(int nid, unsigned long unit_pages);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 084/262] mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (82 preceding siblings ...)
  2021-11-05 20:39 ` [patch 083/262] memory: remove unused CONFIG_MEM_BLOCK_SIZE Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 085/262] mm/mremap: don't account pages in vma_to_resize() Andrew Morton
                   ` (177 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, linux-mm, liu.song11, mm-commits, torvalds

From: Liu Song <liu.song11@zte.com.cn>
Subject: mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey()

After adjustment, the repeated assignment of "prev" is avoided, and the
readability of the code is improved.

Link: https://lkml.kernel.org/r/20211012152444.4127-1-fishland@aliyun.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mprotect.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/mprotect.c~mm-mprotectc-avoid-repeated-assignment-in-do_mprotect_pkey
+++ a/mm/mprotect.c
@@ -563,7 +563,7 @@ static int do_mprotect_pkey(unsigned lon
 	error = -ENOMEM;
 	if (!vma)
 		goto out;
-	prev = vma->vm_prev;
+
 	if (unlikely(grows & PROT_GROWSDOWN)) {
 		if (vma->vm_start >= end)
 			goto out;
@@ -581,8 +581,11 @@ static int do_mprotect_pkey(unsigned lon
 				goto out;
 		}
 	}
+
 	if (start > vma->vm_start)
 		prev = vma;
+	else
+		prev = vma->vm_prev;
 
 	for (nstart = start ; ; ) {
 		unsigned long mask_off_old_flags;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 085/262] mm/mremap: don't account pages in vma_to_resize()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (83 preceding siblings ...)
  2021-11-05 20:39 ` [patch 084/262] mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 086/262] include/linux/io-mapping.h: remove fallback for writecombine Andrew Morton
                   ` (176 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, bgeffon, catalin.marinas, chenwandun, dan.carpenter,
	dan.j.williams, dave.jiang, dima, hughd, jgg, jhubbard,
	kirill.shutemov, linux-mm, linux, luto, mike.kravetz, minchan,
	mingo, mm-commits, rcampbell, tglx, torvalds, tsbogend, vbabka,
	viro, vishal.l.verma, wangkefeng.wang, weiyongjun1, will

From: Dmitry Safonov <dima@arista.com>
Subject: mm/mremap: don't account pages in vma_to_resize()

All this vm_unacct_memory(charged) dance seems to complicate the life
without a good reason.  Furthermore, it seems not always done right on
error-pathes in mremap_to().  And worse than that: this `charged'
difference is sometimes double-accounted for growing MREMAP_DONTUNMAP
mremap()s in move_vma():

	if (security_vm_enough_memory_mm(mm, new_len >> PAGE_SHIFT))

Let's not do this.  Account memory in mremap() fast-path for growing VMAs
or in move_vma() for actually moving things.  The same simpler way as it's
done by vm_stat_account(), but with a difference to call
security_vm_enough_memory_mm() before copying/adjusting VMA.

Originally noticed by Chen Wandun:
https://lkml.kernel.org/r/20210717101942.120607-1-chenwandun@huawei.com

Link: https://lkml.kernel.org/r/20210721131320.522061-1-dima@arista.com
Fixes: e346b3813067 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Brian Geffon <bgeffon@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Wandun <chenwandun@huawei.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mremap.c |   50 ++++++++++++++++++++++----------------------------
 1 file changed, 22 insertions(+), 28 deletions(-)

--- a/mm/mremap.c~mm-mremap-dont-account-pages-in-vma_to_resize
+++ a/mm/mremap.c
@@ -565,6 +565,7 @@ static unsigned long move_vma(struct vm_
 		bool *locked, unsigned long flags,
 		struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap)
 {
+	long to_account = new_len - old_len;
 	struct mm_struct *mm = vma->vm_mm;
 	struct vm_area_struct *new_vma;
 	unsigned long vm_flags = vma->vm_flags;
@@ -583,6 +584,9 @@ static unsigned long move_vma(struct vm_
 	if (mm->map_count >= sysctl_max_map_count - 3)
 		return -ENOMEM;
 
+	if (unlikely(flags & MREMAP_DONTUNMAP))
+		to_account = new_len;
+
 	if (vma->vm_ops && vma->vm_ops->may_split) {
 		if (vma->vm_start != old_addr)
 			err = vma->vm_ops->may_split(vma, old_addr);
@@ -604,8 +608,8 @@ static unsigned long move_vma(struct vm_
 	if (err)
 		return err;
 
-	if (unlikely(flags & MREMAP_DONTUNMAP && vm_flags & VM_ACCOUNT)) {
-		if (security_vm_enough_memory_mm(mm, new_len >> PAGE_SHIFT))
+	if (vm_flags & VM_ACCOUNT) {
+		if (security_vm_enough_memory_mm(mm, to_account >> PAGE_SHIFT))
 			return -ENOMEM;
 	}
 
@@ -613,8 +617,8 @@ static unsigned long move_vma(struct vm_
 	new_vma = copy_vma(&vma, new_addr, new_len, new_pgoff,
 			   &need_rmap_locks);
 	if (!new_vma) {
-		if (unlikely(flags & MREMAP_DONTUNMAP && vm_flags & VM_ACCOUNT))
-			vm_unacct_memory(new_len >> PAGE_SHIFT);
+		if (vm_flags & VM_ACCOUNT)
+			vm_unacct_memory(to_account >> PAGE_SHIFT);
 		return -ENOMEM;
 	}
 
@@ -708,8 +712,7 @@ static unsigned long move_vma(struct vm_
 }
 
 static struct vm_area_struct *vma_to_resize(unsigned long addr,
-	unsigned long old_len, unsigned long new_len, unsigned long flags,
-	unsigned long *p)
+	unsigned long old_len, unsigned long new_len, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
@@ -768,13 +771,6 @@ static struct vm_area_struct *vma_to_res
 				(new_len - old_len) >> PAGE_SHIFT))
 		return ERR_PTR(-ENOMEM);
 
-	if (vma->vm_flags & VM_ACCOUNT) {
-		unsigned long charged = (new_len - old_len) >> PAGE_SHIFT;
-		if (security_vm_enough_memory_mm(mm, charged))
-			return ERR_PTR(-ENOMEM);
-		*p = charged;
-	}
-
 	return vma;
 }
 
@@ -787,7 +783,6 @@ static unsigned long mremap_to(unsigned
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
 	unsigned long ret = -EINVAL;
-	unsigned long charged = 0;
 	unsigned long map_flags = 0;
 
 	if (offset_in_page(new_addr))
@@ -830,7 +825,7 @@ static unsigned long mremap_to(unsigned
 		old_len = new_len;
 	}
 
-	vma = vma_to_resize(addr, old_len, new_len, flags, &charged);
+	vma = vma_to_resize(addr, old_len, new_len, flags);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 		goto out;
@@ -853,7 +848,7 @@ static unsigned long mremap_to(unsigned
 				((addr - vma->vm_start) >> PAGE_SHIFT),
 				map_flags);
 	if (IS_ERR_VALUE(ret))
-		goto out1;
+		goto out;
 
 	/* We got a new mapping */
 	if (!(flags & MREMAP_FIXED))
@@ -862,12 +857,6 @@ static unsigned long mremap_to(unsigned
 	ret = move_vma(vma, addr, old_len, new_len, new_addr, locked, flags, uf,
 		       uf_unmap);
 
-	if (!(offset_in_page(ret)))
-		goto out;
-
-out1:
-	vm_unacct_memory(charged);
-
 out:
 	return ret;
 }
@@ -899,7 +888,6 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
 	unsigned long ret = -EINVAL;
-	unsigned long charged = 0;
 	bool locked = false;
 	bool downgraded = false;
 	struct vm_userfaultfd_ctx uf = NULL_VM_UFFD_CTX;
@@ -981,7 +969,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
 	/*
 	 * Ok, we need to grow..
 	 */
-	vma = vma_to_resize(addr, old_len, new_len, flags, &charged);
+	vma = vma_to_resize(addr, old_len, new_len, flags);
 	if (IS_ERR(vma)) {
 		ret = PTR_ERR(vma);
 		goto out;
@@ -992,10 +980,18 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
 	if (old_len == vma->vm_end - addr) {
 		/* can we just expand the current mapping? */
 		if (vma_expandable(vma, new_len - old_len)) {
-			int pages = (new_len - old_len) >> PAGE_SHIFT;
+			long pages = (new_len - old_len) >> PAGE_SHIFT;
+
+			if (vma->vm_flags & VM_ACCOUNT) {
+				if (security_vm_enough_memory_mm(mm, pages)) {
+					ret = -ENOMEM;
+					goto out;
+				}
+			}
 
 			if (vma_adjust(vma, vma->vm_start, addr + new_len,
 				       vma->vm_pgoff, NULL)) {
+				vm_unacct_memory(pages);
 				ret = -ENOMEM;
 				goto out;
 			}
@@ -1034,10 +1030,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
 			       &locked, flags, &uf, &uf_unmap);
 	}
 out:
-	if (offset_in_page(ret)) {
-		vm_unacct_memory(charged);
+	if (offset_in_page(ret))
 		locked = false;
-	}
 	if (downgraded)
 		mmap_read_unlock(current->mm);
 	else
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 086/262] include/linux/io-mapping.h: remove fallback for writecombine
  2021-11-05 20:34 incoming Andrew Morton
                   ` (84 preceding siblings ...)
  2021-11-05 20:39 ` [patch 085/262] mm/mremap: don't account pages in vma_to_resize() Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 087/262] mm: mmap_lock: remove redundant newline in TP_printk Andrew Morton
                   ` (175 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, chris, daniel.vetter, joonas.lahtinen, linux-mm,
	lucas.demarchi, mm-commits, peterz, torvalds

From: Lucas De Marchi <lucas.demarchi@intel.com>
Subject: include/linux/io-mapping.h: remove fallback for writecombine

The fallback was introduced in commit 80c33624e472 ("io-mapping: Fixup for
different names of writecombine") to fix the build on microblaze.

5 years later, it seems all archs now provide a pgprot_writecombine(), so
just remove the other possible fallbacks.  For microblaze,
pgprot_writecombine() is available since commit 97ccedd793ac ("microblaze:
Provide pgprot_device/writecombine macros for nommu").

This is build-tested on microblaze with a hack to always build
mm/io-mapping.o and without DIYing on an x86-only macro (_PAGE_CACHE_MASK)

Link: https://lkml.kernel.org/r/20211020204838.1142908-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/io-mapping.h |    6 ------
 1 file changed, 6 deletions(-)

--- a/include/linux/io-mapping.h~io-mapping-remove-fallback-for-writecombine
+++ a/include/linux/io-mapping.h
@@ -132,13 +132,7 @@ io_mapping_init_wc(struct io_mapping *io
 
 	iomap->base = base;
 	iomap->size = size;
-#if defined(pgprot_noncached_wc) /* archs can't agree on a name ... */
-	iomap->prot = pgprot_noncached_wc(PAGE_KERNEL);
-#elif defined(pgprot_writecombine)
 	iomap->prot = pgprot_writecombine(PAGE_KERNEL);
-#else
-	iomap->prot = pgprot_noncached(PAGE_KERNEL);
-#endif
 
 	return iomap;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 087/262] mm: mmap_lock: remove redundant newline  in TP_printk
  2021-11-05 20:34 incoming Andrew Morton
                   ` (85 preceding siblings ...)
  2021-11-05 20:39 ` [patch 086/262] include/linux/io-mapping.h: remove fallback for writecombine Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 088/262] mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN Andrew Morton
                   ` (174 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, axelrasmussen, ligang.bdlg, linux-mm, mingo, mm-commits,
	rostedt, torvalds, vbabka

From: Gang Li <ligang.bdlg@bytedance.com>
Subject: mm: mmap_lock: remove redundant newline  in TP_printk

Ftrace core will add newline automatically on printing, so using it in
TP_printkcreates a blank line.

Link: https://lkml.kernel.org/r/20211009071105.69544-1-ligang.bdlg@bytedance.com
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/trace/events/mmap_lock.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/trace/events/mmap_lock.h~mm-mmap_lock-remove-redundant-newline-in-tp_printk
+++ a/include/trace/events/mmap_lock.h
@@ -32,7 +32,7 @@ TRACE_EVENT_FN(mmap_lock_start_locking,
 	),
 
 	TP_printk(
-		"mm=%p memcg_path=%s write=%s\n",
+		"mm=%p memcg_path=%s write=%s",
 		__entry->mm,
 		__get_str(memcg_path),
 		__entry->write ? "true" : "false"
@@ -63,7 +63,7 @@ TRACE_EVENT_FN(mmap_lock_acquire_returne
 	),
 
 	TP_printk(
-		"mm=%p memcg_path=%s write=%s success=%s\n",
+		"mm=%p memcg_path=%s write=%s success=%s",
 		__entry->mm,
 		__get_str(memcg_path),
 		__entry->write ? "true" : "false",
@@ -92,7 +92,7 @@ TRACE_EVENT_FN(mmap_lock_released,
 	),
 
 	TP_printk(
-		"mm=%p memcg_path=%s write=%s\n",
+		"mm=%p memcg_path=%s write=%s",
 		__entry->mm,
 		__get_str(memcg_path),
 		__entry->write ? "true" : "false"
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 088/262] mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN
  2021-11-05 20:34 incoming Andrew Morton
                   ` (86 preceding siblings ...)
  2021-11-05 20:39 ` [patch 087/262] mm: mmap_lock: remove redundant newline in TP_printk Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 089/262] mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node() Andrew Morton
                   ` (173 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, axelrasmussen, ligang.bdlg, linux-mm, mingo, mm-commits,
	rostedt, torvalds, vbabka

From: Gang Li <ligang.bdlg@bytedance.com>
Subject: mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN

By using DECLARE_EVENT_CLASS and TRACE_EVENT_FN, we can save a lot of
space from duplicate code.

Link: https://lkml.kernel.org/r/20211009071243.70286-1-ligang.bdlg@bytedance.com
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/trace/events/mmap_lock.h |   44 +++++++----------------------
 1 file changed, 12 insertions(+), 32 deletions(-)

--- a/include/trace/events/mmap_lock.h~mm-mmap_lock-use-declare_event_class-and-define_event_fn
+++ a/include/trace/events/mmap_lock.h
@@ -13,7 +13,7 @@ struct mm_struct;
 extern int trace_mmap_lock_reg(void);
 extern void trace_mmap_lock_unreg(void);
 
-TRACE_EVENT_FN(mmap_lock_start_locking,
+DECLARE_EVENT_CLASS(mmap_lock,
 
 	TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write),
 
@@ -36,11 +36,19 @@ TRACE_EVENT_FN(mmap_lock_start_locking,
 		__entry->mm,
 		__get_str(memcg_path),
 		__entry->write ? "true" : "false"
-	),
-
-	trace_mmap_lock_reg, trace_mmap_lock_unreg
+	)
 );
 
+#define DEFINE_MMAP_LOCK_EVENT(name)                                    \
+	DEFINE_EVENT_FN(mmap_lock, name,                                \
+		TP_PROTO(struct mm_struct *mm, const char *memcg_path,  \
+			bool write),                                    \
+		TP_ARGS(mm, memcg_path, write),                         \
+		trace_mmap_lock_reg, trace_mmap_lock_unreg)
+
+DEFINE_MMAP_LOCK_EVENT(mmap_lock_start_locking);
+DEFINE_MMAP_LOCK_EVENT(mmap_lock_released);
+
 TRACE_EVENT_FN(mmap_lock_acquire_returned,
 
 	TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write,
@@ -71,34 +79,6 @@ TRACE_EVENT_FN(mmap_lock_acquire_returne
 	),
 
 	trace_mmap_lock_reg, trace_mmap_lock_unreg
-);
-
-TRACE_EVENT_FN(mmap_lock_released,
-
-	TP_PROTO(struct mm_struct *mm, const char *memcg_path, bool write),
-
-	TP_ARGS(mm, memcg_path, write),
-
-	TP_STRUCT__entry(
-		__field(struct mm_struct *, mm)
-		__string(memcg_path, memcg_path)
-		__field(bool, write)
-	),
-
-	TP_fast_assign(
-		__entry->mm = mm;
-		__assign_str(memcg_path, memcg_path);
-		__entry->write = write;
-	),
-
-	TP_printk(
-		"mm=%p memcg_path=%s write=%s",
-		__entry->mm,
-		__get_str(memcg_path),
-		__entry->write ? "true" : "false"
-	),
-
-	trace_mmap_lock_reg, trace_mmap_lock_unreg
 );
 
 #endif /* _TRACE_MMAP_LOCK_H */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 089/262] mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (87 preceding siblings ...)
  2021-11-05 20:39 ` [patch 088/262] mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 090/262] mm/vmalloc: don't allow VM_NO_GUARD on vmap() Andrew Morton
                   ` (172 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, hch, linux-mm, mm-commits, songmuchun, torvalds, urezki, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node()

Commit f255935b9767 ("mm: cleanup the gfp_mask handling in
__vmalloc_area_node") added __GFP_NOWARN to gfp_mask unconditionally
however it disabled all output inside warn_alloc() call.  This patch saves
original gfp_mask and provides it to all warn_alloc() calls.

Link: https://lkml.kernel.org/r/f4f3187b-9684-e426-565d-827c2a9bbb0e@virtuozzo.com
Fixes: f255935b9767 ("mm: cleanup the gfp_mask handling in __vmalloc_area_node")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-repair-warn_allocs-in-__vmalloc_area_node
+++ a/mm/vmalloc.c
@@ -2887,6 +2887,7 @@ static void *__vmalloc_area_node(struct
 				 int node)
 {
 	const gfp_t nested_gfp = (gfp_mask & GFP_RECLAIM_MASK) | __GFP_ZERO;
+	const gfp_t orig_gfp_mask = gfp_mask;
 	unsigned long addr = (unsigned long)area->addr;
 	unsigned long size = get_vm_area_size(area);
 	unsigned long array_size;
@@ -2907,7 +2908,7 @@ static void *__vmalloc_area_node(struct
 	}
 
 	if (!area->pages) {
-		warn_alloc(gfp_mask, NULL,
+		warn_alloc(orig_gfp_mask, NULL,
 			"vmalloc error: size %lu, failed to allocated page array size %lu",
 			nr_small_pages * PAGE_SIZE, array_size);
 		free_vm_area(area);
@@ -2927,7 +2928,7 @@ static void *__vmalloc_area_node(struct
 	 * allocation request, free them via __vfree() if any.
 	 */
 	if (area->nr_pages != nr_small_pages) {
-		warn_alloc(gfp_mask, NULL,
+		warn_alloc(orig_gfp_mask, NULL,
 			"vmalloc error: size %lu, page order %u, failed to allocate pages",
 			area->nr_pages * PAGE_SIZE, page_order);
 		goto fail;
@@ -2935,7 +2936,7 @@ static void *__vmalloc_area_node(struct
 
 	if (vmap_pages_range(addr, addr + size, prot, area->pages,
 			page_shift) < 0) {
-		warn_alloc(gfp_mask, NULL,
+		warn_alloc(orig_gfp_mask, NULL,
 			"vmalloc error: size %lu, failed to map pages",
 			area->nr_pages * PAGE_SIZE);
 		goto fail;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 090/262] mm/vmalloc: don't allow VM_NO_GUARD on vmap()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (88 preceding siblings ...)
  2021-11-05 20:39 ` [patch 089/262] mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node() Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 091/262] mm/vmalloc: make show_numa_info() aware of hugepage mappings Andrew Morton
                   ` (171 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, andreyknvl, david, hch, keescook, linux-mm, mgorman,
	mm-commits, peterz, torvalds, urezki, will

From: Peter Zijlstra <peterz@infradead.org>
Subject: mm/vmalloc: don't allow VM_NO_GUARD on vmap()

The vmalloc guard pages are added on top of each allocation, thereby
isolating any two allocations from one another.  The top guard of the
lower allocation is the bottom guard guard of the higher allocation etc.

Therefore VM_NO_GUARD is dangerous; it breaks the basic premise of
isolating separate allocations.

There are only two in-tree users of this flag, neither of which use it
through the exported interface.  Ensure it stays this way.

Link: https://lkml.kernel.org/r/YUMfdA36fuyZ+/xt@hirez.programming.kicks-ass.net
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/vmalloc.h |    2 +-
 mm/vmalloc.c            |    7 +++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

--- a/include/linux/vmalloc.h~mm-vmalloc-dont-allow-vm_no_guard-on-vmap
+++ a/include/linux/vmalloc.h
@@ -22,7 +22,7 @@ struct notifier_block;		/* in notifier.h
 #define VM_USERMAP		0x00000008	/* suitable for remap_vmalloc_range */
 #define VM_DMA_COHERENT		0x00000010	/* dma_alloc_coherent */
 #define VM_UNINITIALIZED	0x00000020	/* vm_struct is not fully initialized */
-#define VM_NO_GUARD		0x00000040      /* don't add guard page */
+#define VM_NO_GUARD		0x00000040      /* ***DANGEROUS*** don't add guard page */
 #define VM_KASAN		0x00000080      /* has allocated kasan shadow memory */
 #define VM_FLUSH_RESET_PERMS	0x00000100	/* reset direct map and flush TLB on unmap, can't be freed in atomic context */
 #define VM_MAP_PUT_PAGES	0x00000200	/* put pages and free array in vfree */
--- a/mm/vmalloc.c~mm-vmalloc-dont-allow-vm_no_guard-on-vmap
+++ a/mm/vmalloc.c
@@ -2743,6 +2743,13 @@ void *vmap(struct page **pages, unsigned
 
 	might_sleep();
 
+	/*
+	 * Your top guard is someone else's bottom guard. Not having a top
+	 * guard compromises someone else's mappings too.
+	 */
+	if (WARN_ON_ONCE(flags & VM_NO_GUARD))
+		flags &= ~VM_NO_GUARD;
+
 	if (count > totalram_pages())
 		return NULL;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 091/262] mm/vmalloc: make show_numa_info() aware of hugepage mappings
  2021-11-05 20:34 incoming Andrew Morton
                   ` (89 preceding siblings ...)
  2021-11-05 20:39 ` [patch 090/262] mm/vmalloc: don't allow VM_NO_GUARD on vmap() Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 092/262] mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo Andrew Morton
                   ` (170 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, edumazet, linux-mm, mm-commits, torvalds, urezki

From: Eric Dumazet <edumazet@google.com>
Subject: mm/vmalloc: make show_numa_info() aware of hugepage mappings

show_numa_info() can be slightly faster, by skipping over hugepages
directly.

Link: https://lkml.kernel.org/r/20211001172725.105824-1-eric.dumazet@gmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-make-show_numa_info-aware-of-hugepage-mappings
+++ a/mm/vmalloc.c
@@ -3864,6 +3864,7 @@ static void show_numa_info(struct seq_fi
 {
 	if (IS_ENABLED(CONFIG_NUMA)) {
 		unsigned int nr, *counters = m->private;
+		unsigned int step = 1U << vm_area_page_order(v);
 
 		if (!counters)
 			return;
@@ -3875,9 +3876,8 @@ static void show_numa_info(struct seq_fi
 
 		memset(counters, 0, nr_node_ids * sizeof(unsigned int));
 
-		for (nr = 0; nr < v->nr_pages; nr++)
-			counters[page_to_nid(v->pages[nr])]++;
-
+		for (nr = 0; nr < v->nr_pages; nr += step)
+			counters[page_to_nid(v->pages[nr])] += step;
 		for_each_node_state(nr, N_HIGH_MEMORY)
 			if (counters[nr])
 				seq_printf(m, " N%u=%u", nr, counters[nr]);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 092/262] mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo
  2021-11-05 20:34 incoming Andrew Morton
                   ` (90 preceding siblings ...)
  2021-11-05 20:39 ` [patch 091/262] mm/vmalloc: make show_numa_info() aware of hugepage mappings Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 093/262] mm/vmalloc: do not adjust the search size for alignment overhead Andrew Morton
                   ` (169 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, edumazet, linux-mm, lpf.vector, mm-commits, torvalds, urezki

From: Eric Dumazet <edumazet@google.com>
Subject: mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo

If last va found in vmap_area_list does not have a vm pointer,
vmallocinfo.s_show() returns 0, and show_purge_info() is not called as it
should.

Link: https://lkml.kernel.org/r/20211001170815.73321-1-eric.dumazet@gmail.com
Fixes: dd3b8353bae7 ("mm/vmalloc: do not keep unpurged areas in the busy tree")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Pengfei Li <lpf.vector@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/vmalloc.c~mm-vmalloc-make-sure-to-dump-unpurged-areas-in-proc-vmallocinfo
+++ a/mm/vmalloc.c
@@ -3913,7 +3913,7 @@ static int s_show(struct seq_file *m, vo
 			(void *)va->va_start, (void *)va->va_end,
 			va->va_end - va->va_start);
 
-		return 0;
+		goto final;
 	}
 
 	v = va->vm;
@@ -3954,6 +3954,7 @@ static int s_show(struct seq_file *m, vo
 	/*
 	 * As a final step, dump "unpurged" areas.
 	 */
+final:
 	if (list_is_last(&va->list, &vmap_area_list))
 		show_purge_info(m);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 093/262] mm/vmalloc: do not adjust the search size for alignment overhead
  2021-11-05 20:34 incoming Andrew Morton
                   ` (91 preceding siblings ...)
  2021-11-05 20:39 ` [patch 092/262] mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 094/262] mm/vmalloc: check various alignments when debugging Andrew Morton
                   ` (168 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, david, hch, hdanton, linux-mm, mgorman, mhocko, mm-commits,
	npiggin, oleksiy.avramchenko, pifang, rostedt, torvalds, urezki,
	willy

From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: mm/vmalloc: do not adjust the search size for alignment overhead

We used to include an alignment overhead into a search length, in that
case we guarantee that a found area will definitely fit after applying a
specific alignment that user specifies.  From the other hand we do not
guarantee that an area has the lowest address if an alignment is >=
PAGE_SIZE.

It means that, when a user specifies a special alignment together with a
range that corresponds to an exact requested size then an allocation will
fail.  This is what happens to KASAN, it wants the free block that exactly
matches a specified range during onlining memory banks:

[root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory82/state
[root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory83/state
[root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory85/state
[root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory84/state
[  223.858115] vmap allocation for size 16777216 failed: use vmalloc=<size> to increase size
[  223.859415] bash: vmalloc: allocation failure: 16777216 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0
[  223.860992] CPU: 4 PID: 1644 Comm: bash Kdump: loaded Not tainted 4.18.0-339.el8.x86_64+debug #1
[  223.862149] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  223.863580] Call Trace:
[  223.863946]  dump_stack+0x8e/0xd0
[  223.864420]  warn_alloc.cold.90+0x8a/0x1b2
[  223.864990]  ? zone_watermark_ok_safe+0x300/0x300
[  223.865626]  ? slab_free_freelist_hook+0x85/0x1a0
[  223.866264]  ? __get_vm_area_node+0x240/0x2c0
[  223.866858]  ? kfree+0xdd/0x570
[  223.867309]  ? kmem_cache_alloc_node_trace+0x157/0x230
[  223.868028]  ? notifier_call_chain+0x90/0x160
[  223.868625]  __vmalloc_node_range+0x465/0x840
[  223.869230]  ? mark_held_locks+0xb7/0x120

Fix it by making sure that find_vmap_lowest_match() returns lowest start
address with any given alignment value, i.e.  for alignments bigger then
PAGE_SIZE the algorithm rolls back toward parent nodes checking right
sub-trees if the most left free block did not fit due to alignment
overhead.

Link: https://lkml.kernel.org/r/20211004142829.22222-1-urezki@gmail.com
Fixes: 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap allocation")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reported-by: Ping Fang <pifang@redhat.com>
Tested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |   22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-do-not-adjust-the-search-size-for-alignment-overhead
+++ a/mm/vmalloc.c
@@ -1195,18 +1195,14 @@ find_vmap_lowest_match(unsigned long siz
 {
 	struct vmap_area *va;
 	struct rb_node *node;
-	unsigned long length;
 
 	/* Start from the root. */
 	node = free_vmap_area_root.rb_node;
 
-	/* Adjust the search size for alignment overhead. */
-	length = size + align - 1;
-
 	while (node) {
 		va = rb_entry(node, struct vmap_area, rb_node);
 
-		if (get_subtree_max_size(node->rb_left) >= length &&
+		if (get_subtree_max_size(node->rb_left) >= size &&
 				vstart < va->va_start) {
 			node = node->rb_left;
 		} else {
@@ -1216,9 +1212,9 @@ find_vmap_lowest_match(unsigned long siz
 			/*
 			 * Does not make sense to go deeper towards the right
 			 * sub-tree if it does not have a free block that is
-			 * equal or bigger to the requested search length.
+			 * equal or bigger to the requested search size.
 			 */
-			if (get_subtree_max_size(node->rb_right) >= length) {
+			if (get_subtree_max_size(node->rb_right) >= size) {
 				node = node->rb_right;
 				continue;
 			}
@@ -1226,15 +1222,23 @@ find_vmap_lowest_match(unsigned long siz
 			/*
 			 * OK. We roll back and find the first right sub-tree,
 			 * that will satisfy the search criteria. It can happen
-			 * only once due to "vstart" restriction.
+			 * due to "vstart" restriction or an alignment overhead
+			 * that is bigger then PAGE_SIZE.
 			 */
 			while ((node = rb_parent(node))) {
 				va = rb_entry(node, struct vmap_area, rb_node);
 				if (is_within_this_va(va, size, align, vstart))
 					return va;
 
-				if (get_subtree_max_size(node->rb_right) >= length &&
+				if (get_subtree_max_size(node->rb_right) >= size &&
 						vstart <= va->va_start) {
+					/*
+					 * Shift the vstart forward. Please note, we update it with
+					 * parent's start address adding "1" because we do not want
+					 * to enter same sub-tree after it has already been checked
+					 * and no suitable free block found there.
+					 */
+					vstart = va->va_start + 1;
 					node = node->rb_right;
 					break;
 				}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 094/262] mm/vmalloc: check various alignments when debugging
  2021-11-05 20:34 incoming Andrew Morton
                   ` (92 preceding siblings ...)
  2021-11-05 20:39 ` [patch 093/262] mm/vmalloc: do not adjust the search size for alignment overhead Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 095/262] vmalloc: back off when the current task is OOM-killed Andrew Morton
                   ` (167 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, david, hch, hdanton, linux-mm, mgorman, mhocko, mm-commits,
	npiggin, oleksiy.avramchenko, pifang, rostedt, torvalds, urezki,
	willy

From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Subject: mm/vmalloc: check various alignments when debugging

Before we did not guarantee a free block with lowest start address for
allocations with alignment >= PAGE_SIZE.  Because an alignment overhead
was included into a search length like below:

     length = size + align - 1;

doing so we make sure that a bigger block would fit after applying an
alignment adjustment.  Now there is no such limitation, i.e.  any
alignment that user wants to apply will result to a lowest address of
returned free area.

Link: https://lkml.kernel.org/r/20211004142829.22222-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Ping Fang <pifang@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-check-various-alignments-when-debugging
+++ a/mm/vmalloc.c
@@ -1269,7 +1269,7 @@ find_vmap_lowest_linear_match(unsigned l
 }
 
 static void
-find_vmap_lowest_match_check(unsigned long size)
+find_vmap_lowest_match_check(unsigned long size, unsigned long align)
 {
 	struct vmap_area *va_1, *va_2;
 	unsigned long vstart;
@@ -1278,8 +1278,8 @@ find_vmap_lowest_match_check(unsigned lo
 	get_random_bytes(&rnd, sizeof(rnd));
 	vstart = VMALLOC_START + rnd;
 
-	va_1 = find_vmap_lowest_match(size, 1, vstart);
-	va_2 = find_vmap_lowest_linear_match(size, 1, vstart);
+	va_1 = find_vmap_lowest_match(size, align, vstart);
+	va_2 = find_vmap_lowest_linear_match(size, align, vstart);
 
 	if (va_1 != va_2)
 		pr_emerg("not lowest: t: 0x%p, l: 0x%p, v: 0x%lx\n",
@@ -1458,7 +1458,7 @@ __alloc_vmap_area(unsigned long size, un
 		return vend;
 
 #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
-	find_vmap_lowest_match_check(size);
+	find_vmap_lowest_match_check(size, align);
 #endif
 
 	return nva_start_addr;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 095/262] vmalloc: back off when the current task is OOM-killed
  2021-11-05 20:34 incoming Andrew Morton
                   ` (93 preceding siblings ...)
  2021-11-05 20:39 ` [patch 094/262] mm/vmalloc: check various alignments when debugging Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 096/262] vmalloc: choose a better start address in vm_area_register_early() Andrew Morton
                   ` (166 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, hannes, linux-mm, mhocko, mm-commits, penguin-kernel,
	torvalds, urezki, vdavydov.dev, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: vmalloc: back off when the current task is OOM-killed

Huge vmalloc allocation on heavy loaded node can lead to a global memory
shortage.  Task called vmalloc can have worst badness and be selected by
OOM-killer, however taken fatal signal does not interrupt allocation
cycle.  Vmalloc repeat page allocaions again and again, exacerbating the
crisis and consuming the memory freed up by another killed tasks.

After a successful completion of the allocation procedure, a fatal signal
will be processed and task will be destroyed finally.  However it may not
release the consumed memory, since the allocated object may have a
lifetime unrelated to the completed task.  In the worst case, this can
lead to the host will panic due to "Out of memory and no killable
processes..."

This patch allows OOM-killer to break vmalloc cycle, makes OOM more
effective and avoid host panic.  It does not check oom condition directly,
however, and breaks page allocation cycle when fatal signal was received.

This may trigger some hidden problems, when caller does not handle vmalloc
failures, or when rollaback after failed vmalloc calls own vmallocs
inside.  However all of these scenarios are incorrect: vmalloc does not
guarantee successful allocation, it has never been called with
__GFP_NOFAIL and threfore either should not be used for any rollbacks or
should handle such errors correctly and not lead to critical failures.

Link: https://lkml.kernel.org/r/83efc664-3a65-2adb-d7c4-2885784cf109@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/vmalloc.c~vmalloc-back-off-when-the-current-task-is-oom-killed
+++ a/mm/vmalloc.c
@@ -2871,6 +2871,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	/* High-order pages or fallback path if "bulk" fails. */
 
 	while (nr_allocated < nr_pages) {
+		if (fatal_signal_pending(current))
+			break;
+
 		if (nid == NUMA_NO_NODE)
 			page = alloc_pages(gfp, order);
 		else
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 096/262] vmalloc: choose a better start address in vm_area_register_early()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (94 preceding siblings ...)
  2021-11-05 20:39 ` [patch 095/262] vmalloc: back off when the current task is OOM-killed Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 097/262] arm64: support page mapping percpu first chunk allocator Andrew Morton
                   ` (165 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, dvyukov, elver, gregkh,
	linux-mm, mm-commits, ryabinin.a.a, torvalds, wangkefeng.wang,
	will

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: vmalloc: choose a better start address in vm_area_register_early()

Percpu embedded first chunk allocator is the firstly option, but it could
fail on ARM64, eg,

  "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000"
  "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000"
  "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000"

then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087
pcpu_get_vm_areas+0x488/0x838" and the system cannot boot successfully.

Let's implement page mapping percpu first chunk allocator as a fallback to
the embedding allocator to increase the robustness of the system.

Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC
enabled.

Tested on ARM64 qemu with cmdline "percpu_alloc=page".


This patch (of 3):

There are some fixed locations in the vmalloc area be reserved in ARM(see
iotable_init()) and ARM64(see map_kernel()), but for
pcpu_page_first_chunk(), it calls vm_area_register_early() and choose
VMALLOC_START as the start address of vmap area which could be conflicted
with above address, then could trigger a BUG_ON in vm_area_add_early().

Let's choose a suit start address by traversing the vmlist.

Link: https://lkml.kernel.org/r/20210910053354.26721-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20210910053354.26721-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

--- a/mm/vmalloc.c~vmalloc-choose-a-better-start-address-in-vm_area_register_early
+++ a/mm/vmalloc.c
@@ -2276,15 +2276,21 @@ void __init vm_area_add_early(struct vm_
  */
 void __init vm_area_register_early(struct vm_struct *vm, size_t align)
 {
-	static size_t vm_init_off __initdata;
-	unsigned long addr;
+	unsigned long addr = ALIGN(VMALLOC_START, align);
+	struct vm_struct *cur, **p;
 
-	addr = ALIGN(VMALLOC_START + vm_init_off, align);
-	vm_init_off = PFN_ALIGN(addr + vm->size) - VMALLOC_START;
+	BUG_ON(vmap_initialized);
 
-	vm->addr = (void *)addr;
+	for (p = &vmlist; (cur = *p) != NULL; p = &cur->next) {
+		if ((unsigned long)cur->addr - addr >= vm->size)
+			break;
+		addr = ALIGN((unsigned long)cur->addr + cur->size, align);
+	}
 
-	vm_area_add_early(vm);
+	BUG_ON(addr > VMALLOC_END - vm->size);
+	vm->addr = (void *)addr;
+	vm->next = *p;
+	*p = vm;
 }
 
 static void vmap_init_free_space(void)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 097/262] arm64: support page mapping percpu first chunk allocator
  2021-11-05 20:34 incoming Andrew Morton
                   ` (95 preceding siblings ...)
  2021-11-05 20:39 ` [patch 096/262] vmalloc: choose a better start address in vm_area_register_early() Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 098/262] kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC Andrew Morton
                   ` (164 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, dvyukov, elver, gregkh,
	linux-mm, mm-commits, ryabinin.a.a, torvalds, wangkefeng.wang,
	will

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: arm64: support page mapping percpu first chunk allocator

Percpu embedded first chunk allocator is the firstly option, but it
could fails on ARM64, eg,
  "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000"
  "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000"
  "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000"

then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087
pcpu_get_vm_areas+0x488/0x838" and the system could not boot successfully.

Let's implement page mapping percpu first chunk allocator as a fallback to
the embedding allocator to increase the robustness of the system.

Link: https://lkml.kernel.org/r/20210910053354.26721-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marco Elver <elver@google.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/Kconfig       |    4 +
 drivers/base/arch_numa.c |   82 ++++++++++++++++++++++++++++++++-----
 2 files changed, 76 insertions(+), 10 deletions(-)

--- a/arch/arm64/Kconfig~arm64-support-page-mapping-percpu-first-chunk-allocator
+++ a/arch/arm64/Kconfig
@@ -1042,6 +1042,10 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool y
 	depends on NUMA
 
+config NEED_PER_CPU_PAGE_FIRST_CHUNK
+	def_bool y
+	depends on NUMA
+
 source "kernel/Kconfig.hz"
 
 config ARCH_SPARSEMEM_ENABLE
--- a/drivers/base/arch_numa.c~arm64-support-page-mapping-percpu-first-chunk-allocator
+++ a/drivers/base/arch_numa.c
@@ -14,6 +14,7 @@
 #include <linux/of.h>
 
 #include <asm/sections.h>
+#include <asm/pgalloc.h>
 
 struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
 EXPORT_SYMBOL(node_data);
@@ -168,22 +169,83 @@ static void __init pcpu_fc_free(void *pt
 	memblock_free_early(__pa(ptr), size);
 }
 
+#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
+static void __init pcpu_populate_pte(unsigned long addr)
+{
+	pgd_t *pgd = pgd_offset_k(addr);
+	p4d_t *p4d;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	p4d = p4d_offset(pgd, addr);
+	if (p4d_none(*p4d)) {
+		pud_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!new)
+			goto err_alloc;
+		p4d_populate(&init_mm, p4d, new);
+	}
+
+	pud = pud_offset(p4d, addr);
+	if (pud_none(*pud)) {
+		pmd_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!new)
+			goto err_alloc;
+		pud_populate(&init_mm, pud, new);
+	}
+
+	pmd = pmd_offset(pud, addr);
+	if (!pmd_present(*pmd)) {
+		pte_t *new;
+
+		new = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		if (!new)
+			goto err_alloc;
+		pmd_populate_kernel(&init_mm, pmd, new);
+	}
+
+	return;
+
+err_alloc:
+	panic("%s: Failed to allocate %lu bytes align=%lx from=%lx\n",
+	      __func__, PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
+}
+#endif
+
 void __init setup_per_cpu_areas(void)
 {
 	unsigned long delta;
 	unsigned int cpu;
-	int rc;
+	int rc = -EINVAL;
 
-	/*
-	 * Always reserve area for module percpu variables.  That's
-	 * what the legacy allocator did.
-	 */
-	rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE,
-				    PERCPU_DYNAMIC_RESERVE, PAGE_SIZE,
-				    pcpu_cpu_distance,
-				    pcpu_fc_alloc, pcpu_fc_free);
+	if (pcpu_chosen_fc != PCPU_FC_PAGE) {
+		/*
+		 * Always reserve area for module percpu variables.  That's
+		 * what the legacy allocator did.
+		 */
+		rc = pcpu_embed_first_chunk(PERCPU_MODULE_RESERVE,
+					    PERCPU_DYNAMIC_RESERVE, PAGE_SIZE,
+					    pcpu_cpu_distance,
+					    pcpu_fc_alloc, pcpu_fc_free);
+#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
+		if (rc < 0)
+			pr_warn("PERCPU: %s allocator failed (%d), falling back to page size\n",
+				   pcpu_fc_names[pcpu_chosen_fc], rc);
+#endif
+	}
+
+#ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
+	if (rc < 0)
+		rc = pcpu_page_first_chunk(PERCPU_MODULE_RESERVE,
+					   pcpu_fc_alloc,
+					   pcpu_fc_free,
+					   pcpu_populate_pte);
+#endif
 	if (rc < 0)
-		panic("Failed to initialize percpu areas.");
+		panic("Failed to initialize percpu areas (err=%d).", rc);
 
 	delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start;
 	for_each_possible_cpu(cpu)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 098/262] kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC
  2021-11-05 20:34 incoming Andrew Morton
                   ` (96 preceding siblings ...)
  2021-11-05 20:39 ` [patch 097/262] arm64: support page mapping percpu first chunk allocator Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags Andrew Morton
                   ` (163 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, andreyknvl, catalin.marinas, dvyukov, elver, gregkh,
	linux-mm, mm-commits, ryabinin.a.a, torvalds, wangkefeng.wang,
	will

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC

With KASAN_VMALLOC and NEED_PER_CPU_PAGE_FIRST_CHUNK, it crashes,

Unable to handle kernel paging request at virtual address ffff7000028f2000
...
swapper pgtable: 64k pages, 48-bit VAs, pgdp=0000000042440000
[ffff7000028f2000] pgd=000000063e7c0003, p4d=000000063e7c0003, pud=000000063e7c0003, pmd=000000063e7b0003, pte=0000000000000000
Internal error: Oops: 96000007 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.0-rc4-00003-gc6e6e28f3f30-dirty #62
Hardware name: linux,dummy-virt (DT)
pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO BTYPE=--)
pc : kasan_check_range+0x90/0x1a0
lr : memcpy+0x88/0xf4
sp : ffff80001378fe20
...
Call trace:
 kasan_check_range+0x90/0x1a0
 pcpu_page_first_chunk+0x3f0/0x568
 setup_per_cpu_areas+0xb8/0x184
 start_kernel+0x8c/0x328

The vm area used in vm_area_register_early() has no kasan shadow memory,
Let's add a new kasan_populate_early_vm_area_shadow() function to populate
the vm area shadow memory to fix the issue.

[wangkefeng.wang@huawei.com: fix redefinition of 'kasan_populate_early_vm_area_shadow']
  Link: https://lkml.kernel.org/r/20211011123211.3936196-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20210910053354.26721-4-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Marco Elver <elver@google.com>		[KASAN]
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>	[KASAN]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/kasan_init.c |   16 ++++++++++++++++
 include/linux/kasan.h      |    6 ++++++
 mm/kasan/shadow.c          |    5 +++++
 mm/vmalloc.c               |    1 +
 4 files changed, 28 insertions(+)

--- a/arch/arm64/mm/kasan_init.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc
+++ a/arch/arm64/mm/kasan_init.c
@@ -287,6 +287,22 @@ static void __init kasan_init_depth(void
 	init_task.kasan_depth = 0;
 }
 
+#ifdef CONFIG_KASAN_VMALLOC
+void __init kasan_populate_early_vm_area_shadow(void *start, unsigned long size)
+{
+	unsigned long shadow_start, shadow_end;
+
+	if (!is_vmalloc_or_module_addr(start))
+		return;
+
+	shadow_start = (unsigned long)kasan_mem_to_shadow(start);
+	shadow_start = ALIGN_DOWN(shadow_start, PAGE_SIZE);
+	shadow_end = (unsigned long)kasan_mem_to_shadow(start + size);
+	shadow_end = ALIGN(shadow_end, PAGE_SIZE);
+	kasan_map_populate(shadow_start, shadow_end, NUMA_NO_NODE);
+}
+#endif
+
 void __init kasan_init(void)
 {
 	kasan_init_shadow();
--- a/include/linux/kasan.h~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc
+++ a/include/linux/kasan.h
@@ -436,6 +436,8 @@ void kasan_release_vmalloc(unsigned long
 			   unsigned long free_region_start,
 			   unsigned long free_region_end);
 
+void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
+
 #else /* CONFIG_KASAN_VMALLOC */
 
 static inline int kasan_populate_vmalloc(unsigned long start,
@@ -453,6 +455,10 @@ static inline void kasan_release_vmalloc
 					 unsigned long free_region_start,
 					 unsigned long free_region_end) {}
 
+static inline void kasan_populate_early_vm_area_shadow(void *start,
+						       unsigned long size)
+{ }
+
 #endif /* CONFIG_KASAN_VMALLOC */
 
 #if (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)) && \
--- a/mm/kasan/shadow.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc
+++ a/mm/kasan/shadow.c
@@ -254,6 +254,11 @@ core_initcall(kasan_memhotplug_init);
 
 #ifdef CONFIG_KASAN_VMALLOC
 
+void __init __weak kasan_populate_early_vm_area_shadow(void *start,
+						       unsigned long size)
+{
+}
+
 static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
 				      void *unused)
 {
--- a/mm/vmalloc.c~kasan-arm64-fix-pcpu_page_first_chunk-crash-with-kasan_vmalloc
+++ a/mm/vmalloc.c
@@ -2291,6 +2291,7 @@ void __init vm_area_register_early(struc
 	vm->addr = (void *)addr;
 	vm->next = *p;
 	*p = vm;
+	kasan_populate_early_vm_area_shadow(vm->addr, vm->size);
 }
 
 static void vmap_init_free_space(void)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags
  2021-11-05 20:34 incoming Andrew Morton
                   ` (97 preceding siblings ...)
  2021-11-05 20:39 ` [patch 098/262] kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-08  9:25   ` Michal Hocko
  2021-11-05 20:39 ` [patch 100/262] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Andrew Morton
                   ` (162 subsequent siblings)
  261 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, david, hch, idryomov, jlayton, linux-mm, mhocko,
	mm-commits, neilb, torvalds, urezki

From: Michal Hocko <mhocko@suse.com>
Subject: mm/vmalloc: be more explicit about supported gfp flags

The core of the vmalloc allocator __vmalloc_area_node doesn't say anything
about gfp mask argument.  Not all gfp flags are supported though.  Be more
explicit about constraints.

Link: https://lkml.kernel.org/r/20211020082545.4830-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmalloc.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/mm/vmalloc.c~mm-vmalloc-be-more-explicit-about-supported-gfp-flags
+++ a/mm/vmalloc.c
@@ -2983,8 +2983,16 @@ fail:
  * @caller:		  caller's return address
  *
  * Allocate enough pages to cover @size from the page level
- * allocator with @gfp_mask flags.  Map them into contiguous
- * kernel virtual space, using a pagetable protection of @prot.
+ * allocator with @gfp_mask flags. Please note that the full set of gfp
+ * flags are not supported. GFP_KERNEL would be a preferred allocation mode
+ * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
+ * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
+ * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
+ * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
+ * __GFP_NOWARN can be used to suppress error messages about failures.
+ *
+ * Map them into contiguous kernel virtual space, using a pagetable
+ * protection of @prot.
  *
  * Return: the address of the area or %NULL on failure
  */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 100/262] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation
  2021-11-05 20:34 incoming Andrew Morton
                   ` (98 preceding siblings ...)
  2021-11-05 20:39 ` [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 101/262] lib/test_vmalloc.c: use swap() to make code cleaner Andrew Morton
                   ` (161 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, chenwandun, edumazet, guohanjun, linux-mm, mm-commits,
	npiggin, shakeelb, torvalds, urezki, wangkefeng.wang

From: Chen Wandun <chenwandun@huawei.com>
Subject: mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation

"mm/vmalloc: fix numa spreading for large hash tables" will cause
significant performance regressions in some situations as Andrew
mentioned in [1].  The main situation is vmalloc, vmalloc will allocate
pages with NUMA_NO_NODE by default, that will result in alloc page one
by one;

In order to solve this, __alloc_pages_bulk and mempolicy should be
considered at the same time.

1) If node is specified in memory allocation request, it will alloc all
   pages by __alloc_pages_bulk.

2) If interleaving allocate memory, it will cauculate how many pages
   should be allocated in each node, and use __alloc_pages_bulk to alloc
   pages in each node.

[1]: https://lore.kernel.org/lkml/CALvZod4G3SzP3kWxQYn0fj+VgG-G3yWXz=gz17+3N57ru1iajw@mail.gmail.com/t/#m750c8e3231206134293b089feaa090590afa0f60

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: make two functions static]
[akpm@linux-foundation.org: fix CONFIG_NUMA=n build]
Link: https://lkml.kernel.org/r/20211021080744.874701-3-chenwandun@huawei.com
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/gfp.h |    4 ++
 mm/mempolicy.c      |   82 ++++++++++++++++++++++++++++++++++++++++++
 mm/vmalloc.c        |   20 ++++++++--
 3 files changed, 102 insertions(+), 4 deletions(-)

--- a/include/linux/gfp.h~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation
+++ a/include/linux/gfp.h
@@ -535,6 +535,10 @@ unsigned long __alloc_pages_bulk(gfp_t g
 				struct list_head *page_list,
 				struct page **page_array);
 
+unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+				unsigned long nr_pages,
+				struct page **page_array);
+
 /* Bulk allocate order-0 pages */
 static inline unsigned long
 alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
--- a/mm/mempolicy.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation
+++ a/mm/mempolicy.c
@@ -2196,6 +2196,88 @@ struct page *alloc_pages(gfp_t gfp, unsi
 }
 EXPORT_SYMBOL(alloc_pages);
 
+static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp,
+		struct mempolicy *pol, unsigned long nr_pages,
+		struct page **page_array)
+{
+	int nodes;
+	unsigned long nr_pages_per_node;
+	int delta;
+	int i;
+	unsigned long nr_allocated;
+	unsigned long total_allocated = 0;
+
+	nodes = nodes_weight(pol->nodes);
+	nr_pages_per_node = nr_pages / nodes;
+	delta = nr_pages - nodes * nr_pages_per_node;
+
+	for (i = 0; i < nodes; i++) {
+		if (delta) {
+			nr_allocated = __alloc_pages_bulk(gfp,
+					interleave_nodes(pol), NULL,
+					nr_pages_per_node + 1, NULL,
+					page_array);
+			delta--;
+		} else {
+			nr_allocated = __alloc_pages_bulk(gfp,
+					interleave_nodes(pol), NULL,
+					nr_pages_per_node, NULL, page_array);
+		}
+
+		page_array += nr_allocated;
+		total_allocated += nr_allocated;
+	}
+
+	return total_allocated;
+}
+
+static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid,
+		struct mempolicy *pol, unsigned long nr_pages,
+		struct page **page_array)
+{
+	gfp_t preferred_gfp;
+	unsigned long nr_allocated = 0;
+
+	preferred_gfp = gfp | __GFP_NOWARN;
+	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
+
+	nr_allocated  = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes,
+					   nr_pages, NULL, page_array);
+
+	if (nr_allocated < nr_pages)
+		nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL,
+				nr_pages - nr_allocated, NULL,
+				page_array + nr_allocated);
+	return nr_allocated;
+}
+
+/* alloc pages bulk and mempolicy should be considered at the
+ * same time in some situation such as vmalloc.
+ *
+ * It can accelerate memory allocation especially interleaving
+ * allocate memory.
+ */
+unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+		unsigned long nr_pages, struct page **page_array)
+{
+	struct mempolicy *pol = &default_policy;
+
+	if (!in_interrupt() && !(gfp & __GFP_THISNODE))
+		pol = get_task_policy(current);
+
+	if (pol->mode == MPOL_INTERLEAVE)
+		return alloc_pages_bulk_array_interleave(gfp, pol,
+							 nr_pages, page_array);
+
+	if (pol->mode == MPOL_PREFERRED_MANY)
+		return alloc_pages_bulk_array_preferred_many(gfp,
+				numa_node_id(), pol, nr_pages, page_array);
+
+	return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()),
+				  policy_nodemask(gfp, pol), nr_pages, NULL,
+				  page_array);
+}
+
 int vma_dup_policy(struct vm_area_struct *src, struct vm_area_struct *dst)
 {
 	struct mempolicy *pol = mpol_dup(vma_policy(src));
--- a/mm/vmalloc.c~mm-vmalloc-introduce-alloc_pages_bulk_array_mempolicy-to-accelerate-memory-allocation
+++ a/mm/vmalloc.c
@@ -2843,7 +2843,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	 * to fails, fallback to a single page allocator that is
 	 * more permissive.
 	 */
-	if (!order && nid != NUMA_NO_NODE) {
+	if (!order) {
 		while (nr_allocated < nr_pages) {
 			unsigned int nr, nr_pages_request;
 
@@ -2855,8 +2855,20 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			 */
 			nr_pages_request = min(100U, nr_pages - nr_allocated);
 
-			nr = alloc_pages_bulk_array_node(gfp, nid,
-				nr_pages_request, pages + nr_allocated);
+			/* memory allocation should consider mempolicy, we can't
+			 * wrongly use nearest node when nid == NUMA_NO_NODE,
+			 * otherwise memory may be allocated in only one node,
+			 * but mempolcy want to alloc memory by interleaving.
+			 */
+			if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
+				nr = alloc_pages_bulk_array_mempolicy(gfp,
+							nr_pages_request,
+							pages + nr_allocated);
+
+			else
+				nr = alloc_pages_bulk_array_node(gfp, nid,
+							nr_pages_request,
+							pages + nr_allocated);
 
 			nr_allocated += nr;
 			cond_resched();
@@ -2868,7 +2880,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			if (nr != nr_pages_request)
 				break;
 		}
-	} else if (order)
+	} else
 		/*
 		 * Compound pages required for remap_vmalloc_page if
 		 * high-order pages.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 101/262] lib/test_vmalloc.c: use swap() to make code cleaner
  2021-11-05 20:34 incoming Andrew Morton
                   ` (99 preceding siblings ...)
  2021-11-05 20:39 ` [patch 100/262] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:39 ` [patch 102/262] mm/large system hash: avoid possible NULL deref in alloc_large_system_hash Andrew Morton
                   ` (160 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, deng.changcheng, linux-mm, mm-commits, torvalds, urezki, zealci

From: Changcheng Deng <deng.changcheng@zte.com.cn>
Subject: lib/test_vmalloc.c: use swap() to make code cleaner

Use swap() in order to make code cleaner. Issue found by coccinelle.

Link: https://lkml.kernel.org/r/20211028111443.15744-1-deng.changcheng@zte.com.cn
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 lib/test_vmalloc.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/lib/test_vmalloc.c~lib-test_vmallocc-use-swap-to-make-code-cleaner
+++ a/lib/test_vmalloc.c
@@ -393,7 +393,7 @@ static struct test_driver {
 static void shuffle_array(int *arr, int n)
 {
 	unsigned int rnd;
-	int i, j, x;
+	int i, j;
 
 	for (i = n - 1; i > 0; i--)  {
 		get_random_bytes(&rnd, sizeof(rnd));
@@ -402,9 +402,7 @@ static void shuffle_array(int *arr, int
 		j = rnd % i;
 
 		/* Swap indexes. */
-		x = arr[i];
-		arr[i] = arr[j];
-		arr[j] = x;
+		swap(arr[i], arr[j]);
 	}
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 102/262] mm/large system hash: avoid possible NULL deref in alloc_large_system_hash
  2021-11-05 20:34 incoming Andrew Morton
                   ` (100 preceding siblings ...)
  2021-11-05 20:39 ` [patch 101/262] lib/test_vmalloc.c: use swap() to make code cleaner Andrew Morton
@ 2021-11-05 20:39 ` Andrew Morton
  2021-11-05 20:40 ` [patch 103/262] mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order() Andrew Morton
                   ` (159 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:39 UTC (permalink / raw)
  To: akpm, edumazet, linux-mm, mm-commits, npiggin, torvalds

From: Eric Dumazet <edumazet@google.com>
Subject: mm/large system hash: avoid possible NULL deref in alloc_large_system_hash

If __vmalloc() returned NULL, is_vm_area_hugepages(NULL) will fault if
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y

Link: https://lkml.kernel.org/r/20210915212530.2321545-1-eric.dumazet@gmail.com
Fixes: 121e6f3258fe ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-large-system-hash-avoid-possible-null-deref-in-alloc_large_system_hash
+++ a/mm/page_alloc.c
@@ -8762,7 +8762,8 @@ void *__init alloc_large_system_hash(con
 		} else if (get_order(size) >= MAX_ORDER || hashdist) {
 			table = __vmalloc(size, gfp_flags);
 			virt = true;
-			huge = is_vm_area_hugepages(table);
+			if (table)
+				huge = is_vm_area_hugepages(table);
 		} else {
 			/*
 			 * If bucketsize is not a power-of-two, we may free
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 103/262] mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (101 preceding siblings ...)
  2021-11-05 20:39 ` [patch 102/262] mm/large system hash: avoid possible NULL deref in alloc_large_system_hash Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 104/262] mm/page_alloc.c: simplify the code by using macro K() Andrew Morton
                   ` (158 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mgorman, mm-commits, peterz,
	sfr, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order()

Patch series "Cleanups and fixup for page_alloc", v2.

This series contains cleanups to remove meaningless VM_BUG_ON(), use
helpers to simplify the code and remove obsolete comment.  Also we avoid
allocating highmem pages via alloc_pages_exact[_nid].  More details can be
found in the respective changelogs.  


This patch (of 5):

It's meaningless to VM_BUG_ON() order != pageblock_order just after
setting order to pageblock_order.  Remove it.

Link: https://lkml.kernel.org/r/20210902121242.41607-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210902121242.41607-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/mm/page_alloc.c~mm-page_allocc-remove-meaningless-vm_bug_on-in-pindex_to_order
+++ a/mm/page_alloc.c
@@ -677,10 +677,8 @@ static inline int pindex_to_order(unsign
 	int order = pindex / MIGRATE_PCPTYPES;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (order > PAGE_ALLOC_COSTLY_ORDER) {
+	if (order > PAGE_ALLOC_COSTLY_ORDER)
 		order = pageblock_order;
-		VM_BUG_ON(order != pageblock_order);
-	}
 #else
 	VM_BUG_ON(order > PAGE_ALLOC_COSTLY_ORDER);
 #endif
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 104/262] mm/page_alloc.c: simplify the code by using macro K()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (102 preceding siblings ...)
  2021-11-05 20:40 ` [patch 103/262] mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 105/262] mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk() Andrew Morton
                   ` (157 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mgorman, mm-commits, peterz,
	sfr, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_alloc.c: simplify the code by using macro K()

Use helper macro K() to convert the pages to the corresponding size. 
Minor readability improvement.

Link: https://lkml.kernel.org/r/20210902121242.41607-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

--- a/mm/page_alloc.c~mm-page_allocc-simplify-the-code-by-using-macro-k
+++ a/mm/page_alloc.c
@@ -8130,8 +8130,7 @@ unsigned long free_reserved_area(void *s
 	}
 
 	if (pages && s)
-		pr_info("Freeing %s memory: %ldK\n",
-			s, pages << (PAGE_SHIFT - 10));
+		pr_info("Freeing %s memory: %ldK\n", s, K(pages));
 
 	return pages;
 }
@@ -8176,14 +8175,13 @@ void __init mem_init_print_info(void)
 		", %luK highmem"
 #endif
 		")\n",
-		nr_free_pages() << (PAGE_SHIFT - 10),
-		physpages << (PAGE_SHIFT - 10),
+		K(nr_free_pages()), K(physpages),
 		codesize >> 10, datasize >> 10, rosize >> 10,
 		(init_data_size + init_code_size) >> 10, bss_size >> 10,
-		(physpages - totalram_pages() - totalcma_pages) << (PAGE_SHIFT - 10),
-		totalcma_pages << (PAGE_SHIFT - 10)
+		K(physpages - totalram_pages() - totalcma_pages),
+		K(totalcma_pages)
 #ifdef	CONFIG_HIGHMEM
-		, totalhigh_pages() << (PAGE_SHIFT - 10)
+		, K(totalhigh_pages())
 #endif
 		);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 105/262] mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (103 preceding siblings ...)
  2021-11-05 20:40 ` [patch 104/262] mm/page_alloc.c: simplify the code by using macro K() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 106/262] mm/page_alloc.c: use helper function zone_spans_pfn() Andrew Morton
                   ` (156 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mgorman, mm-commits, peterz,
	sfr, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk()

The second two paragraphs about "all pages pinned" and pages_scanned is
obsolete.  And There are PAGE_ALLOC_COSTLY_ORDER + 1 + NR_PCP_THP orders
in pcp.  So the same order assumption is not held now.

Link: https://lkml.kernel.org/r/20210902121242.41607-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

--- a/mm/page_alloc.c~mm-page_allocc-fix-obsolete-comment-in-free_pcppages_bulk
+++ a/mm/page_alloc.c
@@ -1428,14 +1428,8 @@ static inline void prefetch_buddy(struct
 
 /*
  * Frees a number of pages from the PCP lists
- * Assumes all pages on list are in same zone, and of same order.
+ * Assumes all pages on list are in same zone.
  * count is the number of pages to free.
- *
- * If the zone was previously in an "all pages pinned" state then look to
- * see if this freeing clears that state.
- *
- * And clear the zone's pages_scanned counter, to hold off the "all pages are
- * pinned" detection logic.
  */
 static void free_pcppages_bulk(struct zone *zone, int count,
 					struct per_cpu_pages *pcp)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 106/262] mm/page_alloc.c: use helper function zone_spans_pfn()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (104 preceding siblings ...)
  2021-11-05 20:40 ` [patch 105/262] mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 107/262] mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid] Andrew Morton
                   ` (155 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mgorman, mm-commits, peterz,
	sfr, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_alloc.c: use helper function zone_spans_pfn()

Use helper function zone_spans_pfn() to check whether pfn is within a zone
to simplify the code slightly.

Link: https://lkml.kernel.org/r/20210902121242.41607-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-page_allocc-use-helper-function-zone_spans_pfn
+++ a/mm/page_alloc.c
@@ -1583,7 +1583,7 @@ static void __meminit init_reserved_page
 	for (zid = 0; zid < MAX_NR_ZONES; zid++) {
 		struct zone *zone = &pgdat->node_zones[zid];
 
-		if (pfn >= zone->zone_start_pfn && pfn < zone_end_pfn(zone))
+		if (zone_spans_pfn(zone, pfn))
 			break;
 	}
 	__init_single_page(pfn_to_page(pfn), pfn, zid, nid);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 107/262] mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid]
  2021-11-05 20:34 incoming Andrew Morton
                   ` (105 preceding siblings ...)
  2021-11-05 20:40 ` [patch 106/262] mm/page_alloc.c: use helper function zone_spans_pfn() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 108/262] mm/page_alloc: print node fallback order Andrew Morton
                   ` (154 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mgorman, mm-commits, peterz,
	sfr, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid]

Don't use with __GFP_HIGHMEM because page_address() cannot represent
highmem pages without kmap().  Newly allocated pages would leak as
page_address() will return NULL for highmem pages here.  But It works now
because the callers do not specify __GFP_HIGHMEM now.

Link: https://lkml.kernel.org/r/20210902121242.41607-6-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/mm/page_alloc.c~mm-page_allocc-avoid-allocating-highmem-pages-via-alloc_pages_exact
+++ a/mm/page_alloc.c
@@ -5610,8 +5610,8 @@ void *alloc_pages_exact(size_t size, gfp
 	unsigned int order = get_order(size);
 	unsigned long addr;
 
-	if (WARN_ON_ONCE(gfp_mask & __GFP_COMP))
-		gfp_mask &= ~__GFP_COMP;
+	if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM)))
+		gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM);
 
 	addr = __get_free_pages(gfp_mask, order);
 	return make_alloc_exact(addr, order, size);
@@ -5635,8 +5635,8 @@ void * __meminit alloc_pages_exact_nid(i
 	unsigned int order = get_order(size);
 	struct page *p;
 
-	if (WARN_ON_ONCE(gfp_mask & __GFP_COMP))
-		gfp_mask &= ~__GFP_COMP;
+	if (WARN_ON_ONCE(gfp_mask & (__GFP_COMP | __GFP_HIGHMEM)))
+		gfp_mask &= ~(__GFP_COMP | __GFP_HIGHMEM);
 
 	p = alloc_pages_node(nid, gfp_mask, order);
 	if (!p)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 108/262] mm/page_alloc: print node fallback order
  2021-11-05 20:34 incoming Andrew Morton
                   ` (106 preceding siblings ...)
  2021-11-05 20:40 ` [patch 107/262] mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid] Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 109/262] mm/page_alloc: use accumulated load when building node fallback list Andrew Morton
                   ` (153 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, anshuman.khandual, bharata, kamezawa.hiroyu,
	krupa.ramakrishnan, lee.schermerhorn, linux-mm, mgorman,
	mm-commits, Sadagopan.Srinivasan, torvalds

From: Bharata B Rao <bharata@amd.com>
Subject: mm/page_alloc: print node fallback order

Patch series "Fix NUMA nodes fallback list ordering".

For a NUMA system that has multiple nodes at same distance from other
nodes, the fallback list generation prefers same node order for them
instead of round-robin thereby penalizing one node over others.  This
series fixes it.

More description of the problem and the fix is present in the patch
description.


This patch (of 2):

Print information message about the allocation fallback order for each
NUMA node during boot.

No functional changes here.  This makes it easier to illustrate the
problem in the node fallback list generation, which the next patch fixes.

Link: https://lkml.kernel.org/r/20210830121603.1081-1-bharata@amd.com
Link: https://lkml.kernel.org/r/20210830121603.1081-2-bharata@amd.com
Signed-off-by: Bharata B Rao <bharata@amd.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Krupa Ramakrishnan <krupa.ramakrishnan@amd.com>
Cc: Sadagopan Srinivasan <Sadagopan.Srinivasan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/mm/page_alloc.c~mm-page_alloc-print-node-fallback-order
+++ a/mm/page_alloc.c
@@ -6262,6 +6262,10 @@ static void build_zonelists(pg_data_t *p
 
 	build_zonelists_in_node_order(pgdat, node_order, nr_nodes);
 	build_thisnode_zonelists(pgdat);
+	pr_info("Fallback order for Node %d: ", local_node);
+	for (node = 0; node < nr_nodes; node++)
+		pr_cont("%d ", node_order[node]);
+	pr_cont("\n");
 }
 
 #ifdef CONFIG_HAVE_MEMORYLESS_NODES
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 109/262] mm/page_alloc: use accumulated load when building node fallback list
  2021-11-05 20:34 incoming Andrew Morton
                   ` (107 preceding siblings ...)
  2021-11-05 20:40 ` [patch 108/262] mm/page_alloc: print node fallback order Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 110/262] mm: move node_reclaim_distance to fix NUMA without SMP Andrew Morton
                   ` (152 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, anshuman.khandual, bharata, kamezawa.hiroyu,
	krupa.ramakrishnan, lee.schermerhorn, linux-mm, mgorman,
	mm-commits, Sadagopan.Srinivasan, torvalds

From: Krupa Ramakrishnan <krupa.ramakrishnan@amd.com>
Subject: mm/page_alloc: use accumulated load when building node fallback list

In build_zonelists(), when the fallback list is built for the nodes, the
node load gets reinitialized during each iteration.  This results in nodes
with same distances occupying the same slot in different node fallback
lists rather than appearing in the intended round- robin manner.  This
results in one node getting picked for allocation more compared to other
nodes with the same distance.

As an example, consider a 4 node system with the following distance
matrix.

Node 0  1  2  3
----------------
0    10 12 32 32
1    12 10 32 32
2    32 32 10 12
3    32 32 12 10

For this case, the node fallback list gets built like this:

Node	Fallback list
---------------------
0	0 1 2 3
1	1 0 3 2
2	2 3 0 1
3	3 2 0 1 <-- Unexpected fallback order

In the fallback list for nodes 2 and 3, the nodes 0 and 1 appear in the
same order which results in more allocations getting satisfied from node 0
compared to node 1.

The effect of this on remote memory bandwidth as seen by stream benchmark
is shown below:

Case 1: Bandwidth from cores on nodes 2 & 3 to memory on nodes 0 & 1
	(numactl -m 0,1 ./stream_lowOverhead ... --cores <from 2, 3>)
Case 2: Bandwidth from cores on nodes 0 & 1 to memory on nodes 2 & 3
	(numactl -m 2,3 ./stream_lowOverhead ... --cores <from 0, 1>)

----------------------------------------
		BANDWIDTH (MB/s)
    TEST	Case 1		Case 2
----------------------------------------
    COPY	57479.6		110791.8
   SCALE	55372.9		105685.9
     ADD	50460.6		96734.2
  TRIADD	50397.6		97119.1
----------------------------------------

The bandwidth drop in Case 1 occurs because most of the allocations get
satisfied by node 0 as it appears first in the fallback order for both
nodes 2 and 3.

This can be fixed by accumulating the node load in build_zonelists()
rather than reinitializing it during each iteration.  With this the nodes
with the same distance rightly get assigned in the round robin manner.  In
fact this was how it was originally until the commit f0c0b2b808f2 ("change
zonelist order: zonelist order selection logic") dropped the load
accumulation and resorted to initializing the load during each iteration. 
While zonelist ordering was removed by commit c9bff3eebc09 ("mm,
page_alloc: rip out ZONELIST_ORDER_ZONE"), the change to the node load
accumulation in build_zonelists() remained.  So essentially this patch
reverts back to the accumulated node load logic.

After this fix, the fallback order gets built like this:

Node Fallback list
------------------
0    0 1 2 3
1    1 0 3 2
2    2 3 0 1
3    3 2 1 0 <-- Note the change here

The bandwidth in Case 1 improves and matches Case 2 as shown below.

----------------------------------------
		BANDWIDTH (MB/s)
    TEST	Case 1		Case 2
----------------------------------------
    COPY	110438.9	110107.2
   SCALE	105930.5	105817.5
     ADD	97005.1		96159.8
  TRIADD	97441.5		96757.1
----------------------------------------

The correctness of the fallback list generation has been verified for the
above node configuration where the node 3 starts as memory-less node and
comes up online only during memory hotplug.

[bharata@amd.com: Added changelog, review, test validation]
Link: https://lkml.kernel.org/r/20210830121603.1081-3-bharata@amd.com
Fixes: f0c0b2b808f2 ("change zonelist order: zonelist order selection logic")
Signed-off-by: Krupa Ramakrishnan <krupa.ramakrishnan@amd.com>
Co-developed-by: Sadagopan Srinivasan <Sadagopan.Srinivasan@amd.com>
Signed-off-by: Sadagopan Srinivasan <Sadagopan.Srinivasan@amd.com>
Signed-off-by: Bharata B Rao <bharata@amd.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/page_alloc.c~mm-page_alloc-use-accumulated-load-when-building-node-fallback-list
+++ a/mm/page_alloc.c
@@ -6253,7 +6253,7 @@ static void build_zonelists(pg_data_t *p
 		 */
 		if (node_distance(local_node, node) !=
 		    node_distance(local_node, prev_node))
-			node_load[node] = load;
+			node_load[node] += load;
 
 		node_order[nr_nodes++] = node;
 		prev_node = node;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 110/262] mm: move node_reclaim_distance to fix NUMA without SMP
  2021-11-05 20:34 incoming Andrew Morton
                   ` (108 preceding siblings ...)
  2021-11-05 20:40 ` [patch 109/262] mm/page_alloc: use accumulated load when building node fallback list Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 111/262] mm: move fold_vm_numa_events() " Andrew Morton
                   ` (151 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, dalias, geert+renesas, gonsolo, juri.lelli, linux-mm, matt,
	mgorman, mingo, mm-commits, peterz, torvalds, vbabka,
	vincent.guittot, ysato

From: Geert Uytterhoeven <geert+renesas@glider.be>
Subject: mm: move node_reclaim_distance to fix NUMA without SMP

Patch series "Fix NUMA without SMP".

SuperH is the only architecture which still supports NUMA without SMP, for
good reasons (various memories scattered around the address space, each
with varying latencies).  This series fixes two build errors due to
variables and functions used by the NUMA code being provided by SMP-only
source files or sections.


This patch (of 2):

If CONFIG_NUMA=y, but CONFIG_SMP=n (e.g. sh/migor_defconfig):

    sh4-linux-gnu-ld: mm/page_alloc.o: in function `get_page_from_freelist':
    page_alloc.c:(.text+0x2c24): undefined reference to `node_reclaim_distance'

Fix this by moving the declaration of node_reclaim_distance from an
SMP-only to a generic file.

Link: https://lkml.kernel.org/r/cover.1631781495.git.geert+renesas@glider.be
Link: https://lkml.kernel.org/r/6432666a648dde85635341e6c918cee97c97d264.1631781495.git.geert+renesas@glider.be
Fixes: a55c7454a8c887b2 ("sched/topology: Improve load balancing on AMD EPYC systems")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Suggested-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Cc: Rich Felker <dalias@libc.org>
Cc: Gon Solo <gonsolo@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/sched/topology.c |    1 -
 mm/page_alloc.c         |    2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/sched/topology.c~mm-move-node_reclaim_distance-to-fix-numa-without-smp
+++ a/kernel/sched/topology.c
@@ -1481,7 +1481,6 @@ static int			sched_domains_curr_level;
 int				sched_max_numa_distance;
 static int			*sched_domains_numa_distance;
 static struct cpumask		***sched_domains_numa_masks;
-int __read_mostly		node_reclaim_distance = RECLAIM_DISTANCE;
 
 static unsigned long __read_mostly *sched_numa_onlined_nodes;
 #endif
--- a/mm/page_alloc.c~mm-move-node_reclaim_distance-to-fix-numa-without-smp
+++ a/mm/page_alloc.c
@@ -3960,6 +3960,8 @@ bool zone_watermark_ok_safe(struct zone
 }
 
 #ifdef CONFIG_NUMA
+int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE;
+
 static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
 {
 	return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <=
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 111/262] mm: move fold_vm_numa_events() to fix NUMA without SMP
  2021-11-05 20:34 incoming Andrew Morton
                   ` (109 preceding siblings ...)
  2021-11-05 20:40 ` [patch 110/262] mm: move node_reclaim_distance to fix NUMA without SMP Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 112/262] mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page() Andrew Morton
                   ` (150 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, dalias, geert+renesas, gonsolo, juri.lelli, linux-mm, matt,
	mgorman, mingo, mm-commits, peterz, torvalds, vbabka,
	vincent.guittot, ysato

From: Geert Uytterhoeven <geert+renesas@glider.be>
Subject: mm: move fold_vm_numa_events() to fix NUMA without SMP

If CONFIG_NUMA=y, but CONFIG_SMP=n (e.g. sh/migor_defconfig):

    sh4-linux-gnu-ld: mm/vmstat.o: in function `vmstat_start':
    vmstat.c:(.text+0x97c): undefined reference to `fold_vm_numa_events'
    sh4-linux-gnu-ld: drivers/base/node.o: in function `node_read_vmstat':
    node.c:(.text+0x140): undefined reference to `fold_vm_numa_events'
    sh4-linux-gnu-ld: drivers/base/node.o: in function `node_read_numastat':
    node.c:(.text+0x1d0): undefined reference to `fold_vm_numa_events'

Fix this by moving fold_vm_numa_events() outside the SMP-only section.

Link: https://lkml.kernel.org/r/9d16ccdd9ef32803d7100c84f737de6a749314fb.1631781495.git.geert+renesas@glider.be
Fixes: f19298b9516c1a03 ("mm/vmstat: convert NUMA statistics to basic NUMA counters")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Gon Solo <gonsolo@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmstat.c |   56 +++++++++++++++++++++++++-------------------------
 1 file changed, 28 insertions(+), 28 deletions(-)

--- a/mm/vmstat.c~mm-move-fold_vm_numa_events-to-fix-numa-without-smp
+++ a/mm/vmstat.c
@@ -165,6 +165,34 @@ atomic_long_t vm_numa_event[NR_VM_NUMA_E
 EXPORT_SYMBOL(vm_zone_stat);
 EXPORT_SYMBOL(vm_node_stat);
 
+#ifdef CONFIG_NUMA
+static void fold_vm_zone_numa_events(struct zone *zone)
+{
+	unsigned long zone_numa_events[NR_VM_NUMA_EVENT_ITEMS] = { 0, };
+	int cpu;
+	enum numa_stat_item item;
+
+	for_each_online_cpu(cpu) {
+		struct per_cpu_zonestat *pzstats;
+
+		pzstats = per_cpu_ptr(zone->per_cpu_zonestats, cpu);
+		for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++)
+			zone_numa_events[item] += xchg(&pzstats->vm_numa_event[item], 0);
+	}
+
+	for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++)
+		zone_numa_event_add(zone_numa_events[item], zone, item);
+}
+
+void fold_vm_numa_events(void)
+{
+	struct zone *zone;
+
+	for_each_populated_zone(zone)
+		fold_vm_zone_numa_events(zone);
+}
+#endif
+
 #ifdef CONFIG_SMP
 
 int calculate_pressure_threshold(struct zone *zone)
@@ -771,34 +799,6 @@ static int fold_diff(int *zone_diff, int
 	return changes;
 }
 
-#ifdef CONFIG_NUMA
-static void fold_vm_zone_numa_events(struct zone *zone)
-{
-	unsigned long zone_numa_events[NR_VM_NUMA_EVENT_ITEMS] = { 0, };
-	int cpu;
-	enum numa_stat_item item;
-
-	for_each_online_cpu(cpu) {
-		struct per_cpu_zonestat *pzstats;
-
-		pzstats = per_cpu_ptr(zone->per_cpu_zonestats, cpu);
-		for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++)
-			zone_numa_events[item] += xchg(&pzstats->vm_numa_event[item], 0);
-	}
-
-	for (item = 0; item < NR_VM_NUMA_EVENT_ITEMS; item++)
-		zone_numa_event_add(zone_numa_events[item], zone, item);
-}
-
-void fold_vm_numa_events(void)
-{
-	struct zone *zone;
-
-	for_each_populated_zone(zone)
-		fold_vm_zone_numa_events(zone);
-}
-#endif
-
 /*
  * Update the zone counters for the current cpu.
  *
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 112/262] mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (110 preceding siblings ...)
  2021-11-05 20:40 ` [patch 111/262] mm: move fold_vm_numa_events() " Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 113/262] mm/page_alloc: detect allocation forbidden by cpuset and bail out early Andrew Morton
                   ` (149 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, edumazet, hughd, linux-mm, mm-commits, torvalds

From: Eric Dumazet <edumazet@google.com>
Subject: mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page()

Grabbing zone lock in is_free_buddy_page() gives a wrong sense of safety,
and has potential performance implications when zone is experiencing lock
contention.

In any case, if a caller needs a stable result, it should grab zone lock
before calling this function.

Link: https://lkml.kernel.org/r/20210922152833.4023972-1-eric.dumazet@gmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/mm/page_alloc.c~mm-do-not-acquire-zone-lock-in-is_free_buddy_page
+++ a/mm/page_alloc.c
@@ -9356,21 +9356,21 @@ void __offline_isolated_pages(unsigned l
 }
 #endif
 
+/*
+ * This function returns a stable result only if called under zone lock.
+ */
 bool is_free_buddy_page(struct page *page)
 {
-	struct zone *zone = page_zone(page);
 	unsigned long pfn = page_to_pfn(page);
-	unsigned long flags;
 	unsigned int order;
 
-	spin_lock_irqsave(&zone->lock, flags);
 	for (order = 0; order < MAX_ORDER; order++) {
 		struct page *page_head = page - (pfn & ((1 << order) - 1));
 
-		if (PageBuddy(page_head) && buddy_order(page_head) >= order)
+		if (PageBuddy(page_head) &&
+		    buddy_order_unsafe(page_head) >= order)
 			break;
 	}
-	spin_unlock_irqrestore(&zone->lock, flags);
 
 	return order < MAX_ORDER;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 113/262] mm/page_alloc: detect allocation forbidden by cpuset and bail out early
  2021-11-05 20:34 incoming Andrew Morton
                   ` (111 preceding siblings ...)
  2021-11-05 20:40 ` [patch 112/262] mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 114/262] mm/page_alloc.c: show watermark_boost of zone in zoneinfo Andrew Morton
                   ` (148 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, feng.tang, hannes, linux-mm, lizefan.x, mgorman, mhocko,
	mm-commits, rientjes, tj, torvalds, vbabka

From: Feng Tang <feng.tang@intel.com>
Subject: mm/page_alloc: detect allocation forbidden by cpuset and bail out early

There was a report that starting an Ubuntu in docker while using cpuset to
bind it to movable nodes (a node only has movable zone, like a node for
hotplug or a Persistent Memory node in normal usage) will fail due to
memory allocation failure, and then OOM is involved and many other
innocent processes got killed.  It can be reproduced with command: $docker
run -it --rm --cpuset-mems 4 ubuntu:latest bash -c "grep Mems_allowed
/proc/self/status" (node 4 is a movable node)

  runc:[2:INIT] invoked oom-killer: gfp_mask=0x500cc2(GFP_HIGHUSER|__GFP_ACCOUNT), order=0, oom_score_adj=0
  CPU: 8 PID: 8291 Comm: runc:[2:INIT] Tainted: G        W I E     5.8.2-0.g71b519a-default #1 openSUSE Tumbleweed (unreleased)
  Hardware name: Dell Inc. PowerEdge R640/0PHYDR, BIOS 2.6.4 04/09/2020
  Call Trace:
   dump_stack+0x6b/0x88
   dump_header+0x4a/0x1e2
   oom_kill_process.cold+0xb/0x10
   out_of_memory.part.0+0xaf/0x230
   out_of_memory+0x3d/0x80
   __alloc_pages_slowpath.constprop.0+0x954/0xa20
   __alloc_pages_nodemask+0x2d3/0x300
   pipe_write+0x322/0x590
   new_sync_write+0x196/0x1b0
   vfs_write+0x1c3/0x1f0
   ksys_write+0xa7/0xe0
   do_syscall_64+0x52/0xd0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Mem-Info:
  active_anon:392832 inactive_anon:182 isolated_anon:0
   active_file:68130 inactive_file:151527 isolated_file:0
   unevictable:2701 dirty:0 writeback:7
   slab_reclaimable:51418 slab_unreclaimable:116300
   mapped:45825 shmem:735 pagetables:2540 bounce:0
   free:159849484 free_pcp:73 free_cma:0
  Node 4 active_anon:1448kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB all_unreclaimable? no
  Node 4 Movable free:130021408kB min:9140kB low:139160kB high:269180kB reserved_highatomic:0KB active_anon:1448kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:130023424kB managed:130023424kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:292kB local_pcp:84kB free_cma:0kB
  lowmem_reserve[]: 0 0 0 0 0
  Node 4 Movable: 1*4kB (M) 0*8kB 0*16kB 1*32kB (M) 0*64kB 0*128kB 1*256kB (M) 1*512kB (M) 1*1024kB (M) 0*2048kB 31743*4096kB (M) = 130021156kB

  oom-kill:constraint=CONSTRAINT_CPUSET,nodemask=(null),cpuset=docker-9976a269caec812c134fa317f27487ee36e1129beba7278a463dd53e5fb9997b.scope,mems_allowed=4,global_oom,task_memcg=/system.slice/containerd.service,task=containerd,pid=4100,uid=0
  Out of memory: Killed process 4100 (containerd) total-vm:4077036kB, anon-rss:51184kB, file-rss:26016kB, shmem-rss:0kB, UID:0 pgtables:676kB oom_score_adj:0
  oom_reaper: reaped process 8248 (docker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
  oom_reaper: reaped process 2054 (node_exporter), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
  oom_reaper: reaped process 1452 (systemd-journal), now anon-rss:0kB, file-rss:8564kB, shmem-rss:4kB
  oom_reaper: reaped process 2146 (munin-node), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
  oom_reaper: reaped process 8291 (runc:[2:INIT]), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

The reason is, in the case, the target cpuset nodes only have movable
zone, while the creation of an OS in docker sometimes needs to allocate
memory in non-movable zones (dma/dma32/normal) like GFP_HIGHUSER, and the
cpuset limit forbids the allocation, then out-of-memory killing is
involved even when normal nodes and movable nodes both have many free
memory.

The OOM killer cannot help to resolve the situation as there is no usable
memory for the request in the cpuset scope.  The only reasonable measure
to take is to fail the allocation right away and have the caller to deal
with it.

So add a check for cases like this in the slowpath of allocation, and bail
out early returning NULL for the allocation.

As page allocation is one of the hottest path in kernel, this check will
hurt all users with sane cpuset configuration, add a static branch check
and detect the abnormal config in cpuset memory binding setup so that the
extra check cost in page allocation is not paid by everyone.

[thanks to Micho Hocko and David Rientjes for suggesting not handling
 it inside OOM code, adding cpuset check, refining comments]

Link: https://lkml.kernel.org/r/1632481657-68112-1-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/cpuset.h |   17 +++++++++++++++++
 include/linux/mmzone.h |   22 ++++++++++++++++++++++
 kernel/cgroup/cpuset.c |   23 +++++++++++++++++++++++
 mm/page_alloc.c        |   13 +++++++++++++
 4 files changed, 75 insertions(+)

--- a/include/linux/cpuset.h~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early
+++ a/include/linux/cpuset.h
@@ -34,6 +34,8 @@
  */
 extern struct static_key_false cpusets_pre_enable_key;
 extern struct static_key_false cpusets_enabled_key;
+extern struct static_key_false cpusets_insane_config_key;
+
 static inline bool cpusets_enabled(void)
 {
 	return static_branch_unlikely(&cpusets_enabled_key);
@@ -51,6 +53,19 @@ static inline void cpuset_dec(void)
 	static_branch_dec_cpuslocked(&cpusets_pre_enable_key);
 }
 
+/*
+ * This will get enabled whenever a cpuset configuration is considered
+ * unsupportable in general. E.g. movable only node which cannot satisfy
+ * any non movable allocations (see update_nodemask). Page allocator
+ * needs to make additional checks for those configurations and this
+ * check is meant to guard those checks without any overhead for sane
+ * configurations.
+ */
+static inline bool cpusets_insane_config(void)
+{
+	return static_branch_unlikely(&cpusets_insane_config_key);
+}
+
 extern int cpuset_init(void);
 extern void cpuset_init_smp(void);
 extern void cpuset_force_rebuild(void);
@@ -167,6 +182,8 @@ static inline void set_mems_allowed(node
 
 static inline bool cpusets_enabled(void) { return false; }
 
+static inline bool cpusets_insane_config(void) { return false; }
+
 static inline int cpuset_init(void) { return 0; }
 static inline void cpuset_init_smp(void) {}
 
--- a/include/linux/mmzone.h~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early
+++ a/include/linux/mmzone.h
@@ -1220,6 +1220,28 @@ static inline struct zoneref *first_zone
 #define for_each_zone_zonelist(zone, z, zlist, highidx) \
 	for_each_zone_zonelist_nodemask(zone, z, zlist, highidx, NULL)
 
+/* Whether the 'nodes' are all movable nodes */
+static inline bool movable_only_nodes(nodemask_t *nodes)
+{
+	struct zonelist *zonelist;
+	struct zoneref *z;
+	int nid;
+
+	if (nodes_empty(*nodes))
+		return false;
+
+	/*
+	 * We can chose arbitrary node from the nodemask to get a
+	 * zonelist as they are interlinked. We just need to find
+	 * at least one zone that can satisfy kernel allocations.
+	 */
+	nid = first_node(*nodes);
+	zonelist = &NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK];
+	z = first_zones_zonelist(zonelist, ZONE_NORMAL,	nodes);
+	return (!z->zone) ? true : false;
+}
+
+
 #ifdef CONFIG_SPARSEMEM
 #include <asm/sparsemem.h>
 #endif
--- a/kernel/cgroup/cpuset.c~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early
+++ a/kernel/cgroup/cpuset.c
@@ -69,6 +69,13 @@
 DEFINE_STATIC_KEY_FALSE(cpusets_pre_enable_key);
 DEFINE_STATIC_KEY_FALSE(cpusets_enabled_key);
 
+/*
+ * There could be abnormal cpuset configurations for cpu or memory
+ * node binding, add this key to provide a quick low-cost judgement
+ * of the situation.
+ */
+DEFINE_STATIC_KEY_FALSE(cpusets_insane_config_key);
+
 /* See "Frequency meter" comments, below. */
 
 struct fmeter {
@@ -372,6 +379,17 @@ static DECLARE_WORK(cpuset_hotplug_work,
 
 static DECLARE_WAIT_QUEUE_HEAD(cpuset_attach_wq);
 
+static inline void check_insane_mems_config(nodemask_t *nodes)
+{
+	if (!cpusets_insane_config() &&
+		movable_only_nodes(nodes)) {
+		static_branch_enable(&cpusets_insane_config_key);
+		pr_info("Unsupported (movable nodes only) cpuset configuration detected (nmask=%*pbl)!\n"
+			"Cpuset allocations might fail even with a lot of memory available.\n",
+			nodemask_pr_args(nodes));
+	}
+}
+
 /*
  * Cgroup v2 behavior is used on the "cpus" and "mems" control files when
  * on default hierarchy or when the cpuset_v2_mode flag is set by mounting
@@ -1870,6 +1888,8 @@ static int update_nodemask(struct cpuset
 	if (retval < 0)
 		goto done;
 
+	check_insane_mems_config(&trialcs->mems_allowed);
+
 	spin_lock_irq(&callback_lock);
 	cs->mems_allowed = trialcs->mems_allowed;
 	spin_unlock_irq(&callback_lock);
@@ -3173,6 +3193,9 @@ update_tasks:
 	cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus);
 	mems_updated = !nodes_equal(new_mems, cs->effective_mems);
 
+	if (mems_updated)
+		check_insane_mems_config(&new_mems);
+
 	if (is_in_v2_mode())
 		hotplug_update_tasks(cs, &new_cpus, &new_mems,
 				     cpus_updated, mems_updated);
--- a/mm/page_alloc.c~mm-page_alloc-detect-allocation-forbidden-by-cpuset-and-bail-out-early
+++ a/mm/page_alloc.c
@@ -4910,6 +4910,19 @@ retry_cpuset:
 	if (!ac->preferred_zoneref->zone)
 		goto nopage;
 
+	/*
+	 * Check for insane configurations where the cpuset doesn't contain
+	 * any suitable zone to satisfy the request - e.g. non-movable
+	 * GFP_HIGHUSER allocations from MOVABLE nodes only.
+	 */
+	if (cpusets_insane_config() && (gfp_mask & __GFP_HARDWALL)) {
+		struct zoneref *z = first_zones_zonelist(ac->zonelist,
+					ac->highest_zoneidx,
+					&cpuset_current_mems_allowed);
+		if (!z->zone)
+			goto nopage;
+	}
+
 	if (alloc_flags & ALLOC_KSWAPD)
 		wake_all_kswapds(order, gfp_mask, ac);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 114/262] mm/page_alloc.c: show watermark_boost of zone in zoneinfo
  2021-11-05 20:34 incoming Andrew Morton
                   ` (112 preceding siblings ...)
  2021-11-05 20:40 ` [patch 113/262] mm/page_alloc: detect allocation forbidden by cpuset and bail out early Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 115/262] mm: create a new system state and fix core_kernel_text() Andrew Morton
                   ` (147 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, liangcaifan19, linux-mm, mm-commits, torvalds, zhang.lyra

From: Liangcai Fan <liangcaifan19@gmail.com>
Subject: mm/page_alloc.c: show watermark_boost of zone in zoneinfo

min/low/high_wmark_pages(z) is defined as
(z->_watermark[WMARK_MIN/LOW/HIGH] + z->watermark_boost).
If kswapd is frequently waked up due to the increase of
min/low/high_wmark_pages, printing watermark_boost can quickly locate
whether watermark_boost or _watermark[WMARK_MIN/LOW/HIGH] caused
min/low/high_wmark_pages to increase.

Link: https://lkml.kernel.org/r/1632472566-12246-1-git-send-email-liangcaifan19@gmail.com
Signed-off-by: Liangcai Fan <liangcaifan19@gmail.com>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    2 ++
 mm/vmstat.c     |    2 ++
 2 files changed, 4 insertions(+)

--- a/mm/page_alloc.c~mm-show-watermark_boost-of-zone-in-zoneinfo
+++ a/mm/page_alloc.c
@@ -5993,6 +5993,7 @@ void show_free_areas(unsigned int filter
 		printk(KERN_CONT
 			"%s"
 			" free:%lukB"
+			" boost:%lukB"
 			" min:%lukB"
 			" low:%lukB"
 			" high:%lukB"
@@ -6013,6 +6014,7 @@ void show_free_areas(unsigned int filter
 			"\n",
 			zone->name,
 			K(zone_page_state(zone, NR_FREE_PAGES)),
+			K(zone->watermark_boost),
 			K(min_wmark_pages(zone)),
 			K(low_wmark_pages(zone)),
 			K(high_wmark_pages(zone)),
--- a/mm/vmstat.c~mm-show-watermark_boost-of-zone-in-zoneinfo
+++ a/mm/vmstat.c
@@ -1656,6 +1656,7 @@ static void zoneinfo_show_print(struct s
 	}
 	seq_printf(m,
 		   "\n  pages free     %lu"
+		   "\n        boost    %lu"
 		   "\n        min      %lu"
 		   "\n        low      %lu"
 		   "\n        high     %lu"
@@ -1664,6 +1665,7 @@ static void zoneinfo_show_print(struct s
 		   "\n        managed  %lu"
 		   "\n        cma      %lu",
 		   zone_page_state(zone, NR_FREE_PAGES),
+		   zone->watermark_boost,
 		   min_wmark_pages(zone),
 		   low_wmark_pages(zone),
 		   high_wmark_pages(zone),
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 115/262] mm: create a new system state and fix core_kernel_text()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (113 preceding siblings ...)
  2021-11-05 20:40 ` [patch 114/262] mm/page_alloc.c: show watermark_boost of zone in zoneinfo Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 116/262] mm: make generic arch_is_kernel_initmem_freed() do what it says Andrew Morton
                   ` (146 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, benh, christophe.leroy, gerald.schaefer, hca, linux-mm,
	mm-commits, paulus, torvalds, wangkefeng.wang

From: Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: mm: create a new system state and fix core_kernel_text()

core_kernel_text() considers that until system_state in at least
SYSTEM_RUNNING, init memory is valid.

But init memory is freed a few lines before setting SYSTEM_RUNNING, so we
have a small period of time when core_kernel_text() is wrong.

Create an intermediate system state called SYSTEM_FREEING_INIT that is set
before starting freeing init memory, and use it in core_kernel_text() to
report init memory invalid earlier.

Link: https://lkml.kernel.org/r/9ecfdee7dd4d741d172cb93ff1d87f1c58127c9a.1633001016.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kernel.h |    1 +
 init/main.c            |    2 ++
 kernel/extable.c       |    2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

--- a/include/linux/kernel.h~mm-create-a-new-system-state-and-fix-core_kernel_text
+++ a/include/linux/kernel.h
@@ -248,6 +248,7 @@ extern bool early_boot_irqs_disabled;
 extern enum system_states {
 	SYSTEM_BOOTING,
 	SYSTEM_SCHEDULING,
+	SYSTEM_FREEING_INITMEM,
 	SYSTEM_RUNNING,
 	SYSTEM_HALT,
 	SYSTEM_POWER_OFF,
--- a/init/main.c~mm-create-a-new-system-state-and-fix-core_kernel_text
+++ a/init/main.c
@@ -1506,6 +1506,8 @@ static int __ref kernel_init(void *unuse
 	kernel_init_freeable();
 	/* need to finish all async __init code before freeing the memory */
 	async_synchronize_full();
+
+	system_state = SYSTEM_FREEING_INITMEM;
 	kprobe_free_init_mem();
 	ftrace_free_init_mem();
 	kgdb_free_init_mem();
--- a/kernel/extable.c~mm-create-a-new-system-state-and-fix-core_kernel_text
+++ a/kernel/extable.c
@@ -76,7 +76,7 @@ int notrace core_kernel_text(unsigned lo
 	    addr < (unsigned long)_etext)
 		return 1;
 
-	if (system_state < SYSTEM_RUNNING &&
+	if (system_state < SYSTEM_FREEING_INITMEM &&
 	    init_kernel_text(addr))
 		return 1;
 	return 0;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 116/262] mm: make generic arch_is_kernel_initmem_freed() do what it says
  2021-11-05 20:34 incoming Andrew Morton
                   ` (114 preceding siblings ...)
  2021-11-05 20:40 ` [patch 115/262] mm: create a new system state and fix core_kernel_text() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 117/262] powerpc: use generic version of arch_is_kernel_initmem_freed() Andrew Morton
                   ` (145 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, benh, christophe.leroy, gerald.schaefer, hca, linux-mm,
	mm-commits, paulus, torvalds, wangkefeng.wang

From: Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: mm: make generic arch_is_kernel_initmem_freed() do what it says

Commit 7a5da02de8d6 ("locking/lockdep: check for freed initmem in
static_obj()") added arch_is_kernel_initmem_freed() which is supposed to
report whether an object is part of already freed init memory.

For the time being, the generic version of arch_is_kernel_initmem_freed()
always reports 'false', allthough free_initmem() is generically called on
all architectures.

Therefore, change the generic version of arch_is_kernel_initmem_freed() to
check whether free_initmem() has been called.  If so, then check if a
given address falls into init memory.

To ease the use of system_state, move it out of line into its only caller
which is lockdep.c

Link: https://lkml.kernel.org/r/1d40783e676e07858be97d881f449ee7ea8adfb1.1633001016.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/asm-generic/sections.h |   14 --------------
 kernel/locking/lockdep.c       |   15 +++++++++++++++
 2 files changed, 15 insertions(+), 14 deletions(-)

--- a/include/asm-generic/sections.h~mm-make-generic-arch_is_kernel_initmem_freed-do-what-it-says
+++ a/include/asm-generic/sections.h
@@ -80,20 +80,6 @@ static inline int arch_is_kernel_data(un
 }
 #endif
 
-/*
- * Check if an address is part of freed initmem. This is needed on architectures
- * with virt == phys kernel mapping, for code that wants to check if an address
- * is part of a static object within [_stext, _end]. After initmem is freed,
- * memory can be allocated from it, and such allocations would then have
- * addresses within the range [_stext, _end].
- */
-#ifndef arch_is_kernel_initmem_freed
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	return 0;
-}
-#endif
-
 /**
  * memory_contains - checks if an object is contained within a memory region
  * @begin: virtual address of the beginning of the memory region
--- a/kernel/locking/lockdep.c~mm-make-generic-arch_is_kernel_initmem_freed-do-what-it-says
+++ a/kernel/locking/lockdep.c
@@ -788,6 +788,21 @@ static int very_verbose(struct lock_clas
  * Is this the address of a static object:
  */
 #ifdef __KERNEL__
+/*
+ * Check if an address is part of freed initmem. After initmem is freed,
+ * memory can be allocated from it, and such allocations would then have
+ * addresses within the range [_stext, _end].
+ */
+#ifndef arch_is_kernel_initmem_freed
+static int arch_is_kernel_initmem_freed(unsigned long addr)
+{
+	if (system_state < SYSTEM_FREEING_INITMEM)
+		return 0;
+
+	return init_section_contains((void *)addr, 1);
+}
+#endif
+
 static int static_obj(const void *obj)
 {
 	unsigned long start = (unsigned long) &_stext,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 117/262] powerpc: use generic version of arch_is_kernel_initmem_freed()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (115 preceding siblings ...)
  2021-11-05 20:40 ` [patch 116/262] mm: make generic arch_is_kernel_initmem_freed() do what it says Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 118/262] s390: " Andrew Morton
                   ` (144 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, benh, christophe.leroy, gerald.schaefer, hca, linux-mm,
	mm-commits, paulus, torvalds, wangkefeng.wang

From: Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: powerpc: use generic version of arch_is_kernel_initmem_freed()

Generic version of arch_is_kernel_initmem_freed() now does the same as
powerpc version.

Remove the powerpc version.

Link: https://lkml.kernel.org/r/c53764eb45d41491e2b21da2e7812239897dbebb.1633001016.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/include/asm/sections.h |   13 -------------
 1 file changed, 13 deletions(-)

--- a/arch/powerpc/include/asm/sections.h~powerpc-use-generic-version-of-arch_is_kernel_initmem_freed
+++ a/arch/powerpc/include/asm/sections.h
@@ -6,21 +6,8 @@
 #include <linux/elf.h>
 #include <linux/uaccess.h>
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool init_mem_is_free;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!init_mem_is_free)
-		return 0;
-
-	return addr >= (unsigned long)__init_begin &&
-		addr < (unsigned long)__init_end;
-}
-
 extern char __head_end[];
 
 #ifdef __powerpc64__
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 118/262] s390: use generic version of arch_is_kernel_initmem_freed()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (116 preceding siblings ...)
  2021-11-05 20:40 ` [patch 117/262] powerpc: use generic version of arch_is_kernel_initmem_freed() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 119/262] mm: page_alloc: use migrate_disable() in drain_local_pages_wq() Andrew Morton
                   ` (143 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, benh, christophe.leroy, gerald.schaefer, hca, linux-mm,
	mm-commits, paulus, torvalds, wangkefeng.wang

From: Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: s390: use generic version of arch_is_kernel_initmem_freed()

Generic version of arch_is_kernel_initmem_freed() now does the same
as s390 version.

Remove the s390 version.

Link: https://lkml.kernel.org/r/b6feb5dfe611a322de482762fc2df3a9eece70c7.1633001016.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/s390/include/asm/sections.h |   12 ------------
 arch/s390/mm/init.c              |    3 ---
 2 files changed, 15 deletions(-)

--- a/arch/s390/include/asm/sections.h~s390-use-generic-version-of-arch_is_kernel_initmem_freed
+++ a/arch/s390/include/asm/sections.h
@@ -2,20 +2,8 @@
 #ifndef _S390_SECTIONS_H
 #define _S390_SECTIONS_H
 
-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
-
 #include <asm-generic/sections.h>
 
-extern bool initmem_freed;
-
-static inline int arch_is_kernel_initmem_freed(unsigned long addr)
-{
-	if (!initmem_freed)
-		return 0;
-	return addr >= (unsigned long)__init_begin &&
-	       addr < (unsigned long)__init_end;
-}
-
 /*
  * .boot.data section contains variables "shared" between the decompressor and
  * the decompressed kernel. The decompressor will store values in them, and
--- a/arch/s390/mm/init.c~s390-use-generic-version-of-arch_is_kernel_initmem_freed
+++ a/arch/s390/mm/init.c
@@ -58,8 +58,6 @@ unsigned long empty_zero_page, zero_page
 EXPORT_SYMBOL(empty_zero_page);
 EXPORT_SYMBOL(zero_page_mask);
 
-bool initmem_freed;
-
 static void __init setup_zero_pages(void)
 {
 	unsigned int order;
@@ -214,7 +212,6 @@ void __init mem_init(void)
 
 void free_initmem(void)
 {
-	initmem_freed = true;
 	__set_memory((unsigned long)_sinittext,
 		     (unsigned long)(_einittext - _sinittext) >> PAGE_SHIFT,
 		     SET_MEMORY_RW | SET_MEMORY_NX);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 119/262] mm: page_alloc: use migrate_disable() in drain_local_pages_wq()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (117 preceding siblings ...)
  2021-11-05 20:40 ` [patch 118/262] s390: " Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 120/262] mm/page_alloc: use clamp() to simplify code Andrew Morton
                   ` (142 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, bigeasy, linux-mm, mm-commits, peterz, tglx, torvalds

From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: mm: page_alloc: use migrate_disable() in drain_local_pages_wq()

drain_local_pages_wq() disables preemption to avoid CPU migration during
CPU hotplug and can't use cpus_read_lock().

Using migrate_disable() works here, too.  The scheduler won't take the CPU
offline until the task left the migrate-disable section.  The problem with
disabled preemption here is that drain_local_pages() acquires locks which
are turned into sleeping locks on PREEMPT_RT and can't be acquired with
disabled preemption.

Use migrate_disable() in drain_local_pages_wq().

Link: https://lkml.kernel.org/r/20211015210933.viw6rjvo64qtqxn4@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-use-migrate_disable-in-drain_local_pages_wq
+++ a/mm/page_alloc.c
@@ -3141,9 +3141,9 @@ static void drain_local_pages_wq(struct
 	 * cpu which is alright but we also have to make sure to not move to
 	 * a different one.
 	 */
-	preempt_disable();
+	migrate_disable();
 	drain_local_pages(drain->zone);
-	preempt_enable();
+	migrate_enable();
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 120/262] mm/page_alloc: use clamp() to simplify code
  2021-11-05 20:34 incoming Andrew Morton
                   ` (118 preceding siblings ...)
  2021-11-05 20:40 ` [patch 119/262] mm: page_alloc: use migrate_disable() in drain_local_pages_wq() Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:40 ` [patch 121/262] mm: fix data race in PagePoisoned() Andrew Morton
                   ` (141 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, bobo.shaobowang, david, huawei.libin, linux-mm, mm-commits,
	torvalds, weiyongjun1

From: Wang ShaoBo <bobo.shaobowang@huawei.com>
Subject: mm/page_alloc: use clamp() to simplify code

This patch uses clamp() to simplify code in init_per_zone_wmark_min().

Link: https://lkml.kernel.org/r/20211021034830.1049150-1-bobo.shaobowang@huawei.com
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-use-clamp-to-simplify-code
+++ a/mm/page_alloc.c
@@ -8477,16 +8477,12 @@ int __meminit init_per_zone_wmark_min(vo
 	lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
 	new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
 
-	if (new_min_free_kbytes > user_min_free_kbytes) {
-		min_free_kbytes = new_min_free_kbytes;
-		if (min_free_kbytes < 128)
-			min_free_kbytes = 128;
-		if (min_free_kbytes > 262144)
-			min_free_kbytes = 262144;
-	} else {
+	if (new_min_free_kbytes > user_min_free_kbytes)
+		min_free_kbytes = clamp(new_min_free_kbytes, 128, 262144);
+	else
 		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
 				new_min_free_kbytes, user_min_free_kbytes);
-	}
+
 	setup_per_zone_wmarks();
 	refresh_zone_stat_thresholds();
 	setup_per_zone_lowmem_reserve();
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 121/262] mm: fix data race in PagePoisoned()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (119 preceding siblings ...)
  2021-11-05 20:40 ` [patch 120/262] mm/page_alloc: use clamp() to simplify code Andrew Morton
@ 2021-11-05 20:40 ` Andrew Morton
  2021-11-05 20:41 ` [patch 122/262] mm/memory_failure: constify static mm_walk_ops Andrew Morton
                   ` (140 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:40 UTC (permalink / raw)
  To: akpm, elver, kirill.shutemov, linux-mm, mm-commits, n-horiguchi,
	oliver.sang, torvalds, will

From: Marco Elver <elver@google.com>
Subject: mm: fix data race in PagePoisoned()

PagePoisoned() accesses page->flags which can be updated concurrently:

  | BUG: KCSAN: data-race in next_uptodate_page / unlock_page
  |
  | write (marked) to 0xffffea00050f37c0 of 8 bytes by task 1872 on cpu 1:
  |  instrument_atomic_write           include/linux/instrumented.h:87 [inline]
  |  clear_bit_unlock_is_negative_byte include/asm-generic/bitops/instrumented-lock.h:74 [inline]
  |  unlock_page+0x102/0x1b0           mm/filemap.c:1465
  |  filemap_map_pages+0x6c6/0x890     mm/filemap.c:3057
  |  ...
  | read to 0xffffea00050f37c0 of 8 bytes by task 1873 on cpu 0:
  |  PagePoisoned                   include/linux/page-flags.h:204 [inline]
  |  PageReadahead                  include/linux/page-flags.h:382 [inline]
  |  next_uptodate_page+0x456/0x830 mm/filemap.c:2975
  |  ...
  | CPU: 0 PID: 1873 Comm: systemd-udevd Not tainted 5.11.0-rc4-00001-gf9ce0be71d1f #1

To avoid the compiler tearing or otherwise optimizing the access, use
READ_ONCE() to access flags.

Link: https://lore.kernel.org/all/20210826144157.GA26950@xsang-OptiPlex-9020/
Link: https://lkml.kernel.org/r/20210913113542.2658064-1-elver@google.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/page-flags.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/page-flags.h~mm-fix-data-race-in-pagepoisoned
+++ a/include/linux/page-flags.h
@@ -215,7 +215,7 @@ static __always_inline int PageCompound(
 #define	PAGE_POISON_PATTERN	-1l
 static inline int PagePoisoned(const struct page *page)
 {
-	return page->flags == PAGE_POISON_PATTERN;
+	return READ_ONCE(page->flags) == PAGE_POISON_PATTERN;
 }
 
 #ifdef CONFIG_DEBUG_VM
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 122/262] mm/memory_failure: constify static mm_walk_ops
  2021-11-05 20:34 incoming Andrew Morton
                   ` (120 preceding siblings ...)
  2021-11-05 20:40 ` [patch 121/262] mm: fix data race in PagePoisoned() Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 123/262] mm: filemap: coding style cleanup for filemap_map_pmd() Andrew Morton
                   ` (139 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, anshuman.khandual, linux-mm, mm-commits, naoya.horiguchi,
	rikard.falkeborn, torvalds

From: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Subject: mm/memory_failure: constify static mm_walk_ops

The only usage of hwp_walk_ops is to pass its address to walk_page_range()
which takes a pointer to const mm_walk_ops as argument.  Make it const to
allow the compiler to put it in read-only memory.

Link: https://lkml.kernel.org/r/20211014075042.17174-3-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memory-failure.c~mm-memory_failure-constify-static-mm_walk_ops
+++ a/mm/memory-failure.c
@@ -674,7 +674,7 @@ static int hwpoison_hugetlb_range(pte_t
 #define hwpoison_hugetlb_range	NULL
 #endif
 
-static struct mm_walk_ops hwp_walk_ops = {
+static const struct mm_walk_ops hwp_walk_ops = {
 	.pmd_entry = hwpoison_pte_range,
 	.hugetlb_entry = hwpoison_hugetlb_range,
 };
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 123/262] mm: filemap: coding style cleanup for filemap_map_pmd()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (121 preceding siblings ...)
  2021-11-05 20:41 ` [patch 122/262] mm/memory_failure: constify static mm_walk_ops Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 124/262] mm: hwpoison: refactor refcount check handling Andrew Morton
                   ` (138 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, torvalds, willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: filemap: coding style cleanup for filemap_map_pmd()

Patch series "Solve silent data loss caused by poisoned page cache (shmem/tmpfs)", v5.

When discussing the patch that splits page cache THP in order to offline
the poisoned page, Noaya mentioned there is a bigger problem [1] that
prevents this from working since the page cache page will be truncated if
uncorrectable errors happen.  By looking this deeper it turns out this
approach (truncating poisoned page) may incur silent data loss for all
non-readonly filesystems if the page is dirty.  It may be worse for
in-memory filesystem, e.g.  shmem/tmpfs since the data blocks are actually
gone.

To solve this problem we could keep the poisoned dirty page in page cache
then notify the users on any later access, e.g.  page fault, read/write,
etc.  The clean page could be truncated as is since they can be reread
from disk later on.

The consequence is the filesystems may find poisoned page and manipulate
it as healthy page since all the filesystems actually don't check if the
page is poisoned or not in all the relevant paths except page fault.  In
general, we need make the filesystems be aware of poisoned page before we
could keep the poisoned page in page cache in order to solve the data loss
problem.

To make filesystems be aware of poisoned page we should consider:
- The page should be not written back: clearing dirty flag could prevent from
  writeback.
- The page should not be dropped (it shows as a clean page) by drop caches or
  other callers: the refcount pin from hwpoison could prevent from invalidating
  (called by cache drop, inode cache shrinking, etc), but it doesn't avoid
  invalidation in DIO path.
- The page should be able to get truncated/hole punched/unlinked: it works as it
  is.
- Notify users when the page is accessed, e.g. read/write, page fault and other
  paths (compression, encryption, etc).

The scope of the last one is huge since almost all filesystems need do it
once a page is returned from page cache lookup.  There are a couple of
options to do it:

1. Check hwpoison flag for every path, the most straightforward way.
2. Return NULL for poisoned page from page cache lookup, the most callsites
   check if NULL is returned, this should have least work I think.  But the
   error handling in filesystems just return -ENOMEM, the error code will incur
   confusion to the users obviously.
3. To improve #2, we could return error pointer, e.g. ERR_PTR(-EIO), but this
   will involve significant amount of code change as well since all the paths
   need check if the pointer is ERR or not just like option #1.

I did prototype for both #1 and #3, but it seems #3 may require more
changes than #1.  For #3 ERR_PTR will be returned so all the callers need
to check the return value otherwise invalid pointer may be dereferenced,
but not all callers really care about the content of the page, for
example, partial truncate which just sets the truncated range in one page
to 0.  So for such paths it needs additional modification if ERR_PTR is
returned.  And if the callers have their own way to handle the problematic
pages we need to add a new FGP flag to tell FGP functions to return the
pointer to the page.

It may happen very rarely, but once it happens the consequence (data
corruption) could be very bad and it is very hard to debug.  It seems this
problem had been slightly discussed before, but seems no action was taken
at that time.  [2]

As the aforementioned investigation, it needs huge amount of work to solve
the potential data loss for all filesystems.  But it is much easier for
in-memory filesystems and such filesystems actually suffer more than
others since even the data blocks are gone due to truncating.  So this
patchset starts from shmem/tmpfs by taking option #1.

TODO:
* The unpoison has been broken since commit 0ed950d1f281 ("mm,hwpoison: make
  get_hwpoison_page() call get_any_page()"), and this patch series make
  refcount check for unpoisoning shmem page fail.
* Expand to other filesystems.  But I haven't heard feedback from filesystem
  developers yet.

Patch breakdown:
Patch #1: cleanup, depended by patch #2
Patch #2: fix THP with hwpoisoned subpage(s) PMD map bug
Patch #3: coding style cleanup
Patch #4: refactor and preparation.
Patch #5: keep the poisoned page in page cache and handle such case for all
          the paths.
Patch #6: the previous patches unblock page cache THP split, so this patch
          add page cache THP split support.


This patch (of 4):

A minor cleanup to the indent.

Link: https://lkml.kernel.org/r/20211020210755.23964-1-shy828301@gmail.com
Link: https://lkml.kernel.org/r/20211020210755.23964-4-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/filemap.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--- a/mm/filemap.c~mm-filemap-coding-style-cleanup-for-filemap_map_pmd
+++ a/mm/filemap.c
@@ -3203,12 +3203,12 @@ static bool filemap_map_pmd(struct vm_fa
 	}
 
 	if (pmd_none(*vmf->pmd) && PageTransHuge(page)) {
-	    vm_fault_t ret = do_set_pmd(vmf, page);
-	    if (!ret) {
-		    /* The page is mapped successfully, reference consumed. */
-		    unlock_page(page);
-		    return true;
-	    }
+		vm_fault_t ret = do_set_pmd(vmf, page);
+		if (!ret) {
+			/* The page is mapped successfully, reference consumed. */
+			unlock_page(page);
+			return true;
+		}
 	}
 
 	if (pmd_none(*vmf->pmd))
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 124/262] mm: hwpoison: refactor refcount check handling
  2021-11-05 20:34 incoming Andrew Morton
                   ` (122 preceding siblings ...)
  2021-11-05 20:41 ` [patch 123/262] mm: filemap: coding style cleanup for filemap_map_pmd() Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 125/262] mm: shmem: don't truncate page if memory failure happens Andrew Morton
                   ` (137 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, torvalds, willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: hwpoison: refactor refcount check handling

Memory failure will report failure if the page still has extra pinned
refcount other than from hwpoison after the handler is done.  Actually the
check is not necessary for all handlers, so move the check into specific
handlers.  This would make the following keeping shmem page in page cache
patch easier.

There may be expected extra pin for some cases, for example, when the page
is dirty and in swapcache.

Link: https://lkml.kernel.org/r/20211020210755.23964-5-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   93 ++++++++++++++++++++++++++++--------------
 1 file changed, 64 insertions(+), 29 deletions(-)

--- a/mm/memory-failure.c~mm-hwpoison-refactor-refcount-check-handling
+++ a/mm/memory-failure.c
@@ -807,12 +807,44 @@ static int truncate_error_page(struct pa
 	return ret;
 }
 
+struct page_state {
+	unsigned long mask;
+	unsigned long res;
+	enum mf_action_page_type type;
+
+	/* Callback ->action() has to unlock the relevant page inside it. */
+	int (*action)(struct page_state *ps, struct page *p);
+};
+
+/*
+ * Return true if page is still referenced by others, otherwise return
+ * false.
+ *
+ * The extra_pins is true when one extra refcount is expected.
+ */
+static bool has_extra_refcount(struct page_state *ps, struct page *p,
+			       bool extra_pins)
+{
+	int count = page_count(p) - 1;
+
+	if (extra_pins)
+		count -= 1;
+
+	if (count > 0) {
+		pr_err("Memory failure: %#lx: %s still referenced by %d users\n",
+		       page_to_pfn(p), action_page_types[ps->type], count);
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Error hit kernel page.
  * Do nothing, try to be lucky and not touch this instead. For a few cases we
  * could be more sophisticated.
  */
-static int me_kernel(struct page *p, unsigned long pfn)
+static int me_kernel(struct page_state *ps, struct page *p)
 {
 	unlock_page(p);
 	return MF_IGNORED;
@@ -821,9 +853,9 @@ static int me_kernel(struct page *p, uns
 /*
  * Page in unknown state. Do nothing.
  */
-static int me_unknown(struct page *p, unsigned long pfn)
+static int me_unknown(struct page_state *ps, struct page *p)
 {
-	pr_err("Memory failure: %#lx: Unknown page state\n", pfn);
+	pr_err("Memory failure: %#lx: Unknown page state\n", page_to_pfn(p));
 	unlock_page(p);
 	return MF_FAILED;
 }
@@ -831,7 +863,7 @@ static int me_unknown(struct page *p, un
 /*
  * Clean (or cleaned) page cache page.
  */
-static int me_pagecache_clean(struct page *p, unsigned long pfn)
+static int me_pagecache_clean(struct page_state *ps, struct page *p)
 {
 	int ret;
 	struct address_space *mapping;
@@ -868,9 +900,13 @@ static int me_pagecache_clean(struct pag
 	 *
 	 * Open: to take i_rwsem or not for this? Right now we don't.
 	 */
-	ret = truncate_error_page(p, pfn, mapping);
+	ret = truncate_error_page(p, page_to_pfn(p), mapping);
 out:
 	unlock_page(p);
+
+	if (has_extra_refcount(ps, p, false))
+		ret = MF_FAILED;
+
 	return ret;
 }
 
@@ -879,7 +915,7 @@ out:
  * Issues: when the error hit a hole page the error is not properly
  * propagated.
  */
-static int me_pagecache_dirty(struct page *p, unsigned long pfn)
+static int me_pagecache_dirty(struct page_state *ps, struct page *p)
 {
 	struct address_space *mapping = page_mapping(p);
 
@@ -923,7 +959,7 @@ static int me_pagecache_dirty(struct pag
 		mapping_set_error(mapping, -EIO);
 	}
 
-	return me_pagecache_clean(p, pfn);
+	return me_pagecache_clean(ps, p);
 }
 
 /*
@@ -945,9 +981,10 @@ static int me_pagecache_dirty(struct pag
  * Clean swap cache pages can be directly isolated. A later page fault will
  * bring in the known good data from disk.
  */
-static int me_swapcache_dirty(struct page *p, unsigned long pfn)
+static int me_swapcache_dirty(struct page_state *ps, struct page *p)
 {
 	int ret;
+	bool extra_pins = false;
 
 	ClearPageDirty(p);
 	/* Trigger EIO in shmem: */
@@ -955,10 +992,17 @@ static int me_swapcache_dirty(struct pag
 
 	ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED;
 	unlock_page(p);
+
+	if (ret == MF_DELAYED)
+		extra_pins = true;
+
+	if (has_extra_refcount(ps, p, extra_pins))
+		ret = MF_FAILED;
+
 	return ret;
 }
 
-static int me_swapcache_clean(struct page *p, unsigned long pfn)
+static int me_swapcache_clean(struct page_state *ps, struct page *p)
 {
 	int ret;
 
@@ -966,6 +1010,10 @@ static int me_swapcache_clean(struct pag
 
 	ret = delete_from_lru_cache(p) ? MF_FAILED : MF_RECOVERED;
 	unlock_page(p);
+
+	if (has_extra_refcount(ps, p, false))
+		ret = MF_FAILED;
+
 	return ret;
 }
 
@@ -975,7 +1023,7 @@ static int me_swapcache_clean(struct pag
  * - Error on hugepage is contained in hugepage unit (not in raw page unit.)
  *   To narrow down kill region to one page, we need to break up pmd.
  */
-static int me_huge_page(struct page *p, unsigned long pfn)
+static int me_huge_page(struct page_state *ps, struct page *p)
 {
 	int res;
 	struct page *hpage = compound_head(p);
@@ -986,7 +1034,7 @@ static int me_huge_page(struct page *p,
 
 	mapping = page_mapping(hpage);
 	if (mapping) {
-		res = truncate_error_page(hpage, pfn, mapping);
+		res = truncate_error_page(hpage, page_to_pfn(p), mapping);
 		unlock_page(hpage);
 	} else {
 		res = MF_FAILED;
@@ -1004,6 +1052,9 @@ static int me_huge_page(struct page *p,
 		}
 	}
 
+	if (has_extra_refcount(ps, p, false))
+		res = MF_FAILED;
+
 	return res;
 }
 
@@ -1029,14 +1080,7 @@ static int me_huge_page(struct page *p,
 #define slab		(1UL << PG_slab)
 #define reserved	(1UL << PG_reserved)
 
-static struct page_state {
-	unsigned long mask;
-	unsigned long res;
-	enum mf_action_page_type type;
-
-	/* Callback ->action() has to unlock the relevant page inside it. */
-	int (*action)(struct page *p, unsigned long pfn);
-} error_states[] = {
+static struct page_state error_states[] = {
 	{ reserved,	reserved,	MF_MSG_KERNEL,	me_kernel },
 	/*
 	 * free pages are specially detected outside this table:
@@ -1096,19 +1140,10 @@ static int page_action(struct page_state
 			unsigned long pfn)
 {
 	int result;
-	int count;
 
 	/* page p should be unlocked after returning from ps->action().  */
-	result = ps->action(p, pfn);
+	result = ps->action(ps, p);
 
-	count = page_count(p) - 1;
-	if (ps->action == me_swapcache_dirty && result == MF_DELAYED)
-		count--;
-	if (count > 0) {
-		pr_err("Memory failure: %#lx: %s still referenced by %d users\n",
-		       pfn, action_page_types[ps->type], count);
-		result = MF_FAILED;
-	}
 	action_result(pfn, ps->type, result);
 
 	/* Could do more checks here if page looks ok */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 125/262] mm: shmem: don't truncate page if memory failure happens
  2021-11-05 20:34 incoming Andrew Morton
                   ` (123 preceding siblings ...)
  2021-11-05 20:41 ` [patch 124/262] mm: hwpoison: refactor refcount check handling Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 126/262] mm: hwpoison: handle non-anonymous THP correctly Andrew Morton
                   ` (136 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, arnd, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, torvalds, willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: shmem: don't truncate page if memory failure happens

The current behavior of memory failure is to truncate the page cache
regardless of dirty or clean.  If the page is dirty the later access will
get the obsolete data from disk without any notification to the users. 
This may cause silent data loss.  It is even worse for shmem since shmem
is in-memory filesystem, truncating page cache means discarding data
blocks.  The later read would return all zero.

The right approach is to keep the corrupted page in page cache, any later
access would return error for syscalls or SIGBUS for page fault, until the
file is truncated, hole punched or removed.  The regular storage backed
filesystems would be more complicated so this patch is focused on shmem. 
This also unblock the support for soft offlining shmem THP.

[arnd@arndb.de: fix uninitialized variable use in me_pagecache_clean()]
  Link: https://lkml.kernel.org/r/20211022064748.4173718-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20211020210755.23964-6-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |   14 +++++++++++---
 mm/shmem.c          |   38 +++++++++++++++++++++++++++++++++++---
 mm/userfaultfd.c    |    5 +++++
 3 files changed, 51 insertions(+), 6 deletions(-)

--- a/mm/memory-failure.c~mm-shmem-dont-truncate-page-if-memory-failure-happens
+++ a/mm/memory-failure.c
@@ -58,6 +58,7 @@
 #include <linux/ratelimit.h>
 #include <linux/page-isolation.h>
 #include <linux/pagewalk.h>
+#include <linux/shmem_fs.h>
 #include "internal.h"
 #include "ras/ras_event.h"
 
@@ -867,6 +868,7 @@ static int me_pagecache_clean(struct pag
 {
 	int ret;
 	struct address_space *mapping;
+	bool extra_pins;
 
 	delete_from_lru_cache(p);
 
@@ -896,17 +898,23 @@ static int me_pagecache_clean(struct pag
 	}
 
 	/*
+	 * The shmem page is kept in page cache instead of truncating
+	 * so is expected to have an extra refcount after error-handling.
+	 */
+	extra_pins = shmem_mapping(mapping);
+
+	/*
 	 * Truncation is a bit tricky. Enable it per file system for now.
 	 *
 	 * Open: to take i_rwsem or not for this? Right now we don't.
 	 */
 	ret = truncate_error_page(p, page_to_pfn(p), mapping);
+	if (has_extra_refcount(ps, p, extra_pins))
+		ret = MF_FAILED;
+
 out:
 	unlock_page(p);
 
-	if (has_extra_refcount(ps, p, false))
-		ret = MF_FAILED;
-
 	return ret;
 }
 
--- a/mm/shmem.c~mm-shmem-dont-truncate-page-if-memory-failure-happens
+++ a/mm/shmem.c
@@ -2454,6 +2454,7 @@ shmem_write_begin(struct file *file, str
 	struct inode *inode = mapping->host;
 	struct shmem_inode_info *info = SHMEM_I(inode);
 	pgoff_t index = pos >> PAGE_SHIFT;
+	int ret = 0;
 
 	/* i_rwsem is held by caller */
 	if (unlikely(info->seals & (F_SEAL_GROW |
@@ -2464,7 +2465,15 @@ shmem_write_begin(struct file *file, str
 			return -EPERM;
 	}
 
-	return shmem_getpage(inode, index, pagep, SGP_WRITE);
+	ret = shmem_getpage(inode, index, pagep, SGP_WRITE);
+
+	if (*pagep && PageHWPoison(*pagep)) {
+		unlock_page(*pagep);
+		put_page(*pagep);
+		ret = -EIO;
+	}
+
+	return ret;
 }
 
 static int
@@ -2551,6 +2560,12 @@ static ssize_t shmem_file_read_iter(stru
 			if (sgp == SGP_CACHE)
 				set_page_dirty(page);
 			unlock_page(page);
+
+			if (PageHWPoison(page)) {
+				put_page(page);
+				error = -EIO;
+				break;
+			}
 		}
 
 		/*
@@ -3112,7 +3127,8 @@ static const char *shmem_get_link(struct
 		page = find_get_page(inode->i_mapping, 0);
 		if (!page)
 			return ERR_PTR(-ECHILD);
-		if (!PageUptodate(page)) {
+		if (PageHWPoison(page) ||
+		    !PageUptodate(page)) {
 			put_page(page);
 			return ERR_PTR(-ECHILD);
 		}
@@ -3120,6 +3136,11 @@ static const char *shmem_get_link(struct
 		error = shmem_getpage(inode, 0, &page, SGP_READ);
 		if (error)
 			return ERR_PTR(error);
+		if (page && PageHWPoison(page)) {
+			unlock_page(page);
+			put_page(page);
+			return ERR_PTR(-ECHILD);
+		}
 		unlock_page(page);
 	}
 	set_delayed_call(done, shmem_put_link, page);
@@ -3770,6 +3791,13 @@ static void shmem_destroy_inodecache(voi
 	kmem_cache_destroy(shmem_inode_cachep);
 }
 
+/* Keep the page in page cache instead of truncating it */
+static int shmem_error_remove_page(struct address_space *mapping,
+				   struct page *page)
+{
+	return 0;
+}
+
 const struct address_space_operations shmem_aops = {
 	.writepage	= shmem_writepage,
 	.set_page_dirty	= __set_page_dirty_no_writeback,
@@ -3780,7 +3808,7 @@ const struct address_space_operations sh
 #ifdef CONFIG_MIGRATION
 	.migratepage	= migrate_page,
 #endif
-	.error_remove_page = generic_error_remove_page,
+	.error_remove_page = shmem_error_remove_page,
 };
 EXPORT_SYMBOL(shmem_aops);
 
@@ -4191,6 +4219,10 @@ struct page *shmem_read_mapping_page_gfp
 		page = ERR_PTR(error);
 	else
 		unlock_page(page);
+
+	if (PageHWPoison(page))
+		page = ERR_PTR(-EIO);
+
 	return page;
 #else
 	/*
--- a/mm/userfaultfd.c~mm-shmem-dont-truncate-page-if-memory-failure-happens
+++ a/mm/userfaultfd.c
@@ -232,6 +232,11 @@ static int mcontinue_atomic_pte(struct m
 		goto out;
 	}
 
+	if (PageHWPoison(page)) {
+		ret = -EIO;
+		goto out_release;
+	}
+
 	ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr,
 				       page, false, wp_copy);
 	if (ret)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 126/262] mm: hwpoison: handle non-anonymous THP correctly
  2021-11-05 20:34 incoming Andrew Morton
                   ` (124 preceding siblings ...)
  2021-11-05 20:41 ` [patch 125/262] mm: shmem: don't truncate page if memory failure happens Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 127/262] mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h Andrew Morton
                   ` (135 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, hughd, kirill.shutemov, linux-mm, mm-commits,
	naoya.horiguchi, osalvador, peterx, shy828301, torvalds, willy

From: Yang Shi <shy828301@gmail.com>
Subject: mm: hwpoison: handle non-anonymous THP correctly

Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP
support for tmpfs and read-only file cache has been added.  They could be
offlined by split THP, just like anonymous THP.

Link: https://lkml.kernel.org/r/20211020210755.23964-7-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory-failure.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

--- a/mm/memory-failure.c~mm-hwpoison-handle-non-anonymous-thp-correctly
+++ a/mm/memory-failure.c
@@ -1444,14 +1444,11 @@ static int identify_page_state(unsigned
 static int try_to_split_thp_page(struct page *page, const char *msg)
 {
 	lock_page(page);
-	if (!PageAnon(page) || unlikely(split_huge_page(page))) {
+	if (unlikely(split_huge_page(page))) {
 		unsigned long pfn = page_to_pfn(page);
 
 		unlock_page(page);
-		if (!PageAnon(page))
-			pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
-		else
-			pr_info("%s: %#lx: thp split failed\n", msg, pfn);
+		pr_info("%s: %#lx: thp split failed\n", msg, pfn);
 		put_page(page);
 		return -EBUSY;
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 127/262] mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h
  2021-11-05 20:34 incoming Andrew Morton
                   ` (125 preceding siblings ...)
  2021-11-05 20:41 ` [patch 126/262] mm: hwpoison: handle non-anonymous THP correctly Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 128/262] hugetlb: add demote hugetlb page sysfs interfaces Andrew Morton
                   ` (134 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, david, jhubbard, linux-mm, mike.kravetz, mm-commits,
	peterx, songmuchun, torvalds

From: Peter Xu <peterx@redhat.com>
Subject: mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h

Remove __unmap_hugepage_range() from the header file, because it is only
used in hugetlb.c.

Link: https://lkml.kernel.org/r/20210917165108.9341-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |   10 ----------
 mm/hugetlb.c            |    6 +++---
 2 files changed, 3 insertions(+), 13 deletions(-)

--- a/include/linux/hugetlb.h~mm-hugetlb-drop-__unmap_hugepage_range-definition-from-hugetlbh
+++ a/include/linux/hugetlb.h
@@ -143,9 +143,6 @@ void __unmap_hugepage_range_final(struct
 			  struct vm_area_struct *vma,
 			  unsigned long start, unsigned long end,
 			  struct page *ref_page);
-void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
-				unsigned long start, unsigned long end,
-				struct page *ref_page);
 void hugetlb_report_meminfo(struct seq_file *);
 int hugetlb_report_node_meminfo(char *buf, int len, int nid);
 void hugetlb_show_meminfo(void);
@@ -382,13 +379,6 @@ static inline void __unmap_hugepage_rang
 			struct vm_area_struct *vma, unsigned long start,
 			unsigned long end, struct page *ref_page)
 {
-	BUG();
-}
-
-static inline void __unmap_hugepage_range(struct mmu_gather *tlb,
-			struct vm_area_struct *vma, unsigned long start,
-			unsigned long end, struct page *ref_page)
-{
 	BUG();
 }
 
--- a/mm/hugetlb.c~mm-hugetlb-drop-__unmap_hugepage_range-definition-from-hugetlbh
+++ a/mm/hugetlb.c
@@ -4426,9 +4426,9 @@ again:
 	return ret;
 }
 
-void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
-			    unsigned long start, unsigned long end,
-			    struct page *ref_page)
+static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
+				   unsigned long start, unsigned long end,
+				   struct page *ref_page)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	unsigned long address;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 128/262] hugetlb: add demote hugetlb page sysfs interfaces
  2021-11-05 20:34 incoming Andrew Morton
                   ` (126 preceding siblings ...)
  2021-11-05 20:41 ` [patch 127/262] mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 129/262] mm/cma: add cma_pages_valid to determine if pages are in CMA Andrew Morton
                   ` (133 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, aneesh.kumar, david, linux-mm, mhocko, mike.kravetz,
	mm-commits, naoya.horiguchi, nghialm78, osalvador, rientjes,
	songmuchun, torvalds, ziy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: add demote hugetlb page sysfs interfaces

Patch series "hugetlb: add demote/split page functionality", v4.

The concurrent use of multiple hugetlb page sizes on a single system is
becoming more common.  One of the reasons is better TLB support for
gigantic page sizes on x86 hardware.  In addition, hugetlb pages are being
used to back VMs in hosting environments.

When using hugetlb pages to back VMs, it is often desirable to preallocate
hugetlb pools.  This avoids the delay and uncertainty of allocating
hugetlb pages at VM startup.  In addition, preallocating huge pages
minimizes the issue of memory fragmentation that increases the longer the
system is up and running.

In such environments, a combination of larger and smaller hugetlb pages
are preallocated in anticipation of backing VMs of various sizes.  Over
time, the preallocated pool of smaller hugetlb pages may become depleted
while larger hugetlb pages still remain.  In such situations, it is
desirable to convert larger hugetlb pages to smaller hugetlb pages.

Converting larger to smaller hugetlb pages can be accomplished today by
first freeing the larger page to the buddy allocator and then allocating
the smaller pages.  For example, to convert 50 GB pages on x86:
gb_pages=`cat .../hugepages-1048576kB/nr_hugepages`
m2_pages=`cat .../hugepages-2048kB/nr_hugepages`
echo $(($gb_pages - 50)) > .../hugepages-1048576kB/nr_hugepages
echo $(($m2_pages + 25600)) > .../hugepages-2048kB/nr_hugepages

On an idle system this operation is fairly reliable and results are as
expected.  The number of 2MB pages is increased as expected and the time
of the operation is a second or two.

However, when there is activity on the system the following issues
arise:
1) This process can take quite some time, especially if allocation of
   the smaller pages is not immediate and requires migration/compaction.
2) There is no guarantee that the total size of smaller pages allocated
   will match the size of the larger page which was freed.  This is
   because the area freed by the larger page could quickly be
   fragmented.
In a test environment with a load that continually fills the page cache
with clean pages, results such as the following can be observed:

Unexpected number of 2MB pages allocated: Expected 25600, have 19944
real    0m42.092s
user    0m0.008s
sys     0m41.467s

To address these issues, introduce the concept of hugetlb page demotion. 
Demotion provides a means of 'in place' splitting of a hugetlb page to
pages of a smaller size.  This avoids freeing pages to buddy and then
trying to allocate from buddy.

Page demotion is controlled via sysfs files that reside in the per-hugetlb
page size and per node directories.
- demote_size   Target page size for demotion, a smaller huge page size.
		File can be written to chose a smaller huge page size if
		multiple are available.
- demote        Writable number of hugetlb pages to be demoted

To demote 50 GB huge pages, one would:
cat .../hugepages-1048576kB/free_hugepages   /* optional, verify free pages */
cat .../hugepages-1048576kB/demote_size      /* optional, verify target size */
echo 50 > .../hugepages-1048576kB/demote

Only hugetlb pages which are free at the time of the request can be
demoted.  Demotion does not add to the complexity of surplus pages and
honors reserved huge pages.  Therefore, when a value is written to the
sysfs demote file, that value is only the maximum number of pages which
will be demoted.  It is possible fewer will actually be demoted.  The
recently introduced per-hstate mutex is used to synchronize demote
operations with other operations that modify hugetlb pools.

Real world use cases
--------------------
The above scenario describes a real world use case where hugetlb pages are
used to back VMs on x86.  Both issues of long allocation times and not
necessarily getting the expected number of smaller huge pages after a free
and allocate cycle have been experienced.  The occurrence of these issues
is dependent on other activity within the host and can not be predicted.


This patch (of 5):

Two new sysfs files are added to demote hugtlb pages.  These files are
both per-hugetlb page size and per node.  Files are:

  demote_size - The size in Kb that pages are demoted to. (read-write)
  demote - The number of huge pages to demote. (write-only)

By default, demote_size is the next smallest huge page size.  Valid huge
page sizes less than huge page size may be written to this file.  When
huge pages are demoted, they are demoted to this size.

Writing a value to demote will result in an attempt to demote that number
of hugetlb pages to an appropriate number of demote_size pages.

NOTE: Demote interfaces are only provided for huge page sizes if there is
a smaller target demote huge page size.  For example, on x86 1GB huge
pages will have demote interfaces.  2MB huge pages will not have demote
interfaces.

This patch does not provide full demote functionality.  It only provides
the sysfs interfaces.

It also provides documentation for the new interfaces.

[mike.kravetz@oracle.com: n_mask initialization does not need to be protected by the mutex]
  Link: https://lkml.kernel.org/r/0530e4ef-2492-5186-f919-5db68edea654@oracle.com
Link: https://lkml.kernel.org/r/20211007181918.136982-2-mike.kravetz@oracle.com
Link: https://lkml.kernel.org/r/20211007181918.136982-2-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: David Rientjes <rientjes@google.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/hugetlbpage.rst |   30 +++
 include/linux/hugetlb.h                      |    1 
 mm/hugetlb.c                                 |  155 ++++++++++++++++-
 3 files changed, 183 insertions(+), 3 deletions(-)

--- a/Documentation/admin-guide/mm/hugetlbpage.rst~hugetlb-add-demote-hugetlb-page-sysfs-interfaces
+++ a/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -234,8 +234,12 @@ will exist, of the form::
 
 	hugepages-${size}kB
 
-Inside each of these directories, the same set of files will exist::
+Inside each of these directories, the set of files contained in ``/proc``
+will exist.  In addition, two additional interfaces for demoting huge
+pages may exist::
 
+        demote
+        demote_size
 	nr_hugepages
 	nr_hugepages_mempolicy
 	nr_overcommit_hugepages
@@ -243,7 +247,29 @@ Inside each of these directories, the sa
 	resv_hugepages
 	surplus_hugepages
 
-which function as described above for the default huge page-sized case.
+The demote interfaces provide the ability to split a huge page into
+smaller huge pages.  For example, the x86 architecture supports both
+1GB and 2MB huge pages sizes.  A 1GB huge page can be split into 512
+2MB huge pages.  Demote interfaces are not available for the smallest
+huge page size.  The demote interfaces are:
+
+demote_size
+        is the size of demoted pages.  When a page is demoted a corresponding
+        number of huge pages of demote_size will be created.  By default,
+        demote_size is set to the next smaller huge page size.  If there are
+        multiple smaller huge page sizes, demote_size can be set to any of
+        these smaller sizes.  Only huge page sizes less than the current huge
+        pages size are allowed.
+
+demote
+        is used to demote a number of huge pages.  A user with root privileges
+        can write to this file.  It may not be possible to demote the
+        requested number of huge pages.  To determine how many pages were
+        actually demoted, compare the value of nr_hugepages before and after
+        writing to the demote interface.  demote is a write only interface.
+
+The interfaces which are the same as in ``/proc`` (all except demote and
+demote_size) function as described above for the default huge page-sized case.
 
 .. _mem_policy_and_hp_alloc:
 
--- a/include/linux/hugetlb.h~hugetlb-add-demote-hugetlb-page-sysfs-interfaces
+++ a/include/linux/hugetlb.h
@@ -586,6 +586,7 @@ struct hstate {
 	int next_nid_to_alloc;
 	int next_nid_to_free;
 	unsigned int order;
+	unsigned int demote_order;
 	unsigned long mask;
 	unsigned long max_huge_pages;
 	unsigned long nr_huge_pages;
--- a/mm/hugetlb.c~hugetlb-add-demote-hugetlb-page-sysfs-interfaces
+++ a/mm/hugetlb.c
@@ -2986,7 +2986,7 @@ free:
 
 static void __init hugetlb_init_hstates(void)
 {
-	struct hstate *h;
+	struct hstate *h, *h2;
 
 	for_each_hstate(h) {
 		if (minimum_order > huge_page_order(h))
@@ -2995,6 +2995,22 @@ static void __init hugetlb_init_hstates(
 		/* oversize hugepages were init'ed in early boot */
 		if (!hstate_is_gigantic(h))
 			hugetlb_hstate_alloc_pages(h);
+
+		/*
+		 * Set demote order for each hstate.  Note that
+		 * h->demote_order is initially 0.
+		 * - We can not demote gigantic pages if runtime freeing
+		 *   is not supported, so skip this.
+		 */
+		if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
+			continue;
+		for_each_hstate(h2) {
+			if (h2 == h)
+				continue;
+			if (h2->order < h->order &&
+			    h2->order > h->demote_order)
+				h->demote_order = h2->order;
+		}
 	}
 	VM_BUG_ON(minimum_order == UINT_MAX);
 }
@@ -3235,9 +3251,31 @@ out:
 	return 0;
 }
 
+static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
+	__must_hold(&hugetlb_lock)
+{
+	int rc = 0;
+
+	lockdep_assert_held(&hugetlb_lock);
+
+	/* We should never get here if no demote order */
+	if (!h->demote_order) {
+		pr_warn("HugeTLB: NULL demote order passed to demote_pool_huge_page.\n");
+		return -EINVAL;		/* internal error */
+	}
+
+	/*
+	 * TODO - demote fucntionality will be added in subsequent patch
+	 */
+	return rc;
+}
+
 #define HSTATE_ATTR_RO(_name) \
 	static struct kobj_attribute _name##_attr = __ATTR_RO(_name)
 
+#define HSTATE_ATTR_WO(_name) \
+	static struct kobj_attribute _name##_attr = __ATTR_WO(_name)
+
 #define HSTATE_ATTR(_name) \
 	static struct kobj_attribute _name##_attr = \
 		__ATTR(_name, 0644, _name##_show, _name##_store)
@@ -3433,6 +3471,105 @@ static ssize_t surplus_hugepages_show(st
 }
 HSTATE_ATTR_RO(surplus_hugepages);
 
+static ssize_t demote_store(struct kobject *kobj,
+	       struct kobj_attribute *attr, const char *buf, size_t len)
+{
+	unsigned long nr_demote;
+	unsigned long nr_available;
+	nodemask_t nodes_allowed, *n_mask;
+	struct hstate *h;
+	int err = 0;
+	int nid;
+
+	err = kstrtoul(buf, 10, &nr_demote);
+	if (err)
+		return err;
+	h = kobj_to_hstate(kobj, &nid);
+
+	if (nid != NUMA_NO_NODE) {
+		init_nodemask_of_node(&nodes_allowed, nid);
+		n_mask = &nodes_allowed;
+	} else {
+		n_mask = &node_states[N_MEMORY];
+	}
+
+	/* Synchronize with other sysfs operations modifying huge pages */
+	mutex_lock(&h->resize_lock);
+	spin_lock_irq(&hugetlb_lock);
+
+	while (nr_demote) {
+		/*
+		 * Check for available pages to demote each time thorough the
+		 * loop as demote_pool_huge_page will drop hugetlb_lock.
+		 *
+		 * NOTE: demote_pool_huge_page does not yet drop hugetlb_lock
+		 * but will when full demote functionality is added in a later
+		 * patch.
+		 */
+		if (nid != NUMA_NO_NODE)
+			nr_available = h->free_huge_pages_node[nid];
+		else
+			nr_available = h->free_huge_pages;
+		nr_available -= h->resv_huge_pages;
+		if (!nr_available)
+			break;
+
+		err = demote_pool_huge_page(h, n_mask);
+		if (err)
+			break;
+
+		nr_demote--;
+	}
+
+	spin_unlock_irq(&hugetlb_lock);
+	mutex_unlock(&h->resize_lock);
+
+	if (err)
+		return err;
+	return len;
+}
+HSTATE_ATTR_WO(demote);
+
+static ssize_t demote_size_show(struct kobject *kobj,
+					struct kobj_attribute *attr, char *buf)
+{
+	int nid;
+	struct hstate *h = kobj_to_hstate(kobj, &nid);
+	unsigned long demote_size = (PAGE_SIZE << h->demote_order) / SZ_1K;
+
+	return sysfs_emit(buf, "%lukB\n", demote_size);
+}
+
+static ssize_t demote_size_store(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct hstate *h, *demote_hstate;
+	unsigned long demote_size;
+	unsigned int demote_order;
+	int nid;
+
+	demote_size = (unsigned long)memparse(buf, NULL);
+
+	demote_hstate = size_to_hstate(demote_size);
+	if (!demote_hstate)
+		return -EINVAL;
+	demote_order = demote_hstate->order;
+
+	/* demote order must be smaller than hstate order */
+	h = kobj_to_hstate(kobj, &nid);
+	if (demote_order >= h->order)
+		return -EINVAL;
+
+	/* resize_lock synchronizes access to demote size and writes */
+	mutex_lock(&h->resize_lock);
+	h->demote_order = demote_order;
+	mutex_unlock(&h->resize_lock);
+
+	return count;
+}
+HSTATE_ATTR(demote_size);
+
 static struct attribute *hstate_attrs[] = {
 	&nr_hugepages_attr.attr,
 	&nr_overcommit_hugepages_attr.attr,
@@ -3449,6 +3586,16 @@ static const struct attribute_group hsta
 	.attrs = hstate_attrs,
 };
 
+static struct attribute *hstate_demote_attrs[] = {
+	&demote_size_attr.attr,
+	&demote_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group hstate_demote_attr_group = {
+	.attrs = hstate_demote_attrs,
+};
+
 static int hugetlb_sysfs_add_hstate(struct hstate *h, struct kobject *parent,
 				    struct kobject **hstate_kobjs,
 				    const struct attribute_group *hstate_attr_group)
@@ -3466,6 +3613,12 @@ static int hugetlb_sysfs_add_hstate(stru
 		hstate_kobjs[hi] = NULL;
 	}
 
+	if (h->demote_order) {
+		if (sysfs_create_group(hstate_kobjs[hi],
+					&hstate_demote_attr_group))
+			pr_warn("HugeTLB unable to create demote interfaces for %s\n", h->name);
+	}
+
 	return retval;
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 129/262] mm/cma: add cma_pages_valid to determine if pages are in CMA
  2021-11-05 20:34 incoming Andrew Morton
                   ` (127 preceding siblings ...)
  2021-11-05 20:41 ` [patch 128/262] hugetlb: add demote hugetlb page sysfs interfaces Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 130/262] hugetlb: be sure to free demoted CMA pages to CMA Andrew Morton
                   ` (132 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, aneesh.kumar, david, linux-mm, mhocko, mike.kravetz,
	mm-commits, naoya.horiguchi, nghialm78, osalvador, rientjes,
	songmuchun, torvalds, ziy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: mm/cma: add cma_pages_valid to determine if pages are in CMA

Add new interface cma_pages_valid() which indicates if the specified pages
are part of a CMA region.  This interface will be used in a subsequent
patch by hugetlb code.

In order to keep the same amount of DEBUG information, a pr_debug() call
was added to cma_pages_valid().  In the case where the page passed to
cma_release is not in cma region, the debug message will be printed from
cma_pages_valid as opposed to cma_release.

Link: https://lkml.kernel.org/r/20211007181918.136982-3-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/cma.h |    1 +
 mm/cma.c            |   24 ++++++++++++++++++++----
 2 files changed, 21 insertions(+), 4 deletions(-)

--- a/include/linux/cma.h~mm-cma-add-cma_pages_valid-to-determine-if-pages-are-in-cma
+++ a/include/linux/cma.h
@@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_ad
 					struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
 			      bool no_warn);
+extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
--- a/mm/cma.c~mm-cma-add-cma_pages_valid-to-determine-if-pages-are-in-cma
+++ a/mm/cma.c
@@ -524,6 +524,25 @@ out:
 	return page;
 }
 
+bool cma_pages_valid(struct cma *cma, const struct page *pages,
+		     unsigned long count)
+{
+	unsigned long pfn;
+
+	if (!cma || !pages)
+		return false;
+
+	pfn = page_to_pfn(pages);
+
+	if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) {
+		pr_debug("%s(page %p, count %lu)\n", __func__,
+						(void *)pages, count);
+		return false;
+	}
+
+	return true;
+}
+
 /**
  * cma_release() - release allocated pages
  * @cma:   Contiguous memory region for which the allocation is performed.
@@ -539,16 +558,13 @@ bool cma_release(struct cma *cma, const
 {
 	unsigned long pfn;
 
-	if (!cma || !pages)
+	if (!cma_pages_valid(cma, pages, count))
 		return false;
 
 	pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count);
 
 	pfn = page_to_pfn(pages);
 
-	if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count)
-		return false;
-
 	VM_BUG_ON(pfn + count > cma->base_pfn + cma->count);
 
 	free_contig_range(pfn, count);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 130/262] hugetlb: be sure to free demoted CMA pages to CMA
  2021-11-05 20:34 incoming Andrew Morton
                   ` (128 preceding siblings ...)
  2021-11-05 20:41 ` [patch 129/262] mm/cma: add cma_pages_valid to determine if pages are in CMA Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 131/262] hugetlb: add demote bool to gigantic page routines Andrew Morton
                   ` (131 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, aneesh.kumar, david, linux-mm, mhocko, mike.kravetz,
	mm-commits, naoya.horiguchi, nghialm78, osalvador, rientjes,
	songmuchun, torvalds, ziy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: be sure to free demoted CMA pages to CMA

When huge page demotion is fully implemented, gigantic pages can be
demoted to a smaller huge page size.  For example, on x86 a 1G page can be
demoted to 512 2M pages.  However, gigantic pages can potentially be
allocated from CMA.  If a gigantic page which was allocated from CMA is
demoted, the corresponding demoted pages needs to be returned to CMA.

Use the new interface cma_pages_valid() to determine if a non-gigantic
hugetlb page should be freed to CMA.  Also, clear mapping field of these
pages as expected by cma_release.

This also requires a change to CMA region creation for gigantic pages. 
CMA uses a per-region bit map to track allocations.  When setting up the
region, you specify how many pages each bit represents.  Currently, only
gigantic pages are allocated/freed from CMA so the region is set up such
that one bit represents a gigantic page size allocation.

With demote, a gigantic page (allocation) could be split into smaller size
pages.  And, these smaller size pages will be freed to CMA.  So, since the
per-region bit map needs to be set up to represent the smallest
allocation/free size, it now needs to be set to the smallest huge page
size which can be freed to CMA.

Unfortunately, we set up the CMA region for huge pages before we set up
huge pages sizes (hstates).  So, technically we do not know the smallest
huge page size as this can change via command line options and
architecture specific code.  Therefore, at region setup time we use
HUGETLB_PAGE_ORDER as the smallest possible huge page size that can be
given back to CMA.  It is possible that this value is sub-optimal for some
architectures/config options.  If needed, this can be addressed in follow
on work.

Link: https://lkml.kernel.org/r/20211007181918.136982-4-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

--- a/mm/hugetlb.c~hugetlb-be-sure-to-free-demoted-cma-pages-to-cma
+++ a/mm/hugetlb.c
@@ -50,6 +50,16 @@ struct hstate hstates[HUGE_MAX_HSTATE];
 
 #ifdef CONFIG_CMA
 static struct cma *hugetlb_cma[MAX_NUMNODES];
+static bool hugetlb_cma_page(struct page *page, unsigned int order)
+{
+	return cma_pages_valid(hugetlb_cma[page_to_nid(page)], page,
+				1 << order);
+}
+#else
+static bool hugetlb_cma_page(struct page *page, unsigned int order)
+{
+	return false;
+}
 #endif
 static unsigned long hugetlb_cma_size __initdata;
 
@@ -1272,6 +1282,7 @@ static void destroy_compound_gigantic_pa
 	atomic_set(compound_pincount_ptr(page), 0);
 
 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
+		p->mapping = NULL;
 		clear_compound_head(p);
 		set_page_refcounted(p);
 	}
@@ -1476,7 +1487,13 @@ static void __update_and_free_page(struc
 				1 << PG_active | 1 << PG_private |
 				1 << PG_writeback);
 	}
-	if (hstate_is_gigantic(h)) {
+
+	/*
+	 * Non-gigantic pages demoted from CMA allocated gigantic pages
+	 * need to be given back to CMA in free_gigantic_page.
+	 */
+	if (hstate_is_gigantic(h) ||
+	    hugetlb_cma_page(page, huge_page_order(h))) {
 		destroy_compound_gigantic_page(page, huge_page_order(h));
 		free_gigantic_page(page, huge_page_order(h));
 	} else {
@@ -3001,9 +3018,13 @@ static void __init hugetlb_init_hstates(
 		 * h->demote_order is initially 0.
 		 * - We can not demote gigantic pages if runtime freeing
 		 *   is not supported, so skip this.
+		 * - If CMA allocation is possible, we can not demote
+		 *   HUGETLB_PAGE_ORDER or smaller size pages.
 		 */
 		if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 			continue;
+		if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER)
+			continue;
 		for_each_hstate(h2) {
 			if (h2 == h)
 				continue;
@@ -3555,6 +3576,8 @@ static ssize_t demote_size_store(struct
 	if (!demote_hstate)
 		return -EINVAL;
 	demote_order = demote_hstate->order;
+	if (demote_order < HUGETLB_PAGE_ORDER)
+		return -EINVAL;
 
 	/* demote order must be smaller than hstate order */
 	h = kobj_to_hstate(kobj, &nid);
@@ -6543,6 +6566,7 @@ void __init hugetlb_cma_reserve(int orde
 	if (hugetlb_cma_size < (PAGE_SIZE << order)) {
 		pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n",
 			(PAGE_SIZE << order) / SZ_1M);
+		hugetlb_cma_size = 0;
 		return;
 	}
 
@@ -6563,7 +6587,13 @@ void __init hugetlb_cma_reserve(int orde
 		size = round_up(size, PAGE_SIZE << order);
 
 		snprintf(name, sizeof(name), "hugetlb%d", nid);
-		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
+		/*
+		 * Note that 'order per bit' is based on smallest size that
+		 * may be returned to CMA allocator in the case of
+		 * huge page demotion.
+		 */
+		res = cma_declare_contiguous_nid(0, size, 0,
+						PAGE_SIZE << HUGETLB_PAGE_ORDER,
 						 0, false, name,
 						 &hugetlb_cma[nid], nid);
 		if (res) {
@@ -6579,6 +6609,13 @@ void __init hugetlb_cma_reserve(int orde
 		if (reserved >= hugetlb_cma_size)
 			break;
 	}
+
+	if (!reserved)
+		/*
+		 * hugetlb_cma_size is used to determine if allocations from
+		 * cma are possible.  Set to zero if no cma regions are set up.
+		 */
+		hugetlb_cma_size = 0;
 }
 
 void __init hugetlb_cma_check(void)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 131/262] hugetlb: add demote bool to gigantic page routines
  2021-11-05 20:34 incoming Andrew Morton
                   ` (129 preceding siblings ...)
  2021-11-05 20:41 ` [patch 130/262] hugetlb: be sure to free demoted CMA pages to CMA Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 132/262] hugetlb: add hugetlb demote page support Andrew Morton
                   ` (130 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, aneesh.kumar, david, linux-mm, mhocko, mike.kravetz,
	mm-commits, naoya.horiguchi, nghialm78, osalvador, rientjes,
	songmuchun, torvalds, ziy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: add demote bool to gigantic page routines

The routines remove_hugetlb_page and destroy_compound_gigantic_page
will remove a gigantic page and make the set of base pages ready to be
returned to a lower level allocator.  In the process of doing this,
they make all base pages reference counted.

The routine prep_compound_gigantic_page creates a gigantic page from a
set of base pages.  It assumes that all these base pages are reference
counted.

During demotion, a gigantic page will be split into huge pages of a
smaller size.  This logically involves use of the routines,
remove_hugetlb_page, and destroy_compound_gigantic_page followed by
prep_compound*_page for each smaller huge page.

When pages are reference counted (ref count >= 0), additional
speculative ref counts could be taken as described in previous commits
[1] and [2].  This could result in errors while demoting a huge page. 
Quite a bit of code would need to be created to handle all possible
issues.

Instead of dealing with the possibility of speculative ref counts,
avoid the possibility by keeping ref counts at zero during the demote
process.  Add a boolean 'demote' to the routines remove_hugetlb_page,
destroy_compound_gigantic_page and prep_compound_gigantic_page.  If the
boolean is set, the remove and destroy routines will not reference
count pages and the prep routine will not expect reference counted
pages.

'*_for_demote' wrappers of the routines will be added in a subsequent
patch where this functionality is used.

[1] https://lore.kernel.org/linux-mm/20210622021423.154662-3-mike.kravetz@oracle.com/
[2] https://lore.kernel.org/linux-mm/20210809184832.18342-3-mike.kravetz@oracle.com/

Link: https://lkml.kernel.org/r/20211007181918.136982-5-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |   54 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 11 deletions(-)

--- a/mm/hugetlb.c~hugetlb-add-demote-bool-to-gigantic-page-routines
+++ a/mm/hugetlb.c
@@ -1271,8 +1271,8 @@ static int hstate_next_node_to_free(stru
 		nr_nodes--)
 
 #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static void destroy_compound_gigantic_page(struct page *page,
-					unsigned int order)
+static void __destroy_compound_gigantic_page(struct page *page,
+					unsigned int order, bool demote)
 {
 	int i;
 	int nr_pages = 1 << order;
@@ -1284,7 +1284,8 @@ static void destroy_compound_gigantic_pa
 	for (i = 1; i < nr_pages; i++, p = mem_map_next(p, page, i)) {
 		p->mapping = NULL;
 		clear_compound_head(p);
-		set_page_refcounted(p);
+		if (!demote)
+			set_page_refcounted(p);
 	}
 
 	set_compound_order(page, 0);
@@ -1292,6 +1293,12 @@ static void destroy_compound_gigantic_pa
 	__ClearPageHead(page);
 }
 
+static void destroy_compound_gigantic_page(struct page *page,
+					unsigned int order)
+{
+	__destroy_compound_gigantic_page(page, order, false);
+}
+
 static void free_gigantic_page(struct page *page, unsigned int order)
 {
 	/*
@@ -1364,12 +1371,15 @@ static inline void destroy_compound_giga
 
 /*
  * Remove hugetlb page from lists, and update dtor so that page appears
- * as just a compound page.  A reference is held on the page.
+ * as just a compound page.
+ *
+ * A reference is held on the page, except in the case of demote.
  *
  * Must be called with hugetlb lock held.
  */
-static void remove_hugetlb_page(struct hstate *h, struct page *page,
-							bool adjust_surplus)
+static void __remove_hugetlb_page(struct hstate *h, struct page *page,
+							bool adjust_surplus,
+							bool demote)
 {
 	int nid = page_to_nid(page);
 
@@ -1407,8 +1417,12 @@ static void remove_hugetlb_page(struct h
 	 *
 	 * This handles the case where more than one ref is held when and
 	 * after update_and_free_page is called.
+	 *
+	 * In the case of demote we do not ref count the page as it will soon
+	 * be turned into a page of smaller size.
 	 */
-	set_page_refcounted(page);
+	if (!demote)
+		set_page_refcounted(page);
 	if (hstate_is_gigantic(h))
 		set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
 	else
@@ -1418,6 +1432,12 @@ static void remove_hugetlb_page(struct h
 	h->nr_huge_pages_node[nid]--;
 }
 
+static void remove_hugetlb_page(struct hstate *h, struct page *page,
+							bool adjust_surplus)
+{
+	__remove_hugetlb_page(h, page, adjust_surplus, false);
+}
+
 static void add_hugetlb_page(struct hstate *h, struct page *page,
 			     bool adjust_surplus)
 {
@@ -1681,7 +1701,8 @@ static void prep_new_huge_page(struct hs
 	spin_unlock_irq(&hugetlb_lock);
 }
 
-static bool prep_compound_gigantic_page(struct page *page, unsigned int order)
+static bool __prep_compound_gigantic_page(struct page *page, unsigned int order,
+								bool demote)
 {
 	int i, j;
 	int nr_pages = 1 << order;
@@ -1719,10 +1740,16 @@ static bool prep_compound_gigantic_page(
 		 * the set of pages can not be converted to a gigantic page.
 		 * The caller who allocated the pages should then discard the
 		 * pages using the appropriate free interface.
+		 *
+		 * In the case of demote, the ref count will be zero.
 		 */
-		if (!page_ref_freeze(p, 1)) {
-			pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n");
-			goto out_error;
+		if (!demote) {
+			if (!page_ref_freeze(p, 1)) {
+				pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n");
+				goto out_error;
+			}
+		} else {
+			VM_BUG_ON_PAGE(page_count(p), p);
 		}
 		set_page_count(p, 0);
 		set_compound_head(p, page);
@@ -1747,6 +1774,11 @@ out_error:
 	return false;
 }
 
+static bool prep_compound_gigantic_page(struct page *page, unsigned int order)
+{
+	return __prep_compound_gigantic_page(page, order, false);
+}
+
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for normal or
  * transparent huge pages.  See the PageTransHuge() documentation for more
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 132/262] hugetlb: add hugetlb demote page support
  2021-11-05 20:34 incoming Andrew Morton
                   ` (130 preceding siblings ...)
  2021-11-05 20:41 ` [patch 131/262] hugetlb: add demote bool to gigantic page routines Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 133/262] mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged Andrew Morton
                   ` (129 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, aneesh.kumar, david, linux-mm, mhocko, mike.kravetz,
	mm-commits, naoya.horiguchi, nghialm78, osalvador, rientjes,
	songmuchun, torvalds, ziy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: add hugetlb demote page support

Demote page functionality will split a huge page into a number of huge
pages of a smaller size.  For example, on x86 a 1GB huge page can be
demoted into 512 2M huge pages.  Demotion is done 'in place' by simply
splitting the huge page.

Added '*_for_demote' wrappers for remove_hugetlb_page,
destroy_compound_hugetlb_page and prep_compound_gigantic_page for use by
demote code.

[mike.kravetz@oracle.com: v4]
  Link: https://lkml.kernel.org/r/6ca29b8e-527c-d6ec-900e-e6a43e4f8b73@oracle.com
Link: https://lkml.kernel.org/r/20211007181918.136982-6-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Nghia Le <nghialm78@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |  100 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 92 insertions(+), 8 deletions(-)

--- a/mm/hugetlb.c~hugetlb-add-hugetlb-demote-page-support
+++ a/mm/hugetlb.c
@@ -1270,7 +1270,7 @@ static int hstate_next_node_to_free(stru
 		((node = hstate_next_node_to_free(hs, mask)) || 1);	\
 		nr_nodes--)
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
+/* used to demote non-gigantic_huge pages as well */
 static void __destroy_compound_gigantic_page(struct page *page,
 					unsigned int order, bool demote)
 {
@@ -1293,6 +1293,13 @@ static void __destroy_compound_gigantic_
 	__ClearPageHead(page);
 }
 
+static void destroy_compound_hugetlb_page_for_demote(struct page *page,
+					unsigned int order)
+{
+	__destroy_compound_gigantic_page(page, order, true);
+}
+
+#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
 static void destroy_compound_gigantic_page(struct page *page,
 					unsigned int order)
 {
@@ -1438,6 +1445,12 @@ static void remove_hugetlb_page(struct h
 	__remove_hugetlb_page(h, page, adjust_surplus, false);
 }
 
+static void remove_hugetlb_page_for_demote(struct hstate *h, struct page *page,
+							bool adjust_surplus)
+{
+	__remove_hugetlb_page(h, page, adjust_surplus, true);
+}
+
 static void add_hugetlb_page(struct hstate *h, struct page *page,
 			     bool adjust_surplus)
 {
@@ -1779,6 +1792,12 @@ static bool prep_compound_gigantic_page(
 	return __prep_compound_gigantic_page(page, order, false);
 }
 
+static bool prep_compound_gigantic_page_for_demote(struct page *page,
+							unsigned int order)
+{
+	return __prep_compound_gigantic_page(page, order, true);
+}
+
 /*
  * PageHuge() only returns true for hugetlbfs pages, but not for normal or
  * transparent huge pages.  See the PageTransHuge() documentation for more
@@ -3304,9 +3323,72 @@ out:
 	return 0;
 }
 
+static int demote_free_huge_page(struct hstate *h, struct page *page)
+{
+	int i, nid = page_to_nid(page);
+	struct hstate *target_hstate;
+	int rc = 0;
+
+	target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order);
+
+	remove_hugetlb_page_for_demote(h, page, false);
+	spin_unlock_irq(&hugetlb_lock);
+
+	rc = alloc_huge_page_vmemmap(h, page);
+	if (rc) {
+		/* Allocation of vmemmmap failed, we can not demote page */
+		spin_lock_irq(&hugetlb_lock);
+		set_page_refcounted(page);
+		add_hugetlb_page(h, page, false);
+		return rc;
+	}
+
+	/*
+	 * Use destroy_compound_hugetlb_page_for_demote for all huge page
+	 * sizes as it will not ref count pages.
+	 */
+	destroy_compound_hugetlb_page_for_demote(page, huge_page_order(h));
+
+	/*
+	 * Taking target hstate mutex synchronizes with set_max_huge_pages.
+	 * Without the mutex, pages added to target hstate could be marked
+	 * as surplus.
+	 *
+	 * Note that we already hold h->resize_lock.  To prevent deadlock,
+	 * use the convention of always taking larger size hstate mutex first.
+	 */
+	mutex_lock(&target_hstate->resize_lock);
+	for (i = 0; i < pages_per_huge_page(h);
+				i += pages_per_huge_page(target_hstate)) {
+		if (hstate_is_gigantic(target_hstate))
+			prep_compound_gigantic_page_for_demote(page + i,
+							target_hstate->order);
+		else
+			prep_compound_page(page + i, target_hstate->order);
+		set_page_private(page + i, 0);
+		set_page_refcounted(page + i);
+		prep_new_huge_page(target_hstate, page + i, nid);
+		put_page(page + i);
+	}
+	mutex_unlock(&target_hstate->resize_lock);
+
+	spin_lock_irq(&hugetlb_lock);
+
+	/*
+	 * Not absolutely necessary, but for consistency update max_huge_pages
+	 * based on pool changes for the demoted page.
+	 */
+	h->max_huge_pages--;
+	target_hstate->max_huge_pages += pages_per_huge_page(h);
+
+	return rc;
+}
+
 static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
 	__must_hold(&hugetlb_lock)
 {
+	int nr_nodes, node;
+	struct page *page;
 	int rc = 0;
 
 	lockdep_assert_held(&hugetlb_lock);
@@ -3317,9 +3399,15 @@ static int demote_pool_huge_page(struct
 		return -EINVAL;		/* internal error */
 	}
 
-	/*
-	 * TODO - demote fucntionality will be added in subsequent patch
-	 */
+	for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) {
+		if (!list_empty(&h->hugepage_freelists[node])) {
+			page = list_entry(h->hugepage_freelists[node].next,
+					struct page, lru);
+			rc = demote_free_huge_page(h, page);
+			break;
+		}
+	}
+
 	return rc;
 }
 
@@ -3554,10 +3642,6 @@ static ssize_t demote_store(struct kobje
 		/*
 		 * Check for available pages to demote each time thorough the
 		 * loop as demote_pool_huge_page will drop hugetlb_lock.
-		 *
-		 * NOTE: demote_pool_huge_page does not yet drop hugetlb_lock
-		 * but will when full demote functionality is added in a later
-		 * patch.
 		 */
 		if (nid != NUMA_NO_NODE)
 			nr_available = h->free_huge_pages_node[nid];
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 133/262] mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged
  2021-11-05 20:34 incoming Andrew Morton
                   ` (131 preceding siblings ...)
  2021-11-05 20:41 ` [patch 132/262] hugetlb: add hugetlb demote page support Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 134/262] mm, hugepages: add mremap() support for hugepage backed vma Andrew Morton
                   ` (128 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, liangcaifan19, linux-mm, mike.kravetz, mm-commits,
	torvalds, zhang.lyra

From: Liangcai Fan <liangcaifan19@gmail.com>
Subject: mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged

When initializing transparent huge pages, min_free_kbytes would be
calculated according to what khugepaged expected.
So when disable transparent huge pages, min_free_kbytes should be
recalculated instead of the higher value set by khugepaged.

Link: https://lkml.kernel.org/r/1633937809-16558-1-git-send-email-liangcaifan19@gmail.com
Signed-off-by: Liangcai Fan <liangcaifan19@gmail.com>
Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |    1 +
 mm/khugepaged.c    |   10 ++++++++--
 mm/page_alloc.c    |    7 ++++++-
 3 files changed, 15 insertions(+), 3 deletions(-)

--- a/include/linux/mm.h~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged
+++ a/include/linux/mm.h
@@ -2453,6 +2453,7 @@ extern void memmap_init_range(unsigned l
 		unsigned long, unsigned long, enum meminit_context,
 		struct vmem_altmap *, int migratetype);
 extern void setup_per_zone_wmarks(void);
+extern void calculate_min_free_kbytes(void);
 extern int __meminit init_per_zone_wmark_min(void);
 extern void mem_init(void);
 extern void __init mmap_init(void);
--- a/mm/khugepaged.c~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged
+++ a/mm/khugepaged.c
@@ -2299,6 +2299,11 @@ static void set_recommended_min_free_kby
 	int nr_zones = 0;
 	unsigned long recommended_min;
 
+	if (!khugepaged_enabled()) {
+		calculate_min_free_kbytes();
+		goto update_wmarks;
+	}
+
 	for_each_populated_zone(zone) {
 		/*
 		 * We don't need to worry about fragmentation of
@@ -2334,6 +2339,8 @@ static void set_recommended_min_free_kby
 
 		min_free_kbytes = recommended_min;
 	}
+
+update_wmarks:
 	setup_per_zone_wmarks();
 }
 
@@ -2355,12 +2362,11 @@ int start_stop_khugepaged(void)
 
 		if (!list_empty(&khugepaged_scan.mm_head))
 			wake_up_interruptible(&khugepaged_wait);
-
-		set_recommended_min_free_kbytes();
 	} else if (khugepaged_thread) {
 		kthread_stop(khugepaged_thread);
 		khugepaged_thread = NULL;
 	}
+	set_recommended_min_free_kbytes();
 fail:
 	mutex_unlock(&khugepaged_mutex);
 	return err;
--- a/mm/page_alloc.c~mm-khugepaged-recalculate-min_free_kbytes-after-stopping-khugepaged
+++ a/mm/page_alloc.c
@@ -8469,7 +8469,7 @@ void setup_per_zone_wmarks(void)
  * 8192MB:	11584k
  * 16384MB:	16384k
  */
-int __meminit init_per_zone_wmark_min(void)
+void calculate_min_free_kbytes(void)
 {
 	unsigned long lowmem_kbytes;
 	int new_min_free_kbytes;
@@ -8483,6 +8483,11 @@ int __meminit init_per_zone_wmark_min(vo
 		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
 				new_min_free_kbytes, user_min_free_kbytes);
 
+}
+
+int __meminit init_per_zone_wmark_min(void)
+{
+	calculate_min_free_kbytes();
 	setup_per_zone_wmarks();
 	refresh_zone_stat_thresholds();
 	setup_per_zone_lowmem_reserve();
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 134/262] mm, hugepages: add mremap() support for hugepage backed vma
  2021-11-05 20:34 incoming Andrew Morton
                   ` (132 preceding siblings ...)
  2021-11-05 20:41 ` [patch 133/262] mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 135/262] mm, hugepages: add hugetlb vma mremap() test Andrew Morton
                   ` (127 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, almasrymina, ckennelly, kenchen, kirill, linux-mm, mhocko,
	mike.kravetz, mm-commits, torvalds, vbabka

From: Mina Almasry <almasrymina@google.com>
Subject: mm, hugepages: add mremap() support for hugepage backed vma

Support mremap() for hugepage backed vma segment by simply repositioning
page table entries.  The page table entries are repositioned to the new
virtual address on mremap().

Hugetlb mremap() support is of course generic; my motivating use case is a
library (hugepage_text), which reloads the ELF text of executables in
hugepages.  This significantly increases the execution performance of said
executables.

Restricts the mremap operation on hugepages to up to the size of the
original mapping as the underlying hugetlb reservation is not yet capable
of handling remapping to a larger size.

During the mremap() operation we detect pmd_share'd mappings and we
unshare those during the mremap().  On access and fault the sharing is
established again.

Link: https://lkml.kernel.org/r/20211013195825.3058275-1-almasrymina@google.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/hugetlb.h |   19 ++++++
 mm/hugetlb.c            |  111 +++++++++++++++++++++++++++++++++++---
 mm/mremap.c             |   36 +++++++++++-
 3 files changed, 157 insertions(+), 9 deletions(-)

--- a/include/linux/hugetlb.h~mm-hugepages-add-mremap-support-for-hugepage-backed-vma
+++ a/include/linux/hugetlb.h
@@ -124,6 +124,7 @@ struct hugepage_subpool *hugepage_new_su
 void hugepage_put_subpool(struct hugepage_subpool *spool);
 
 void reset_vma_resv_huge_pages(struct vm_area_struct *vma);
+void clear_vma_resv_huge_pages(struct vm_area_struct *vma);
 int hugetlb_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *);
 int hugetlb_overcommit_handler(struct ctl_table *, int, void *, size_t *,
 		loff_t *);
@@ -132,6 +133,10 @@ int hugetlb_treat_movable_handler(struct
 int hugetlb_mempolicy_sysctl_handler(struct ctl_table *, int, void *, size_t *,
 		loff_t *);
 
+int move_hugetlb_page_tables(struct vm_area_struct *vma,
+			     struct vm_area_struct *new_vma,
+			     unsigned long old_addr, unsigned long new_addr,
+			     unsigned long len);
 int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *);
 long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *,
 			 struct page **, struct vm_area_struct **,
@@ -215,6 +220,10 @@ static inline void reset_vma_resv_huge_p
 {
 }
 
+static inline void clear_vma_resv_huge_pages(struct vm_area_struct *vma)
+{
+}
+
 static inline unsigned long hugetlb_total_pages(void)
 {
 	return 0;
@@ -260,6 +269,16 @@ static inline int copy_hugetlb_page_rang
 {
 	BUG();
 	return 0;
+}
+
+static inline int move_hugetlb_page_tables(struct vm_area_struct *vma,
+					   struct vm_area_struct *new_vma,
+					   unsigned long old_addr,
+					   unsigned long new_addr,
+					   unsigned long len)
+{
+	BUG();
+	return 0;
 }
 
 static inline void hugetlb_report_meminfo(struct seq_file *m)
--- a/mm/hugetlb.c~mm-hugepages-add-mremap-support-for-hugepage-backed-vma
+++ a/mm/hugetlb.c
@@ -1014,6 +1014,35 @@ void reset_vma_resv_huge_pages(struct vm
 		vma->vm_private_data = (void *)0;
 }
 
+/*
+ * Reset and decrement one ref on hugepage private reservation.
+ * Called with mm->mmap_sem writer semaphore held.
+ * This function should be only used by move_vma() and operate on
+ * same sized vma. It should never come here with last ref on the
+ * reservation.
+ */
+void clear_vma_resv_huge_pages(struct vm_area_struct *vma)
+{
+	/*
+	 * Clear the old hugetlb private page reservation.
+	 * It has already been transferred to new_vma.
+	 *
+	 * During a mremap() operation of a hugetlb vma we call move_vma()
+	 * which copies vma into new_vma and unmaps vma. After the copy
+	 * operation both new_vma and vma share a reference to the resv_map
+	 * struct, and at that point vma is about to be unmapped. We don't
+	 * want to return the reservation to the pool at unmap of vma because
+	 * the reservation still lives on in new_vma, so simply decrement the
+	 * ref here and remove the resv_map reference from this vma.
+	 */
+	struct resv_map *reservations = vma_resv_map(vma);
+
+	if (reservations && is_vma_resv_set(vma, HPAGE_RESV_OWNER))
+		kref_put(&reservations->refs, resv_map_release);
+
+	reset_vma_resv_huge_pages(vma);
+}
+
 /* Returns true if the VMA has associated reserve pages */
 static bool vma_has_reserves(struct vm_area_struct *vma, long chg)
 {
@@ -4718,6 +4747,82 @@ again:
 	return ret;
 }
 
+static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
+			  unsigned long new_addr, pte_t *src_pte)
+{
+	struct hstate *h = hstate_vma(vma);
+	struct mm_struct *mm = vma->vm_mm;
+	pte_t *dst_pte, pte;
+	spinlock_t *src_ptl, *dst_ptl;
+
+	dst_pte = huge_pte_offset(mm, new_addr, huge_page_size(h));
+	dst_ptl = huge_pte_lock(h, mm, dst_pte);
+	src_ptl = huge_pte_lockptr(h, mm, src_pte);
+
+	/*
+	 * We don't have to worry about the ordering of src and dst ptlocks
+	 * because exclusive mmap_sem (or the i_mmap_lock) prevents deadlock.
+	 */
+	if (src_ptl != dst_ptl)
+		spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
+
+	pte = huge_ptep_get_and_clear(mm, old_addr, src_pte);
+	set_huge_pte_at(mm, new_addr, dst_pte, pte);
+
+	if (src_ptl != dst_ptl)
+		spin_unlock(src_ptl);
+	spin_unlock(dst_ptl);
+}
+
+int move_hugetlb_page_tables(struct vm_area_struct *vma,
+			     struct vm_area_struct *new_vma,
+			     unsigned long old_addr, unsigned long new_addr,
+			     unsigned long len)
+{
+	struct hstate *h = hstate_vma(vma);
+	struct address_space *mapping = vma->vm_file->f_mapping;
+	unsigned long sz = huge_page_size(h);
+	struct mm_struct *mm = vma->vm_mm;
+	unsigned long old_end = old_addr + len;
+	unsigned long old_addr_copy;
+	pte_t *src_pte, *dst_pte;
+	struct mmu_notifier_range range;
+
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr,
+				old_end);
+	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
+	mmu_notifier_invalidate_range_start(&range);
+	/* Prevent race with file truncation */
+	i_mmap_lock_write(mapping);
+	for (; old_addr < old_end; old_addr += sz, new_addr += sz) {
+		src_pte = huge_pte_offset(mm, old_addr, sz);
+		if (!src_pte)
+			continue;
+		if (huge_pte_none(huge_ptep_get(src_pte)))
+			continue;
+
+		/* old_addr arg to huge_pmd_unshare() is a pointer and so the
+		 * arg may be modified. Pass a copy instead to preserve the
+		 * value in old_addr.
+		 */
+		old_addr_copy = old_addr;
+
+		if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte))
+			continue;
+
+		dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz);
+		if (!dst_pte)
+			break;
+
+		move_huge_pte(vma, old_addr, new_addr, src_pte);
+	}
+	i_mmap_unlock_write(mapping);
+	flush_tlb_range(vma, old_end - len, old_end);
+	mmu_notifier_invalidate_range_end(&range);
+
+	return len + old_addr - old_end;
+}
+
 static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 				   unsigned long start, unsigned long end,
 				   struct page *ref_page)
@@ -6257,12 +6362,6 @@ void adjust_range_if_pmd_sharing_possibl
  * sharing is possible.  For hugetlbfs, this prevents removal of any page
  * table entries associated with the address space.  This is important as we
  * are setting up sharing based on existing page table entries (mappings).
- *
- * NOTE: This routine is only called from huge_pte_alloc.  Some callers of
- * huge_pte_alloc know that sharing is not possible and do not take
- * i_mmap_rwsem as a performance optimization.  This is handled by the
- * if !vma_shareable check at the beginning of the routine. i_mmap_rwsem is
- * only required for subsequent processing.
  */
 pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
 		      unsigned long addr, pud_t *pud)
--- a/mm/mremap.c~mm-hugepages-add-mremap-support-for-hugepage-backed-vma
+++ a/mm/mremap.c
@@ -489,6 +489,10 @@ unsigned long move_page_tables(struct vm
 	old_end = old_addr + len;
 	flush_cache_range(vma, old_addr, old_end);
 
+	if (is_vm_hugetlb_page(vma))
+		return move_hugetlb_page_tables(vma, new_vma, old_addr,
+						new_addr, len);
+
 	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
 				old_addr, old_end);
 	mmu_notifier_invalidate_range_start(&range);
@@ -646,6 +650,10 @@ static unsigned long move_vma(struct vm_
 		mremap_userfaultfd_prep(new_vma, uf);
 	}
 
+	if (is_vm_hugetlb_page(vma)) {
+		clear_vma_resv_huge_pages(vma);
+	}
+
 	/* Conceal VM_ACCOUNT so old reservation is not undone */
 	if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP)) {
 		vma->vm_flags &= ~VM_ACCOUNT;
@@ -739,9 +747,6 @@ static struct vm_area_struct *vma_to_res
 			(vma->vm_flags & (VM_DONTEXPAND | VM_PFNMAP)))
 		return ERR_PTR(-EINVAL);
 
-	if (is_vm_hugetlb_page(vma))
-		return ERR_PTR(-EINVAL);
-
 	/* We can't remap across vm area boundaries */
 	if (old_len > vma->vm_end - addr)
 		return ERR_PTR(-EFAULT);
@@ -937,6 +942,31 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
 
 	if (mmap_write_lock_killable(current->mm))
 		return -EINTR;
+	vma = find_vma(mm, addr);
+	if (!vma || vma->vm_start > addr) {
+		ret = EFAULT;
+		goto out;
+	}
+
+	if (is_vm_hugetlb_page(vma)) {
+		struct hstate *h __maybe_unused = hstate_vma(vma);
+
+		old_len = ALIGN(old_len, huge_page_size(h));
+		new_len = ALIGN(new_len, huge_page_size(h));
+
+		/* addrs must be huge page aligned */
+		if (addr & ~huge_page_mask(h))
+			goto out;
+		if (new_addr & ~huge_page_mask(h))
+			goto out;
+
+		/*
+		 * Don't allow remap expansion, because the underlying hugetlb
+		 * reservation is not yet capable to handle split reservation.
+		 */
+		if (new_len > old_len)
+			goto out;
+	}
 
 	if (flags & (MREMAP_FIXED | MREMAP_DONTUNMAP)) {
 		ret = mremap_to(addr, old_len, new_addr, new_len,
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 135/262] mm, hugepages: add hugetlb vma mremap() test
  2021-11-05 20:34 incoming Andrew Morton
                   ` (133 preceding siblings ...)
  2021-11-05 20:41 ` [patch 134/262] mm, hugepages: add mremap() support for hugepage backed vma Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 136/262] hugetlb: support node specified when using cma for gigantic hugepages Andrew Morton
                   ` (126 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, almasrymina, ckennelly, kenchen, kirill, linux-mm, mhocko,
	mike.kravetz, mm-commits, torvalds, vbabka, wanjiabing

From: Mina Almasry <almasrymina@google.com>
Subject: mm, hugepages: add hugetlb vma mremap() test

[almasrymina@google.com: v8]
  Link: https://lkml.kernel.org/r/20211014200542.4126947-2-almasrymina@google.com
[wanjiabing@vivo.com: remove duplicated include in hugepage-mremap]
  Link: https://lkml.kernel.org/r/20211021122944.8857-1-wanjiabing@vivo.com
Link: https://lkml.kernel.org/r/20211013195825.3058275-2-almasrymina@google.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/.gitignore        |    1 
 tools/testing/selftests/vm/Makefile          |    1 
 tools/testing/selftests/vm/hugepage-mremap.c |  160 +++++++++++++++++
 tools/testing/selftests/vm/run_vmtests.sh    |   11 +
 4 files changed, 173 insertions(+)

--- a/tools/testing/selftests/vm/.gitignore~mm-hugepages-add-hugetlb-vma-mremap-test
+++ a/tools/testing/selftests/vm/.gitignore
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 hugepage-mmap
+hugepage-mremap
 hugepage-shm
 khugepaged
 map_hugetlb
--- /dev/null
+++ a/tools/testing/selftests/vm/hugepage-mremap.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * hugepage-mremap:
+ *
+ * Example of remapping huge page memory in a user application using the
+ * mremap system call.  Code assumes a hugetlbfs filesystem is mounted
+ * at './huge'.  The code will use 10MB worth of huge pages.
+ */
+
+#define _GNU_SOURCE
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <fcntl.h> /* Definition of O_* constants */
+#include <sys/syscall.h> /* Definition of SYS_* constants */
+#include <unistd.h>
+#include <linux/userfaultfd.h>
+#include <sys/ioctl.h>
+
+#define LENGTH (1UL * 1024 * 1024 * 1024)
+
+#define PROTECTION (PROT_READ | PROT_WRITE | PROT_EXEC)
+#define FLAGS (MAP_SHARED | MAP_ANONYMOUS)
+
+static void check_bytes(char *addr)
+{
+	printf("First hex is %x\n", *((unsigned int *)addr));
+}
+
+static void write_bytes(char *addr)
+{
+	unsigned long i;
+
+	for (i = 0; i < LENGTH; i++)
+		*(addr + i) = (char)i;
+}
+
+static int read_bytes(char *addr)
+{
+	unsigned long i;
+
+	check_bytes(addr);
+	for (i = 0; i < LENGTH; i++)
+		if (*(addr + i) != (char)i) {
+			printf("Mismatch at %lu\n", i);
+			return 1;
+		}
+	return 0;
+}
+
+static void register_region_with_uffd(char *addr, size_t len)
+{
+	long uffd; /* userfaultfd file descriptor */
+	struct uffdio_api uffdio_api;
+	struct uffdio_register uffdio_register;
+
+	/* Create and enable userfaultfd object. */
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+	if (uffd == -1) {
+		perror("userfaultfd");
+		exit(1);
+	}
+
+	uffdio_api.api = UFFD_API;
+	uffdio_api.features = 0;
+	if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1) {
+		perror("ioctl-UFFDIO_API");
+		exit(1);
+	}
+
+	/* Create a private anonymous mapping. The memory will be
+	 * demand-zero paged--that is, not yet allocated. When we
+	 * actually touch the memory, it will be allocated via
+	 * the userfaultfd.
+	 */
+
+	addr = mmap(NULL, len, PROT_READ | PROT_WRITE,
+		    MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap");
+		exit(1);
+	}
+
+	printf("Address returned by mmap() = %p\n", addr);
+
+	/* Register the memory range of the mapping we just created for
+	 * handling by the userfaultfd object. In mode, we request to track
+	 * missing pages (i.e., pages that have not yet been faulted in).
+	 */
+
+	uffdio_register.range.start = (unsigned long)addr;
+	uffdio_register.range.len = len;
+	uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
+	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1) {
+		perror("ioctl-UFFDIO_REGISTER");
+		exit(1);
+	}
+}
+
+int main(void)
+{
+	int ret = 0;
+
+	int fd = open("/huge/test", O_CREAT | O_RDWR, 0755);
+
+	if (fd < 0) {
+		perror("Open failed");
+		exit(1);
+	}
+
+	/* mmap to a PUD aligned address to hopefully trigger pmd sharing. */
+	unsigned long suggested_addr = 0x7eaa40000000;
+	void *haddr = mmap((void *)suggested_addr, LENGTH, PROTECTION,
+			   MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
+	printf("Map haddr: Returned address is %p\n", haddr);
+	if (haddr == MAP_FAILED) {
+		perror("mmap1");
+		exit(1);
+	}
+
+	/* mmap again to a dummy address to hopefully trigger pmd sharing. */
+	suggested_addr = 0x7daa40000000;
+	void *daddr = mmap((void *)suggested_addr, LENGTH, PROTECTION,
+			   MAP_HUGETLB | MAP_SHARED | MAP_POPULATE, fd, 0);
+	printf("Map daddr: Returned address is %p\n", daddr);
+	if (daddr == MAP_FAILED) {
+		perror("mmap3");
+		exit(1);
+	}
+
+	suggested_addr = 0x7faa40000000;
+	void *vaddr =
+		mmap((void *)suggested_addr, LENGTH, PROTECTION, FLAGS, -1, 0);
+	printf("Map vaddr: Returned address is %p\n", vaddr);
+	if (vaddr == MAP_FAILED) {
+		perror("mmap2");
+		exit(1);
+	}
+
+	register_region_with_uffd(haddr, LENGTH);
+
+	void *addr = mremap(haddr, LENGTH, LENGTH,
+			    MREMAP_MAYMOVE | MREMAP_FIXED, vaddr);
+	if (addr == MAP_FAILED) {
+		perror("mremap");
+		exit(1);
+	}
+
+	printf("Mremap: Returned address is %p\n", addr);
+	check_bytes(addr);
+	write_bytes(addr);
+	ret = read_bytes(addr);
+
+	munmap(addr, LENGTH);
+
+	return ret;
+}
--- a/tools/testing/selftests/vm/Makefile~mm-hugepages-add-hugetlb-vma-mremap-test
+++ a/tools/testing/selftests/vm/Makefile
@@ -29,6 +29,7 @@ TEST_GEN_FILES = compaction_test
 TEST_GEN_FILES += gup_test
 TEST_GEN_FILES += hmm-tests
 TEST_GEN_FILES += hugepage-mmap
+TEST_GEN_FILES += hugepage-mremap
 TEST_GEN_FILES += hugepage-shm
 TEST_GEN_FILES += khugepaged
 TEST_GEN_FILES += madv_populate
--- a/tools/testing/selftests/vm/run_vmtests.sh~mm-hugepages-add-hugetlb-vma-mremap-test
+++ a/tools/testing/selftests/vm/run_vmtests.sh
@@ -108,6 +108,17 @@ else
 	echo "[PASS]"
 fi
 
+echo "-----------------------"
+echo "running hugepage-mremap"
+echo "-----------------------"
+./hugepage-mremap
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
 echo "NOTE: The above hugetlb tests provide minimal coverage.  Use"
 echo "      https://github.com/libhugetlbfs/libhugetlbfs.git for"
 echo "      hugetlb regression testing."
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 136/262] hugetlb: support node specified when using cma for gigantic hugepages
  2021-11-05 20:34 incoming Andrew Morton
                   ` (134 preceding siblings ...)
  2021-11-05 20:41 ` [patch 135/262] mm, hugepages: add hugetlb vma mremap() test Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 137/262] mm: remove duplicate include in hugepage-mremap.c Andrew Morton
                   ` (125 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, baolin.wang, corbet, guro, linux-mm, mhocko, mike.kravetz,
	mm-commits, torvalds

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: hugetlb: support node specified when using cma for gigantic hugepages

Now the size of CMA area for gigantic hugepages runtime allocation is
balanced for all online nodes, but we also want to specify the size of CMA
per-node, or only one node in some cases, which are similar with patch
[1].

For example, on some multi-nodes systems, each node's memory can be
different, allocating the same size of CMA for each node is not suitable
for the low-memory nodes.  Meanwhile some workloads like DPDK mentioned by
Zhenguo in patch [1] only need hugepages in one node.

On the other hand, we have some machines with multiple types of memory,
like DRAM and PMEM (persistent memory).  On this system, we may want to
specify all the hugepages only on DRAM node, or specify the proportion of
DRAM node and PMEM node, to tuning the performance of the workloads.

Thus this patch adds node format for 'hugetlb_cma' parameter to support
specifying the size of CMA per-node.  An example is as follows:

hugetlb_cma=0:5G,2:5G

which means allocating 5G size of CMA area on node 0 and node 2
respectively.  And the users should use the node specific sysfs file to
allocate the gigantic hugepages if specified the CMA size on that node.

[1]
https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.com

Link: https://lkml.kernel.org/r/bb790775ca60bb8f4b26956bb3f6988f74e075c7.1634261144.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/kernel-parameters.txt |    6 
 mm/hugetlb.c                                    |   86 ++++++++++++--
 2 files changed, 81 insertions(+), 11 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt~hugetlb-support-node-specified-when-using-cma-for-gigantic-hugepages
+++ a/Documentation/admin-guide/kernel-parameters.txt
@@ -1587,8 +1587,10 @@
 			registers.  Default set by CONFIG_HPET_MMAP_DEFAULT.
 
 	hugetlb_cma=	[HW,CMA] The size of a CMA area used for allocation
-			of gigantic hugepages.
-			Format: nn[KMGTPE]
+			of gigantic hugepages. Or using node format, the size
+			of a CMA area per node can be specified.
+			Format: nn[KMGTPE] or (node format)
+				<node>:nn[KMGTPE][,<node>:nn[KMGTPE]]
 
 			Reserve a CMA area of given size and allocate gigantic
 			hugepages using the CMA allocator. If enabled, the
--- a/mm/hugetlb.c~hugetlb-support-node-specified-when-using-cma-for-gigantic-hugepages
+++ a/mm/hugetlb.c
@@ -50,6 +50,7 @@ struct hstate hstates[HUGE_MAX_HSTATE];
 
 #ifdef CONFIG_CMA
 static struct cma *hugetlb_cma[MAX_NUMNODES];
+static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata;
 static bool hugetlb_cma_page(struct page *page, unsigned int order)
 {
 	return cma_pages_valid(hugetlb_cma[page_to_nid(page)], page,
@@ -6762,7 +6763,38 @@ static bool cma_reserve_called __initdat
 
 static int __init cmdline_parse_hugetlb_cma(char *p)
 {
-	hugetlb_cma_size = memparse(p, &p);
+	int nid, count = 0;
+	unsigned long tmp;
+	char *s = p;
+
+	while (*s) {
+		if (sscanf(s, "%lu%n", &tmp, &count) != 1)
+			break;
+
+		if (s[count] == ':') {
+			nid = tmp;
+			if (nid < 0 || nid >= MAX_NUMNODES)
+				break;
+
+			s += count + 1;
+			tmp = memparse(s, &s);
+			hugetlb_cma_size_in_node[nid] = tmp;
+			hugetlb_cma_size += tmp;
+
+			/*
+			 * Skip the separator if have one, otherwise
+			 * break the parsing.
+			 */
+			if (*s == ',')
+				s++;
+			else
+				break;
+		} else {
+			hugetlb_cma_size = memparse(p, &p);
+			break;
+		}
+	}
+
 	return 0;
 }
 
@@ -6771,6 +6803,7 @@ early_param("hugetlb_cma", cmdline_parse
 void __init hugetlb_cma_reserve(int order)
 {
 	unsigned long size, reserved, per_node;
+	bool node_specific_cma_alloc = false;
 	int nid;
 
 	cma_reserve_called = true;
@@ -6778,6 +6811,31 @@ void __init hugetlb_cma_reserve(int orde
 	if (!hugetlb_cma_size)
 		return;
 
+	for (nid = 0; nid < MAX_NUMNODES; nid++) {
+		if (hugetlb_cma_size_in_node[nid] == 0)
+			continue;
+
+		if (!node_state(nid, N_ONLINE)) {
+			pr_warn("hugetlb_cma: invalid node %d specified\n", nid);
+			hugetlb_cma_size -= hugetlb_cma_size_in_node[nid];
+			hugetlb_cma_size_in_node[nid] = 0;
+			continue;
+		}
+
+		if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) {
+			pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n",
+				nid, (PAGE_SIZE << order) / SZ_1M);
+			hugetlb_cma_size -= hugetlb_cma_size_in_node[nid];
+			hugetlb_cma_size_in_node[nid] = 0;
+		} else {
+			node_specific_cma_alloc = true;
+		}
+	}
+
+	/* Validate the CMA size again in case some invalid nodes specified. */
+	if (!hugetlb_cma_size)
+		return;
+
 	if (hugetlb_cma_size < (PAGE_SIZE << order)) {
 		pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n",
 			(PAGE_SIZE << order) / SZ_1M);
@@ -6785,20 +6843,30 @@ void __init hugetlb_cma_reserve(int orde
 		return;
 	}
 
-	/*
-	 * If 3 GB area is requested on a machine with 4 numa nodes,
-	 * let's allocate 1 GB on first three nodes and ignore the last one.
-	 */
-	per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes);
-	pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n",
-		hugetlb_cma_size / SZ_1M, per_node / SZ_1M);
+	if (!node_specific_cma_alloc) {
+		/*
+		 * If 3 GB area is requested on a machine with 4 numa nodes,
+		 * let's allocate 1 GB on first three nodes and ignore the last one.
+		 */
+		per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes);
+		pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n",
+			hugetlb_cma_size / SZ_1M, per_node / SZ_1M);
+	}
 
 	reserved = 0;
 	for_each_node_state(nid, N_ONLINE) {
 		int res;
 		char name[CMA_MAX_NAME];
 
-		size = min(per_node, hugetlb_cma_size - reserved);
+		if (node_specific_cma_alloc) {
+			if (hugetlb_cma_size_in_node[nid] == 0)
+				continue;
+
+			size = hugetlb_cma_size_in_node[nid];
+		} else {
+			size = min(per_node, hugetlb_cma_size - reserved);
+		}
+
 		size = round_up(size, PAGE_SIZE << order);
 
 		snprintf(name, sizeof(name), "hugetlb%d", nid);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 137/262] mm: remove duplicate include in hugepage-mremap.c
  2021-11-05 20:34 incoming Andrew Morton
                   ` (135 preceding siblings ...)
  2021-11-05 20:41 ` [patch 136/262] hugetlb: support node specified when using cma for gigantic hugepages Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 138/262] hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro Andrew Morton
                   ` (124 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, ran.jianping, shuah, torvalds, zealci

From: Ran Jianping <ran.jianping@zte.com.cn>
Subject: mm: remove duplicate include in hugepage-mremap.c

Remove duplicate includes 'unistd.h' included in
 '/tools/testing/selftests/vm/hugepage-mremap.c'  is duplicated.It is also
 included on 23 line.

Link: https://lkml.kernel.org/r/20211018102336.869726-1-ran.jianping@zte.com.cn
Signed-off-by: Ran Jianping <ran.jianping@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/hugepage-mremap.c |    1 -
 1 file changed, 1 deletion(-)

--- a/tools/testing/selftests/vm/hugepage-mremap.c~mm-remove-duplicate-include-in-hugepage-mremapc
+++ a/tools/testing/selftests/vm/hugepage-mremap.c
@@ -15,7 +15,6 @@
 #include <errno.h>
 #include <fcntl.h> /* Definition of O_* constants */
 #include <sys/syscall.h> /* Definition of SYS_* constants */
-#include <unistd.h>
 #include <linux/userfaultfd.h>
 #include <sys/ioctl.h>
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 138/262] hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro
  2021-11-05 20:34 incoming Andrew Morton
                   ` (136 preceding siblings ...)
  2021-11-05 20:41 ` [patch 137/262] mm: remove duplicate include in hugepage-mremap.c Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 139/262] hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments Andrew Morton
                   ` (123 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, baolin.wang, linux-mm, mhocko, mike.kravetz, mm-commits, torvalds

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro

Patch series "Some cleanups and improvements for hugetlb".

This patchset does some cleanups and improvements for hugetlb and
hugetlb_cgroup.


This patch (of 4):

Since commit 726b7bbe ("hugetlb_cgroup: fix illegal access to memory"),
the hugetlb_cgroup_from_counter() macro is not used any more, remove it.

Link: https://lkml.kernel.org/r/cover.1634797639.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/f03b29b801fa9942466ab15334ec09988e124ae6.1634797639.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb_cgroup.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/mm/hugetlb_cgroup.c~hugetlb_cgroup-remove-unused-hugetlb_cgroup_from_counter-macro
+++ a/mm/hugetlb_cgroup.c
@@ -27,9 +27,6 @@
 #define MEMFILE_IDX(val)	(((val) >> 16) & 0xffff)
 #define MEMFILE_ATTR(val)	((val) & 0xffff)
 
-#define hugetlb_cgroup_from_counter(counter, idx)                   \
-	container_of(counter, struct hugetlb_cgroup, hugepage[idx])
-
 static struct hugetlb_cgroup *root_h_cgroup __read_mostly;
 
 static inline struct page_counter *
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 139/262] hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments
  2021-11-05 20:34 incoming Andrew Morton
                   ` (137 preceding siblings ...)
  2021-11-05 20:41 ` [patch 138/262] hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:41 ` [patch 140/262] hugetlb: remove redundant validation in has_same_uncharge_info() Andrew Morton
                   ` (122 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, baolin.wang, linux-mm, mhocko, mike.kravetz, mm-commits, torvalds

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments

After commit 8382d914ebf7 ("mm, hugetlb: improve page-fault scalability"),
the hugetlb_instantiation_mutex lock had been replaced by
hugetlb_fault_mutex_table to serializes faults on the same logical page. 
Thus update the obsolete hugetlb_instantiation_mutex related comments.

Link: https://lkml.kernel.org/r/4b3febeae37455ff7b74aa0aad16cc6909cf0926.1634797639.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/hugetlb.c~hugetlb-replace-the-obsolete-hugetlb_instantiation_mutex-in-the-comments
+++ a/mm/hugetlb.c
@@ -5014,7 +5014,7 @@ static void unmap_ref_private(struct mm_
 
 /*
  * Hugetlb_cow() should be called with page lock of the original hugepage held.
- * Called with hugetlb_instantiation_mutex held and pte_page locked so we
+ * Called with hugetlb_fault_mutex_table held and pte_page locked so we
  * cannot race with other handlers or page migration.
  * Keep the pte_same checks anyway to make transition from the mutex easier.
  */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 140/262] hugetlb: remove redundant validation in has_same_uncharge_info()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (138 preceding siblings ...)
  2021-11-05 20:41 ` [patch 139/262] hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments Andrew Morton
@ 2021-11-05 20:41 ` Andrew Morton
  2021-11-05 20:42 ` [patch 141/262] hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range() Andrew Morton
                   ` (121 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:41 UTC (permalink / raw)
  To: akpm, baolin.wang, linux-mm, mhocko, mike.kravetz, mm-commits, torvalds

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: hugetlb: remove redundant validation in has_same_uncharge_info()

The callers of has_same_uncharge_info() has accessed the original
file_region and new file_region, and they are impossible to be NULL now. 
So we can remove the file_region validation in has_same_uncharge_info() to
simplify the code.

Link: https://lkml.kernel.org/r/97fc68d3f8d34f63c204645e10d7a718997e50b7.1634797639.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/hugetlb.c~hugetlb-remove-redundant-validation-in-has_same_uncharge_info
+++ a/mm/hugetlb.c
@@ -332,8 +332,7 @@ static bool has_same_uncharge_info(struc
 				   struct file_region *org)
 {
 #ifdef CONFIG_CGROUP_HUGETLB
-	return rg && org &&
-	       rg->reservation_counter == org->reservation_counter &&
+	return rg->reservation_counter == org->reservation_counter &&
 	       rg->css == org->css;
 
 #else
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 141/262] hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (139 preceding siblings ...)
  2021-11-05 20:41 ` [patch 140/262] hugetlb: remove redundant validation in has_same_uncharge_info() Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 142/262] hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page Andrew Morton
                   ` (120 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, baolin.wang, linux-mm, mhocko, mike.kravetz, mm-commits, torvalds

From: Baolin Wang <baolin.wang@linux.alibaba.com>
Subject: hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range()

When calling hugetlb_resv_map_add(), we've guaranteed that the parameter
'to' is always larger than 'from', so it never returns a negative value
from hugetlb_resv_map_add().  Thus remove the redundant VM_BUG_ON().

Link: https://lkml.kernel.org/r/2b565552f3d06753da1e8dda439c0d96d6d9a5a3.1634797639.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    1 -
 1 file changed, 1 deletion(-)

--- a/mm/hugetlb.c~hugetlb-remove-redundant-vm_bug_on-in-add_reservation_in_range
+++ a/mm/hugetlb.c
@@ -445,7 +445,6 @@ static long add_reservation_in_range(str
 		add += hugetlb_resv_map_add(resv, rg, last_accounted_offset,
 					    t, h, h_cg, regions_needed);
 
-	VM_BUG_ON(add < 0);
 	return add;
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 142/262] hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page
  2021-11-05 20:34 incoming Andrew Morton
                   ` (140 preceding siblings ...)
  2021-11-05 20:42 ` [patch 141/262] hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range() Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 143/262] userfaultfd/selftests: don't rely on GNU extensions for random numbers Andrew Morton
                   ` (119 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, linux-mm, mike.kravetz, mm-commits, osalvador,
	pasha.tatashin, songmuchun, torvalds, willy

From: Mike Kravetz <mike.kravetz@oracle.com>
Subject: hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page

In commit 7118fc2906e29 ("hugetlb: address ref count racing in
prep_compound_gigantic_page"), page_ref_freeze is used to atomically zero
the ref count of tail pages iff they are 1.  The unconditional call to
set_page_count(0) was left in the code.  This call is after
page_ref_freeze so it is really a noop.

Remove redundant and unnecessary set_page_count call.

Link: https://lkml.kernel.org/r/20211026220635.35187-1-mike.kravetz@oracle.com
Fixes: 7118fc2906e29 ("hugetlb: address ref count racing in prep_compound_gigantic_page")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/hugetlb.c |    1 -
 1 file changed, 1 deletion(-)

--- a/mm/hugetlb.c~hugetlb-remove-unnecessary-set_page_count-in-prep_compound_gigantic_page
+++ a/mm/hugetlb.c
@@ -1792,7 +1792,6 @@ static bool __prep_compound_gigantic_pag
 		} else {
 			VM_BUG_ON_PAGE(page_count(p), p);
 		}
-		set_page_count(p, 0);
 		set_compound_head(p, page);
 	}
 	atomic_set(compound_mapcount_ptr(page), -1);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 143/262] userfaultfd/selftests: don't rely on GNU extensions for random numbers
  2021-11-05 20:34 incoming Andrew Morton
                   ` (141 preceding siblings ...)
  2021-11-05 20:42 ` [patch 142/262] hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 144/262] userfaultfd/selftests: fix feature support detection Andrew Morton
                   ` (118 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, axelrasmussen, linux-mm, mm-commits, peterx, shuah, torvalds

From: Axel Rasmussen <axelrasmussen@google.com>
Subject: userfaultfd/selftests: don't rely on GNU extensions for random numbers

Patch series "Small userfaultfd selftest fixups", v2.


This patch (of 3):

Two arguments for doing this:

First, and maybe most importantly, the resulting code is significantly
shorter / simpler.

Then, we avoid using GNU libc extensions.  Why does this matter?  It makes
testing userfaultfd with the selftest easier e.g.  on distros which use
something other than glibc (e.g., Alpine, which uses musl); basically, it
makes the test more portable.

Link: https://lkml.kernel.org/r/20210930212309.4001967-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/userfaultfd.c |   26 +++------------------
 1 file changed, 4 insertions(+), 22 deletions(-)

--- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-dont-rely-on-gnu-extensions-for-random-numbers
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -57,6 +57,7 @@
 #include <assert.h>
 #include <inttypes.h>
 #include <stdint.h>
+#include <sys/random.h>
 
 #include "../kselftest.h"
 
@@ -518,22 +519,10 @@ static void continue_range(int ufd, __u6
 static void *locking_thread(void *arg)
 {
 	unsigned long cpu = (unsigned long) arg;
-	struct random_data rand;
 	unsigned long page_nr = *(&(page_nr)); /* uninitialized warning */
-	int32_t rand_nr;
 	unsigned long long count;
-	char randstate[64];
-	unsigned int seed;
 
-	if (bounces & BOUNCE_RANDOM) {
-		seed = (unsigned int) time(NULL) - bounces;
-		if (!(bounces & BOUNCE_RACINGFAULTS))
-			seed += cpu;
-		bzero(&rand, sizeof(rand));
-		bzero(&randstate, sizeof(randstate));
-		if (initstate_r(seed, randstate, sizeof(randstate), &rand))
-			err("initstate_r failed");
-	} else {
+	if (!(bounces & BOUNCE_RANDOM)) {
 		page_nr = -bounces;
 		if (!(bounces & BOUNCE_RACINGFAULTS))
 			page_nr += cpu * nr_pages_per_cpu;
@@ -541,15 +530,8 @@ static void *locking_thread(void *arg)
 
 	while (!finished) {
 		if (bounces & BOUNCE_RANDOM) {
-			if (random_r(&rand, &rand_nr))
-				err("random_r failed");
-			page_nr = rand_nr;
-			if (sizeof(page_nr) > sizeof(rand_nr)) {
-				if (random_r(&rand, &rand_nr))
-					err("random_r failed");
-				page_nr |= (((unsigned long) rand_nr) << 16) <<
-					   16;
-			}
+			if (getrandom(&page_nr, sizeof(page_nr), 0) != sizeof(page_nr))
+				err("getrandom failed");
 		} else
 			page_nr += 1;
 		page_nr %= nr_pages;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 144/262] userfaultfd/selftests: fix feature support detection
  2021-11-05 20:34 incoming Andrew Morton
                   ` (142 preceding siblings ...)
  2021-11-05 20:42 ` [patch 143/262] userfaultfd/selftests: don't rely on GNU extensions for random numbers Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 145/262] userfaultfd/selftests: fix calculation of expected ioctls Andrew Morton
                   ` (117 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, axelrasmussen, linux-mm, mm-commits, peterx, shuah, torvalds

From: Axel Rasmussen <axelrasmussen@google.com>
Subject: userfaultfd/selftests: fix feature support detection

Before any tests are run, in set_test_type, we decide what feature(s) we
are going to be testing, based upon our command line arguments.  However,
the supported features are not just a function of the memory type being
used, so this is broken.

For instance, consider writeprotect support.  It is "normally" supported
for anonymous memory, but furthermore it requires that the kernel has
CONFIG_HAVE_ARCH_USERFAULTFD_WP.  So, it is *not* supported at all on
aarch64, for example.

So, this commit fixes this by querying the kernel for the set of features
it supports in set_test_type, by opening a userfaultfd and issuing a
UFFDIO_API ioctl.  Based upon the reported features, we toggle what tests
are enabled.

Link: https://lkml.kernel.org/r/20210930212309.4001967-3-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/userfaultfd.c |   54 ++++++++++++---------
 1 file changed, 31 insertions(+), 23 deletions(-)

--- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-fix-feature-support-detection
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -346,6 +346,16 @@ static struct uffd_test_ops hugetlb_uffd
 
 static struct uffd_test_ops *uffd_test_ops;
 
+static inline uint64_t uffd_minor_feature(void)
+{
+	if (test_type == TEST_HUGETLB && map_shared)
+		return UFFD_FEATURE_MINOR_HUGETLBFS;
+	else if (test_type == TEST_SHMEM)
+		return UFFD_FEATURE_MINOR_SHMEM;
+	else
+		return 0;
+}
+
 static void userfaultfd_open(uint64_t *features)
 {
 	struct uffdio_api uffdio_api;
@@ -406,7 +416,7 @@ static void uffd_test_ctx_clear(void)
 	munmap_area((void **)&area_dst_alias);
 }
 
-static void uffd_test_ctx_init_ext(uint64_t *features)
+static void uffd_test_ctx_init(uint64_t features)
 {
 	unsigned long nr, cpu;
 
@@ -415,7 +425,7 @@ static void uffd_test_ctx_init_ext(uint6
 	uffd_test_ops->allocate_area((void **)&area_src);
 	uffd_test_ops->allocate_area((void **)&area_dst);
 
-	userfaultfd_open(features);
+	userfaultfd_open(&features);
 
 	count_verify = malloc(nr_pages * sizeof(unsigned long long));
 	if (!count_verify)
@@ -463,11 +473,6 @@ static void uffd_test_ctx_init_ext(uint6
 			err("pipe");
 }
 
-static inline void uffd_test_ctx_init(uint64_t features)
-{
-	uffd_test_ctx_init_ext(&features);
-}
-
 static int my_bcmp(char *str1, char *str2, size_t n)
 {
 	unsigned long i;
@@ -1208,7 +1213,6 @@ static int userfaultfd_minor_test(void)
 	void *expected_page;
 	char c;
 	struct uffd_stats stats = { 0 };
-	uint64_t req_features, features_out;
 
 	if (!test_uffdio_minor)
 		return 0;
@@ -1216,21 +1220,7 @@ static int userfaultfd_minor_test(void)
 	printf("testing minor faults: ");
 	fflush(stdout);
 
-	if (test_type == TEST_HUGETLB)
-		req_features = UFFD_FEATURE_MINOR_HUGETLBFS;
-	else if (test_type == TEST_SHMEM)
-		req_features = UFFD_FEATURE_MINOR_SHMEM;
-	else
-		return 1;
-
-	features_out = req_features;
-	uffd_test_ctx_init_ext(&features_out);
-	/* If kernel reports required features aren't supported, skip test. */
-	if ((features_out & req_features) != req_features) {
-		printf("skipping test due to lack of feature support\n");
-		fflush(stdout);
-		return 0;
-	}
+	uffd_test_ctx_init(uffd_minor_feature());
 
 	uffdio_register.range.start = (unsigned long)area_dst_alias;
 	uffdio_register.range.len = nr_pages * page_size;
@@ -1591,6 +1581,8 @@ unsigned long default_huge_page_size(voi
 
 static void set_test_type(const char *type)
 {
+	uint64_t features = UFFD_API_FEATURES;
+
 	if (!strcmp(type, "anon")) {
 		test_type = TEST_ANON;
 		uffd_test_ops = &anon_uffd_test_ops;
@@ -1624,6 +1616,22 @@ static void set_test_type(const char *ty
 	if ((unsigned long) area_count(NULL, 0) + sizeof(unsigned long long) * 2
 	    > page_size)
 		err("Impossible to run this test");
+
+	/*
+	 * Whether we can test certain features depends not just on test type,
+	 * but also on whether or not this particular kernel supports the
+	 * feature.
+	 */
+
+	userfaultfd_open(&features);
+
+	test_uffdio_wp = test_uffdio_wp &&
+		(features & UFFD_FEATURE_PAGEFAULT_FLAG_WP);
+	test_uffdio_minor = test_uffdio_minor &&
+		(features & uffd_minor_feature());
+
+	close(uffd);
+	uffd = -1;
 }
 
 static void sigalrm(int sig)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 145/262] userfaultfd/selftests: fix calculation of expected ioctls
  2021-11-05 20:34 incoming Andrew Morton
                   ` (143 preceding siblings ...)
  2021-11-05 20:42 ` [patch 144/262] userfaultfd/selftests: fix feature support detection Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 146/262] mm/page_isolation: fix potential missing call to unset_migratetype_isolate() Andrew Morton
                   ` (116 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, axelrasmussen, linux-mm, mm-commits, peterx, shuah, torvalds

From: Axel Rasmussen <axelrasmussen@google.com>
Subject: userfaultfd/selftests: fix calculation of expected ioctls

Today, we assert that the ioctls the kernel reports as supported for a
registration match a precomputed list.  We decide which ioctls are
supported by examining the memory type.  Then, in several locations we
"fix up" this list by adding or removing things this initial decision got
wrong.

What ioctls the kernel reports is actually a function of several things:
- The memory type
- Kernel feature support (e.g., no writeprotect on aarch64)
- The registration type (e.g., CONTINUE only supported for MINOR mode)

So, we can't fully compute this at the start, in set_test_type.  It varies
per test, depending on what registration mode(s) those tests use.

Instead, introduce a new function which computes the correct list.  This
centralizes the add/remove of ioctls depending on these function inputs in
one place, so we don't have to repeat ourselves in various tests.

Not only is the resulting code a bit shorter, but it fixes a real bug in
the existing code: previously, we would incorrectly require the
writeprotect ioctl to be present on aarch64, where it isn't actually
supported.

Link: https://lkml.kernel.org/r/20210930212309.4001967-4-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/userfaultfd.c |   77 ++++++++++-----------
 1 file changed, 38 insertions(+), 39 deletions(-)

--- a/tools/testing/selftests/vm/userfaultfd.c~userfaultfd-selftests-fix-calculation-of-expected-ioctls
+++ a/tools/testing/selftests/vm/userfaultfd.c
@@ -308,37 +308,24 @@ static void shmem_alias_mapping(__u64 *s
 }
 
 struct uffd_test_ops {
-	unsigned long expected_ioctls;
 	void (*allocate_area)(void **alloc_area);
 	void (*release_pages)(char *rel_area);
 	void (*alias_mapping)(__u64 *start, size_t len, unsigned long offset);
 };
 
-#define SHMEM_EXPECTED_IOCTLS		((1 << _UFFDIO_WAKE) | \
-					 (1 << _UFFDIO_COPY) | \
-					 (1 << _UFFDIO_ZEROPAGE))
-
-#define ANON_EXPECTED_IOCTLS		((1 << _UFFDIO_WAKE) | \
-					 (1 << _UFFDIO_COPY) | \
-					 (1 << _UFFDIO_ZEROPAGE) | \
-					 (1 << _UFFDIO_WRITEPROTECT))
-
 static struct uffd_test_ops anon_uffd_test_ops = {
-	.expected_ioctls = ANON_EXPECTED_IOCTLS,
 	.allocate_area	= anon_allocate_area,
 	.release_pages	= anon_release_pages,
 	.alias_mapping = noop_alias_mapping,
 };
 
 static struct uffd_test_ops shmem_uffd_test_ops = {
-	.expected_ioctls = SHMEM_EXPECTED_IOCTLS,
 	.allocate_area	= shmem_allocate_area,
 	.release_pages	= shmem_release_pages,
 	.alias_mapping = shmem_alias_mapping,
 };
 
 static struct uffd_test_ops hugetlb_uffd_test_ops = {
-	.expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC & ~(1 << _UFFDIO_CONTINUE),
 	.allocate_area	= hugetlb_allocate_area,
 	.release_pages	= hugetlb_release_pages,
 	.alias_mapping = hugetlb_alias_mapping,
@@ -356,6 +343,33 @@ static inline uint64_t uffd_minor_featur
 		return 0;
 }
 
+static uint64_t get_expected_ioctls(uint64_t mode)
+{
+	uint64_t ioctls = UFFD_API_RANGE_IOCTLS;
+
+	if (test_type == TEST_HUGETLB)
+		ioctls &= ~(1 << _UFFDIO_ZEROPAGE);
+
+	if (!((mode & UFFDIO_REGISTER_MODE_WP) && test_uffdio_wp))
+		ioctls &= ~(1 << _UFFDIO_WRITEPROTECT);
+
+	if (!((mode & UFFDIO_REGISTER_MODE_MINOR) && test_uffdio_minor))
+		ioctls &= ~(1 << _UFFDIO_CONTINUE);
+
+	return ioctls;
+}
+
+static void assert_expected_ioctls_present(uint64_t mode, uint64_t ioctls)
+{
+	uint64_t expected = get_expected_ioctls(mode);
+	uint64_t actual = ioctls & expected;
+
+	if (actual != expected) {
+		err("missing ioctl(s): expected %"PRIx64" actual: %"PRIx64,
+		    expected, actual);
+	}
+}
+
 static void userfaultfd_open(uint64_t *features)
 {
 	struct uffdio_api uffdio_api;
@@ -1017,11 +1031,9 @@ static int __uffdio_zeropage(int ufd, un
 {
 	struct uffdio_zeropage uffdio_zeropage;
 	int ret;
-	unsigned long has_zeropage;
+	bool has_zeropage = get_expected_ioctls(0) & (1 << _UFFDIO_ZEROPAGE);
 	__s64 res;
 
-	has_zeropage = uffd_test_ops->expected_ioctls & (1 << _UFFDIO_ZEROPAGE);
-
 	if (offset >= nr_pages * page_size)
 		err("unexpected offset %lu", offset);
 	uffdio_zeropage.range.start = (unsigned long) area_dst + offset;
@@ -1061,7 +1073,6 @@ static int uffdio_zeropage(int ufd, unsi
 static int userfaultfd_zeropage_test(void)
 {
 	struct uffdio_register uffdio_register;
-	unsigned long expected_ioctls;
 
 	printf("testing UFFDIO_ZEROPAGE: ");
 	fflush(stdout);
@@ -1076,9 +1087,8 @@ static int userfaultfd_zeropage_test(voi
 	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
 		err("register failure");
 
-	expected_ioctls = uffd_test_ops->expected_ioctls;
-	if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls)
-		err("unexpected missing ioctl for anon memory");
+	assert_expected_ioctls_present(
+		uffdio_register.mode, uffdio_register.ioctls);
 
 	if (uffdio_zeropage(uffd, 0))
 		if (my_bcmp(area_dst, zeropage, page_size))
@@ -1091,7 +1101,6 @@ static int userfaultfd_zeropage_test(voi
 static int userfaultfd_events_test(void)
 {
 	struct uffdio_register uffdio_register;
-	unsigned long expected_ioctls;
 	pthread_t uffd_mon;
 	int err, features;
 	pid_t pid;
@@ -1115,9 +1124,8 @@ static int userfaultfd_events_test(void)
 	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
 		err("register failure");
 
-	expected_ioctls = uffd_test_ops->expected_ioctls;
-	if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls)
-		err("unexpected missing ioctl for anon memory");
+	assert_expected_ioctls_present(
+		uffdio_register.mode, uffdio_register.ioctls);
 
 	if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, &stats))
 		err("uffd_poll_thread create");
@@ -1145,7 +1153,6 @@ static int userfaultfd_events_test(void)
 static int userfaultfd_sig_test(void)
 {
 	struct uffdio_register uffdio_register;
-	unsigned long expected_ioctls;
 	unsigned long userfaults;
 	pthread_t uffd_mon;
 	int err, features;
@@ -1169,9 +1176,8 @@ static int userfaultfd_sig_test(void)
 	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
 		err("register failure");
 
-	expected_ioctls = uffd_test_ops->expected_ioctls;
-	if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls)
-		err("unexpected missing ioctl for anon memory");
+	assert_expected_ioctls_present(
+		uffdio_register.mode, uffdio_register.ioctls);
 
 	if (faulting_process(1))
 		err("faulting process failed");
@@ -1206,7 +1212,6 @@ static int userfaultfd_sig_test(void)
 static int userfaultfd_minor_test(void)
 {
 	struct uffdio_register uffdio_register;
-	unsigned long expected_ioctls;
 	unsigned long p;
 	pthread_t uffd_mon;
 	uint8_t expected_byte;
@@ -1228,10 +1233,8 @@ static int userfaultfd_minor_test(void)
 	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
 		err("register failure");
 
-	expected_ioctls = uffd_test_ops->expected_ioctls;
-	expected_ioctls |= 1 << _UFFDIO_CONTINUE;
-	if ((uffdio_register.ioctls & expected_ioctls) != expected_ioctls)
-		err("unexpected missing ioctl(s)");
+	assert_expected_ioctls_present(
+		uffdio_register.mode, uffdio_register.ioctls);
 
 	/*
 	 * After registering with UFFD, populate the non-UFFD-registered side of
@@ -1428,8 +1431,6 @@ static int userfaultfd_stress(void)
 	pthread_attr_setstacksize(&attr, 16*1024*1024);
 
 	while (bounces--) {
-		unsigned long expected_ioctls;
-
 		printf("bounces: %d, mode:", bounces);
 		if (bounces & BOUNCE_RANDOM)
 			printf(" rnd");
@@ -1457,10 +1458,8 @@ static int userfaultfd_stress(void)
 			uffdio_register.mode |= UFFDIO_REGISTER_MODE_WP;
 		if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
 			err("register failure");
-		expected_ioctls = uffd_test_ops->expected_ioctls;
-		if ((uffdio_register.ioctls & expected_ioctls) !=
-		    expected_ioctls)
-			err("unexpected missing ioctl for anon memory");
+		assert_expected_ioctls_present(
+			uffdio_register.mode, uffdio_register.ioctls);
 
 		if (area_dst_alias) {
 			uffdio_register.range.start = (unsigned long)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 146/262] mm/page_isolation: fix potential missing call to unset_migratetype_isolate()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (144 preceding siblings ...)
  2021-11-05 20:42 ` [patch 145/262] userfaultfd/selftests: fix calculation of expected ioctls Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 147/262] mm/page_isolation: guard against possible putback unisolated page Andrew Morton
                   ` (115 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, david, linmiaohe, linux-mm, mhocko, mm-commits, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_isolation: fix potential missing call to unset_migratetype_isolate()

In start_isolate_page_range() undo path, pfn_to_online_page() just checks
the first pfn in a pageblock while __first_valid_page() will traverse the
pageblock until the first online pfn is found.  So we may miss the call to
unset_migratetype_isolate() in undo path and pages will remain isolated
unexpectedly.  Fix this by calling undo_isolate_page_range() and this will
also help to simplify the code further.  Note we shouldn't ever trigger it
because MAX_ORDER-1 aligned pfn ranges shouldn't contain memory holes now.

Link: https://lkml.kernel.org/r/20210914114348.15569-1-linmiaohe@huawei.com
Fixes: 2ce13640b3f4 ("mm: __first_valid_page skip over offline pages")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_isolation.c |   20 +++-----------------
 1 file changed, 3 insertions(+), 17 deletions(-)

--- a/mm/page_isolation.c~mm-page_isolation-fix-potential-missing-call-to-unset_migratetype_isolate
+++ a/mm/page_isolation.c
@@ -183,7 +183,6 @@ int start_isolate_page_range(unsigned lo
 			     unsigned migratetype, int flags)
 {
 	unsigned long pfn;
-	unsigned long undo_pfn;
 	struct page *page;
 
 	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
@@ -193,25 +192,12 @@ int start_isolate_page_range(unsigned lo
 	     pfn < end_pfn;
 	     pfn += pageblock_nr_pages) {
 		page = __first_valid_page(pfn, pageblock_nr_pages);
-		if (page) {
-			if (set_migratetype_isolate(page, migratetype, flags)) {
-				undo_pfn = pfn;
-				goto undo;
-			}
+		if (page && set_migratetype_isolate(page, migratetype, flags)) {
+			undo_isolate_page_range(start_pfn, pfn, migratetype);
+			return -EBUSY;
 		}
 	}
 	return 0;
-undo:
-	for (pfn = start_pfn;
-	     pfn < undo_pfn;
-	     pfn += pageblock_nr_pages) {
-		struct page *page = pfn_to_online_page(pfn);
-		if (!page)
-			continue;
-		unset_migratetype_isolate(page, migratetype);
-	}
-
-	return -EBUSY;
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 147/262] mm/page_isolation: guard against possible putback unisolated page
  2021-11-05 20:34 incoming Andrew Morton
                   ` (145 preceding siblings ...)
  2021-11-05 20:42 ` [patch 146/262] mm/page_isolation: fix potential missing call to unset_migratetype_isolate() Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 148/262] mm/vmscan.c: fix -Wunused-but-set-variable warning Andrew Morton
                   ` (114 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, david, iamjoonsoo.kim, jhubbard, linmiaohe, linux-mm,
	mm-commits, torvalds, vbabka

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/page_isolation: guard against possible putback unisolated page

Isolating a free page in an isolated pageblock is expected to always work
as watermarks don't apply here.  But if __isolate_free_page() failed, due
to condition changes, the page will be left on the free list.  And the
page will be put back to free list again via __putback_isolated_page(). 
This may trigger VM_BUG_ON_PAGE() on page->flags checking in
__free_one_page() if PageReported is set.  Or we will corrupt the free
list because list_add() will be called for pages already on another list. 
Add a VM_WARN_ON() to complain about this change.

Link: https://lkml.kernel.org/r/20210914114508.23725-1-linmiaohe@huawei.com
Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_isolation.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

--- a/mm/page_isolation.c~mm-page_isolation-guard-against-possible-putback-unisolated-page
+++ a/mm/page_isolation.c
@@ -94,8 +94,13 @@ static void unset_migratetype_isolate(st
 			buddy = page + (buddy_pfn - pfn);
 
 			if (!is_migrate_isolate_page(buddy)) {
-				__isolate_free_page(page, order);
-				isolated_page = true;
+				isolated_page = !!__isolate_free_page(page, order);
+				/*
+				 * Isolating a free page in an isolated pageblock
+				 * is expected to always work as watermarks don't
+				 * apply here.
+				 */
+				VM_WARN_ON(!isolated_page);
 			}
 		}
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 148/262] mm/vmscan.c: fix -Wunused-but-set-variable warning
  2021-11-05 20:34 incoming Andrew Morton
                   ` (146 preceding siblings ...)
  2021-11-05 20:42 ` [patch 147/262] mm/page_isolation: guard against possible putback unisolated page Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested Andrew Morton
                   ` (113 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, shy828301, songkai01, torvalds

From: Kai Song <songkai01@inspur.com>
Subject: mm/vmscan.c: fix -Wunused-but-set-variable warning

We fix the following warning when building kernel with W=1:
mm/vmscan.c:1362:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]

Link: https://lkml.kernel.org/r/20210924181218.21165-1-songkai01@inspur.com
Signed-off-by: Kai Song <songkai01@inspur.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/mm/vmscan.c~mm-vmscanc-fix-wunused-but-set-variable-warning
+++ a/mm/vmscan.c
@@ -1337,7 +1337,6 @@ static unsigned int demote_page_list(str
 {
 	int target_nid = next_demotion_node(pgdat->node_id);
 	unsigned int nr_succeeded;
-	int err;
 
 	if (list_empty(demote_pages))
 		return 0;
@@ -1346,7 +1345,7 @@ static unsigned int demote_page_list(str
 		return 0;
 
 	/* Demotion ignores all cpuset and mempolicy settings */
-	err = migrate_pages(demote_pages, alloc_demote_page, NULL,
+	migrate_pages(demote_pages, alloc_demote_page, NULL,
 			    target_nid, MIGRATE_ASYNC, MR_DEMOTION,
 			    &nr_succeeded);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-05 20:34 incoming Andrew Morton
                   ` (147 preceding siblings ...)
  2021-11-05 20:42 ` [patch 148/262] mm/vmscan.c: fix -Wunused-but-set-variable warning Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 21:02   ` Matthew Wilcox
  2021-11-05 20:42 ` [patch 150/262] mm/vmscan: throttle reclaim and compaction when too may pages are isolated Andrew Morton
                   ` (112 subsequent siblings)
  261 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: throttle reclaim until some writeback completes if congested

Patch series "Remove dependency on congestion_wait in mm/", v5.

This series that removes all calls to congestion_wait in mm/ and deletes
wait_iff_congested.  It's not a clever implementation but congestion_wait
has been broken for a long time
(https://lore.kernel.org/linux-mm/45d8b7a6-8548-65f5-cccf-9f451d4ae3d4@kernel.dk/).
Even if congestion throttling worked, it was never a great idea.  While
excessive dirty/writeback pages at the tail of the LRU is one possibility
that reclaim may be slow, there is also the problem of too many pages
being isolated and reclaim failing for other reasons (elevated references,
too many pages isolated, excessive LRU contention etc).

This series replaces the "congestion" throttling with 3 different types.

o If there are too many dirty/writeback pages, sleep until a timeout or
  enough pages get cleaned
o If too many pages are isolated, sleep until enough isolated pages are
  either reclaimed or put back on the LRU
o If no progress is being made, direct reclaim tasks sleep until another
  task makes progress with acceptable efficiency.

This was initially tested with a mix of workloads that used to trigger
corner cases that no longer work.  A new test case was created called
"stutterp" (pagereclaim-stutterp-noreaders in mmtests) using a freshly
created XFS filesystem.  Note that it may be necessary to increase the
timeout of ssh if executing remotely as ssh itself can get throttled and
the connection may timeout.

stutterp varies the number of "worker" processes from 4 up to NR_CPUS*4 to
check the impact as the number of direct reclaimers increase.  It has four
types of worker.

o One "anon latency" worker creates small mappings with mmap() and times
  how long it takes to fault the mapping reading it 4K at a time
o X file writers which is fio randomly writing X files where the total
  size of the files add up to the allowed dirty_ratio.  fio is allowed to
  run for a warmup period to allow some file-backed pages to accumulate. 
  The duration of the warmup is based on the best-case linear write speed
  of the storage.
o Y file readers which is fio randomly reading small files
o Z anon memory hogs which continually map (100-dirty_ratio)% of memory
o Total estimated WSS = (100+dirty_ration) percentage of memory

X+Y+Z+1 == NR_WORKERS varying from 4 up to NR_CPUS*4

The intent is to maximise the total WSS with a mix of file and anon memory
where some anonymous memory must be swapped and there is a high likelihood
of dirty/writeback pages reaching the end of the LRU.

The test can be configured to have no background readers to stress
dirty/writeback pages.  The results below are based on having zero
readers.

The short summary of the results is that the series works and stalls until
some event occurs but the timeouts may need adjustment.

The test results are not broken down by patch as the series should be
treated as one block that replaces a broken throttling mechanism with a
working one.

Finally, three machines were tested but I'm reporting the worst set of
results.  The other two machines had much better latencies for example.

First the results of the "anon latency" latency

stutterp
                              5.15.0-rc1             5.15.0-rc1
                                 vanilla mm-reclaimcongest-v5r4
Amean     mmap-4      31.4003 (   0.00%)   2661.0198 (-8374.52%)
Amean     mmap-7      38.1641 (   0.00%)    149.2891 (-291.18%)
Amean     mmap-12     60.0981 (   0.00%)    187.8105 (-212.51%)
Amean     mmap-21    161.2699 (   0.00%)    213.9107 ( -32.64%)
Amean     mmap-30    174.5589 (   0.00%)    377.7548 (-116.41%)
Amean     mmap-48   8106.8160 (   0.00%)   1070.5616 (  86.79%)
Stddev    mmap-4      41.3455 (   0.00%)  27573.9676 (-66591.66%)
Stddev    mmap-7      53.5556 (   0.00%)   4608.5860 (-8505.23%)
Stddev    mmap-12    171.3897 (   0.00%)   5559.4542 (-3143.75%)
Stddev    mmap-21   1506.6752 (   0.00%)   5746.2507 (-281.39%)
Stddev    mmap-30    557.5806 (   0.00%)   7678.1624 (-1277.05%)
Stddev    mmap-48  61681.5718 (   0.00%)  14507.2830 (  76.48%)
Max-90    mmap-4      31.4243 (   0.00%)     83.1457 (-164.59%)
Max-90    mmap-7      41.0410 (   0.00%)     41.0720 (  -0.08%)
Max-90    mmap-12     66.5255 (   0.00%)     53.9073 (  18.97%)
Max-90    mmap-21    146.7479 (   0.00%)    105.9540 (  27.80%)
Max-90    mmap-30    193.9513 (   0.00%)     64.3067 (  66.84%)
Max-90    mmap-48    277.9137 (   0.00%)    591.0594 (-112.68%)
Max       mmap-4    1913.8009 (   0.00%) 299623.9695 (-15555.96%)
Max       mmap-7    2423.9665 (   0.00%) 204453.1708 (-8334.65%)
Max       mmap-12   6845.6573 (   0.00%) 221090.3366 (-3129.64%)
Max       mmap-21  56278.6508 (   0.00%) 213877.3496 (-280.03%)
Max       mmap-30  19716.2990 (   0.00%) 216287.6229 (-997.00%)
Max       mmap-48 477923.9400 (   0.00%) 245414.8238 (  48.65%)

For most thread counts, the time to mmap() is unfortunately increased.  In
earlier versions of the series, this was lower but a large number of
throttling events were reaching their timeout increasing the amount of
inefficient scanning of the LRU.  There is no prioritisation of reclaim
tasks making progress based on each tasks rate of page allocation versus
progress of reclaim.  The variance is also impacted for high worker counts
but in all cases, the differences in latency are not statistically
significant due to very large maximum outliers.  Max-90 shows that 90% of
the stalls are comparable but the Max results show the massive outliers
which are increased to to stalling.

It is expected that this will be very machine dependant.  Due to the test
design, reclaim is difficult so allocations stall and there are variances
depending on whether THPs can be allocated or not.  The amount of memory
will affect exactly how bad the corner cases are and how often they
trigger.  The warmup period calculation is not ideal as it's based on
linear writes where as fio is randomly writing multiple files from
multiple tasks so the start state of the test is variable.  For example,
these are the latencies on a single-socket machine that had more memory

Amean     mmap-4      42.2287 (   0.00%)     49.6838 * -17.65%*
Amean     mmap-7     216.4326 (   0.00%)     47.4451 *  78.08%*
Amean     mmap-12   2412.0588 (   0.00%)     51.7497 (  97.85%)
Amean     mmap-21   5546.2548 (   0.00%)     51.8862 (  99.06%)
Amean     mmap-30   1085.3121 (   0.00%)     72.1004 (  93.36%)

The overall system CPU usage and elapsed time is as follows

                  5.15.0-rc3  5.15.0-rc3
                     vanilla mm-reclaimcongest-v5r4
Duration User        6989.03      983.42
Duration System      7308.12      799.68
Duration Elapsed     2277.67     2092.98

The patches reduce system CPU usage by 89% as the vanilla kernel is rarely
stalling.

The high-level /proc/vmstats show

                                     5.15.0-rc1     5.15.0-rc1
                                        vanilla mm-reclaimcongest-v5r2
Ops Direct pages scanned          1056608451.00   503594991.00
Ops Kswapd pages scanned           109795048.00   147289810.00
Ops Kswapd pages reclaimed          63269243.00    31036005.00
Ops Direct pages reclaimed          10803973.00     6328887.00
Ops Kswapd efficiency %                   57.62          21.07
Ops Kswapd velocity                    48204.98       57572.86
Ops Direct efficiency %                    1.02           1.26
Ops Direct velocity                   463898.83      196845.97

Kswapd scanned less pages but the detailed pattern is different.  The
vanilla kernel scans slowly over time where as the patches exhibits burst
patterns of scan activity.  Direct reclaim scanning is reduced by 52% due
to stalling.

The pattern for stealing pages is also slightly different.  Both kernels
exhibit spikes but the vanilla kernel when reclaiming shows pages being
reclaimed over a period of time where as the patches tend to reclaim in
spikes.  The difference is that vanilla is not throttling and instead
scanning constantly finding some pages over time where as the patched
kernel throttles and reclaims in spikes.

Ops Percentage direct scans               90.59          77.37

For direct reclaim, vanilla scanned 90.59% of pages where as with the
patches, 77.37% were direct reclaim due to throttling

Ops Page writes by reclaim           2613590.00     1687131.00

Page writes from reclaim context are reduced.

Ops Page writes anon                 2932752.00     1917048.00

And there is less swapping.

Ops Page reclaim immediate         996248528.00   107664764.00

The number of pages encountered at the tail of the LRU tagged for immediate
reclaim but still dirty/writeback is reduced by 89%.

Ops Slabs scanned                     164284.00      153608.00

Slab scan activity is similar.

ftrace was used to gather stall activity

Vanilla
-------
      1 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=16000
      2 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=12000
      8 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=8000
     29 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=4000
  82394 writeback_wait_iff_congested: usec_timeout=100000 usec_delayed=0

The fast majority of wait_iff_congested calls do not stall at all.
What is likely happening is that cond_resched() reschedules the task for
a short period when the BDI is not registering congestion (which it never
will in this test setup).

      1 writeback_congestion_wait: usec_timeout=100000 usec_delayed=120000
      2 writeback_congestion_wait: usec_timeout=100000 usec_delayed=132000
      4 writeback_congestion_wait: usec_timeout=100000 usec_delayed=112000
    380 writeback_congestion_wait: usec_timeout=100000 usec_delayed=108000
    778 writeback_congestion_wait: usec_timeout=100000 usec_delayed=104000

congestion_wait if called always exceeds the timeout as there is no
trigger to wake it up.

Bottom line: Vanilla will throttle but it's not effective.

Patch series
------------

Kswapd throttle activity was always due to scanning pages tagged for
immediate reclaim at the tail of the LRU

      1 usec_timeout=100000 usect_delayed=72000 reason=VMSCAN_THROTTLE_WRITEBACK
      4 usec_timeout=100000 usect_delayed=20000 reason=VMSCAN_THROTTLE_WRITEBACK
      5 usec_timeout=100000 usect_delayed=12000 reason=VMSCAN_THROTTLE_WRITEBACK
      6 usec_timeout=100000 usect_delayed=16000 reason=VMSCAN_THROTTLE_WRITEBACK
     11 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK
     11 usec_timeout=100000 usect_delayed=8000 reason=VMSCAN_THROTTLE_WRITEBACK
     94 usec_timeout=100000 usect_delayed=0 reason=VMSCAN_THROTTLE_WRITEBACK
    112 usec_timeout=100000 usect_delayed=4000 reason=VMSCAN_THROTTLE_WRITEBACK

The majority of events did not stall or stalled for a short period. 
Roughly 16% of stalls reached the timeout before expiry.  For direct
reclaim, the number of times stalled for each reason were

   6624 reason=VMSCAN_THROTTLE_ISOLATED
  93246 reason=VMSCAN_THROTTLE_NOPROGRESS
  96934 reason=VMSCAN_THROTTLE_WRITEBACK

The most common reason to stall was due to excessive pages tagged for
immediate reclaim at the tail of the LRU followed by a failure to make
forward.  A relatively small number were due to too many pages isolated
from the LRU by parallel threads

For VMSCAN_THROTTLE_ISOLATED, the breakdown of delays was
 
      9 usec_timeout=20000 usect_delayed=4000 reason=VMSCAN_THROTTLE_ISOLATED
     12 usec_timeout=20000 usect_delayed=16000 reason=VMSCAN_THROTTLE_ISOLATED
     83 usec_timeout=20000 usect_delayed=20000 reason=VMSCAN_THROTTLE_ISOLATED
   6520 usec_timeout=20000 usect_delayed=0 reason=VMSCAN_THROTTLE_ISOLATED

Most did not stall at all. A small number reached the timeout.

For VMSCAN_THROTTLE_NOPROGRESS, the breakdown of stalls were all over the
map

      1 usec_timeout=500000 usect_delayed=324000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usec_timeout=500000 usect_delayed=332000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usec_timeout=500000 usect_delayed=348000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usec_timeout=500000 usect_delayed=360000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=228000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=260000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=340000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=364000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=372000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=428000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=460000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usec_timeout=500000 usect_delayed=464000 reason=VMSCAN_THROTTLE_NOPROGRESS
      3 usec_timeout=500000 usect_delayed=244000 reason=VMSCAN_THROTTLE_NOPROGRESS
      3 usec_timeout=500000 usect_delayed=252000 reason=VMSCAN_THROTTLE_NOPROGRESS
      3 usec_timeout=500000 usect_delayed=272000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=188000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=268000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=328000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=380000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=392000 reason=VMSCAN_THROTTLE_NOPROGRESS
      4 usec_timeout=500000 usect_delayed=432000 reason=VMSCAN_THROTTLE_NOPROGRESS
      5 usec_timeout=500000 usect_delayed=204000 reason=VMSCAN_THROTTLE_NOPROGRESS
      5 usec_timeout=500000 usect_delayed=220000 reason=VMSCAN_THROTTLE_NOPROGRESS
      5 usec_timeout=500000 usect_delayed=412000 reason=VMSCAN_THROTTLE_NOPROGRESS
      5 usec_timeout=500000 usect_delayed=436000 reason=VMSCAN_THROTTLE_NOPROGRESS
      6 usec_timeout=500000 usect_delayed=488000 reason=VMSCAN_THROTTLE_NOPROGRESS
      7 usec_timeout=500000 usect_delayed=212000 reason=VMSCAN_THROTTLE_NOPROGRESS
      7 usec_timeout=500000 usect_delayed=300000 reason=VMSCAN_THROTTLE_NOPROGRESS
      7 usec_timeout=500000 usect_delayed=316000 reason=VMSCAN_THROTTLE_NOPROGRESS
      7 usec_timeout=500000 usect_delayed=472000 reason=VMSCAN_THROTTLE_NOPROGRESS
      8 usec_timeout=500000 usect_delayed=248000 reason=VMSCAN_THROTTLE_NOPROGRESS
      8 usec_timeout=500000 usect_delayed=356000 reason=VMSCAN_THROTTLE_NOPROGRESS
      8 usec_timeout=500000 usect_delayed=456000 reason=VMSCAN_THROTTLE_NOPROGRESS
      9 usec_timeout=500000 usect_delayed=124000 reason=VMSCAN_THROTTLE_NOPROGRESS
      9 usec_timeout=500000 usect_delayed=376000 reason=VMSCAN_THROTTLE_NOPROGRESS
      9 usec_timeout=500000 usect_delayed=484000 reason=VMSCAN_THROTTLE_NOPROGRESS
     10 usec_timeout=500000 usect_delayed=172000 reason=VMSCAN_THROTTLE_NOPROGRESS
     10 usec_timeout=500000 usect_delayed=420000 reason=VMSCAN_THROTTLE_NOPROGRESS
     10 usec_timeout=500000 usect_delayed=452000 reason=VMSCAN_THROTTLE_NOPROGRESS
     11 usec_timeout=500000 usect_delayed=256000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=112000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=116000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=144000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=152000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=264000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=384000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=424000 reason=VMSCAN_THROTTLE_NOPROGRESS
     12 usec_timeout=500000 usect_delayed=492000 reason=VMSCAN_THROTTLE_NOPROGRESS
     13 usec_timeout=500000 usect_delayed=184000 reason=VMSCAN_THROTTLE_NOPROGRESS
     13 usec_timeout=500000 usect_delayed=444000 reason=VMSCAN_THROTTLE_NOPROGRESS
     14 usec_timeout=500000 usect_delayed=308000 reason=VMSCAN_THROTTLE_NOPROGRESS
     14 usec_timeout=500000 usect_delayed=440000 reason=VMSCAN_THROTTLE_NOPROGRESS
     14 usec_timeout=500000 usect_delayed=476000 reason=VMSCAN_THROTTLE_NOPROGRESS
     16 usec_timeout=500000 usect_delayed=140000 reason=VMSCAN_THROTTLE_NOPROGRESS
     17 usec_timeout=500000 usect_delayed=232000 reason=VMSCAN_THROTTLE_NOPROGRESS
     17 usec_timeout=500000 usect_delayed=240000 reason=VMSCAN_THROTTLE_NOPROGRESS
     17 usec_timeout=500000 usect_delayed=280000 reason=VMSCAN_THROTTLE_NOPROGRESS
     18 usec_timeout=500000 usect_delayed=404000 reason=VMSCAN_THROTTLE_NOPROGRESS
     20 usec_timeout=500000 usect_delayed=148000 reason=VMSCAN_THROTTLE_NOPROGRESS
     20 usec_timeout=500000 usect_delayed=216000 reason=VMSCAN_THROTTLE_NOPROGRESS
     20 usec_timeout=500000 usect_delayed=468000 reason=VMSCAN_THROTTLE_NOPROGRESS
     21 usec_timeout=500000 usect_delayed=448000 reason=VMSCAN_THROTTLE_NOPROGRESS
     23 usec_timeout=500000 usect_delayed=168000 reason=VMSCAN_THROTTLE_NOPROGRESS
     23 usec_timeout=500000 usect_delayed=296000 reason=VMSCAN_THROTTLE_NOPROGRESS
     25 usec_timeout=500000 usect_delayed=132000 reason=VMSCAN_THROTTLE_NOPROGRESS
     25 usec_timeout=500000 usect_delayed=352000 reason=VMSCAN_THROTTLE_NOPROGRESS
     26 usec_timeout=500000 usect_delayed=180000 reason=VMSCAN_THROTTLE_NOPROGRESS
     27 usec_timeout=500000 usect_delayed=284000 reason=VMSCAN_THROTTLE_NOPROGRESS
     28 usec_timeout=500000 usect_delayed=164000 reason=VMSCAN_THROTTLE_NOPROGRESS
     29 usec_timeout=500000 usect_delayed=136000 reason=VMSCAN_THROTTLE_NOPROGRESS
     30 usec_timeout=500000 usect_delayed=200000 reason=VMSCAN_THROTTLE_NOPROGRESS
     30 usec_timeout=500000 usect_delayed=400000 reason=VMSCAN_THROTTLE_NOPROGRESS
     31 usec_timeout=500000 usect_delayed=196000 reason=VMSCAN_THROTTLE_NOPROGRESS
     32 usec_timeout=500000 usect_delayed=156000 reason=VMSCAN_THROTTLE_NOPROGRESS
     33 usec_timeout=500000 usect_delayed=224000 reason=VMSCAN_THROTTLE_NOPROGRESS
     35 usec_timeout=500000 usect_delayed=128000 reason=VMSCAN_THROTTLE_NOPROGRESS
     35 usec_timeout=500000 usect_delayed=176000 reason=VMSCAN_THROTTLE_NOPROGRESS
     36 usec_timeout=500000 usect_delayed=368000 reason=VMSCAN_THROTTLE_NOPROGRESS
     36 usec_timeout=500000 usect_delayed=496000 reason=VMSCAN_THROTTLE_NOPROGRESS
     37 usec_timeout=500000 usect_delayed=312000 reason=VMSCAN_THROTTLE_NOPROGRESS
     38 usec_timeout=500000 usect_delayed=304000 reason=VMSCAN_THROTTLE_NOPROGRESS
     40 usec_timeout=500000 usect_delayed=288000 reason=VMSCAN_THROTTLE_NOPROGRESS
     43 usec_timeout=500000 usect_delayed=408000 reason=VMSCAN_THROTTLE_NOPROGRESS
     55 usec_timeout=500000 usect_delayed=416000 reason=VMSCAN_THROTTLE_NOPROGRESS
     56 usec_timeout=500000 usect_delayed=76000 reason=VMSCAN_THROTTLE_NOPROGRESS
     58 usec_timeout=500000 usect_delayed=120000 reason=VMSCAN_THROTTLE_NOPROGRESS
     59 usec_timeout=500000 usect_delayed=208000 reason=VMSCAN_THROTTLE_NOPROGRESS
     61 usec_timeout=500000 usect_delayed=68000 reason=VMSCAN_THROTTLE_NOPROGRESS
     71 usec_timeout=500000 usect_delayed=192000 reason=VMSCAN_THROTTLE_NOPROGRESS
     71 usec_timeout=500000 usect_delayed=480000 reason=VMSCAN_THROTTLE_NOPROGRESS
     79 usec_timeout=500000 usect_delayed=60000 reason=VMSCAN_THROTTLE_NOPROGRESS
     82 usec_timeout=500000 usect_delayed=320000 reason=VMSCAN_THROTTLE_NOPROGRESS
     82 usec_timeout=500000 usect_delayed=92000 reason=VMSCAN_THROTTLE_NOPROGRESS
     85 usec_timeout=500000 usect_delayed=64000 reason=VMSCAN_THROTTLE_NOPROGRESS
     85 usec_timeout=500000 usect_delayed=80000 reason=VMSCAN_THROTTLE_NOPROGRESS
     88 usec_timeout=500000 usect_delayed=84000 reason=VMSCAN_THROTTLE_NOPROGRESS
     90 usec_timeout=500000 usect_delayed=160000 reason=VMSCAN_THROTTLE_NOPROGRESS
     90 usec_timeout=500000 usect_delayed=292000 reason=VMSCAN_THROTTLE_NOPROGRESS
     94 usec_timeout=500000 usect_delayed=56000 reason=VMSCAN_THROTTLE_NOPROGRESS
    118 usec_timeout=500000 usect_delayed=88000 reason=VMSCAN_THROTTLE_NOPROGRESS
    119 usec_timeout=500000 usect_delayed=72000 reason=VMSCAN_THROTTLE_NOPROGRESS
    126 usec_timeout=500000 usect_delayed=108000 reason=VMSCAN_THROTTLE_NOPROGRESS
    146 usec_timeout=500000 usect_delayed=52000 reason=VMSCAN_THROTTLE_NOPROGRESS
    148 usec_timeout=500000 usect_delayed=36000 reason=VMSCAN_THROTTLE_NOPROGRESS
    148 usec_timeout=500000 usect_delayed=48000 reason=VMSCAN_THROTTLE_NOPROGRESS
    159 usec_timeout=500000 usect_delayed=28000 reason=VMSCAN_THROTTLE_NOPROGRESS
    178 usec_timeout=500000 usect_delayed=44000 reason=VMSCAN_THROTTLE_NOPROGRESS
    183 usec_timeout=500000 usect_delayed=40000 reason=VMSCAN_THROTTLE_NOPROGRESS
    237 usec_timeout=500000 usect_delayed=100000 reason=VMSCAN_THROTTLE_NOPROGRESS
    266 usec_timeout=500000 usect_delayed=32000 reason=VMSCAN_THROTTLE_NOPROGRESS
    313 usec_timeout=500000 usect_delayed=24000 reason=VMSCAN_THROTTLE_NOPROGRESS
    347 usec_timeout=500000 usect_delayed=96000 reason=VMSCAN_THROTTLE_NOPROGRESS
    470 usec_timeout=500000 usect_delayed=20000 reason=VMSCAN_THROTTLE_NOPROGRESS
    559 usec_timeout=500000 usect_delayed=16000 reason=VMSCAN_THROTTLE_NOPROGRESS
    964 usec_timeout=500000 usect_delayed=12000 reason=VMSCAN_THROTTLE_NOPROGRESS
   2001 usec_timeout=500000 usect_delayed=104000 reason=VMSCAN_THROTTLE_NOPROGRESS
   2447 usec_timeout=500000 usect_delayed=8000 reason=VMSCAN_THROTTLE_NOPROGRESS
   7888 usec_timeout=500000 usect_delayed=4000 reason=VMSCAN_THROTTLE_NOPROGRESS
  22727 usec_timeout=500000 usect_delayed=0 reason=VMSCAN_THROTTLE_NOPROGRESS
  51305 usec_timeout=500000 usect_delayed=500000 reason=VMSCAN_THROTTLE_NOPROGRESS

The full timeout is often hit but a large number also do not stall at all.
The remainder slept a little allowing other reclaim tasks to make
progress.

While this timeout could be further increased, it could also negatively
impact worst-case behaviour when there is no prioritisation of what task
should make progress.

For VMSCAN_THROTTLE_WRITEBACK, the breakdown was

      1 usec_timeout=100000 usect_delayed=44000 reason=VMSCAN_THROTTLE_WRITEBACK
      2 usec_timeout=100000 usect_delayed=76000 reason=VMSCAN_THROTTLE_WRITEBACK
      3 usec_timeout=100000 usect_delayed=80000 reason=VMSCAN_THROTTLE_WRITEBACK
      5 usec_timeout=100000 usect_delayed=48000 reason=VMSCAN_THROTTLE_WRITEBACK
      5 usec_timeout=100000 usect_delayed=84000 reason=VMSCAN_THROTTLE_WRITEBACK
      6 usec_timeout=100000 usect_delayed=72000 reason=VMSCAN_THROTTLE_WRITEBACK
      7 usec_timeout=100000 usect_delayed=88000 reason=VMSCAN_THROTTLE_WRITEBACK
     11 usec_timeout=100000 usect_delayed=56000 reason=VMSCAN_THROTTLE_WRITEBACK
     12 usec_timeout=100000 usect_delayed=64000 reason=VMSCAN_THROTTLE_WRITEBACK
     16 usec_timeout=100000 usect_delayed=92000 reason=VMSCAN_THROTTLE_WRITEBACK
     24 usec_timeout=100000 usect_delayed=68000 reason=VMSCAN_THROTTLE_WRITEBACK
     28 usec_timeout=100000 usect_delayed=32000 reason=VMSCAN_THROTTLE_WRITEBACK
     30 usec_timeout=100000 usect_delayed=60000 reason=VMSCAN_THROTTLE_WRITEBACK
     30 usec_timeout=100000 usect_delayed=96000 reason=VMSCAN_THROTTLE_WRITEBACK
     32 usec_timeout=100000 usect_delayed=52000 reason=VMSCAN_THROTTLE_WRITEBACK
     42 usec_timeout=100000 usect_delayed=40000 reason=VMSCAN_THROTTLE_WRITEBACK
     77 usec_timeout=100000 usect_delayed=28000 reason=VMSCAN_THROTTLE_WRITEBACK
     99 usec_timeout=100000 usect_delayed=36000 reason=VMSCAN_THROTTLE_WRITEBACK
    137 usec_timeout=100000 usect_delayed=24000 reason=VMSCAN_THROTTLE_WRITEBACK
    190 usec_timeout=100000 usect_delayed=20000 reason=VMSCAN_THROTTLE_WRITEBACK
    339 usec_timeout=100000 usect_delayed=16000 reason=VMSCAN_THROTTLE_WRITEBACK
    518 usec_timeout=100000 usect_delayed=12000 reason=VMSCAN_THROTTLE_WRITEBACK
    852 usec_timeout=100000 usect_delayed=8000 reason=VMSCAN_THROTTLE_WRITEBACK
   3359 usec_timeout=100000 usect_delayed=4000 reason=VMSCAN_THROTTLE_WRITEBACK
   7147 usec_timeout=100000 usect_delayed=0 reason=VMSCAN_THROTTLE_WRITEBACK
  83962 usec_timeout=100000 usect_delayed=100000 reason=VMSCAN_THROTTLE_WRITEBACK

The majority hit the timeout in direct reclaim context although a sizable
number did not stall at all.  This is very different to kswapd where only
a tiny percentage of stalls due to writeback reached the timeout.

Bottom line, the throttling appears to work and the wakeup events may
limit worst case stalls.  There might be some grounds for adjusting
timeouts but it's likely futile as the worst-case scenarios depend on the
workload, memory size and the speed of the storage.  A better approach to
improve the series further would be to prioritise tasks based on their
rate of allocation with the caveat that it may be very expensive to track.


This patch (of 5):

Page reclaim throttles on wait_iff_congested under the following
conditions:

o kswapd is encountering pages under writeback and marked for immediate
  reclaim implying that pages are cycling through the LRU faster than
  pages can be cleaned.

o Direct reclaim will stall if all dirty pages are backed by congested
  inodes.

wait_iff_congested is almost completely broken with few exceptions.  This
patch adds a new node-based workqueue and tracks the number of throttled
tasks and pages written back since throttling started.  If enough pages
belonging to the node are written back then the throttled tasks will wake
early.  If not, the throttled tasks sleeps until the timeout expires.

[neilb@suse.de: Uninterruptible sleep and simpler wakeups]
[hdanton@sina.com: Avoid race when reclaim starts]
[vbabka@suse.cz: vmstat irq-safe api, clarifications]
Link: https://lkml.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net
Link: https://lkml.kernel.org/r/20211022144651.19914-2-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: NeilBrown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/backing-dev.h      |    1 
 include/linux/mmzone.h           |   13 ++++
 include/trace/events/vmscan.h    |   34 ++++++++++++
 include/trace/events/writeback.h |    7 --
 mm/backing-dev.c                 |   48 ----------------
 mm/filemap.c                     |    1 
 mm/internal.h                    |   11 +++
 mm/page_alloc.c                  |    5 +
 mm/vmscan.c                      |   82 ++++++++++++++++++++++++-----
 mm/vmstat.c                      |    1 
 10 files changed, 135 insertions(+), 68 deletions(-)

--- a/include/linux/backing-dev.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/include/linux/backing-dev.h
@@ -154,7 +154,6 @@ static inline int wb_congested(struct bd
 }
 
 long congestion_wait(int sync, long timeout);
-long wait_iff_congested(int sync, long timeout);
 
 static inline bool mapping_can_writeback(struct address_space *mapping)
 {
--- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/include/linux/mmzone.h
@@ -199,6 +199,7 @@ enum node_stat_item {
 	NR_VMSCAN_IMMEDIATE,	/* Prioritise for reclaim when writeback ends */
 	NR_DIRTIED,		/* page dirtyings since bootup */
 	NR_WRITTEN,		/* page writings since bootup */
+	NR_THROTTLED_WRITTEN,	/* NR_WRITTEN while reclaim throttled */
 	NR_KERNEL_MISC_RECLAIMABLE,	/* reclaimable non-slab kernel pages */
 	NR_FOLL_PIN_ACQUIRED,	/* via: pin_user_page(), gup flag: FOLL_PIN */
 	NR_FOLL_PIN_RELEASED,	/* pages returned via unpin_user_page() */
@@ -272,6 +273,11 @@ enum lru_list {
 	NR_LRU_LISTS
 };
 
+enum vmscan_throttle_state {
+	VMSCAN_THROTTLE_WRITEBACK,
+	NR_VMSCAN_THROTTLE,
+};
+
 #define for_each_lru(lru) for (lru = 0; lru < NR_LRU_LISTS; lru++)
 
 #define for_each_evictable_lru(lru) for (lru = 0; lru <= LRU_ACTIVE_FILE; lru++)
@@ -841,6 +847,13 @@ typedef struct pglist_data {
 	int node_id;
 	wait_queue_head_t kswapd_wait;
 	wait_queue_head_t pfmemalloc_wait;
+
+	/* workqueues for throttling reclaim for different reasons. */
+	wait_queue_head_t reclaim_wait[NR_VMSCAN_THROTTLE];
+
+	atomic_t nr_writeback_throttled;/* nr of writeback-throttled tasks */
+	unsigned long nr_reclaim_start;	/* nr pages written while throttled
+					 * when throttling started. */
 	struct task_struct *kswapd;	/* Protected by
 					   mem_hotplug_begin/end() */
 	int kswapd_order;
--- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/include/trace/events/vmscan.h
@@ -27,6 +27,14 @@
 		{RECLAIM_WB_ASYNC,	"RECLAIM_WB_ASYNC"}	\
 		) : "RECLAIM_WB_NONE"
 
+#define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
+
+#define show_throttle_flags(flags)						\
+	(flags) ? __print_flags(flags, "|",					\
+		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"}	\
+		) : "VMSCAN_THROTTLE_NONE"
+
+
 #define trace_reclaim_flags(file) ( \
 	(file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
 	(RECLAIM_WB_ASYNC) \
@@ -454,6 +462,32 @@ DEFINE_EVENT(mm_vmscan_direct_reclaim_en
 	TP_ARGS(nr_reclaimed)
 );
 
+TRACE_EVENT(mm_vmscan_throttled,
+
+	TP_PROTO(int nid, int usec_timeout, int usec_delayed, int reason),
+
+	TP_ARGS(nid, usec_timeout, usec_delayed, reason),
+
+	TP_STRUCT__entry(
+		__field(int, nid)
+		__field(int, usec_timeout)
+		__field(int, usec_delayed)
+		__field(int, reason)
+	),
+
+	TP_fast_assign(
+		__entry->nid = nid;
+		__entry->usec_timeout = usec_timeout;
+		__entry->usec_delayed = usec_delayed;
+		__entry->reason = 1U << reason;
+	),
+
+	TP_printk("nid=%d usec_timeout=%d usect_delayed=%d reason=%s",
+		__entry->nid,
+		__entry->usec_timeout,
+		__entry->usec_delayed,
+		show_throttle_flags(__entry->reason))
+);
 #endif /* _TRACE_VMSCAN_H */
 
 /* This part must be outside protection */
--- a/include/trace/events/writeback.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/include/trace/events/writeback.h
@@ -763,13 +763,6 @@ DEFINE_EVENT(writeback_congest_waited_te
 	TP_ARGS(usec_timeout, usec_delayed)
 );
 
-DEFINE_EVENT(writeback_congest_waited_template, writeback_wait_iff_congested,
-
-	TP_PROTO(unsigned int usec_timeout, unsigned int usec_delayed),
-
-	TP_ARGS(usec_timeout, usec_delayed)
-);
-
 DECLARE_EVENT_CLASS(writeback_single_inode_template,
 
 	TP_PROTO(struct inode *inode,
--- a/mm/backing-dev.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/backing-dev.c
@@ -1038,51 +1038,3 @@ long congestion_wait(int sync, long time
 	return ret;
 }
 EXPORT_SYMBOL(congestion_wait);
-
-/**
- * wait_iff_congested - Conditionally wait for a backing_dev to become uncongested or a pgdat to complete writes
- * @sync: SYNC or ASYNC IO
- * @timeout: timeout in jiffies
- *
- * In the event of a congested backing_dev (any backing_dev) this waits
- * for up to @timeout jiffies for either a BDI to exit congestion of the
- * given @sync queue or a write to complete.
- *
- * The return value is 0 if the sleep is for the full timeout. Otherwise,
- * it is the number of jiffies that were still remaining when the function
- * returned. return_value == timeout implies the function did not sleep.
- */
-long wait_iff_congested(int sync, long timeout)
-{
-	long ret;
-	unsigned long start = jiffies;
-	DEFINE_WAIT(wait);
-	wait_queue_head_t *wqh = &congestion_wqh[sync];
-
-	/*
-	 * If there is no congestion, yield if necessary instead
-	 * of sleeping on the congestion queue
-	 */
-	if (atomic_read(&nr_wb_congested[sync]) == 0) {
-		cond_resched();
-
-		/* In case we scheduled, work out time remaining */
-		ret = timeout - (jiffies - start);
-		if (ret < 0)
-			ret = 0;
-
-		goto out;
-	}
-
-	/* Sleep until uncongested or a write happens */
-	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
-	ret = io_schedule_timeout(timeout);
-	finish_wait(wqh, &wait);
-
-out:
-	trace_writeback_wait_iff_congested(jiffies_to_usecs(timeout),
-					jiffies_to_usecs(jiffies - start));
-
-	return ret;
-}
-EXPORT_SYMBOL(wait_iff_congested);
--- a/mm/filemap.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/filemap.c
@@ -1612,6 +1612,7 @@ void end_page_writeback(struct page *pag
 
 	smp_mb__after_atomic();
 	wake_up_page(page, PG_writeback);
+	acct_reclaim_writeback(page);
 	put_page(page);
 }
 EXPORT_SYMBOL(end_page_writeback);
--- a/mm/internal.h~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/internal.h
@@ -34,6 +34,17 @@
 
 void page_writeback_init(void);
 
+void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page,
+						int nr_throttled);
+static inline void acct_reclaim_writeback(struct page *page)
+{
+	pg_data_t *pgdat = page_pgdat(page);
+	int nr_throttled = atomic_read(&pgdat->nr_writeback_throttled);
+
+	if (nr_throttled)
+		__acct_reclaim_writeback(pgdat, page, nr_throttled);
+}
+
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
--- a/mm/page_alloc.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/page_alloc.c
@@ -7408,6 +7408,8 @@ static void pgdat_init_kcompactd(struct
 
 static void __meminit pgdat_init_internals(struct pglist_data *pgdat)
 {
+	int i;
+
 	pgdat_resize_init(pgdat);
 
 	pgdat_init_split_queue(pgdat);
@@ -7416,6 +7418,9 @@ static void __meminit pgdat_init_interna
 	init_waitqueue_head(&pgdat->kswapd_wait);
 	init_waitqueue_head(&pgdat->pfmemalloc_wait);
 
+	for (i = 0; i < NR_VMSCAN_THROTTLE; i++)
+		init_waitqueue_head(&pgdat->reclaim_wait[i]);
+
 	pgdat_page_ext_init(pgdat);
 	lruvec_init(&pgdat->__lruvec);
 }
--- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/vmscan.c
@@ -1006,6 +1006,64 @@ static void handle_write_error(struct ad
 	unlock_page(page);
 }
 
+static void
+reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
+							long timeout)
+{
+	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
+	long ret;
+	DEFINE_WAIT(wait);
+
+	/*
+	 * Do not throttle IO workers, kthreads other than kswapd or
+	 * workqueues. They may be required for reclaim to make
+	 * forward progress (e.g. journalling workqueues or kthreads).
+	 */
+	if (!current_is_kswapd() &&
+	    current->flags & (PF_IO_WORKER|PF_KTHREAD))
+		return;
+
+	if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
+		WRITE_ONCE(pgdat->nr_reclaim_start,
+			node_page_state(pgdat, NR_THROTTLED_WRITTEN));
+	}
+
+	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
+	ret = schedule_timeout(timeout);
+	finish_wait(wqh, &wait);
+	atomic_dec(&pgdat->nr_writeback_throttled);
+
+	trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout),
+				jiffies_to_usecs(timeout - ret),
+				reason);
+}
+
+/*
+ * Account for pages written if tasks are throttled waiting on dirty
+ * pages to clean. If enough pages have been cleaned since throttling
+ * started then wakeup the throttled tasks.
+ */
+void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page,
+							int nr_throttled)
+{
+	unsigned long nr_written;
+
+	inc_node_page_state(page, NR_THROTTLED_WRITTEN);
+
+	/*
+	 * This is an inaccurate read as the per-cpu deltas may not
+	 * be synchronised. However, given that the system is
+	 * writeback throttled, it is not worth taking the penalty
+	 * of getting an accurate count. At worst, the throttle
+	 * timeout guarantees forward progress.
+	 */
+	nr_written = node_page_state(pgdat, NR_THROTTLED_WRITTEN) -
+		READ_ONCE(pgdat->nr_reclaim_start);
+
+	if (nr_written > SWAP_CLUSTER_MAX * nr_throttled)
+		wake_up(&pgdat->reclaim_wait[VMSCAN_THROTTLE_WRITEBACK]);
+}
+
 /* possible outcome of pageout() */
 typedef enum {
 	/* failed to write page out, page is locked */
@@ -1411,9 +1469,8 @@ retry:
 
 		/*
 		 * The number of dirty pages determines if a node is marked
-		 * reclaim_congested which affects wait_iff_congested. kswapd
-		 * will stall and start writing pages if the tail of the LRU
-		 * is all dirty unqueued pages.
+		 * reclaim_congested. kswapd will stall and start writing
+		 * pages if the tail of the LRU is all dirty unqueued pages.
 		 */
 		page_check_dirty_writeback(page, &dirty, &writeback);
 		if (dirty || writeback)
@@ -3179,19 +3236,19 @@ again:
 		 * If kswapd scans pages marked for immediate
 		 * reclaim and under writeback (nr_immediate), it
 		 * implies that pages are cycling through the LRU
-		 * faster than they are written so also forcibly stall.
+		 * faster than they are written so forcibly stall
+		 * until some pages complete writeback.
 		 */
 		if (sc->nr.immediate)
-			congestion_wait(BLK_RW_ASYNC, HZ/10);
+			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10);
 	}
 
 	/*
-	 * Tag a node/memcg as congested if all the dirty pages
-	 * scanned were backed by a congested BDI and
-	 * wait_iff_congested will stall.
+	 * Tag a node/memcg as congested if all the dirty pages were marked
+	 * for writeback and immediate reclaim (counted in nr.congested).
 	 *
 	 * Legacy memcg will stall in page writeback so avoid forcibly
-	 * stalling in wait_iff_congested().
+	 * stalling in reclaim_throttle().
 	 */
 	if ((current_is_kswapd() ||
 	     (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) &&
@@ -3199,15 +3256,15 @@ again:
 		set_bit(LRUVEC_CONGESTED, &target_lruvec->flags);
 
 	/*
-	 * Stall direct reclaim for IO completions if underlying BDIs
-	 * and node is congested. Allow kswapd to continue until it
+	 * Stall direct reclaim for IO completions if the lruvec is
+	 * node is congested. Allow kswapd to continue until it
 	 * starts encountering unqueued dirty pages or cycling through
 	 * the LRU too quickly.
 	 */
 	if (!current_is_kswapd() && current_may_throttle() &&
 	    !sc->hibernation_mode &&
 	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
-		wait_iff_congested(BLK_RW_ASYNC, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10);
 
 	if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
 				    sc))
@@ -4285,6 +4342,7 @@ static int kswapd(void *p)
 
 	WRITE_ONCE(pgdat->kswapd_order, 0);
 	WRITE_ONCE(pgdat->kswapd_highest_zoneidx, MAX_NR_ZONES);
+	atomic_set(&pgdat->nr_writeback_throttled, 0);
 	for ( ; ; ) {
 		bool ret;
 
--- a/mm/vmstat.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
+++ a/mm/vmstat.c
@@ -1225,6 +1225,7 @@ const char * const vmstat_text[] = {
 	"nr_vmscan_immediate_reclaim",
 	"nr_dirtied",
 	"nr_written",
+	"nr_throttled_written",
 	"nr_kernel_misc_reclaimable",
 	"nr_foll_pin_acquired",
 	"nr_foll_pin_released",
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 150/262] mm/vmscan: throttle reclaim and compaction when too may pages are isolated
  2021-11-05 20:34 incoming Andrew Morton
                   ` (148 preceding siblings ...)
  2021-11-05 20:42 ` [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 151/262] mm/vmscan: throttle reclaim when no progress is being made Andrew Morton
                   ` (111 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: throttle reclaim and compaction when too may pages are isolated

Page reclaim throttles on congestion if too many parallel reclaim
instances have isolated too many pages.  This makes no sense, excessive
parallelisation has nothing to do with writeback or congestion.

This patch creates an additional workqueue to sleep on when too many pages
are isolated.  The throttled tasks are woken when the number of isolated
pages is reduced or a timeout occurs.  There may be some false positive
wakeups for GFP_NOIO/GFP_NOFS callers but the tasks will throttle again if
necessary.

[shy828301@gmail.com: Wake up from compaction context]
[vbabka@suse.cz: Account number of throttled tasks only for writeback]
Link: https://lkml.kernel.org/r/20211022144651.19914-3-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h        |    1 +
 include/trace/events/vmscan.h |    4 +++-
 mm/compaction.c               |   10 ++++++++--
 mm/internal.h                 |   11 +++++++++++
 mm/vmscan.c                   |   22 ++++++++++++++++------
 5 files changed, 39 insertions(+), 9 deletions(-)

--- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated
+++ a/include/linux/mmzone.h
@@ -275,6 +275,7 @@ enum lru_list {
 
 enum vmscan_throttle_state {
 	VMSCAN_THROTTLE_WRITEBACK,
+	VMSCAN_THROTTLE_ISOLATED,
 	NR_VMSCAN_THROTTLE,
 };
 
--- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated
+++ a/include/trace/events/vmscan.h
@@ -28,10 +28,12 @@
 		) : "RECLAIM_WB_NONE"
 
 #define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
+#define _VMSCAN_THROTTLE_ISOLATED	(1 << VMSCAN_THROTTLE_ISOLATED)
 
 #define show_throttle_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",					\
-		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"}	\
+		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"},	\
+		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"}	\
 		) : "VMSCAN_THROTTLE_NONE"
 
 
--- a/mm/compaction.c~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated
+++ a/mm/compaction.c
@@ -761,6 +761,8 @@ isolate_freepages_range(struct compact_c
 /* Similar to reclaim, but different enough that they don't share logic */
 static bool too_many_isolated(pg_data_t *pgdat)
 {
+	bool too_many;
+
 	unsigned long active, inactive, isolated;
 
 	inactive = node_page_state(pgdat, NR_INACTIVE_FILE) +
@@ -770,7 +772,11 @@ static bool too_many_isolated(pg_data_t
 	isolated = node_page_state(pgdat, NR_ISOLATED_FILE) +
 			node_page_state(pgdat, NR_ISOLATED_ANON);
 
-	return isolated > (inactive + active) / 2;
+	too_many = isolated > (inactive + active) / 2;
+	if (!too_many)
+		wake_throttle_isolated(pgdat);
+
+	return too_many;
 }
 
 /**
@@ -822,7 +828,7 @@ isolate_migratepages_block(struct compac
 		if (cc->mode == MIGRATE_ASYNC)
 			return -EAGAIN;
 
-		congestion_wait(BLK_RW_ASYNC, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10);
 
 		if (fatal_signal_pending(current))
 			return -EINTR;
--- a/mm/internal.h~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated
+++ a/mm/internal.h
@@ -45,6 +45,15 @@ static inline void acct_reclaim_writebac
 		__acct_reclaim_writeback(pgdat, page, nr_throttled);
 }
 
+static inline void wake_throttle_isolated(pg_data_t *pgdat)
+{
+	wait_queue_head_t *wqh;
+
+	wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_ISOLATED];
+	if (waitqueue_active(wqh))
+		wake_up(wqh);
+}
+
 vm_fault_t do_swap_page(struct vm_fault *vmf);
 
 void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
@@ -121,6 +130,8 @@ extern unsigned long highest_memmap_pfn;
  */
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
+extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
+								long timeout);
 
 /*
  * in mm/rmap.c:
--- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-and-compaction-when-too-may-pages-are-isolated
+++ a/mm/vmscan.c
@@ -1006,12 +1006,12 @@ static void handle_write_error(struct ad
 	unlock_page(page);
 }
 
-static void
-reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
+void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
 							long timeout)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
 	long ret;
+	bool acct_writeback = (reason == VMSCAN_THROTTLE_WRITEBACK);
 	DEFINE_WAIT(wait);
 
 	/*
@@ -1023,7 +1023,8 @@ reclaim_throttle(pg_data_t *pgdat, enum
 	    current->flags & (PF_IO_WORKER|PF_KTHREAD))
 		return;
 
-	if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
+	if (acct_writeback &&
+	    atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
 		WRITE_ONCE(pgdat->nr_reclaim_start,
 			node_page_state(pgdat, NR_THROTTLED_WRITTEN));
 	}
@@ -1031,7 +1032,9 @@ reclaim_throttle(pg_data_t *pgdat, enum
 	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
 	ret = schedule_timeout(timeout);
 	finish_wait(wqh, &wait);
-	atomic_dec(&pgdat->nr_writeback_throttled);
+
+	if (acct_writeback)
+		atomic_dec(&pgdat->nr_writeback_throttled);
 
 	trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout),
 				jiffies_to_usecs(timeout - ret),
@@ -2175,6 +2178,7 @@ static int too_many_isolated(struct pgli
 		struct scan_control *sc)
 {
 	unsigned long inactive, isolated;
+	bool too_many;
 
 	if (current_is_kswapd())
 		return 0;
@@ -2198,7 +2202,13 @@ static int too_many_isolated(struct pgli
 	if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS))
 		inactive >>= 3;
 
-	return isolated > inactive;
+	too_many = isolated > inactive;
+
+	/* Wake up tasks throttled due to too_many_isolated. */
+	if (!too_many)
+		wake_throttle_isolated(pgdat);
+
+	return too_many;
 }
 
 /*
@@ -2307,8 +2317,8 @@ shrink_inactive_list(unsigned long nr_to
 			return 0;
 
 		/* wait a bit for the reclaimer. */
-		msleep(100);
 		stalled = true;
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10);
 
 		/* We are about to die and free our memory. Return now. */
 		if (fatal_signal_pending(current))
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 151/262] mm/vmscan: throttle reclaim when no progress is being made
  2021-11-05 20:34 incoming Andrew Morton
                   ` (149 preceding siblings ...)
  2021-11-05 20:42 ` [patch 150/262] mm/vmscan: throttle reclaim and compaction when too may pages are isolated Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 152/262] mm/writeback: throttle based on page writeback instead of congestion Andrew Morton
                   ` (110 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: throttle reclaim when no progress is being made

Memcg reclaim throttles on congestion if no reclaim progress is made. 
This makes little sense, it might be due to writeback or a host of other
factors.

For !memcg reclaim, it's messy.  Direct reclaim primarily is throttled in
the page allocator if it is failing to make progress.  Kswapd throttles if
too many pages are under writeback and marked for immediate reclaim.

This patch explicitly throttles if reclaim is failing to make progress.

[vbabka@suse.cz: Remove redundant code]
Link: https://lkml.kernel.org/r/20211022144651.19914-4-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h        |    1 +
 include/trace/events/vmscan.h |    4 +++-
 mm/memcontrol.c               |   10 +---------
 mm/vmscan.c                   |   28 ++++++++++++++++++++++++++++
 4 files changed, 33 insertions(+), 10 deletions(-)

--- a/include/linux/mmzone.h~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made
+++ a/include/linux/mmzone.h
@@ -276,6 +276,7 @@ enum lru_list {
 enum vmscan_throttle_state {
 	VMSCAN_THROTTLE_WRITEBACK,
 	VMSCAN_THROTTLE_ISOLATED,
+	VMSCAN_THROTTLE_NOPROGRESS,
 	NR_VMSCAN_THROTTLE,
 };
 
--- a/include/trace/events/vmscan.h~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made
+++ a/include/trace/events/vmscan.h
@@ -29,11 +29,13 @@
 
 #define _VMSCAN_THROTTLE_WRITEBACK	(1 << VMSCAN_THROTTLE_WRITEBACK)
 #define _VMSCAN_THROTTLE_ISOLATED	(1 << VMSCAN_THROTTLE_ISOLATED)
+#define _VMSCAN_THROTTLE_NOPROGRESS	(1 << VMSCAN_THROTTLE_NOPROGRESS)
 
 #define show_throttle_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",					\
 		{_VMSCAN_THROTTLE_WRITEBACK,	"VMSCAN_THROTTLE_WRITEBACK"},	\
-		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"}	\
+		{_VMSCAN_THROTTLE_ISOLATED,	"VMSCAN_THROTTLE_ISOLATED"},	\
+		{_VMSCAN_THROTTLE_NOPROGRESS,	"VMSCAN_THROTTLE_NOPROGRESS"}	\
 		) : "VMSCAN_THROTTLE_NONE"
 
 
--- a/mm/memcontrol.c~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made
+++ a/mm/memcontrol.c
@@ -3487,19 +3487,11 @@ static int mem_cgroup_force_empty(struct
 
 	/* try to free all pages in this cgroup */
 	while (nr_retries && page_counter_read(&memcg->memory)) {
-		int progress;
-
 		if (signal_pending(current))
 			return -EINTR;
 
-		progress = try_to_free_mem_cgroup_pages(memcg, 1,
-							GFP_KERNEL, true);
-		if (!progress) {
+		if (!try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true))
 			nr_retries--;
-			/* maybe some writeback is necessary */
-			congestion_wait(BLK_RW_ASYNC, HZ/10);
-		}
-
 	}
 
 	return 0;
--- a/mm/vmscan.c~mm-vmscan-throttle-reclaim-when-no-progress-is-being-made
+++ a/mm/vmscan.c
@@ -3322,6 +3322,33 @@ static inline bool compaction_ready(stru
 	return zone_watermark_ok_safe(zone, 0, watermark, sc->reclaim_idx);
 }
 
+static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
+{
+	/* If reclaim is making progress, wake any throttled tasks. */
+	if (sc->nr_reclaimed) {
+		wait_queue_head_t *wqh;
+
+		wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_NOPROGRESS];
+		if (waitqueue_active(wqh))
+			wake_up(wqh);
+
+		return;
+	}
+
+	/*
+	 * Do not throttle kswapd on NOPROGRESS as it will throttle on
+	 * VMSCAN_THROTTLE_WRITEBACK if there are too many pages under
+	 * writeback and marked for immediate reclaim at the tail of
+	 * the LRU.
+	 */
+	if (current_is_kswapd())
+		return;
+
+	/* Throttle if making no progress at high prioities. */
+	if (sc->priority < DEF_PRIORITY - 2)
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10);
+}
+
 /*
  * This is the direct reclaim path, for page-allocating processes.  We only
  * try to reclaim pages from zones which will satisfy the caller's allocation
@@ -3406,6 +3433,7 @@ static void shrink_zones(struct zonelist
 			continue;
 		last_pgdat = zone->zone_pgdat;
 		shrink_node(zone->zone_pgdat, sc);
+		consider_reclaim_throttle(zone->zone_pgdat, sc);
 	}
 
 	/*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 152/262] mm/writeback: throttle based on page writeback instead of congestion
  2021-11-05 20:34 incoming Andrew Morton
                   ` (150 preceding siblings ...)
  2021-11-05 20:42 ` [patch 151/262] mm/vmscan: throttle reclaim when no progress is being made Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 153/262] mm/page_alloc: remove the throttling logic from the page allocator Andrew Morton
                   ` (109 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/writeback: throttle based on page writeback instead of congestion

do_writepages throttles on congestion if the writepages() fails due to a
lack of memory but congestion_wait() is partially broken as the congestion
state is not updated for all BDIs.

This patch stalls waiting for a number of pages to complete writeback that
located on the local node.  The main weakness is that there is no
correlation between the location of the inode's pages and locality but
that is still better than congestion_wait.

Link: https://lkml.kernel.org/r/20211022144651.19914-5-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page-writeback.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/mm/page-writeback.c~mm-writeback-throttle-based-on-page-writeback-instead-of-congestion
+++ a/mm/page-writeback.c
@@ -2366,8 +2366,15 @@ int do_writepages(struct address_space *
 			ret = generic_writepages(mapping, wbc);
 		if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL))
 			break;
-		cond_resched();
-		congestion_wait(BLK_RW_ASYNC, HZ/50);
+
+		/*
+		 * Lacking an allocation context or the locality or writeback
+		 * state of any of the inode's pages, throttle based on
+		 * writeback activity on the local node. It's as good a
+		 * guess as any.
+		 */
+		reclaim_throttle(NODE_DATA(numa_node_id()),
+			VMSCAN_THROTTLE_WRITEBACK, HZ/50);
 	}
 	/*
 	 * Usually few pages are written by now from those we've just submitted
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 153/262] mm/page_alloc: remove the throttling logic from the page allocator
  2021-11-05 20:34 incoming Andrew Morton
                   ` (151 preceding siblings ...)
  2021-11-05 20:42 ` [patch 152/262] mm/writeback: throttle based on page writeback instead of congestion Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 154/262] mm/vmscan: centralise timeout values for reclaim_throttle Andrew Morton
                   ` (108 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/page_alloc: remove the throttling logic from the page allocator

The page allocator stalls based on the number of pages that are waiting
for writeback to start but this should now be redundant. 
shrink_inactive_list() will wake flusher threads if the LRU tail are
unqueued dirty pages so the flusher should be active.  If it fails to make
progress due to pages under writeback not being completed quickly then it
should stall on VMSCAN_THROTTLE_WRITEBACK.

Link: https://lkml.kernel.org/r/20211022144651.19914-6-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |   21 +--------------------
 1 file changed, 1 insertion(+), 20 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-remove-the-throttling-logic-from-the-page-allocator
+++ a/mm/page_alloc.c
@@ -4791,30 +4791,11 @@ should_reclaim_retry(gfp_t gfp_mask, uns
 		trace_reclaim_retry_zone(z, order, reclaimable,
 				available, min_wmark, *no_progress_loops, wmark);
 		if (wmark) {
-			/*
-			 * If we didn't make any progress and have a lot of
-			 * dirty + writeback pages then we should wait for
-			 * an IO to complete to slow down the reclaim and
-			 * prevent from pre mature OOM
-			 */
-			if (!did_some_progress) {
-				unsigned long write_pending;
-
-				write_pending = zone_page_state_snapshot(zone,
-							NR_ZONE_WRITE_PENDING);
-
-				if (2 * write_pending > reclaimable) {
-					congestion_wait(BLK_RW_ASYNC, HZ/10);
-					return true;
-				}
-			}
-
 			ret = true;
-			goto out;
+			break;
 		}
 	}
 
-out:
 	/*
 	 * Memory allocation/reclaim might be called from a WQ context and the
 	 * current implementation of the WQ concurrency control doesn't
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 154/262] mm/vmscan: centralise timeout values for reclaim_throttle
  2021-11-05 20:34 incoming Andrew Morton
                   ` (152 preceding siblings ...)
  2021-11-05 20:42 ` [patch 153/262] mm/page_alloc: remove the throttling logic from the page allocator Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 155/262] mm/vmscan: increase the timeout if page reclaim is not making progress Andrew Morton
                   ` (107 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: centralise timeout values for reclaim_throttle

Neil Brown raised concerns about callers of reclaim_throttle specifying a
timeout value.  The original timeout values to congestion_wait() were
probably pulled out of thin air or copy&pasted from somewhere else.  This
patch centralises the timeout values and selects a timeout based on the
reason for reclaim throttling.  These figures are also pulled out of the
same thin air but better values may be derived

Running a workload that is throttling for inappropriate periods and
tracing mm_vmscan_throttled can be used to pick a more appropriate value. 
Excessive throttling would pick a lower timeout where as excessive CPU
usage in reclaim context would select a larger timeout.  Ideally a large
value would always be used and the wakeups would occur before a timeout
but that requires careful testing.

Link: https://lkml.kernel.org/r/20211022144651.19914-7-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/compaction.c     |    2 -
 mm/internal.h       |    3 --
 mm/page-writeback.c |    2 -
 mm/vmscan.c         |   50 +++++++++++++++++++++++++++++++-----------
 4 files changed, 40 insertions(+), 17 deletions(-)

--- a/mm/compaction.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle
+++ a/mm/compaction.c
@@ -828,7 +828,7 @@ isolate_migratepages_block(struct compac
 		if (cc->mode == MIGRATE_ASYNC)
 			return -EAGAIN;
 
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED);
 
 		if (fatal_signal_pending(current))
 			return -EINTR;
--- a/mm/internal.h~mm-vmscan-centralise-timeout-values-for-reclaim_throttle
+++ a/mm/internal.h
@@ -130,8 +130,7 @@ extern unsigned long highest_memmap_pfn;
  */
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
-extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
-								long timeout);
+extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason);
 
 /*
  * in mm/rmap.c:
--- a/mm/page-writeback.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle
+++ a/mm/page-writeback.c
@@ -2374,7 +2374,7 @@ int do_writepages(struct address_space *
 		 * guess as any.
 		 */
 		reclaim_throttle(NODE_DATA(numa_node_id()),
-			VMSCAN_THROTTLE_WRITEBACK, HZ/50);
+			VMSCAN_THROTTLE_WRITEBACK);
 	}
 	/*
 	 * Usually few pages are written by now from those we've just submitted
--- a/mm/vmscan.c~mm-vmscan-centralise-timeout-values-for-reclaim_throttle
+++ a/mm/vmscan.c
@@ -1006,12 +1006,10 @@ static void handle_write_error(struct ad
 	unlock_page(page);
 }
 
-void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason,
-							long timeout)
+void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
 {
 	wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason];
-	long ret;
-	bool acct_writeback = (reason == VMSCAN_THROTTLE_WRITEBACK);
+	long timeout, ret;
 	DEFINE_WAIT(wait);
 
 	/*
@@ -1023,17 +1021,43 @@ void reclaim_throttle(pg_data_t *pgdat,
 	    current->flags & (PF_IO_WORKER|PF_KTHREAD))
 		return;
 
-	if (acct_writeback &&
-	    atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
-		WRITE_ONCE(pgdat->nr_reclaim_start,
-			node_page_state(pgdat, NR_THROTTLED_WRITTEN));
+	/*
+	 * These figures are pulled out of thin air.
+	 * VMSCAN_THROTTLE_ISOLATED is a transient condition based on too many
+	 * parallel reclaimers which is a short-lived event so the timeout is
+	 * short. Failing to make progress or waiting on writeback are
+	 * potentially long-lived events so use a longer timeout. This is shaky
+	 * logic as a failure to make progress could be due to anything from
+	 * writeback to a slow device to excessive references pages at the tail
+	 * of the inactive LRU.
+	 */
+	switch(reason) {
+	case VMSCAN_THROTTLE_WRITEBACK:
+		timeout = HZ/10;
+
+		if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) {
+			WRITE_ONCE(pgdat->nr_reclaim_start,
+				node_page_state(pgdat, NR_THROTTLED_WRITTEN));
+		}
+
+		break;
+	case VMSCAN_THROTTLE_NOPROGRESS:
+		timeout = HZ/10;
+		break;
+	case VMSCAN_THROTTLE_ISOLATED:
+		timeout = HZ/50;
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		timeout = HZ;
+		break;
 	}
 
 	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
 	ret = schedule_timeout(timeout);
 	finish_wait(wqh, &wait);
 
-	if (acct_writeback)
+	if (reason == VMSCAN_THROTTLE_WRITEBACK)
 		atomic_dec(&pgdat->nr_writeback_throttled);
 
 	trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout),
@@ -2318,7 +2342,7 @@ shrink_inactive_list(unsigned long nr_to
 
 		/* wait a bit for the reclaimer. */
 		stalled = true;
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED);
 
 		/* We are about to die and free our memory. Return now. */
 		if (fatal_signal_pending(current))
@@ -3250,7 +3274,7 @@ again:
 		 * until some pages complete writeback.
 		 */
 		if (sc->nr.immediate)
-			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10);
+			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
 	}
 
 	/*
@@ -3274,7 +3298,7 @@ again:
 	if (!current_is_kswapd() && current_may_throttle() &&
 	    !sc->hibernation_mode &&
 	    test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
 
 	if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
 				    sc))
@@ -3346,7 +3370,7 @@ static void consider_reclaim_throttle(pg
 
 	/* Throttle if making no progress at high prioities. */
 	if (sc->priority < DEF_PRIORITY - 2)
-		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS, HZ/10);
+		reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
 }
 
 /*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 155/262] mm/vmscan: increase the timeout if page reclaim is not making progress
  2021-11-05 20:34 incoming Andrew Morton
                   ` (153 preceding siblings ...)
  2021-11-05 20:42 ` [patch 154/262] mm/vmscan: centralise timeout values for reclaim_throttle Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 156/262] mm/vmscan: delay waking of tasks throttled on NOPROGRESS Andrew Morton
                   ` (106 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: increase the timeout if page reclaim is not making progress

Tracing of the stutterp workload showed the following delays

      1 usect_delayed=124000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=128000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=176000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=536000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=544000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=556000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=624000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=716000 reason=VMSCAN_THROTTLE_NOPROGRESS
      1 usect_delayed=772000 reason=VMSCAN_THROTTLE_NOPROGRESS
      2 usect_delayed=512000 reason=VMSCAN_THROTTLE_NOPROGRESS
     16 usect_delayed=120000 reason=VMSCAN_THROTTLE_NOPROGRESS
     53 usect_delayed=116000 reason=VMSCAN_THROTTLE_NOPROGRESS
    116 usect_delayed=112000 reason=VMSCAN_THROTTLE_NOPROGRESS
   5907 usect_delayed=108000 reason=VMSCAN_THROTTLE_NOPROGRESS
  71741 usect_delayed=104000 reason=VMSCAN_THROTTLE_NOPROGRESS

All the throttling hit the full timeout and then there was wakeup delays
meaning that the wakeups are premature as no other reclaimer such as
kswapd has made progress.  This patch increases the maximum timeout.

Link: https://lkml.kernel.org/r/20211022144651.19914-8-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmscan.c~mm-vmscan-increase-the-timeout-if-page-reclaim-is-not-making-progress
+++ a/mm/vmscan.c
@@ -1042,7 +1042,7 @@ void reclaim_throttle(pg_data_t *pgdat,
 
 		break;
 	case VMSCAN_THROTTLE_NOPROGRESS:
-		timeout = HZ/10;
+		timeout = HZ/2;
 		break;
 	case VMSCAN_THROTTLE_ISOLATED:
 		timeout = HZ/50;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 156/262] mm/vmscan: delay waking of tasks throttled on NOPROGRESS
  2021-11-05 20:34 incoming Andrew Morton
                   ` (154 preceding siblings ...)
  2021-11-05 20:42 ` [patch 155/262] mm/vmscan: increase the timeout if page reclaim is not making progress Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 157/262] mm/vmpressure: fix data-race with memcg->socket_pressure Andrew Morton
                   ` (105 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: adilger.kernel, akpm, corbet, david, djwong, hannes, linux-mm,
	mgorman, mhocko, mm-commits, neilb, riel, torvalds, tytso,
	vbabka, willy

From: Mel Gorman <mgorman@techsingularity.net>
Subject: mm/vmscan: delay waking of tasks throttled on NOPROGRESS

Tracing indicates that tasks throttled on NOPROGRESS are woken prematurely
resulting in occasional massive spikes in direct reclaim activity.  This
patch wakes tasks throttled on NOPROGRESS if reclaim efficiency is at
least 12%.

Link: https://lkml.kernel.org/r/20211022144651.19914-9-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Darrick J . Wong" <djwong@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmscan.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/mm/vmscan.c~mm-vmscan-delay-waking-of-tasks-throttled-on-noprogress
+++ a/mm/vmscan.c
@@ -3348,8 +3348,11 @@ static inline bool compaction_ready(stru
 
 static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
 {
-	/* If reclaim is making progress, wake any throttled tasks. */
-	if (sc->nr_reclaimed) {
+	/*
+	 * If reclaim is making progress greater than 12% efficiency then
+	 * wake all the NOPROGRESS throttled tasks.
+	 */
+	if (sc->nr_reclaimed > (sc->nr_scanned >> 3)) {
 		wait_queue_head_t *wqh;
 
 		wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_NOPROGRESS];
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 157/262] mm/vmpressure: fix data-race with memcg->socket_pressure
  2021-11-05 20:34 incoming Andrew Morton
                   ` (155 preceding siblings ...)
  2021-11-05 20:42 ` [patch 156/262] mm/vmscan: delay waking of tasks throttled on NOPROGRESS Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 158/262] tools/vm/page_owner_sort.c: count and sort by mem Andrew Morton
                   ` (104 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, alexs, guro, hannes, linux-mm, mhocko, mm-commits,
	richard.weiyang, shakeelb, songmuchun, songyuanzheng, torvalds,
	willy

From: Yuanzheng Song <songyuanzheng@huawei.com>
Subject: mm/vmpressure: fix data-race with memcg->socket_pressure

BUG: KCSAN: data-race in __sk_mem_reduce_allocated / vmpressure

write to 0xffff8881286f4938 of 8 bytes by task 24550 on cpu 3:
 vmpressure+0x218/0x230 mm/vmpressure.c:307
 shrink_node_memcgs+0x2b9/0x410 mm/vmscan.c:2658
 shrink_node+0x9d2/0x11d0 mm/vmscan.c:2769
 shrink_zones+0x29f/0x470 mm/vmscan.c:2972
 do_try_to_free_pages+0x193/0x6e0 mm/vmscan.c:3027
 try_to_free_mem_cgroup_pages+0x1c0/0x3f0 mm/vmscan.c:3345
 reclaim_high mm/memcontrol.c:2440 [inline]
 mem_cgroup_handle_over_high+0x18b/0x4d0 mm/memcontrol.c:2624
 tracehook_notify_resume include/linux/tracehook.h:197 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:164 [inline]
 exit_to_user_mode_prepare+0x110/0x170 kernel/entry/common.c:191
 syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266
 ret_from_fork+0x15/0x30 arch/x86/entry/entry_64.S:289

read to 0xffff8881286f4938 of 8 bytes by interrupt on cpu 1:
 mem_cgroup_under_socket_pressure include/linux/memcontrol.h:1483 [inline]
 sk_under_memory_pressure include/net/sock.h:1314 [inline]
 __sk_mem_reduce_allocated+0x1d2/0x270 net/core/sock.c:2696
 __sk_mem_reclaim+0x44/0x50 net/core/sock.c:2711
 sk_mem_reclaim include/net/sock.h:1490 [inline]
 ......
 net_rx_action+0x17a/0x480 net/core/dev.c:6864
 __do_softirq+0x12c/0x2af kernel/softirq.c:298
 run_ksoftirqd+0x13/0x20 kernel/softirq.c:653
 smpboot_thread_fn+0x33f/0x510 kernel/smpboot.c:165
 kthread+0x1fc/0x220 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

When reading memcg->socket_pressure in mem_cgroup_under_socket_pressure()
and writing memcg->socket_pressure in vmpressure() at the same time,
the data-race occurs.

So fix it by using READ_ONCE() and WRITE_ONCE() to read and write
memcg->socket_pressure.

Link: https://lkml.kernel.org/r/20211025082843.671690-1-songyuanzheng@huawei.com
Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Alex Shi <alexs@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memcontrol.h |    2 +-
 mm/vmpressure.c            |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/memcontrol.h~mm-vmpressure-fix-data-race-with-memcg-socket_pressure
+++ a/include/linux/memcontrol.h
@@ -1606,7 +1606,7 @@ static inline bool mem_cgroup_under_sock
 	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure)
 		return true;
 	do {
-		if (time_before(jiffies, memcg->socket_pressure))
+		if (time_before(jiffies, READ_ONCE(memcg->socket_pressure)))
 			return true;
 	} while ((memcg = parent_mem_cgroup(memcg)));
 	return false;
--- a/mm/vmpressure.c~mm-vmpressure-fix-data-race-with-memcg-socket_pressure
+++ a/mm/vmpressure.c
@@ -308,7 +308,7 @@ void vmpressure(gfp_t gfp, struct mem_cg
 			 * asserted for a second in which subsequent
 			 * pressure events can occur.
 			 */
-			memcg->socket_pressure = jiffies + HZ;
+			WRITE_ONCE(memcg->socket_pressure, jiffies + HZ);
 		}
 	}
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 158/262] tools/vm/page_owner_sort.c: count and sort by mem
  2021-11-05 20:34 incoming Andrew Morton
                   ` (156 preceding siblings ...)
  2021-11-05 20:42 ` [patch 157/262] mm/vmpressure: fix data-race with memcg->socket_pressure Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:42 ` [patch 159/262] tools/vm/page-types.c: make walk_file() aware of address range option Andrew Morton
                   ` (103 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, nixiaoming, tangbin, torvalds,
	weizhenliang, zhangshengju

From: Zhenliang Wei <weizhenliang@huawei.com>
Subject: tools/vm/page_owner_sort.c: count and sort by mem

When viewing page owner information, we may be more concerned about the
total memory rather than the times of stack appears.  Therefore, the
following adjustments are made:

1. Added the statistics on the total number of pages.

2. Added the optional parameter "-m" to configure the program to sort by
   memory (total pages).

The general output of page_owner is as follows:

	Page allocated via order XXX, ...
	PFN XXX ...
	 // Detailed stack

	Page allocated via order XXX, ...
	PFN XXX ...
	 // Detailed stack

The original page_owner_sort ignores PFN rows, puts the remaining rows
in buf, counts the times of buf, and finally sorts them according to
the times. General output:

	XXX times:
	Page allocated via order XXX, ...
	 // Detailed stack

Now, we use regexp to extract the page order value from the buf,
and count the total pages for the buf. General output:

	XXX times, XXX pages:
	Page allocated via order XXX, ...
	 // Detailed stack

By default, it is still sorted by the times of buf;
If you want to sort by the pages nums of buf, use the new -m parameter.

Link: https://lkml.kernel.org/r/1631678242-41033-1-git-send-email-weizhenliang@huawei.com
Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/page_owner.rst |   23 +++++++
 tools/vm/page_owner_sort.c      |   94 +++++++++++++++++++++++++++---
 2 files changed, 107 insertions(+), 10 deletions(-)

--- a/Documentation/vm/page_owner.rst~tools-vm-page_owner_sortc-count-and-sort-by-mem
+++ a/Documentation/vm/page_owner.rst
@@ -85,5 +85,26 @@ Usage
 	cat /sys/kernel/debug/page_owner > page_owner_full.txt
 	./page_owner_sort page_owner_full.txt sorted_page_owner.txt
 
+   The general output of ``page_owner_full.txt`` is as follows:
+
+	Page allocated via order XXX, ...
+	PFN XXX ...
+	 // Detailed stack
+
+	Page allocated via order XXX, ...
+	PFN XXX ...
+	 // Detailed stack
+
+   The ``page_owner_sort`` tool ignores ``PFN`` rows, puts the remaining rows
+   in buf, uses regexp to extract the page order value, counts the times
+   and pages of buf, and finally sorts them according to the times.
+
    See the result about who allocated each page
-   in the ``sorted_page_owner.txt``.
+   in the ``sorted_page_owner.txt``. General output:
+
+	XXX times, XXX pages:
+	Page allocated via order XXX, ...
+	 // Detailed stack
+
+   By default, ``page_owner_sort`` is sorted according to the times of buf.
+   If you want to sort by the pages nums of buf, use the ``-m`` parameter.
--- a/tools/vm/page_owner_sort.c~tools-vm-page_owner_sortc-count-and-sort-by-mem
+++ a/tools/vm/page_owner_sort.c
@@ -5,6 +5,8 @@
  * Example use:
  * cat /sys/kernel/debug/page_owner > page_owner_full.txt
  * ./page_owner_sort page_owner_full.txt sorted_page_owner.txt
+ * Or sort by total memory:
+ * ./page_owner_sort -m page_owner_full.txt sorted_page_owner.txt
  *
  * See Documentation/vm/page_owner.rst
 */
@@ -16,14 +18,18 @@
 #include <fcntl.h>
 #include <unistd.h>
 #include <string.h>
+#include <regex.h>
+#include <errno.h>
 
 struct block_list {
 	char *txt;
 	int len;
 	int num;
+	int page_num;
 };
 
-
+static int sort_by_memory;
+static regex_t order_pattern;
 static struct block_list *list;
 static int list_size;
 static int max_size;
@@ -59,12 +65,50 @@ static int compare_num(const void *p1, c
 	return l2->num - l1->num;
 }
 
+static int compare_page_num(const void *p1, const void *p2)
+{
+	const struct block_list *l1 = p1, *l2 = p2;
+
+	return l2->page_num - l1->page_num;
+}
+
+static int get_page_num(char *buf)
+{
+	int err, val_len, order_val;
+	char order_str[4] = {0};
+	char *endptr;
+	regmatch_t pmatch[2];
+
+	err = regexec(&order_pattern, buf, 2, pmatch, REG_NOTBOL);
+	if (err != 0 || pmatch[1].rm_so == -1) {
+		printf("no order pattern in %s\n", buf);
+		return 0;
+	}
+	val_len = pmatch[1].rm_eo - pmatch[1].rm_so;
+	if (val_len > 2) /* max_order should not exceed 2 digits */
+		goto wrong_order;
+
+	memcpy(order_str, buf + pmatch[1].rm_so, val_len);
+
+	errno = 0;
+	order_val = strtol(order_str, &endptr, 10);
+	if (errno != 0 || endptr == order_str || *endptr != '\0')
+		goto wrong_order;
+
+	return 1 << order_val;
+
+wrong_order:
+	printf("wrong order in follow buf:\n%s\n", buf);
+	return 0;
+}
+
 static void add_list(char *buf, int len)
 {
 	if (list_size != 0 &&
 	    len == list[list_size-1].len &&
 	    memcmp(buf, list[list_size-1].txt, len) == 0) {
 		list[list_size-1].num++;
+		list[list_size-1].page_num += get_page_num(buf);
 		return;
 	}
 	if (list_size == max_size) {
@@ -74,6 +118,7 @@ static void add_list(char *buf, int len)
 	list[list_size].txt = malloc(len+1);
 	list[list_size].len = len;
 	list[list_size].num = 1;
+	list[list_size].page_num = get_page_num(buf);
 	memcpy(list[list_size].txt, buf, len);
 	list[list_size].txt[len] = 0;
 	list_size++;
@@ -85,6 +130,13 @@ static void add_list(char *buf, int len)
 
 #define BUF_SIZE	(128 * 1024)
 
+static void usage(void)
+{
+	printf("Usage: ./page_owner_sort [-m] <input> <output>\n"
+		"-m	Sort by total memory. If this option is unset, sort by times\n"
+	);
+}
+
 int main(int argc, char **argv)
 {
 	FILE *fin, *fout;
@@ -92,21 +144,39 @@ int main(int argc, char **argv)
 	int ret, i, count;
 	struct block_list *list2;
 	struct stat st;
+	int err;
+	int opt;
 
-	if (argc < 3) {
-		printf("Usage: ./program <input> <output>\n");
-		perror("open: ");
+	while ((opt = getopt(argc, argv, "m")) != -1)
+		switch (opt) {
+		case 'm':
+			sort_by_memory = 1;
+			break;
+		default:
+			usage();
+			exit(1);
+		}
+
+	if (optind >= (argc - 1)) {
+		usage();
 		exit(1);
 	}
 
-	fin = fopen(argv[1], "r");
-	fout = fopen(argv[2], "w");
+	fin = fopen(argv[optind], "r");
+	fout = fopen(argv[optind + 1], "w");
 	if (!fin || !fout) {
-		printf("Usage: ./program <input> <output>\n");
+		usage();
 		perror("open: ");
 		exit(1);
 	}
 
+	err = regcomp(&order_pattern, "order\\s*([0-9]*),", REG_EXTENDED|REG_NEWLINE);
+	if (err != 0 || order_pattern.re_nsub != 1) {
+		printf("%s: Invalid pattern 'order\\s*([0-9]*),' code %d\n",
+			argv[0], err);
+		exit(1);
+	}
+
 	fstat(fileno(fin), &st);
 	max_size = st.st_size / 100; /* hack ... */
 
@@ -145,13 +215,19 @@ int main(int argc, char **argv)
 			list2[count++] = list[i];
 		} else {
 			list2[count-1].num += list[i].num;
+			list2[count-1].page_num += list[i].page_num;
 		}
 	}
 
-	qsort(list2, count, sizeof(list[0]), compare_num);
+	if (sort_by_memory)
+		qsort(list2, count, sizeof(list[0]), compare_page_num);
+	else
+		qsort(list2, count, sizeof(list[0]), compare_num);
 
 	for (i = 0; i < count; i++)
-		fprintf(fout, "%d times:\n%s\n", list2[i].num, list2[i].txt);
+		fprintf(fout, "%d times, %d pages:\n%s\n",
+				list2[i].num, list2[i].page_num, list2[i].txt);
 
+	regfree(&order_pattern);
 	return 0;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 159/262] tools/vm/page-types.c: make walk_file() aware of address range option
  2021-11-05 20:34 incoming Andrew Morton
                   ` (157 preceding siblings ...)
  2021-11-05 20:42 ` [patch 158/262] tools/vm/page_owner_sort.c: count and sort by mem Andrew Morton
@ 2021-11-05 20:42 ` Andrew Morton
  2021-11-05 20:43 ` [patch 160/262] tools/vm/page-types.c: move show_file() to summary output Andrew Morton
                   ` (102 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:42 UTC (permalink / raw)
  To: akpm, changbin.du, chansen3, koct9i, linux-mm, mm-commits,
	naoya.horiguchi, torvalds, wangbin224

From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: tools/vm/page-types.c: make walk_file() aware of address range option

Patch series "tools/vm/page-types.c: a few improvements".

This patchset adds some improvements on tools/vm/page-types.c.  Patch 1/3
makes -a option (specify address range) work with -f (file cache mode). 
Patch 2/3 and 3/3 are to fix minor formatting issues of this tool.  These
would make life a little easier for the users of this tool.

Please see individual patches for more details about specific issues.


This patch (of 3):

-a|--addr option is used to limit the range of address to be scanned for
page status.  It works now for physical address space (dafult mode) or for
virtual address space (with -p option), but not for file address space
(with -f option).  So make walk_file() aware of -a option.

Link: https://lkml.kernel.org/r/20211004061325.1525902-1-naoya.horiguchi@linux.dev
Link: https://lkml.kernel.org/r/20211004061325.1525902-2-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Bin Wang <wangbin224@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/vm/page-types.c |   24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

--- a/tools/vm/page-types.c~tools-vm-page-typesc-make-walk_file-aware-of-address-range-option
+++ a/tools/vm/page-types.c
@@ -967,22 +967,19 @@ static struct sigaction sigbus_action =
 	.sa_flags = SA_SIGINFO,
 };
 
-static void walk_file(const char *name, const struct stat *st)
+static void walk_file_range(const char *name, int fd,
+			    unsigned long off, unsigned long end)
 {
 	uint8_t vec[PAGEMAP_BATCH];
 	uint64_t buf[PAGEMAP_BATCH], flags;
 	uint64_t cgroup = 0;
 	uint64_t mapcnt = 0;
 	unsigned long nr_pages, pfn, i;
-	off_t off, end = st->st_size;
-	int fd;
 	ssize_t len;
 	void *ptr;
 	int first = 1;
 
-	fd = checked_open(name, O_RDONLY|O_NOATIME|O_NOFOLLOW);
-
-	for (off = 0; off < end; off += len) {
+	for (; off < end; off += len) {
 		nr_pages = (end - off + page_size - 1) / page_size;
 		if (nr_pages > PAGEMAP_BATCH)
 			nr_pages = PAGEMAP_BATCH;
@@ -1043,6 +1040,21 @@ got_sigbus:
 				 flags, cgroup, mapcnt, buf[i]);
 		}
 	}
+}
+
+static void walk_file(const char *name, const struct stat *st)
+{
+	int i;
+	int fd;
+
+	fd = checked_open(name, O_RDONLY|O_NOATIME|O_NOFOLLOW);
+
+	if (!nr_addr_ranges)
+		add_addr_range(0, st->st_size / page_size);
+
+	for (i = 0; i < nr_addr_ranges; i++)
+		walk_file_range(name, fd, opt_offset[i] * page_size,
+				(opt_offset[i] + opt_size[i]) * page_size);
 
 	close(fd);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 160/262] tools/vm/page-types.c: move show_file() to summary output
  2021-11-05 20:34 incoming Andrew Morton
                   ` (158 preceding siblings ...)
  2021-11-05 20:42 ` [patch 159/262] tools/vm/page-types.c: make walk_file() aware of address range option Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 161/262] tools/vm/page-types.c: print file offset in hexadecimal Andrew Morton
                   ` (101 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, changbin.du, chansen3, koct9i, linux-mm, mm-commits,
	naoya.horiguchi, torvalds, wangbin224

From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: tools/vm/page-types.c: move show_file() to summary output

Currently file info from show_file() is printed out within page list like
below, but this is inconvenient a little to utilize the page list from
other scripts (maybe needs additional filtering).

    $ ./page-types -f page-types.c -l
    foffset offset  len     flags
    page-types.c Inode: 15108680 Size: 30953 (8 pages)
    Modify: Sat Oct  2 23:11:20 2021 (2399 seconds ago)
    Access: Sat Oct  2 23:11:28 2021 (2391 seconds ago)
    0       d9f59e  1       ___U_lA____________________________________
    1       1031eb5 1       __RU_l_____________________________________
    2       13bf717 1       __RU_l_____________________________________
    3       13ac333 1       ___U_lA____________________________________
    4       d9f59f  1       __RU_l_____________________________________
    5       183fd49 1       ___U_lA____________________________________
    6       13cbf69 1       ___U_lA____________________________________
    7       d9ef05  1       ___U_lA____________________________________

                 flags      page-count       MB  symbolic-flags                     long-symbolic-flags
    0x000000000000002c               3        0  __RU_l_____________________________________        referenced,uptodate,lru
    0x0000000000000068               5        0  ___U_lA____________________________________        uptodate,lru,active
                 total               8        0

With this patch file info is printed out in summary part like below:

    $ ./page-types -f page-types.c -l
    foffset offset  len     flags
    0       d9f59e  1       ___U_lA_____________________________________
    1       1031eb5 1       __RU_l______________________________________
    2       13bf717 1       __RU_l______________________________________
    3       13ac333 1       ___U_lA_____________________________________
    4       d9f59f  1       __RU_l______________________________________
    5       183fd49 1       ___U_lA_____________________________________
    6       13cbf69 1       ___U_lA_____________________________________

    page-types.c Inode: 15108680 Size: 30953 (8 pages)
    Modify: Sat Oct  2 23:11:20 2021 (2435 seconds ago)
    Access: Sat Oct  2 23:11:28 2021 (2427 seconds ago)

                 flags      page-count       MB  symbolic-flags                     long-symbolic-flags
    0x000000000000002c               3        0  __RU_l______________________________________       referenced,uptodate,lru
    0x0000000000000068               4        0  ___U_lA_____________________________________       uptodate,lru,active
                 total               7        0

Link: https://lkml.kernel.org/r/20211004061325.1525902-3-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Bin Wang <wangbin224@huawei.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/vm/page-types.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

--- a/tools/vm/page-types.c~tools-vm-page-typesc-move-show_file-to-summary-output
+++ a/tools/vm/page-types.c
@@ -1034,7 +1034,6 @@ got_sigbus:
 			if (first && opt_list) {
 				first = 0;
 				flush_page_range();
-				show_file(name, st);
 			}
 			add_page(off / page_size + i, pfn,
 				 flags, cgroup, mapcnt, buf[i]);
@@ -1074,10 +1073,10 @@ int walk_tree(const char *name, const st
 	return 0;
 }
 
+struct stat st;
+
 static void walk_page_cache(void)
 {
-	struct stat st;
-
 	kpageflags_fd = checked_open(opt_kpageflags, O_RDONLY);
 	pagemap_fd = checked_open("/proc/self/pagemap", O_RDONLY);
 	sigaction(SIGBUS, &sigbus_action, NULL);
@@ -1374,6 +1373,11 @@ int main(int argc, char *argv[])
 	if (opt_list)
 		printf("\n\n");
 
+	if (opt_file) {
+		show_file(opt_file, &st);
+		printf("\n");
+	}
+
 	show_summary();
 
 	if (opt_list_mapcnt)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 161/262] tools/vm/page-types.c: print file offset in hexadecimal
  2021-11-05 20:34 incoming Andrew Morton
                   ` (159 preceding siblings ...)
  2021-11-05 20:43 ` [patch 160/262] tools/vm/page-types.c: move show_file() to summary output Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 162/262] arch_numa: simplify numa_distance allocation Andrew Morton
                   ` (100 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, changbin.du, chansen3, koct9i, linux-mm, mm-commits,
	naoya.horiguchi, torvalds, wangbin224

From: Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: tools/vm/page-types.c: print file offset in hexadecimal

In page list mode (with -l and -L option), virtual address and physical
address are printed in hexadecimal, but file offset is not, which is
confusing, so let's align it.

Link: https://lkml.kernel.org/r/20211004061325.1525902-4-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Bin Wang <wangbin224@huawei.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Christian Hansen <chansen3@cisco.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/vm/page-types.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/vm/page-types.c~tools-vm-page-typesc-print-file-offset-in-hexadecimal
+++ a/tools/vm/page-types.c
@@ -390,7 +390,7 @@ static void show_page_range(unsigned lon
 		if (opt_pid)
 			printf("%lx\t", voff);
 		if (opt_file)
-			printf("%lu\t", voff);
+			printf("%lx\t", voff);
 		if (opt_list_cgroup)
 			printf("@%llu\t", (unsigned long long)cgroup0);
 		if (opt_list_mapcnt)
@@ -418,7 +418,7 @@ static void show_page(unsigned long voff
 	if (opt_pid)
 		printf("%lx\t", voffset);
 	if (opt_file)
-		printf("%lu\t", voffset);
+		printf("%lx\t", voffset);
 	if (opt_list_cgroup)
 		printf("@%llu\t", (unsigned long long)cgroup);
 	if (opt_list_mapcnt)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 162/262] arch_numa: simplify numa_distance allocation
  2021-11-05 20:34 incoming Andrew Morton
                   ` (160 preceding siblings ...)
  2021-11-05 20:43 ` [patch 161/262] tools/vm/page-types.c: print file offset in hexadecimal Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 163/262] xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer Andrew Morton
                   ` (99 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: arch_numa: simplify numa_distance allocation

Patch series "memblock: cleanup memblock_free interface", v2.

This is the fix for memblock freeing APIs mismatch [1].

The first patch is a cleanup of numa_distance allocation in arch_numa I've
spotted during the conversion.  The second patch is a fix for Xen memory
freeing on some of the error paths.

[1] https://lore.kernel.org/all/CAHk-=wj9k4LZTz+svCxLYs5Y1=+yKrbAUArH1+ghyG3OLd8VVg@mail.gmail.com


This patch (of 6):

Memory allocation of numa_distance uses memblock_phys_alloc_range()
without actual range limits, converts the returned physical address to
virtual and then only uses the virtual address for further initialization.

Simplify this by replacing memblock_phys_alloc_range() with
memblock_alloc().

Link: https://lkml.kernel.org/r/20210930185031.18648-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20210930185031.18648-2-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/base/arch_numa.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/drivers/base/arch_numa.c~arch_numa-simplify-numa_distance-allocation
+++ a/drivers/base/arch_numa.c
@@ -337,15 +337,13 @@ void __init numa_free_distance(void)
 static int __init numa_alloc_distance(void)
 {
 	size_t size;
-	u64 phys;
 	int i, j;
 
 	size = nr_node_ids * nr_node_ids * sizeof(numa_distance[0]);
-	phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0, PFN_PHYS(max_pfn));
-	if (WARN_ON(!phys))
+	numa_distance = memblock_alloc(size, PAGE_SIZE);
+	if (WARN_ON(!numa_distance))
 		return -ENOMEM;
 
-	numa_distance = __va(phys);
 	numa_distance_cnt = nr_node_ids;
 
 	/* fill with the default distances */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 163/262] xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer
  2021-11-05 20:34 incoming Andrew Morton
                   ` (161 preceding siblings ...)
  2021-11-05 20:43 ` [patch 162/262] arch_numa: simplify numa_distance allocation Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 164/262] memblock: drop memblock_free_early_nid() and memblock_free_early() Andrew Morton
                   ` (98 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer

free_p2m_page() wrongly passes a virtual pointer to memblock_free() that
treats it as a physical address.

Call memblock_free_ptr() instead that gets a virtual address to free the
memory.

Link: https://lkml.kernel.org/r/20210930185031.18648-3-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/xen/p2m.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/xen/p2m.c~xen-x86-free_p2m_page-use-memblock_free_ptr-to-free-a-virtual-pointer
+++ a/arch/x86/xen/p2m.c
@@ -197,7 +197,7 @@ static void * __ref alloc_p2m_page(void)
 static void __ref free_p2m_page(void *p)
 {
 	if (unlikely(!slab_is_available())) {
-		memblock_free((unsigned long)p, PAGE_SIZE);
+		memblock_free_ptr(p, PAGE_SIZE);
 		return;
 	}
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 164/262] memblock: drop memblock_free_early_nid() and memblock_free_early()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (162 preceding siblings ...)
  2021-11-05 20:43 ` [patch 163/262] xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 165/262] memblock: stop aliasing __memblock_free_late with memblock_free_late Andrew Morton
                   ` (97 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: memblock: drop memblock_free_early_nid() and memblock_free_early()

memblock_free_early_nid() is unused and memblock_free_early() is an alias
for memblock_free().

Replace calls to memblock_free_early() with calls to memblock_free() and
remove memblock_free_early() and memblock_free_early_nid().

Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/mips/mm/init.c                  |    2 +-
 arch/powerpc/platforms/pseries/svm.c |    3 +--
 arch/s390/kernel/smp.c               |    2 +-
 drivers/base/arch_numa.c             |    2 +-
 drivers/s390/char/sclp_early.c       |    2 +-
 include/linux/memblock.h             |   12 ------------
 kernel/dma/swiotlb.c                 |    2 +-
 lib/cpumask.c                        |    2 +-
 mm/percpu.c                          |    8 ++++----
 mm/sparse.c                          |    2 +-
 10 files changed, 12 insertions(+), 25 deletions(-)

--- a/arch/mips/mm/init.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/arch/mips/mm/init.c
@@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_free_early(__pa(ptr), size);
+	memblock_free(__pa(ptr), size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/arch/powerpc/platforms/pseries/svm.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/arch/powerpc/platforms/pseries/svm.c
@@ -56,8 +56,7 @@ void __init svm_swiotlb_init(void)
 		return;
 
 
-	memblock_free_early(__pa(vstart),
-			    PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+	memblock_free(__pa(vstart), PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
 	panic("SVM: Cannot allocate SWIOTLB buffer");
 }
 
--- a/arch/s390/kernel/smp.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/arch/s390/kernel/smp.c
@@ -880,7 +880,7 @@ void __init smp_detect_cpus(void)
 
 	/* Add CPUs present at boot */
 	__smp_rescan_cpus(info, true);
-	memblock_free_early((unsigned long)info, sizeof(*info));
+	memblock_free((unsigned long)info, sizeof(*info));
 }
 
 /*
--- a/drivers/base/arch_numa.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/drivers/base/arch_numa.c
@@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_free_early(__pa(ptr), size);
+	memblock_free(__pa(ptr), size);
 }
 
 #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
--- a/drivers/s390/char/sclp_early.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/drivers/s390/char/sclp_early.c
@@ -139,7 +139,7 @@ int __init sclp_early_get_core_info(stru
 	}
 	sclp_fill_core_info(info, sccb);
 out:
-	memblock_free_early((unsigned long)sccb, length);
+	memblock_free((unsigned long)sccb, length);
 	return rc;
 }
 
--- a/include/linux/memblock.h~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/include/linux/memblock.h
@@ -441,18 +441,6 @@ static inline void *memblock_alloc_node(
 				      MEMBLOCK_ALLOC_ACCESSIBLE, nid);
 }
 
-static inline void memblock_free_early(phys_addr_t base,
-					      phys_addr_t size)
-{
-	memblock_free(base, size);
-}
-
-static inline void memblock_free_early_nid(phys_addr_t base,
-						  phys_addr_t size, int nid)
-{
-	memblock_free(base, size);
-}
-
 static inline void memblock_free_late(phys_addr_t base, phys_addr_t size)
 {
 	__memblock_free_late(base, size);
--- a/kernel/dma/swiotlb.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/kernel/dma/swiotlb.c
@@ -247,7 +247,7 @@ swiotlb_init(int verbose)
 	return;
 
 fail_free_mem:
-	memblock_free_early(__pa(tlb), bytes);
+	memblock_free(__pa(tlb), bytes);
 fail:
 	pr_warn("Cannot allocate buffer");
 }
--- a/lib/cpumask.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/lib/cpumask.c
@@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var);
  */
 void __init free_bootmem_cpumask_var(cpumask_var_t mask)
 {
-	memblock_free_early(__pa(mask), cpumask_size());
+	memblock_free(__pa(mask), cpumask_size());
 }
 #endif
 
--- a/mm/percpu.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/mm/percpu.c
@@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all
  */
 void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
 {
-	memblock_free_early(__pa(ai), ai->__ai_size);
+	memblock_free(__pa(ai), ai->__ai_size);
 }
 
 /**
@@ -3134,7 +3134,7 @@ out_free_areas:
 out_free:
 	pcpu_free_alloc_info(ai);
 	if (areas)
-		memblock_free_early(__pa(areas), areas_size);
+		memblock_free(__pa(areas), areas_size);
 	return rc;
 }
 #endif /* BUILD_EMBED_FIRST_CHUNK */
@@ -3256,7 +3256,7 @@ enomem:
 		free_fn(page_address(pages[j]), PAGE_SIZE);
 	rc = -ENOMEM;
 out_free_ar:
-	memblock_free_early(__pa(pages), pages_size);
+	memblock_free(__pa(pages), pages_size);
 	pcpu_free_alloc_info(ai);
 	return rc;
 }
@@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u
 
 static void __init pcpu_dfl_fc_free(void *ptr, size_t size)
 {
-	memblock_free_early(__pa(ptr), size);
+	memblock_free(__pa(ptr), size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/mm/sparse.c~memblock-drop-memblock_free_early_nid-and-memblock_free_early
+++ a/mm/sparse.c
@@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit
 static inline void __meminit sparse_buffer_free(unsigned long size)
 {
 	WARN_ON(!sparsemap_buf || size == 0);
-	memblock_free_early(__pa(sparsemap_buf), size);
+	memblock_free(__pa(sparsemap_buf), size);
 }
 
 static void __init sparse_buffer_init(unsigned long size, int nid)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 165/262] memblock: stop aliasing __memblock_free_late with memblock_free_late
  2021-11-05 20:34 incoming Andrew Morton
                   ` (163 preceding siblings ...)
  2021-11-05 20:43 ` [patch 164/262] memblock: drop memblock_free_early_nid() and memblock_free_early() Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 166/262] memblock: rename memblock_free to memblock_phys_free Andrew Morton
                   ` (96 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: memblock: stop aliasing __memblock_free_late with memblock_free_late

memblock_free_late() is a NOP wrapper for __memblock_free_late(), there is
no point to keep this indirection.

Drop the wrapper and rename __memblock_free_late() to
memblock_free_late().

Link: https://lkml.kernel.org/r/20210930185031.18648-5-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memblock.h |    7 +------
 mm/memblock.c            |    8 ++++----
 2 files changed, 5 insertions(+), 10 deletions(-)

--- a/include/linux/memblock.h~memblock-stop-aliasing-__memblock_free_late-with-memblock_free_late
+++ a/include/linux/memblock.h
@@ -133,7 +133,7 @@ void __next_mem_range_rev(u64 *idx, int
 			  struct memblock_type *type_b, phys_addr_t *out_start,
 			  phys_addr_t *out_end, int *out_nid);
 
-void __memblock_free_late(phys_addr_t base, phys_addr_t size);
+void memblock_free_late(phys_addr_t base, phys_addr_t size);
 
 #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
 static inline void __next_physmem_range(u64 *idx, struct memblock_type *type,
@@ -441,11 +441,6 @@ static inline void *memblock_alloc_node(
 				      MEMBLOCK_ALLOC_ACCESSIBLE, nid);
 }
 
-static inline void memblock_free_late(phys_addr_t base, phys_addr_t size)
-{
-	__memblock_free_late(base, size);
-}
-
 /*
  * Set the allocation direction to bottom-up or top-down.
  */
--- a/mm/memblock.c~memblock-stop-aliasing-__memblock_free_late-with-memblock_free_late
+++ a/mm/memblock.c
@@ -366,14 +366,14 @@ void __init memblock_discard(void)
 		addr = __pa(memblock.reserved.regions);
 		size = PAGE_ALIGN(sizeof(struct memblock_region) *
 				  memblock.reserved.max);
-		__memblock_free_late(addr, size);
+		memblock_free_late(addr, size);
 	}
 
 	if (memblock.memory.regions != memblock_memory_init_regions) {
 		addr = __pa(memblock.memory.regions);
 		size = PAGE_ALIGN(sizeof(struct memblock_region) *
 				  memblock.memory.max);
-		__memblock_free_late(addr, size);
+		memblock_free_late(addr, size);
 	}
 
 	memblock_memory = NULL;
@@ -1589,7 +1589,7 @@ void * __init memblock_alloc_try_nid(
 }
 
 /**
- * __memblock_free_late - free pages directly to buddy allocator
+ * memblock_free_late - free pages directly to buddy allocator
  * @base: phys starting address of the  boot memory block
  * @size: size of the boot memory block in bytes
  *
@@ -1597,7 +1597,7 @@ void * __init memblock_alloc_try_nid(
  * down, but we are still initializing the system.  Pages are released directly
  * to the buddy allocator.
  */
-void __init __memblock_free_late(phys_addr_t base, phys_addr_t size)
+void __init memblock_free_late(phys_addr_t base, phys_addr_t size)
 {
 	phys_addr_t cursor, end;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 166/262] memblock: rename memblock_free to memblock_phys_free
  2021-11-05 20:34 incoming Andrew Morton
                   ` (164 preceding siblings ...)
  2021-11-05 20:43 ` [patch 165/262] memblock: stop aliasing __memblock_free_late with memblock_free_late Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 167/262] memblock: use memblock_free for freeing virtual pointers Andrew Morton
                   ` (95 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: memblock: rename memblock_free to memblock_phys_free

Since memblock_free() operates on a physical range, make its name reflect
it and rename it to memblock_phys_free(), so it will be a logical
counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

@@
expression addr;
expression size;
@@
- memblock_free(addr, size);
+ memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/kernel/core_irongate.c         |    3 ++-
 arch/arc/mm/init.c                        |    2 +-
 arch/arm/mach-hisi/platmcpm.c             |    2 +-
 arch/arm/mm/init.c                        |    2 +-
 arch/arm64/mm/mmu.c                       |    4 ++--
 arch/mips/mm/init.c                       |    2 +-
 arch/mips/sgi-ip30/ip30-setup.c           |    6 +++---
 arch/powerpc/kernel/dt_cpu_ftrs.c         |    4 ++--
 arch/powerpc/kernel/paca.c                |    8 ++++----
 arch/powerpc/kernel/setup-common.c        |    2 +-
 arch/powerpc/kernel/setup_64.c            |    2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |    2 +-
 arch/powerpc/platforms/pseries/svm.c      |    3 ++-
 arch/riscv/kernel/setup.c                 |    5 +++--
 arch/s390/kernel/setup.c                  |    8 ++++----
 arch/s390/kernel/smp.c                    |    4 ++--
 arch/s390/kernel/uv.c                     |    2 +-
 arch/s390/mm/kasan_init.c                 |    2 +-
 arch/sh/boards/mach-ap325rxa/setup.c      |    2 +-
 arch/sh/boards/mach-ecovec24/setup.c      |    4 ++--
 arch/sh/boards/mach-kfr2r09/setup.c       |    2 +-
 arch/sh/boards/mach-migor/setup.c         |    2 +-
 arch/sh/boards/mach-se/7724/setup.c       |    4 ++--
 arch/sparc/kernel/smp_64.c                |    2 +-
 arch/um/kernel/mem.c                      |    2 +-
 arch/x86/kernel/setup.c                   |    4 ++--
 arch/x86/mm/init.c                        |    2 +-
 arch/x86/xen/mmu_pv.c                     |    6 +++---
 arch/x86/xen/setup.c                      |    6 +++---
 drivers/base/arch_numa.c                  |    2 +-
 drivers/firmware/efi/memmap.c             |    2 +-
 drivers/of/kexec.c                        |    3 +--
 drivers/of/of_reserved_mem.c              |    5 +++--
 drivers/s390/char/sclp_early.c            |    2 +-
 drivers/usb/early/xhci-dbc.c              |   10 +++++-----
 drivers/xen/swiotlb-xen.c                 |    2 +-
 include/linux/memblock.h                  |    2 +-
 init/initramfs.c                          |    2 +-
 kernel/dma/swiotlb.c                      |    2 +-
 lib/cpumask.c                             |    2 +-
 mm/cma.c                                  |    2 +-
 mm/memblock.c                             |    8 ++++----
 mm/memory_hotplug.c                       |    2 +-
 mm/percpu.c                               |    8 ++++----
 mm/sparse.c                               |    2 +-
 45 files changed, 79 insertions(+), 76 deletions(-)

--- a/arch/alpha/kernel/core_irongate.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/alpha/kernel/core_irongate.c
@@ -233,7 +233,8 @@ albacore_init_arch(void)
 			unsigned long size;
 
 			size = initrd_end - initrd_start;
-			memblock_free(__pa(initrd_start), PAGE_ALIGN(size));
+			memblock_phys_free(__pa(initrd_start),
+					   PAGE_ALIGN(size));
 			if (!move_initrd(pci_mem))
 				printk("irongate_init_arch: initrd too big "
 				       "(%ldK)\ndisabling initrd\n",
--- a/arch/arc/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/arc/mm/init.c
@@ -173,7 +173,7 @@ static void __init highmem_init(void)
 #ifdef CONFIG_HIGHMEM
 	unsigned long tmp;
 
-	memblock_free(high_mem_start, high_mem_sz);
+	memblock_phys_free(high_mem_start, high_mem_sz);
 	for (tmp = min_high_pfn; tmp < max_high_pfn; tmp++)
 		free_highmem_page(pfn_to_page(tmp));
 #endif
--- a/arch/arm64/mm/mmu.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/arm64/mm/mmu.c
@@ -738,8 +738,8 @@ void __init paging_init(void)
 	cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
 	init_mm.pgd = swapper_pg_dir;
 
-	memblock_free(__pa_symbol(init_pg_dir),
-		      __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir));
+	memblock_phys_free(__pa_symbol(init_pg_dir),
+			   __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir));
 
 	memblock_allow_resize();
 }
--- a/arch/arm/mach-hisi/platmcpm.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/arm/mach-hisi/platmcpm.c
@@ -339,7 +339,7 @@ err_fabric:
 err_sysctrl:
 	iounmap(relocation);
 err_reloc:
-	memblock_free(hip04_boot_method[0], hip04_boot_method[1]);
+	memblock_phys_free(hip04_boot_method[0], hip04_boot_method[1]);
 err:
 	return ret;
 }
--- a/arch/arm/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/arm/mm/init.c
@@ -158,7 +158,7 @@ phys_addr_t __init arm_memblock_steal(ph
 		panic("Failed to steal %pa bytes at %pS\n",
 		      &size, (void *)_RET_IP_);
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 
 	return phys;
--- a/arch/mips/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/mips/mm/init.c
@@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_free(__pa(ptr), size);
+	memblock_phys_free(__pa(ptr), size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/arch/mips/sgi-ip30/ip30-setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/mips/sgi-ip30/ip30-setup.c
@@ -69,10 +69,10 @@ static void __init ip30_mem_init(void)
 		total_mem += size;
 
 		if (addr >= IP30_REAL_MEMORY_START)
-			memblock_free(addr, size);
+			memblock_phys_free(addr, size);
 		else if ((addr + size) > IP30_REAL_MEMORY_START)
-			memblock_free(IP30_REAL_MEMORY_START,
-				     size - IP30_MAX_PROM_MEMORY);
+			memblock_phys_free(IP30_REAL_MEMORY_START,
+					   size - IP30_MAX_PROM_MEMORY);
 	}
 	pr_info("Detected %luMB of physical memory.\n", MEM_SHIFT(total_mem));
 }
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -1095,8 +1095,8 @@ static int __init dt_cpu_ftrs_scan_callb
 
 	cpufeatures_setup_finished();
 
-	memblock_free(__pa(dt_cpu_features),
-			sizeof(struct dt_cpu_feature)*nr_dt_cpu_features);
+	memblock_phys_free(__pa(dt_cpu_features),
+			   sizeof(struct dt_cpu_feature) * nr_dt_cpu_features);
 
 	return 0;
 }
--- a/arch/powerpc/kernel/paca.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/kernel/paca.c
@@ -322,8 +322,8 @@ void __init free_unused_pacas(void)
 
 	new_ptrs_size = sizeof(struct paca_struct *) * nr_cpu_ids;
 	if (new_ptrs_size < paca_ptrs_size)
-		memblock_free(__pa(paca_ptrs) + new_ptrs_size,
-					paca_ptrs_size - new_ptrs_size);
+		memblock_phys_free(__pa(paca_ptrs) + new_ptrs_size,
+				   paca_ptrs_size - new_ptrs_size);
 
 	paca_nr_cpu_ids = nr_cpu_ids;
 	paca_ptrs_size = new_ptrs_size;
@@ -331,8 +331,8 @@ void __init free_unused_pacas(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (early_radix_enabled()) {
 		/* Ugly fixup, see new_slb_shadow() */
-		memblock_free(__pa(paca_ptrs[boot_cpuid]->slb_shadow_ptr),
-				sizeof(struct slb_shadow));
+		memblock_phys_free(__pa(paca_ptrs[boot_cpuid]->slb_shadow_ptr),
+				   sizeof(struct slb_shadow));
 		paca_ptrs[boot_cpuid]->slb_shadow_ptr = NULL;
 	}
 #endif
--- a/arch/powerpc/kernel/setup_64.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/kernel/setup_64.c
@@ -812,7 +812,7 @@ static void * __init pcpu_alloc_bootmem(
 
 static void __init pcpu_free_bootmem(void *ptr, size_t size)
 {
-	memblock_free(__pa(ptr), size);
+	memblock_phys_free(__pa(ptr), size);
 }
 
 static int pcpu_cpu_distance(unsigned int from, unsigned int to)
--- a/arch/powerpc/kernel/setup-common.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/kernel/setup-common.c
@@ -825,7 +825,7 @@ static void __init smp_setup_pacas(void)
 		set_hard_smp_processor_id(cpu, cpu_to_phys_id[cpu]);
 	}
 
-	memblock_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32));
+	memblock_phys_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32));
 	cpu_to_phys_id = NULL;
 }
 #endif
--- a/arch/powerpc/platforms/powernv/pci-ioda.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2981,7 +2981,7 @@ static void __init pnv_pci_init_ioda_phb
 	if (!phb->hose) {
 		pr_err("  Can't allocate PCI controller for %pOF\n",
 		       np);
-		memblock_free(__pa(phb), sizeof(struct pnv_phb));
+		memblock_phys_free(__pa(phb), sizeof(struct pnv_phb));
 		return;
 	}
 
--- a/arch/powerpc/platforms/pseries/svm.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/powerpc/platforms/pseries/svm.c
@@ -56,7 +56,8 @@ void __init svm_swiotlb_init(void)
 		return;
 
 
-	memblock_free(__pa(vstart), PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+	memblock_phys_free(__pa(vstart),
+			   PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
 	panic("SVM: Cannot allocate SWIOTLB buffer");
 }
 
--- a/arch/riscv/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/riscv/kernel/setup.c
@@ -230,13 +230,14 @@ static void __init init_resources(void)
 
 	/* Clean-up any unused pre-allocated resources */
 	if (res_idx >= 0)
-		memblock_free(__pa(mem_res), (res_idx + 1) * sizeof(*mem_res));
+		memblock_phys_free(__pa(mem_res),
+				   (res_idx + 1) * sizeof(*mem_res));
 	return;
 
  error:
 	/* Better an empty resource tree than an inconsistent one */
 	release_child_resources(&iomem_resource);
-	memblock_free(__pa(mem_res), mem_res_sz);
+	memblock_phys_free(__pa(mem_res), mem_res_sz);
 }
 
 
--- a/arch/s390/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/s390/kernel/setup.c
@@ -693,7 +693,7 @@ static void __init reserve_crashkernel(v
 	}
 
 	if (register_memory_notifier(&kdump_mem_nb)) {
-		memblock_free(crash_base, crash_size);
+		memblock_phys_free(crash_base, crash_size);
 		return;
 	}
 
@@ -748,7 +748,7 @@ static void __init free_mem_detect_info(
 
 	get_mem_detect_reserved(&start, &size);
 	if (size)
-		memblock_free(start, size);
+		memblock_phys_free(start, size);
 }
 
 static const char * __init get_mem_info_source(void)
@@ -793,7 +793,7 @@ static void __init check_initrd(void)
 	if (initrd_data.start && initrd_data.size &&
 	    !memblock_is_region_memory(initrd_data.start, initrd_data.size)) {
 		pr_err("The initial RAM disk does not fit into the memory\n");
-		memblock_free(initrd_data.start, initrd_data.size);
+		memblock_phys_free(initrd_data.start, initrd_data.size);
 		initrd_start = initrd_end = 0;
 	}
 #endif
@@ -890,7 +890,7 @@ static void __init setup_randomness(void
 
 	if (stsi(vmms, 3, 2, 2) == 0 && vmms->count)
 		add_device_randomness(&vmms->vm, sizeof(vmms->vm[0]) * vmms->count);
-	memblock_free((unsigned long) vmms, PAGE_SIZE);
+	memblock_phys_free((unsigned long)vmms, PAGE_SIZE);
 }
 
 /*
--- a/arch/s390/kernel/smp.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/s390/kernel/smp.c
@@ -723,7 +723,7 @@ void __init smp_save_dump_cpus(void)
 			/* Get the CPU registers */
 			smp_save_cpu_regs(sa, addr, is_boot_cpu, page);
 	}
-	memblock_free(page, PAGE_SIZE);
+	memblock_phys_free(page, PAGE_SIZE);
 	diag_amode31_ops.diag308_reset();
 	pcpu_set_smt(0);
 }
@@ -880,7 +880,7 @@ void __init smp_detect_cpus(void)
 
 	/* Add CPUs present at boot */
 	__smp_rescan_cpus(info, true);
-	memblock_free((unsigned long)info, sizeof(*info));
+	memblock_phys_free((unsigned long)info, sizeof(*info));
 }
 
 /*
--- a/arch/s390/kernel/uv.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/s390/kernel/uv.c
@@ -64,7 +64,7 @@ void __init setup_uv(void)
 	}
 
 	if (uv_init(uv_stor_base, uv_info.uv_base_stor_len)) {
-		memblock_free(uv_stor_base, uv_info.uv_base_stor_len);
+		memblock_phys_free(uv_stor_base, uv_info.uv_base_stor_len);
 		goto fail;
 	}
 
--- a/arch/s390/mm/kasan_init.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/s390/mm/kasan_init.c
@@ -399,5 +399,5 @@ void __init kasan_copy_shadow_mapping(vo
 
 void __init kasan_free_early_identity(void)
 {
-	memblock_free(pgalloc_pos, pgalloc_freeable - pgalloc_pos);
+	memblock_phys_free(pgalloc_pos, pgalloc_freeable - pgalloc_pos);
 }
--- a/arch/sh/boards/mach-ap325rxa/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sh/boards/mach-ap325rxa/setup.c
@@ -560,7 +560,7 @@ static void __init ap325rxa_mv_mem_reser
 	if (!phys)
 		panic("Failed to allocate CEU memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 
 	ceu_dma_membase = phys;
--- a/arch/sh/boards/mach-ecovec24/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sh/boards/mach-ecovec24/setup.c
@@ -1502,7 +1502,7 @@ static void __init ecovec_mv_mem_reserve
 	if (!phys)
 		panic("Failed to allocate CEU0 memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 	ceu0_dma_membase = phys;
 
@@ -1510,7 +1510,7 @@ static void __init ecovec_mv_mem_reserve
 	if (!phys)
 		panic("Failed to allocate CEU1 memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 	ceu1_dma_membase = phys;
 }
--- a/arch/sh/boards/mach-kfr2r09/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sh/boards/mach-kfr2r09/setup.c
@@ -633,7 +633,7 @@ static void __init kfr2r09_mv_mem_reserv
 	if (!phys)
 		panic("Failed to allocate CEU memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 
 	ceu_dma_membase = phys;
--- a/arch/sh/boards/mach-migor/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sh/boards/mach-migor/setup.c
@@ -633,7 +633,7 @@ static void __init migor_mv_mem_reserve(
 	if (!phys)
 		panic("Failed to allocate CEU memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 
 	ceu_dma_membase = phys;
--- a/arch/sh/boards/mach-se/7724/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sh/boards/mach-se/7724/setup.c
@@ -966,7 +966,7 @@ static void __init ms7724se_mv_mem_reser
 	if (!phys)
 		panic("Failed to allocate CEU0 memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 	ceu0_dma_membase = phys;
 
@@ -974,7 +974,7 @@ static void __init ms7724se_mv_mem_reser
 	if (!phys)
 		panic("Failed to allocate CEU1 memory\n");
 
-	memblock_free(phys, size);
+	memblock_phys_free(phys, size);
 	memblock_remove(phys, size);
 	ceu1_dma_membase = phys;
 }
--- a/arch/sparc/kernel/smp_64.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/sparc/kernel/smp_64.c
@@ -1567,7 +1567,7 @@ static void * __init pcpu_alloc_bootmem(
 
 static void __init pcpu_free_bootmem(void *ptr, size_t size)
 {
-	memblock_free(__pa(ptr), size);
+	memblock_phys_free(__pa(ptr), size);
 }
 
 static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
--- a/arch/um/kernel/mem.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/um/kernel/mem.c
@@ -47,7 +47,7 @@ void __init mem_init(void)
 	 */
 	brk_end = (unsigned long) UML_ROUND_UP(sbrk(0));
 	map_memory(brk_end, __pa(brk_end), uml_reserved - brk_end, 1, 1, 0);
-	memblock_free(__pa(brk_end), uml_reserved - brk_end);
+	memblock_phys_free(__pa(brk_end), uml_reserved - brk_end);
 	uml_reserved = brk_end;
 
 	/* this will put all low memory onto the freelists */
--- a/arch/x86/kernel/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/x86/kernel/setup.c
@@ -322,7 +322,7 @@ static void __init reserve_initrd(void)
 
 	relocate_initrd();
 
-	memblock_free(ramdisk_image, ramdisk_end - ramdisk_image);
+	memblock_phys_free(ramdisk_image, ramdisk_end - ramdisk_image);
 }
 
 #else
@@ -521,7 +521,7 @@ static void __init reserve_crashkernel(v
 	}
 
 	if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
-		memblock_free(crash_base, crash_size);
+		memblock_phys_free(crash_base, crash_size);
 		return;
 	}
 
--- a/arch/x86/mm/init.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/x86/mm/init.c
@@ -618,7 +618,7 @@ static void __init memory_map_top_down(u
 	 */
 	addr = memblock_phys_alloc_range(PMD_SIZE, PMD_SIZE, map_start,
 					 map_end);
-	memblock_free(addr, PMD_SIZE);
+	memblock_phys_free(addr, PMD_SIZE);
 	real_end = addr + PMD_SIZE;
 
 	/* step_size need to be small so pgt_buf from BRK could cover it */
--- a/arch/x86/xen/mmu_pv.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/x86/xen/mmu_pv.c
@@ -1025,7 +1025,7 @@ static void __init xen_free_ro_pages(uns
 	for (; vaddr < vaddr_end; vaddr += PAGE_SIZE)
 		make_lowmem_page_readwrite(vaddr);
 
-	memblock_free(paddr, size);
+	memblock_phys_free(paddr, size);
 }
 
 static void __init xen_cleanmfnmap_free_pgtbl(void *pgtbl, bool unpin)
@@ -1151,7 +1151,7 @@ static void __init xen_pagetable_p2m_fre
 		xen_cleanhighmap(addr, addr + size);
 		size = PAGE_ALIGN(xen_start_info->nr_pages *
 				  sizeof(unsigned long));
-		memblock_free(__pa(addr), size);
+		memblock_phys_free(__pa(addr), size);
 	} else {
 		xen_cleanmfnmap(addr);
 	}
@@ -1955,7 +1955,7 @@ void __init xen_relocate_p2m(void)
 		pfn_end = p2m_pfn_end;
 	}
 
-	memblock_free(PFN_PHYS(pfn), PAGE_SIZE * (pfn_end - pfn));
+	memblock_phys_free(PFN_PHYS(pfn), PAGE_SIZE * (pfn_end - pfn));
 	while (pfn < pfn_end) {
 		if (pfn == p2m_pfn) {
 			pfn = p2m_pfn_end;
--- a/arch/x86/xen/setup.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/arch/x86/xen/setup.c
@@ -153,7 +153,7 @@ static void __init xen_del_extra_mem(uns
 			break;
 		}
 	}
-	memblock_free(PFN_PHYS(start_pfn), PFN_PHYS(n_pfns));
+	memblock_phys_free(PFN_PHYS(start_pfn), PFN_PHYS(n_pfns));
 }
 
 /*
@@ -719,7 +719,7 @@ static void __init xen_reserve_xen_mfnli
 		return;
 
 	xen_relocate_p2m();
-	memblock_free(start, size);
+	memblock_phys_free(start, size);
 }
 
 /**
@@ -885,7 +885,7 @@ char * __init xen_memory_setup(void)
 		xen_phys_memcpy(new_area, start, size);
 		pr_info("initrd moved from [mem %#010llx-%#010llx] to [mem %#010llx-%#010llx]\n",
 			start, start + size, new_area, new_area + size);
-		memblock_free(start, size);
+		memblock_phys_free(start, size);
 		boot_params.hdr.ramdisk_image = new_area;
 		boot_params.ext_ramdisk_image = new_area >> 32;
 	}
--- a/drivers/base/arch_numa.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/base/arch_numa.c
@@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_free(__pa(ptr), size);
+	memblock_phys_free(__pa(ptr), size);
 }
 
 #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
--- a/drivers/firmware/efi/memmap.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/firmware/efi/memmap.c
@@ -35,7 +35,7 @@ void __init __efi_memmap_free(u64 phys,
 		if (slab_is_available())
 			memblock_free_late(phys, size);
 		else
-			memblock_free(phys, size);
+			memblock_phys_free(phys, size);
 	} else if (flags & EFI_MEMMAP_SLAB) {
 		struct page *p = pfn_to_page(PHYS_PFN(phys));
 		unsigned int order = get_order(size);
--- a/drivers/of/kexec.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/of/kexec.c
@@ -171,8 +171,7 @@ int ima_free_kexec_buffer(void)
 	if (ret)
 		return ret;
 
-	return memblock_free(addr, size);
-
+	return memblock_phys_free(addr, size);
 }
 
 /**
--- a/drivers/of/of_reserved_mem.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/of/of_reserved_mem.c
@@ -46,7 +46,7 @@ static int __init early_init_dt_alloc_re
 	if (nomap) {
 		err = memblock_mark_nomap(base, size);
 		if (err)
-			memblock_free(base, size);
+			memblock_phys_free(base, size);
 		kmemleak_ignore_phys(base);
 	}
 
@@ -284,7 +284,8 @@ void __init fdt_init_reserved_mem(void)
 				if (nomap)
 					memblock_clear_nomap(rmem->base, rmem->size);
 				else
-					memblock_free(rmem->base, rmem->size);
+					memblock_phys_free(rmem->base,
+							   rmem->size);
 			}
 		}
 	}
--- a/drivers/s390/char/sclp_early.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/s390/char/sclp_early.c
@@ -139,7 +139,7 @@ int __init sclp_early_get_core_info(stru
 	}
 	sclp_fill_core_info(info, sccb);
 out:
-	memblock_free((unsigned long)sccb, length);
+	memblock_phys_free((unsigned long)sccb, length);
 	return rc;
 }
 
--- a/drivers/usb/early/xhci-dbc.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/usb/early/xhci-dbc.c
@@ -185,7 +185,7 @@ static void __init xdbc_free_ring(struct
 	if (!seg)
 		return;
 
-	memblock_free(seg->dma, PAGE_SIZE);
+	memblock_phys_free(seg->dma, PAGE_SIZE);
 	ring->segment = NULL;
 }
 
@@ -665,10 +665,10 @@ int __init early_xdbc_setup_hardware(voi
 		xdbc_free_ring(&xdbc.in_ring);
 
 		if (xdbc.table_dma)
-			memblock_free(xdbc.table_dma, PAGE_SIZE);
+			memblock_phys_free(xdbc.table_dma, PAGE_SIZE);
 
 		if (xdbc.out_dma)
-			memblock_free(xdbc.out_dma, PAGE_SIZE);
+			memblock_phys_free(xdbc.out_dma, PAGE_SIZE);
 
 		xdbc.table_base = NULL;
 		xdbc.out_buf = NULL;
@@ -987,8 +987,8 @@ free_and_quit:
 	xdbc_free_ring(&xdbc.evt_ring);
 	xdbc_free_ring(&xdbc.out_ring);
 	xdbc_free_ring(&xdbc.in_ring);
-	memblock_free(xdbc.table_dma, PAGE_SIZE);
-	memblock_free(xdbc.out_dma, PAGE_SIZE);
+	memblock_phys_free(xdbc.table_dma, PAGE_SIZE);
+	memblock_phys_free(xdbc.out_dma, PAGE_SIZE);
 	writel(0, &xdbc.xdbc_reg->control);
 	early_iounmap(xdbc.xhci_base, xdbc.xhci_length);
 
--- a/drivers/xen/swiotlb-xen.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/drivers/xen/swiotlb-xen.c
@@ -241,7 +241,7 @@ retry:
 	 */
 	rc = xen_swiotlb_fixup(start, nslabs);
 	if (rc) {
-		memblock_free(__pa(start), PAGE_ALIGN(bytes));
+		memblock_phys_free(__pa(start), PAGE_ALIGN(bytes));
 		if (nslabs > 1024 && repeat--) {
 			/* Min is 2MB */
 			nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
--- a/include/linux/memblock.h~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/include/linux/memblock.h
@@ -103,7 +103,7 @@ void memblock_allow_resize(void);
 int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
 int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
-int memblock_free(phys_addr_t base, phys_addr_t size);
+int memblock_phys_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
 #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
 int memblock_physmem_add(phys_addr_t base, phys_addr_t size);
--- a/init/initramfs.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/init/initramfs.c
@@ -607,7 +607,7 @@ void __weak __init free_initrd_mem(unsig
 	unsigned long aligned_start = ALIGN_DOWN(start, PAGE_SIZE);
 	unsigned long aligned_end = ALIGN(end, PAGE_SIZE);
 
-	memblock_free(__pa(aligned_start), aligned_end - aligned_start);
+	memblock_phys_free(__pa(aligned_start), aligned_end - aligned_start);
 #endif
 
 	free_reserved_area((void *)start, (void *)end, POISON_FREE_INITMEM,
--- a/kernel/dma/swiotlb.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/kernel/dma/swiotlb.c
@@ -247,7 +247,7 @@ swiotlb_init(int verbose)
 	return;
 
 fail_free_mem:
-	memblock_free(__pa(tlb), bytes);
+	memblock_phys_free(__pa(tlb), bytes);
 fail:
 	pr_warn("Cannot allocate buffer");
 }
--- a/lib/cpumask.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/lib/cpumask.c
@@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var);
  */
 void __init free_bootmem_cpumask_var(cpumask_var_t mask)
 {
-	memblock_free(__pa(mask), cpumask_size());
+	memblock_phys_free(__pa(mask), cpumask_size());
 }
 #endif
 
--- a/mm/cma.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/mm/cma.c
@@ -378,7 +378,7 @@ int __init cma_declare_contiguous_nid(ph
 	return 0;
 
 free_mem:
-	memblock_free(base, size);
+	memblock_phys_free(base, size);
 err:
 	pr_err("Failed to reserve %ld MiB\n", (unsigned long)size / SZ_1M);
 	return ret;
--- a/mm/memblock.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/mm/memblock.c
@@ -806,18 +806,18 @@ int __init_memblock memblock_remove(phys
 void __init_memblock memblock_free_ptr(void *ptr, size_t size)
 {
 	if (ptr)
-		memblock_free(__pa(ptr), size);
+		memblock_phys_free(__pa(ptr), size);
 }
 
 /**
- * memblock_free - free boot memory block
+ * memblock_phys_free - free boot memory block
  * @base: phys starting address of the  boot memory block
  * @size: size of the boot memory block in bytes
  *
  * Free boot memory block previously allocated by memblock_alloc_xx() API.
  * The freeing memory will not be released to the buddy allocator.
  */
-int __init_memblock memblock_free(phys_addr_t base, phys_addr_t size)
+int __init_memblock memblock_phys_free(phys_addr_t base, phys_addr_t size)
 {
 	phys_addr_t end = base + size - 1;
 
@@ -1937,7 +1937,7 @@ static void __init free_memmap(unsigned
 	 * memmap array.
 	 */
 	if (pg < pgend)
-		memblock_free(pg, pgend - pg);
+		memblock_phys_free(pg, pgend - pg);
 }
 
 /*
--- a/mm/memory_hotplug.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/mm/memory_hotplug.c
@@ -2204,7 +2204,7 @@ static int __ref try_remove_memory(u64 s
 	arch_remove_memory(start, size, altmap);
 
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
-		memblock_free(start, size);
+		memblock_phys_free(start, size);
 		memblock_remove(start, size);
 	}
 
--- a/mm/percpu.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/mm/percpu.c
@@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all
  */
 void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
 {
-	memblock_free(__pa(ai), ai->__ai_size);
+	memblock_phys_free(__pa(ai), ai->__ai_size);
 }
 
 /**
@@ -3134,7 +3134,7 @@ out_free_areas:
 out_free:
 	pcpu_free_alloc_info(ai);
 	if (areas)
-		memblock_free(__pa(areas), areas_size);
+		memblock_phys_free(__pa(areas), areas_size);
 	return rc;
 }
 #endif /* BUILD_EMBED_FIRST_CHUNK */
@@ -3256,7 +3256,7 @@ enomem:
 		free_fn(page_address(pages[j]), PAGE_SIZE);
 	rc = -ENOMEM;
 out_free_ar:
-	memblock_free(__pa(pages), pages_size);
+	memblock_phys_free(__pa(pages), pages_size);
 	pcpu_free_alloc_info(ai);
 	return rc;
 }
@@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u
 
 static void __init pcpu_dfl_fc_free(void *ptr, size_t size)
 {
-	memblock_free(__pa(ptr), size);
+	memblock_phys_free(__pa(ptr), size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/mm/sparse.c~memblock-rename-memblock_free-to-memblock_phys_free
+++ a/mm/sparse.c
@@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit
 static inline void __meminit sparse_buffer_free(unsigned long size)
 {
 	WARN_ON(!sparsemap_buf || size == 0);
-	memblock_free(__pa(sparsemap_buf), size);
+	memblock_phys_free(__pa(sparsemap_buf), size);
 }
 
 static void __init sparse_buffer_init(unsigned long size, int nid)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 167/262] memblock: use memblock_free for freeing virtual pointers
  2021-11-05 20:34 incoming Andrew Morton
                   ` (165 preceding siblings ...)
  2021-11-05 20:43 ` [patch 166/262] memblock: rename memblock_free to memblock_phys_free Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 168/262] mm: mark the OOM reaper thread as freezable Andrew Morton
                   ` (94 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, christophe.leroy, jgross, linux-mm, mm-commits, rppt, sfr,
	Shahab.Vahedi, torvalds

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: memblock: use memblock_free for freeing virtual pointers

Rename memblock_free_ptr() to memblock_free() and use memblock_free() when
freeing a virtual pointer so that memblock_free() will be a counterpart of
memblock_alloc()

The callers are updated with the below semantic patch and manual addition
of (void *) casting to pointers that are represented by unsigned long
variables.

@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)

[sfr@canb.auug.org.au: fixup]
  Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au
Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/kernel/core_irongate.c         |    3 +--
 arch/mips/mm/init.c                       |    2 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c         |    4 ++--
 arch/powerpc/kernel/setup-common.c        |    2 +-
 arch/powerpc/kernel/setup_64.c            |    2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |    2 +-
 arch/powerpc/platforms/pseries/svm.c      |    3 +--
 arch/riscv/kernel/setup.c                 |    5 ++---
 arch/sparc/kernel/smp_64.c                |    2 +-
 arch/um/kernel/mem.c                      |    2 +-
 arch/x86/kernel/setup_percpu.c            |    2 +-
 arch/x86/mm/kasan_init_64.c               |    4 ++--
 arch/x86/mm/numa.c                        |    2 +-
 arch/x86/mm/numa_emulation.c              |    2 +-
 arch/x86/xen/mmu_pv.c                     |    2 +-
 arch/x86/xen/p2m.c                        |    2 +-
 drivers/base/arch_numa.c                  |    4 ++--
 drivers/macintosh/smu.c                   |    2 +-
 drivers/xen/swiotlb-xen.c                 |    2 +-
 include/linux/memblock.h                  |    2 +-
 init/initramfs.c                          |    2 +-
 init/main.c                               |    4 ++--
 kernel/dma/swiotlb.c                      |    2 +-
 kernel/printk/printk.c                    |    4 ++--
 lib/bootconfig.c                          |    2 +-
 lib/cpumask.c                             |    2 +-
 mm/memblock.c                             |    6 +++---
 mm/percpu.c                               |    8 ++++----
 mm/sparse.c                               |    2 +-
 29 files changed, 40 insertions(+), 43 deletions(-)

--- a/arch/alpha/kernel/core_irongate.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/alpha/kernel/core_irongate.c
@@ -233,8 +233,7 @@ albacore_init_arch(void)
 			unsigned long size;
 
 			size = initrd_end - initrd_start;
-			memblock_phys_free(__pa(initrd_start),
-					   PAGE_ALIGN(size));
+			memblock_free((void *)initrd_start, PAGE_ALIGN(size));
 			if (!move_initrd(pci_mem))
 				printk("irongate_init_arch: initrd too big "
 				       "(%ldK)\ndisabling initrd\n",
--- a/arch/mips/mm/init.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/mips/mm/init.c
@@ -529,7 +529,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_phys_free(__pa(ptr), size);
+	memblock_free(ptr, size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -1095,8 +1095,8 @@ static int __init dt_cpu_ftrs_scan_callb
 
 	cpufeatures_setup_finished();
 
-	memblock_phys_free(__pa(dt_cpu_features),
-			   sizeof(struct dt_cpu_feature) * nr_dt_cpu_features);
+	memblock_free(dt_cpu_features,
+		      sizeof(struct dt_cpu_feature) * nr_dt_cpu_features);
 
 	return 0;
 }
--- a/arch/powerpc/kernel/setup_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/powerpc/kernel/setup_64.c
@@ -812,7 +812,7 @@ static void * __init pcpu_alloc_bootmem(
 
 static void __init pcpu_free_bootmem(void *ptr, size_t size)
 {
-	memblock_phys_free(__pa(ptr), size);
+	memblock_free(ptr, size);
 }
 
 static int pcpu_cpu_distance(unsigned int from, unsigned int to)
--- a/arch/powerpc/kernel/setup-common.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/powerpc/kernel/setup-common.c
@@ -825,7 +825,7 @@ static void __init smp_setup_pacas(void)
 		set_hard_smp_processor_id(cpu, cpu_to_phys_id[cpu]);
 	}
 
-	memblock_phys_free(__pa(cpu_to_phys_id), nr_cpu_ids * sizeof(u32));
+	memblock_free(cpu_to_phys_id, nr_cpu_ids * sizeof(u32));
 	cpu_to_phys_id = NULL;
 }
 #endif
--- a/arch/powerpc/platforms/powernv/pci-ioda.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2981,7 +2981,7 @@ static void __init pnv_pci_init_ioda_phb
 	if (!phb->hose) {
 		pr_err("  Can't allocate PCI controller for %pOF\n",
 		       np);
-		memblock_phys_free(__pa(phb), sizeof(struct pnv_phb));
+		memblock_free(phb, sizeof(struct pnv_phb));
 		return;
 	}
 
--- a/arch/powerpc/platforms/pseries/svm.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/powerpc/platforms/pseries/svm.c
@@ -56,8 +56,7 @@ void __init svm_swiotlb_init(void)
 		return;
 
 
-	memblock_phys_free(__pa(vstart),
-			   PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+	memblock_free(vstart, PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
 	panic("SVM: Cannot allocate SWIOTLB buffer");
 }
 
--- a/arch/riscv/kernel/setup.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/riscv/kernel/setup.c
@@ -230,14 +230,13 @@ static void __init init_resources(void)
 
 	/* Clean-up any unused pre-allocated resources */
 	if (res_idx >= 0)
-		memblock_phys_free(__pa(mem_res),
-				   (res_idx + 1) * sizeof(*mem_res));
+		memblock_free(mem_res, (res_idx + 1) * sizeof(*mem_res));
 	return;
 
  error:
 	/* Better an empty resource tree than an inconsistent one */
 	release_child_resources(&iomem_resource);
-	memblock_phys_free(__pa(mem_res), mem_res_sz);
+	memblock_free(mem_res, mem_res_sz);
 }
 
 
--- a/arch/sparc/kernel/smp_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/sparc/kernel/smp_64.c
@@ -1567,7 +1567,7 @@ static void * __init pcpu_alloc_bootmem(
 
 static void __init pcpu_free_bootmem(void *ptr, size_t size)
 {
-	memblock_phys_free(__pa(ptr), size);
+	memblock_free(ptr, size);
 }
 
 static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
--- a/arch/um/kernel/mem.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/um/kernel/mem.c
@@ -47,7 +47,7 @@ void __init mem_init(void)
 	 */
 	brk_end = (unsigned long) UML_ROUND_UP(sbrk(0));
 	map_memory(brk_end, __pa(brk_end), uml_reserved - brk_end, 1, 1, 0);
-	memblock_phys_free(__pa(brk_end), uml_reserved - brk_end);
+	memblock_free((void *)brk_end, uml_reserved - brk_end);
 	uml_reserved = brk_end;
 
 	/* this will put all low memory onto the freelists */
--- a/arch/x86/kernel/setup_percpu.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/kernel/setup_percpu.c
@@ -135,7 +135,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_free_ptr(ptr, size);
+	memblock_free(ptr, size);
 }
 
 static int __init pcpu_cpu_distance(unsigned int from, unsigned int to)
--- a/arch/x86/mm/kasan_init_64.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/mm/kasan_init_64.c
@@ -49,7 +49,7 @@ static void __init kasan_populate_pmd(pm
 			p = early_alloc(PMD_SIZE, nid, false);
 			if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
 				return;
-			memblock_free_ptr(p, PMD_SIZE);
+			memblock_free(p, PMD_SIZE);
 		}
 
 		p = early_alloc(PAGE_SIZE, nid, true);
@@ -85,7 +85,7 @@ static void __init kasan_populate_pud(pu
 			p = early_alloc(PUD_SIZE, nid, false);
 			if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL))
 				return;
-			memblock_free_ptr(p, PUD_SIZE);
+			memblock_free(p, PUD_SIZE);
 		}
 
 		p = early_alloc(PAGE_SIZE, nid, true);
--- a/arch/x86/mm/numa.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/mm/numa.c
@@ -355,7 +355,7 @@ void __init numa_reset_distance(void)
 
 	/* numa_distance could be 1LU marking allocation failure, test cnt */
 	if (numa_distance_cnt)
-		memblock_free_ptr(numa_distance, size);
+		memblock_free(numa_distance, size);
 	numa_distance_cnt = 0;
 	numa_distance = NULL;	/* enable table creation */
 }
--- a/arch/x86/mm/numa_emulation.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/mm/numa_emulation.c
@@ -517,7 +517,7 @@ void __init numa_emulation(struct numa_m
 	}
 
 	/* free the copied physical distance table */
-	memblock_free_ptr(phys_dist, phys_size);
+	memblock_free(phys_dist, phys_size);
 	return;
 
 no_emu:
--- a/arch/x86/xen/mmu_pv.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/xen/mmu_pv.c
@@ -1151,7 +1151,7 @@ static void __init xen_pagetable_p2m_fre
 		xen_cleanhighmap(addr, addr + size);
 		size = PAGE_ALIGN(xen_start_info->nr_pages *
 				  sizeof(unsigned long));
-		memblock_phys_free(__pa(addr), size);
+		memblock_free((void *)addr, size);
 	} else {
 		xen_cleanmfnmap(addr);
 	}
--- a/arch/x86/xen/p2m.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/arch/x86/xen/p2m.c
@@ -197,7 +197,7 @@ static void * __ref alloc_p2m_page(void)
 static void __ref free_p2m_page(void *p)
 {
 	if (unlikely(!slab_is_available())) {
-		memblock_free_ptr(p, PAGE_SIZE);
+		memblock_free(p, PAGE_SIZE);
 		return;
 	}
 
--- a/drivers/base/arch_numa.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/drivers/base/arch_numa.c
@@ -166,7 +166,7 @@ static void * __init pcpu_fc_alloc(unsig
 
 static void __init pcpu_fc_free(void *ptr, size_t size)
 {
-	memblock_phys_free(__pa(ptr), size);
+	memblock_free(ptr, size);
 }
 
 #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
@@ -326,7 +326,7 @@ void __init numa_free_distance(void)
 	size = numa_distance_cnt * numa_distance_cnt *
 		sizeof(numa_distance[0]);
 
-	memblock_free_ptr(numa_distance, size);
+	memblock_free(numa_distance, size);
 	numa_distance_cnt = 0;
 	numa_distance = NULL;
 }
--- a/drivers/macintosh/smu.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/drivers/macintosh/smu.c
@@ -570,7 +570,7 @@ fail_msg_node:
 fail_db_node:
 	of_node_put(smu->db_node);
 fail_bootmem:
-	memblock_free_ptr(smu, sizeof(struct smu_device));
+	memblock_free(smu, sizeof(struct smu_device));
 	smu = NULL;
 fail_np:
 	of_node_put(np);
--- a/drivers/xen/swiotlb-xen.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/drivers/xen/swiotlb-xen.c
@@ -241,7 +241,7 @@ retry:
 	 */
 	rc = xen_swiotlb_fixup(start, nslabs);
 	if (rc) {
-		memblock_phys_free(__pa(start), PAGE_ALIGN(bytes));
+		memblock_free(start, PAGE_ALIGN(bytes));
 		if (nslabs > 1024 && repeat--) {
 			/* Min is 2MB */
 			nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
--- a/include/linux/memblock.h~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/include/linux/memblock.h
@@ -118,7 +118,7 @@ int memblock_mark_nomap(phys_addr_t base
 int memblock_clear_nomap(phys_addr_t base, phys_addr_t size);
 
 void memblock_free_all(void);
-void memblock_free_ptr(void *ptr, size_t size);
+void memblock_free(void *ptr, size_t size);
 void reset_node_managed_pages(pg_data_t *pgdat);
 void reset_all_zones_managed_pages(void);
 
--- a/init/initramfs.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/init/initramfs.c
@@ -607,7 +607,7 @@ void __weak __init free_initrd_mem(unsig
 	unsigned long aligned_start = ALIGN_DOWN(start, PAGE_SIZE);
 	unsigned long aligned_end = ALIGN(end, PAGE_SIZE);
 
-	memblock_phys_free(__pa(aligned_start), aligned_end - aligned_start);
+	memblock_free((void *)aligned_start, aligned_end - aligned_start);
 #endif
 
 	free_reserved_area((void *)start, (void *)end, POISON_FREE_INITMEM,
--- a/init/main.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/init/main.c
@@ -382,7 +382,7 @@ static char * __init xbc_make_cmdline(co
 	ret = xbc_snprint_cmdline(new_cmdline, len + 1, root);
 	if (ret < 0 || ret > len) {
 		pr_err("Failed to print extra kernel cmdline.\n");
-		memblock_free_ptr(new_cmdline, len + 1);
+		memblock_free(new_cmdline, len + 1);
 		return NULL;
 	}
 
@@ -925,7 +925,7 @@ static void __init print_unknown_bootopt
 		end += sprintf(end, " %s", *p);
 
 	pr_notice("Unknown command line parameters:%s\n", unknown_options);
-	memblock_free_ptr(unknown_options, len);
+	memblock_free(unknown_options, len);
 }
 
 asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
--- a/kernel/dma/swiotlb.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/kernel/dma/swiotlb.c
@@ -247,7 +247,7 @@ swiotlb_init(int verbose)
 	return;
 
 fail_free_mem:
-	memblock_phys_free(__pa(tlb), bytes);
+	memblock_free(tlb, bytes);
 fail:
 	pr_warn("Cannot allocate buffer");
 }
--- a/kernel/printk/printk.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/kernel/printk/printk.c
@@ -1166,9 +1166,9 @@ void __init setup_log_buf(int early)
 	return;
 
 err_free_descs:
-	memblock_free_ptr(new_descs, new_descs_size);
+	memblock_free(new_descs, new_descs_size);
 err_free_log_buf:
-	memblock_free_ptr(new_log_buf, new_log_buf_len);
+	memblock_free(new_log_buf, new_log_buf_len);
 }
 
 static bool __read_mostly ignore_loglevel;
--- a/lib/bootconfig.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/lib/bootconfig.c
@@ -792,7 +792,7 @@ void __init xbc_destroy_all(void)
 	xbc_data = NULL;
 	xbc_data_size = 0;
 	xbc_node_num = 0;
-	memblock_free_ptr(xbc_nodes, sizeof(struct xbc_node) * XBC_NODE_MAX);
+	memblock_free(xbc_nodes, sizeof(struct xbc_node) * XBC_NODE_MAX);
 	xbc_nodes = NULL;
 	brace_index = 0;
 }
--- a/lib/cpumask.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/lib/cpumask.c
@@ -188,7 +188,7 @@ EXPORT_SYMBOL(free_cpumask_var);
  */
 void __init free_bootmem_cpumask_var(cpumask_var_t mask)
 {
-	memblock_phys_free(__pa(mask), cpumask_size());
+	memblock_free(mask, cpumask_size());
 }
 #endif
 
--- a/mm/memblock.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/mm/memblock.c
@@ -472,7 +472,7 @@ static int __init_memblock memblock_doub
 		kfree(old_array);
 	else if (old_array != memblock_memory_init_regions &&
 		 old_array != memblock_reserved_init_regions)
-		memblock_free_ptr(old_array, old_alloc_size);
+		memblock_free(old_array, old_alloc_size);
 
 	/*
 	 * Reserve the new array if that comes from the memblock.  Otherwise, we
@@ -796,14 +796,14 @@ int __init_memblock memblock_remove(phys
 }
 
 /**
- * memblock_free_ptr - free boot memory allocation
+ * memblock_free - free boot memory allocation
  * @ptr: starting address of the  boot memory allocation
  * @size: size of the boot memory block in bytes
  *
  * Free boot memory block previously allocated by memblock_alloc_xx() API.
  * The freeing memory will not be released to the buddy allocator.
  */
-void __init_memblock memblock_free_ptr(void *ptr, size_t size)
+void __init_memblock memblock_free(void *ptr, size_t size)
 {
 	if (ptr)
 		memblock_phys_free(__pa(ptr), size);
--- a/mm/percpu.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/mm/percpu.c
@@ -2472,7 +2472,7 @@ struct pcpu_alloc_info * __init pcpu_all
  */
 void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
 {
-	memblock_phys_free(__pa(ai), ai->__ai_size);
+	memblock_free(ai, ai->__ai_size);
 }
 
 /**
@@ -3134,7 +3134,7 @@ out_free_areas:
 out_free:
 	pcpu_free_alloc_info(ai);
 	if (areas)
-		memblock_phys_free(__pa(areas), areas_size);
+		memblock_free(areas, areas_size);
 	return rc;
 }
 #endif /* BUILD_EMBED_FIRST_CHUNK */
@@ -3256,7 +3256,7 @@ enomem:
 		free_fn(page_address(pages[j]), PAGE_SIZE);
 	rc = -ENOMEM;
 out_free_ar:
-	memblock_phys_free(__pa(pages), pages_size);
+	memblock_free(pages, pages_size);
 	pcpu_free_alloc_info(ai);
 	return rc;
 }
@@ -3286,7 +3286,7 @@ static void * __init pcpu_dfl_fc_alloc(u
 
 static void __init pcpu_dfl_fc_free(void *ptr, size_t size)
 {
-	memblock_phys_free(__pa(ptr), size);
+	memblock_free(ptr, size);
 }
 
 void __init setup_per_cpu_areas(void)
--- a/mm/sparse.c~memblock-use-memblock_free-for-freeing-virtual-pointers
+++ a/mm/sparse.c
@@ -451,7 +451,7 @@ static void *sparsemap_buf_end __meminit
 static inline void __meminit sparse_buffer_free(unsigned long size)
 {
 	WARN_ON(!sparsemap_buf || size == 0);
-	memblock_phys_free(__pa(sparsemap_buf), size);
+	memblock_free(sparsemap_buf, size);
 }
 
 static void __init sparse_buffer_init(unsigned long size, int nid)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 168/262] mm: mark the OOM reaper thread as freezable
  2021-11-05 20:34 incoming Andrew Morton
                   ` (166 preceding siblings ...)
  2021-11-05 20:43 ` [patch 167/262] memblock: use memblock_free for freeing virtual pointers Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 169/262] hugetlbfs: extend the definition of hugepages parameter to support node allocation Andrew Morton
                   ` (93 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, linux-mm, mgorman, mhocko, mm-commits, rientjes, sultan, torvalds

From: Sultan Alsawaf <sultan@kerneltoast.com>
Subject: mm: mark the OOM reaper thread as freezable

The OOM reaper alters user address space which might theoretically
alter the snapshot if reaping is allowed to happen after the freezer
quiescent state.  To this end, the reaper kthread uses
wait_event_freezable() while waiting for any work so that it cannot run
while the system freezes.

However, the current implementation doesn't respect the freezer because
all kernel threads are created with the PF_NOFREEZE flag, so they are
automatically excluded from freezing operations.  This means that the
OOM reaper can race with system snapshotting if it has work to do while
the system is being frozen.

Fix this by adding a set_freezable() call which will clear the
PF_NOFREEZE flag and thus make the OOM reaper visible to the freezer.

Please note that the OOM reaper altering the snapshot this way is
mostly a theoretical concern and has not been observed in practice.

Link: https://lkml.kernel.org/r/20210921165758.6154-1-sultan@kerneltoast.com
Link: https://lkml.kernel.org/r/20210918233920.9174-1-sultan@kerneltoast.com
Fixes: aac453635549 ("mm, oom: introduce oom reaper")
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/oom_kill.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/oom_kill.c~mm-mark-the-oom-reaper-thread-as-freezable
+++ a/mm/oom_kill.c
@@ -641,6 +641,8 @@ done:
 
 static int oom_reaper(void *unused)
 {
+	set_freezable();
+
 	while (true) {
 		struct task_struct *tsk = NULL;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 169/262] hugetlbfs: extend the definition of hugepages parameter to support node allocation
  2021-11-05 20:34 incoming Andrew Morton
                   ` (167 preceding siblings ...)
  2021-11-05 20:43 ` [patch 168/262] mm: mark the OOM reaper thread as freezable Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 170/262] mm/migrate: de-duplicate migrate_reason strings Andrew Morton
                   ` (92 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, benh, corbet, dan.carpenter, linux-mm, mike.kravetz,
	mm-commits, mpe, nathan, paulus, rppt, torvalds, willy,
	yaozhenguo1

From: Zhenguo Yao <yaozhenguo1@gmail.com>
Subject: hugetlbfs: extend the definition of hugepages parameter to support node allocation

We can specify the number of hugepages to allocate at boot.  But the
hugepages is balanced in all nodes at present.  In some scenarios, we only
need hugepages in one node.  For example: DPDK needs hugepages which are
in the same node as NIC.

If DPDK needs four hugepages of 1G size in node1 and system has 16 numa
nodes we must reserve 64 hugepages on the kernel cmdline.  But only four
hugepages are used.  The others should be free after boot.  If the system
memory is low(for example: 64G), it will be an impossible task.

So extend the hugepages parameter to support specifying hugepages on a
specific node.  For example add following parameter:

hugepagesz=1G hugepages=0:1,1:3

It will allocate 1 hugepage in node0 and 3 hugepages in node1.

Link: https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.com
Signed-off-by: Zhenguo Yao <yaozhenguo1@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zhenguo Yao <yaozhenguo1@gmail.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/kernel-parameters.txt |    8 
 Documentation/admin-guide/mm/hugetlbpage.rst    |   12 +
 arch/powerpc/mm/hugetlbpage.c                   |    9 
 include/linux/hugetlb.h                         |    6 
 mm/hugetlb.c                                    |  153 +++++++++++---
 5 files changed, 155 insertions(+), 33 deletions(-)

--- a/arch/powerpc/mm/hugetlbpage.c~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation
+++ a/arch/powerpc/mm/hugetlbpage.c
@@ -229,17 +229,22 @@ static int __init pseries_alloc_bootmem_
 	m->hstate = hstate;
 	return 1;
 }
+
+bool __init hugetlb_node_alloc_supported(void)
+{
+	return false;
+}
 #endif
 
 
-int __init alloc_bootmem_huge_page(struct hstate *h)
+int __init alloc_bootmem_huge_page(struct hstate *h, int nid)
 {
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled())
 		return pseries_alloc_bootmem_huge_page(h);
 #endif
-	return __alloc_bootmem_huge_page(h);
+	return __alloc_bootmem_huge_page(h, nid);
 }
 
 #ifndef CONFIG_PPC_BOOK3S_64
--- a/Documentation/admin-guide/kernel-parameters.txt~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation
+++ a/Documentation/admin-guide/kernel-parameters.txt
@@ -1601,9 +1601,11 @@
 			the number of pages of hugepagesz to be allocated.
 			If this is the first HugeTLB parameter on the command
 			line, it specifies the number of pages to allocate for
-			the default huge page size.  See also
-			Documentation/admin-guide/mm/hugetlbpage.rst.
-			Format: <integer>
+			the default huge page size. If using node format, the
+			number of pages to allocate per-node can be specified.
+			See also Documentation/admin-guide/mm/hugetlbpage.rst.
+			Format: <integer> or (node format)
+				<node>:<integer>[,<node>:<integer>]
 
 	hugepagesz=
 			[HW] The size of the HugeTLB pages.  This is used in
--- a/Documentation/admin-guide/mm/hugetlbpage.rst~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation
+++ a/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -128,7 +128,9 @@ hugepages
 	implicitly specifies the number of huge pages of default size to
 	allocate.  If the number of huge pages of default size is implicitly
 	specified, it can not be overwritten by a hugepagesz,hugepages
-	parameter pair for the default size.
+	parameter pair for the default size.  This parameter also has a
+	node format.  The node format specifies the number of huge pages
+	to allocate on specific nodes.
 
 	For example, on an architecture with 2M default huge page size::
 
@@ -138,6 +140,14 @@ hugepages
 	indicating that the hugepages=512 parameter is ignored.  If a hugepages
 	parameter is preceded by an invalid hugepagesz parameter, it will
 	be ignored.
+
+	Node format example::
+
+		hugepagesz=2M hugepages=0:1,1:2
+
+	It will allocate 1 2M hugepage on node0 and 2 2M hugepages on node1.
+	If the node number is invalid,  the parameter will be ignored.
+
 default_hugepagesz
 	Specify the default huge page size.  This parameter can
 	only be specified once on the command line.  default_hugepagesz can
--- a/include/linux/hugetlb.h~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation
+++ a/include/linux/hugetlb.h
@@ -615,6 +615,7 @@ struct hstate {
 	unsigned long nr_overcommit_huge_pages;
 	struct list_head hugepage_activelist;
 	struct list_head hugepage_freelists[MAX_NUMNODES];
+	unsigned int max_huge_pages_node[MAX_NUMNODES];
 	unsigned int nr_huge_pages_node[MAX_NUMNODES];
 	unsigned int free_huge_pages_node[MAX_NUMNODES];
 	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
@@ -647,8 +648,9 @@ void restore_reserve_on_error(struct hst
 				unsigned long address, struct page *page);
 
 /* arch callback */
-int __init __alloc_bootmem_huge_page(struct hstate *h);
-int __init alloc_bootmem_huge_page(struct hstate *h);
+int __init __alloc_bootmem_huge_page(struct hstate *h, int nid);
+int __init alloc_bootmem_huge_page(struct hstate *h, int nid);
+bool __init hugetlb_node_alloc_supported(void);
 
 void __init hugetlb_add_hstate(unsigned order);
 bool __init arch_hugetlb_valid_size(unsigned long size);
--- a/mm/hugetlb.c~hugetlbfs-extend-the-definition-of-hugepages-parameter-to-support-node-allocation
+++ a/mm/hugetlb.c
@@ -77,6 +77,7 @@ static struct hstate * __initdata parsed
 static unsigned long __initdata default_hstate_max_huge_pages;
 static bool __initdata parsed_valid_hugepagesz = true;
 static bool __initdata parsed_default_hugepagesz;
+static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata;
 
 /*
  * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages,
@@ -2963,33 +2964,39 @@ out_subpool_put:
 	return ERR_PTR(-ENOSPC);
 }
 
-int alloc_bootmem_huge_page(struct hstate *h)
+int alloc_bootmem_huge_page(struct hstate *h, int nid)
 	__attribute__ ((weak, alias("__alloc_bootmem_huge_page")));
-int __alloc_bootmem_huge_page(struct hstate *h)
+int __alloc_bootmem_huge_page(struct hstate *h, int nid)
 {
-	struct huge_bootmem_page *m;
+	struct huge_bootmem_page *m = NULL; /* initialize for clang */
 	int nr_nodes, node;
 
+	if (nid >= nr_online_nodes)
+		return 0;
+	/* do node specific alloc */
+	if (nid != NUMA_NO_NODE) {
+		m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h),
+				0, MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+		if (!m)
+			return 0;
+		goto found;
+	}
+	/* allocate from next node when distributing huge pages */
 	for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
-		void *addr;
-
-		addr = memblock_alloc_try_nid_raw(
+		m = memblock_alloc_try_nid_raw(
 				huge_page_size(h), huge_page_size(h),
 				0, MEMBLOCK_ALLOC_ACCESSIBLE, node);
-		if (addr) {
-			/*
-			 * Use the beginning of the huge page to store the
-			 * huge_bootmem_page struct (until gather_bootmem
-			 * puts them into the mem_map).
-			 */
-			m = addr;
-			goto found;
-		}
+		/*
+		 * Use the beginning of the huge page to store the
+		 * huge_bootmem_page struct (until gather_bootmem
+		 * puts them into the mem_map).
+		 */
+		if (!m)
+			return 0;
+		goto found;
 	}
-	return 0;
 
 found:
-	BUG_ON(!IS_ALIGNED(virt_to_phys(m), huge_page_size(h)));
 	/* Put them into a private list first because mem_map is not up yet */
 	INIT_LIST_HEAD(&m->list);
 	list_add(&m->list, &huge_boot_pages);
@@ -3029,12 +3036,61 @@ static void __init gather_bootmem_preall
 		cond_resched();
 	}
 }
+static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
+{
+	unsigned long i;
+	char buf[32];
+
+	for (i = 0; i < h->max_huge_pages_node[nid]; ++i) {
+		if (hstate_is_gigantic(h)) {
+			if (!alloc_bootmem_huge_page(h, nid))
+				break;
+		} else {
+			struct page *page;
+			gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
+
+			page = alloc_fresh_huge_page(h, gfp_mask, nid,
+					&node_states[N_MEMORY], NULL);
+			if (!page)
+				break;
+			put_page(page); /* free it into the hugepage allocator */
+		}
+		cond_resched();
+	}
+	if (i == h->max_huge_pages_node[nid])
+		return;
+
+	string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
+	pr_warn("HugeTLB: allocating %u of page size %s failed node%d.  Only allocated %lu hugepages.\n",
+		h->max_huge_pages_node[nid], buf, nid, i);
+	h->max_huge_pages -= (h->max_huge_pages_node[nid] - i);
+	h->max_huge_pages_node[nid] = i;
+}
 
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
 	unsigned long i;
 	nodemask_t *node_alloc_noretry;
+	bool node_specific_alloc = false;
+
+	/* skip gigantic hugepages allocation if hugetlb_cma enabled */
+	if (hstate_is_gigantic(h) && hugetlb_cma_size) {
+		pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
+		return;
+	}
+
+	/* do node specific alloc */
+	for (i = 0; i < nr_online_nodes; i++) {
+		if (h->max_huge_pages_node[i] > 0) {
+			hugetlb_hstate_alloc_pages_onenode(h, i);
+			node_specific_alloc = true;
+		}
+	}
 
+	if (node_specific_alloc)
+		return;
+
+	/* below will do all node balanced alloc */
 	if (!hstate_is_gigantic(h)) {
 		/*
 		 * Bit mask controlling how hard we retry per-node allocations.
@@ -3055,11 +3111,7 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (hugetlb_cma_size) {
-				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
-				goto free;
-			}
-			if (!alloc_bootmem_huge_page(h))
+			if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE))
 				break;
 		} else if (!alloc_pool_huge_page(h,
 					 &node_states[N_MEMORY],
@@ -3075,7 +3127,6 @@ static void __init hugetlb_hstate_alloc_
 			h->max_huge_pages, buf, i);
 		h->max_huge_pages = i;
 	}
-free:
 	kfree(node_alloc_noretry);
 }
 
@@ -3990,6 +4041,10 @@ static int __init hugetlb_init(void)
 			}
 			default_hstate.max_huge_pages =
 				default_hstate_max_huge_pages;
+
+			for (i = 0; i < nr_online_nodes; i++)
+				default_hstate.max_huge_pages_node[i] =
+					default_hugepages_in_node[i];
 		}
 	}
 
@@ -4050,6 +4105,10 @@ void __init hugetlb_add_hstate(unsigned
 	parsed_hstate = h;
 }
 
+bool __init __weak hugetlb_node_alloc_supported(void)
+{
+	return true;
+}
 /*
  * hugepages command line processing
  * hugepages normally follows a valid hugepagsz or default_hugepagsz
@@ -4061,6 +4120,10 @@ static int __init hugepages_setup(char *
 {
 	unsigned long *mhp;
 	static unsigned long *last_mhp;
+	int node = NUMA_NO_NODE;
+	int count;
+	unsigned long tmp;
+	char *p = s;
 
 	if (!parsed_valid_hugepagesz) {
 		pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s);
@@ -4084,8 +4147,40 @@ static int __init hugepages_setup(char *
 		return 0;
 	}
 
-	if (sscanf(s, "%lu", mhp) <= 0)
-		*mhp = 0;
+	while (*p) {
+		count = 0;
+		if (sscanf(p, "%lu%n", &tmp, &count) != 1)
+			goto invalid;
+		/* Parameter is node format */
+		if (p[count] == ':') {
+			if (!hugetlb_node_alloc_supported()) {
+				pr_warn("HugeTLB: architecture can't support node specific alloc, ignoring!\n");
+				return 0;
+			}
+			node = tmp;
+			p += count + 1;
+			if (node < 0 || node >= nr_online_nodes)
+				goto invalid;
+			/* Parse hugepages */
+			if (sscanf(p, "%lu%n", &tmp, &count) != 1)
+				goto invalid;
+			if (!hugetlb_max_hstate)
+				default_hugepages_in_node[node] = tmp;
+			else
+				parsed_hstate->max_huge_pages_node[node] = tmp;
+			*mhp += tmp;
+			/* Go to parse next node*/
+			if (p[count] == ',')
+				p += count + 1;
+			else
+				break;
+		} else {
+			if (p != s)
+				goto invalid;
+			*mhp = tmp;
+			break;
+		}
+	}
 
 	/*
 	 * Global state is always initialized later in hugetlb_init.
@@ -4098,6 +4193,10 @@ static int __init hugepages_setup(char *
 	last_mhp = mhp;
 
 	return 1;
+
+invalid:
+	pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p);
+	return 0;
 }
 __setup("hugepages=", hugepages_setup);
 
@@ -4159,6 +4258,7 @@ __setup("hugepagesz=", hugepagesz_setup)
 static int __init default_hugepagesz_setup(char *s)
 {
 	unsigned long size;
+	int i;
 
 	parsed_valid_hugepagesz = false;
 	if (parsed_default_hugepagesz) {
@@ -4187,6 +4287,9 @@ static int __init default_hugepagesz_set
 	 */
 	if (default_hstate_max_huge_pages) {
 		default_hstate.max_huge_pages = default_hstate_max_huge_pages;
+		for (i = 0; i < nr_online_nodes; i++)
+			default_hstate.max_huge_pages_node[i] =
+				default_hugepages_in_node[i];
 		if (hstate_is_gigantic(&default_hstate))
 			hugetlb_hstate_alloc_pages(&default_hstate);
 		default_hstate_max_huge_pages = 0;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 170/262] mm/migrate: de-duplicate migrate_reason strings
  2021-11-05 20:34 incoming Andrew Morton
                   ` (168 preceding siblings ...)
  2021-11-05 20:43 ` [patch 169/262] hugetlbfs: extend the definition of hugepages parameter to support node allocation Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 171/262] mm: migrate: make demotion knob depend on migration Andrew Morton
                   ` (91 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, jhubbard, linux-mm, mm-commits, o451686892, torvalds, ying.huang

From: John Hubbard <jhubbard@nvidia.com>
Subject: mm/migrate: de-duplicate migrate_reason strings

In order to remove the need to manually keep three different files in
synch, provide a common definition of the mapping between enum
migrate_reason, and the associated strings for each enum item.

1. Use the tracing system's mapping of enums to strings, by redefining
   and reusing the MIGRATE_REASON and supporting macros, and using that to
   populate the string array in mm/debug.c.

2. Move enum migrate_reason to migrate_mode.h.  This is not strictly
   necessary for this patch, but migrate mode and migrate reason go
   together, so this will slightly clarify things.

Link: https://lkml.kernel.org/r/20210922041755.141817-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Weizhao Ouyang <o451686892@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/migrate.h      |   19 +------------------
 include/linux/migrate_mode.h |   13 +++++++++++++
 mm/debug.c                   |   20 +++++++++++---------
 3 files changed, 25 insertions(+), 27 deletions(-)

--- a/include/linux/migrate.h~mm-migrate-de-duplicate-migrate_reason-strings
+++ a/include/linux/migrate.h
@@ -19,24 +19,7 @@ struct migration_target_control;
  */
 #define MIGRATEPAGE_SUCCESS		0
 
-/*
- * Keep sync with:
- * - macro MIGRATE_REASON in include/trace/events/migrate.h
- * - migrate_reason_names[MR_TYPES] in mm/debug.c
- */
-enum migrate_reason {
-	MR_COMPACTION,
-	MR_MEMORY_FAILURE,
-	MR_MEMORY_HOTPLUG,
-	MR_SYSCALL,		/* also applies to cpusets */
-	MR_MEMPOLICY_MBIND,
-	MR_NUMA_MISPLACED,
-	MR_CONTIG_RANGE,
-	MR_LONGTERM_PIN,
-	MR_DEMOTION,
-	MR_TYPES
-};
-
+/* Defined in mm/debug.c: */
 extern const char *migrate_reason_names[MR_TYPES];
 
 #ifdef CONFIG_MIGRATION
--- a/include/linux/migrate_mode.h~mm-migrate-de-duplicate-migrate_reason-strings
+++ a/include/linux/migrate_mode.h
@@ -19,4 +19,17 @@ enum migrate_mode {
 	MIGRATE_SYNC_NO_COPY,
 };
 
+enum migrate_reason {
+	MR_COMPACTION,
+	MR_MEMORY_FAILURE,
+	MR_MEMORY_HOTPLUG,
+	MR_SYSCALL,		/* also applies to cpusets */
+	MR_MEMPOLICY_MBIND,
+	MR_NUMA_MISPLACED,
+	MR_CONTIG_RANGE,
+	MR_LONGTERM_PIN,
+	MR_DEMOTION,
+	MR_TYPES
+};
+
 #endif		/* MIGRATE_MODE_H_INCLUDED */
--- a/mm/debug.c~mm-migrate-de-duplicate-migrate_reason-strings
+++ a/mm/debug.c
@@ -16,17 +16,19 @@
 #include <linux/ctype.h>
 
 #include "internal.h"
+#include <trace/events/migrate.h>
+
+/*
+ * Define EM() and EMe() so that MIGRATE_REASON from trace/events/migrate.h can
+ * be used to populate migrate_reason_names[].
+ */
+#undef EM
+#undef EMe
+#define EM(a, b)	b,
+#define EMe(a, b)	b
 
 const char *migrate_reason_names[MR_TYPES] = {
-	"compaction",
-	"memory_failure",
-	"memory_hotplug",
-	"syscall_or_cpuset",
-	"mempolicy_mbind",
-	"numa_misplaced",
-	"contig_range",
-	"longterm_pin",
-	"demotion",
+	MIGRATE_REASON
 };
 
 const struct trace_print_flags pageflag_names[] = {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 171/262] mm: migrate: make demotion knob depend on migration
  2021-11-05 20:34 incoming Andrew Morton
                   ` (169 preceding siblings ...)
  2021-11-05 20:43 ` [patch 170/262] mm/migrate: de-duplicate migrate_reason strings Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 172/262] selftests/vm/transhuge-stress: fix ram size thinko Andrew Morton
                   ` (90 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, dave.hansen, linux-mm, mm-commits, shy828301, torvalds, ying.huang

From: Yang Shi <shy828301@gmail.com>
Subject: mm: migrate: make demotion knob depend on migration

The memory demotion needs to call migrate_pages() to do the jobs.  And it
is controlled by a knob, however, the knob doesn't depend on
CONFIG_MIGRATION.  The knob could be truned on even though MIGRATION is
disabled, this will not cause any crash since migrate_pages() would just
return -ENOSYS.  But it is definitely not optimal to go through demotion
path then retry regular swap every time.

And it doesn't make too much sense to have the knob visible to the users
when !MIGRATION.  Move the related code from mempolicy.[h|c] to
migrate.[h|c].

Link: https://lkml.kernel.org/r/20211015005559.246709-1-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Acked-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mempolicy.h |    4 --
 include/linux/migrate.h   |    4 ++
 mm/mempolicy.c            |   61 ------------------------------------
 mm/migrate.c              |   61 ++++++++++++++++++++++++++++++++++++
 4 files changed, 65 insertions(+), 65 deletions(-)

--- a/include/linux/mempolicy.h~mm-migrate-make-demotion-knob-depend-on-migration
+++ a/include/linux/mempolicy.h
@@ -183,8 +183,6 @@ extern bool vma_migratable(struct vm_are
 extern int mpol_misplaced(struct page *, struct vm_area_struct *, unsigned long);
 extern void mpol_put_task_policy(struct task_struct *);
 
-extern bool numa_demotion_enabled;
-
 static inline bool mpol_is_preferred_many(struct mempolicy *pol)
 {
 	return  (pol->mode == MPOL_PREFERRED_MANY);
@@ -300,8 +298,6 @@ static inline nodemask_t *policy_nodemas
 	return NULL;
 }
 
-#define numa_demotion_enabled	false
-
 static inline bool mpol_is_preferred_many(struct mempolicy *pol)
 {
 	return  false;
--- a/include/linux/migrate.h~mm-migrate-make-demotion-knob-depend-on-migration
+++ a/include/linux/migrate.h
@@ -40,6 +40,8 @@ extern int migrate_huge_page_move_mappin
 				  struct page *newpage, struct page *page);
 extern int migrate_page_move_mapping(struct address_space *mapping,
 		struct page *newpage, struct page *page, int extra_count);
+
+extern bool numa_demotion_enabled;
 #else
 
 static inline void putback_movable_pages(struct list_head *l) {}
@@ -65,6 +67,8 @@ static inline int migrate_huge_page_move
 {
 	return -ENOSYS;
 }
+
+#define numa_demotion_enabled	false
 #endif /* CONFIG_MIGRATION */
 
 #ifdef CONFIG_COMPACTION
--- a/mm/mempolicy.c~mm-migrate-make-demotion-knob-depend-on-migration
+++ a/mm/mempolicy.c
@@ -3057,64 +3057,3 @@ void mpol_to_str(char *buffer, int maxle
 		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
 			       nodemask_pr_args(&nodes));
 }
-
-bool numa_demotion_enabled = false;
-
-#ifdef CONFIG_SYSFS
-static ssize_t numa_demotion_enabled_show(struct kobject *kobj,
-					  struct kobj_attribute *attr, char *buf)
-{
-	return sysfs_emit(buf, "%s\n",
-			  numa_demotion_enabled? "true" : "false");
-}
-
-static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
-					   struct kobj_attribute *attr,
-					   const char *buf, size_t count)
-{
-	if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1))
-		numa_demotion_enabled = true;
-	else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1))
-		numa_demotion_enabled = false;
-	else
-		return -EINVAL;
-
-	return count;
-}
-
-static struct kobj_attribute numa_demotion_enabled_attr =
-	__ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
-	       numa_demotion_enabled_store);
-
-static struct attribute *numa_attrs[] = {
-	&numa_demotion_enabled_attr.attr,
-	NULL,
-};
-
-static const struct attribute_group numa_attr_group = {
-	.attrs = numa_attrs,
-};
-
-static int __init numa_init_sysfs(void)
-{
-	int err;
-	struct kobject *numa_kobj;
-
-	numa_kobj = kobject_create_and_add("numa", mm_kobj);
-	if (!numa_kobj) {
-		pr_err("failed to create numa kobject\n");
-		return -ENOMEM;
-	}
-	err = sysfs_create_group(numa_kobj, &numa_attr_group);
-	if (err) {
-		pr_err("failed to register numa group\n");
-		goto delete_obj;
-	}
-	return 0;
-
-delete_obj:
-	kobject_put(numa_kobj);
-	return err;
-}
-subsys_initcall(numa_init_sysfs);
-#endif
--- a/mm/migrate.c~mm-migrate-make-demotion-knob-depend-on-migration
+++ a/mm/migrate.c
@@ -3306,3 +3306,64 @@ static int __init migrate_on_reclaim_ini
 }
 late_initcall(migrate_on_reclaim_init);
 #endif /* CONFIG_HOTPLUG_CPU */
+
+bool numa_demotion_enabled = false;
+
+#ifdef CONFIG_SYSFS
+static ssize_t numa_demotion_enabled_show(struct kobject *kobj,
+					  struct kobj_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%s\n",
+			  numa_demotion_enabled ? "true" : "false");
+}
+
+static ssize_t numa_demotion_enabled_store(struct kobject *kobj,
+					   struct kobj_attribute *attr,
+					   const char *buf, size_t count)
+{
+	if (!strncmp(buf, "true", 4) || !strncmp(buf, "1", 1))
+		numa_demotion_enabled = true;
+	else if (!strncmp(buf, "false", 5) || !strncmp(buf, "0", 1))
+		numa_demotion_enabled = false;
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+static struct kobj_attribute numa_demotion_enabled_attr =
+	__ATTR(demotion_enabled, 0644, numa_demotion_enabled_show,
+	       numa_demotion_enabled_store);
+
+static struct attribute *numa_attrs[] = {
+	&numa_demotion_enabled_attr.attr,
+	NULL,
+};
+
+static const struct attribute_group numa_attr_group = {
+	.attrs = numa_attrs,
+};
+
+static int __init numa_init_sysfs(void)
+{
+	int err;
+	struct kobject *numa_kobj;
+
+	numa_kobj = kobject_create_and_add("numa", mm_kobj);
+	if (!numa_kobj) {
+		pr_err("failed to create numa kobject\n");
+		return -ENOMEM;
+	}
+	err = sysfs_create_group(numa_kobj, &numa_attr_group);
+	if (err) {
+		pr_err("failed to register numa group\n");
+		goto delete_obj;
+	}
+	return 0;
+
+delete_obj:
+	kobject_put(numa_kobj);
+	return err;
+}
+subsys_initcall(numa_init_sysfs);
+#endif
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 172/262] selftests/vm/transhuge-stress: fix ram size thinko
  2021-11-05 20:34 incoming Andrew Morton
                   ` (170 preceding siblings ...)
  2021-11-05 20:43 ` [patch 171/262] mm: migrate: make demotion knob depend on migration Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 173/262] mm, thp: lock filemap when truncating page cache Andrew Morton
                   ` (89 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, davis.george, erosca, koct9i, linux-mm, mm-commits, skhan,
	torvalds

From: "George G. Davis" <davis.george@siemens.com>
Subject: selftests/vm/transhuge-stress: fix ram size thinko

When executing transhuge-stress with an argument to specify the virtual
memory size for testing, the ram size is reported as 0, e.g.

transhuge-stress 384
thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 0 MiB of ram
thp-mmap: 0.184 s/loop, 0.957 ms/page,   2090.265 MiB/s  192 succeed,    0 failed

This appears to be due to a thinko in commit 0085d61fe05e
("selftests/vm/transhuge-stress: stress test for memory compaction"),
where, at a guess, the intent was to base "xyz MiB of ram" on `ram` size. 
Here are results after using `ram` size:

thp-mmap: allocate 192 transhuge pages, using 384 MiB virtual memory and 14 MiB of ram

Link: https://lkml.kernel.org/r/20210825135843.29052-1-george_davis@mentor.com
Fixes: 0085d61fe05e ("selftests/vm/transhuge-stress: stress test for memory compaction")
Signed-off-by: George G. Davis <davis.george@siemens.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Eugeniu Rosca <erosca@de.adit-jv.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/transhuge-stress.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/testing/selftests/vm/transhuge-stress.c~selftests-vm-transhuge-stress-fix-ram-size-thinko
+++ a/tools/testing/selftests/vm/transhuge-stress.c
@@ -79,7 +79,7 @@ int main(int argc, char **argv)
 
 	warnx("allocate %zd transhuge pages, using %zd MiB virtual memory"
 	      " and %zd MiB of ram", len >> HPAGE_SHIFT, len >> 20,
-	      len >> (20 + HPAGE_SHIFT - PAGE_SHIFT - 1));
+	      ram >> (20 + HPAGE_SHIFT - PAGE_SHIFT - 1));
 
 	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
 	if (pagemap_fd < 0)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 173/262] mm, thp: lock filemap when truncating page cache
  2021-11-05 20:34 incoming Andrew Morton
                   ` (171 preceding siblings ...)
  2021-11-05 20:43 ` [patch 172/262] selftests/vm/transhuge-stress: fix ram size thinko Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 174/262] mm, thp: fix incorrect unmap behavior for private pages Andrew Morton
                   ` (88 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, cfijalkovich, hughd, linux-mm, mike.kravetz, mm-commits,
	rongwei.wang, shy828301, song, stable, torvalds,
	william.kucharski, willy, xuyu

From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Subject: mm, thp: lock filemap when truncating page cache

Patch series "fix two bugs for file THP".


This patch (of 2):

Transparent huge page has supported read-only non-shmem files.  The file-
backed THP is collapsed by khugepaged and truncated when written (for
shared libraries).

However, there is a race when multiple writers truncate the same page
cache concurrently.

In that case, subpage(s) of file THP can be revealed by find_get_entry in
truncate_inode_pages_range, which will trigger PageTail BUG_ON in
truncate_inode_page, as follows.

page:000000009e420ff2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff pfn:0x50c3ff
head:0000000075ff816d order:9 compound_mapcount:0 compound_pincount:0
flags: 0x37fffe0000010815(locked|uptodate|lru|arch_1|head)
raw: 37fffe0000000000 fffffe0013108001 dead000000000122 dead000000000400
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
head: 37fffe0000010815 fffffe001066bd48 ffff000404183c20 0000000000000000
head: 0000000000000600 0000000000000000 00000001ffffffff ffff000c0345a000
page dumped because: VM_BUG_ON_PAGE(PageTail(page))
------------[ cut here ]------------
kernel BUG at mm/truncate.c:213!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in: xfs(E) libcrc32c(E) rfkill(E) ...
CPU: 14 PID: 11394 Comm: check_madvise_d Kdump: ...
Hardware name: ECS, BIOS 0.0.0 02/06/2015
pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
pc : truncate_inode_page+0x64/0x70
lr : truncate_inode_page+0x64/0x70
sp : ffff80001b60b900
x29: ffff80001b60b900 x28: 00000000000007ff
x27: ffff80001b60b9a0 x26: 0000000000000000
x25: 000000000000000f x24: ffff80001b60b9a0
x23: ffff80001b60ba18 x22: ffff0001e0999ea8
x21: ffff0000c21db300 x20: ffffffffffffffff
x19: fffffe001310ffc0 x18: 0000000000000020
x17: 0000000000000000 x16: 0000000000000000
x15: ffff0000c21db960 x14: 3030306666666620
x13: 6666666666666666 x12: 3130303030303030
x11: ffff8000117b69b8 x10: 00000000ffff8000
x9 : ffff80001012690c x8 : 0000000000000000
x7 : ffff8000114f69b8 x6 : 0000000000017ffd
x5 : ffff0007fffbcbc8 x4 : ffff80001b60b5c0
x3 : 0000000000000001 x2 : 0000000000000000
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
 truncate_inode_page+0x64/0x70
 truncate_inode_pages_range+0x550/0x7e4
 truncate_pagecache+0x58/0x80
 do_dentry_open+0x1e4/0x3c0
 vfs_open+0x38/0x44
 do_open+0x1f0/0x310
 path_openat+0x114/0x1dc
 do_filp_open+0x84/0x134
 do_sys_openat2+0xbc/0x164
 __arm64_sys_openat+0x74/0xc0
 el0_svc_common.constprop.0+0x88/0x220
 do_el0_svc+0x30/0xa0
 el0_svc+0x20/0x30
 el0_sync_handler+0x1a4/0x1b0
 el0_sync+0x180/0x1c0
Code: aa0103e0 900061e1 910ec021 9400d300 (d4210000)
---[ end trace f70cdb42cb7c2d42 ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception

This patch mainly to lock filemap when one enter truncate_pagecache(),
avoiding truncating the same page cache concurrently.

Link: https://lkml.kernel.org/r/20211025092134.18562-1-rongwei.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/20211025092134.18562-2-rongwei.wang@linux.alibaba.com
Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Song Liu <song@kernel.org>
Cc: Collin Fijalkovich <cfijalkovich@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/open.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/fs/open.c~mm-thp-lock-filemap-when-truncating-page-cache
+++ a/fs/open.c
@@ -856,8 +856,11 @@ static int do_dentry_open(struct file *f
 		 * of THPs into the page cache will fail.
 		 */
 		smp_mb();
-		if (filemap_nr_thps(inode->i_mapping))
+		if (filemap_nr_thps(inode->i_mapping)) {
+			filemap_invalidate_lock(inode->i_mapping);
 			truncate_pagecache(inode, 0);
+			filemap_invalidate_unlock(inode->i_mapping);
+		}
 	}
 
 	return 0;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 174/262] mm, thp: fix incorrect unmap behavior for private pages
  2021-11-05 20:34 incoming Andrew Morton
                   ` (172 preceding siblings ...)
  2021-11-05 20:43 ` [patch 173/262] mm, thp: lock filemap when truncating page cache Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 175/262] mm/readahead.c: fix incorrect comments for get_init_ra_size Andrew Morton
                   ` (87 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, cfijalkovich, hughd, linux-mm, mike.kravetz, mm-commits,
	rongwei.wang, shy828301, song, stable, torvalds,
	william.kucharski, willy, xuyu

From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Subject: mm, thp: fix incorrect unmap behavior for private pages

When truncating pagecache on file THP, the private pages of a process
should not be unmapped mapping.  This incorrect behavior on a dynamic
shared libraries which will cause related processes to happen core dump.

A simple test for a DSO (Prerequisite is the DSO mapped in file THP):

int main(int argc, char *argv[])
{
	int fd;

	fd = open(argv[1], O_WRONLY);
	if (fd < 0) {
		perror("open");
	}

	close(fd);
	return 0;
}

The test only to open a target DSO, and do nothing.  But this operation
will lead one or more process to happen core dump.  This patch mainly to
fix this bug.

Link: https://lkml.kernel.org/r/20211025092134.18562-3-rongwei.wang@linux.alibaba.com
Fixes: eb6ecbed0aa2 ("mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs")
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Tested-by: Xu Yu <xuyu@linux.alibaba.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Collin Fijalkovich <cfijalkovich@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/open.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/fs/open.c~mm-thp-fix-incorrect-unmap-behavior-for-private-pages
+++ a/fs/open.c
@@ -857,8 +857,17 @@ static int do_dentry_open(struct file *f
 		 */
 		smp_mb();
 		if (filemap_nr_thps(inode->i_mapping)) {
+			struct address_space *mapping = inode->i_mapping;
+
 			filemap_invalidate_lock(inode->i_mapping);
-			truncate_pagecache(inode, 0);
+			/*
+			 * unmap_mapping_range just need to be called once
+			 * here, because the private pages is not need to be
+			 * unmapped mapping (e.g. data segment of dynamic
+			 * shared libraries here).
+			 */
+			unmap_mapping_range(mapping, 0, 0, 0);
+			truncate_inode_pages(mapping, 0);
 			filemap_invalidate_unlock(inode->i_mapping);
 		}
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 175/262] mm/readahead.c: fix incorrect comments for get_init_ra_size
  2021-11-05 20:34 incoming Andrew Morton
                   ` (173 preceding siblings ...)
  2021-11-05 20:43 ` [patch 174/262] mm, thp: fix incorrect unmap behavior for private pages Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 176/262] mm: nommu: kill arch_get_unmapped_area() Andrew Morton
                   ` (86 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, linf, linux-mm, mm-commits, torvalds

From: Lin Feng <linf@wangsu.com>
Subject: mm/readahead.c: fix incorrect comments for get_init_ra_size

In fact, formated values returned by get_init_ra_size are not that
intuitive.  This patch make the comments reflect its truth.

Link: https://lkml.kernel.org/r/20211019104812.135602-1-linf@wangsu.com
Signed-off-by: Lin Feng <linf@wangsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/readahead.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/readahead.c~mm-readaheadc-fix-incorrect-comments-for-get_init_ra_size
+++ a/mm/readahead.c
@@ -309,7 +309,7 @@ void force_page_cache_ra(struct readahea
  * Set the initial window size, round to next power of 2 and square
  * for small size, x 4 for medium, and x 2 for large
  * for 128k (32 page) max ra
- * 1-8 page = 32k initial, > 8 page = 128k initial
+ * 1-2 page = 16k, 3-4 page 32k, 5-8 page = 64k, > 8 page = 128k initial
  */
 static unsigned long get_init_ra_size(unsigned long size, unsigned long max)
 {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 176/262] mm: nommu: kill arch_get_unmapped_area()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (174 preceding siblings ...)
  2021-11-05 20:43 ` [patch 175/262] mm/readahead.c: fix incorrect comments for get_init_ra_size Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 177/262] selftest/vm: fix ksm selftest to run with different NUMA topologies Andrew Morton
                   ` (85 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, torvalds, wangkefeng.wang

From: Kefeng Wang <wangkefeng.wang@huawei.com>
Subject: mm: nommu: kill arch_get_unmapped_area()

When nommu, the arch_get_unmapped_area() will not be called, just kill it.

Link: https://lkml.kernel.org/r/20210910061906.36299-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/nommu.c |    6 ------
 1 file changed, 6 deletions(-)

--- a/mm/nommu.c~mm-nommu-kill-arch_get_unmapped_area
+++ a/mm/nommu.c
@@ -1639,12 +1639,6 @@ int remap_vmalloc_range(struct vm_area_s
 }
 EXPORT_SYMBOL(remap_vmalloc_range);
 
-unsigned long arch_get_unmapped_area(struct file *file, unsigned long addr,
-	unsigned long len, unsigned long pgoff, unsigned long flags)
-{
-	return -ENOMEM;
-}
-
 vm_fault_t filemap_fault(struct vm_fault *vmf)
 {
 	BUG();
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 177/262] selftest/vm: fix ksm selftest to run with different NUMA topologies
  2021-11-05 20:34 incoming Andrew Morton
                   ` (175 preceding siblings ...)
  2021-11-05 20:43 ` [patch 176/262] mm: nommu: kill arch_get_unmapped_area() Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 178/262] selftests: vm: add KSM huge pages merging time test Andrew Morton
                   ` (84 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, aneesh.kumar, hughd, linux-mm, mm-commits, pasha.tatashin,
	shuah, torvalds, tyhicks, zhansayabagdaulet

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Subject: selftest/vm: fix ksm selftest to run with different NUMA topologies

Platforms can have non-contiguous NUMA nodes like below

 #numactl  -H
available: 2 nodes (0,8)
.....
node distances:
node   0   8
  0:  10  40
  8:  40  10

 #numactl  -H
available: 1 nodes (1)
....
node distances:
node   1
  1:  10

Hence update the test to not assume the presence of Node 0 and 1 and also
use numa_num_configured_nodes() instead of numa_max_node for finding
whether to skip the test.

Link: https://lkml.kernel.org/r/20210914141414.350759-1-aneesh.kumar@linux.ibm.com
Fixes: 82e717ad3501 ("selftests: vm: add KSM merging across nodes test")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/ksm_tests.c |   29 ++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

--- a/tools/testing/selftests/vm/ksm_tests.c~selftest-vm-fix-ksm-selftest-to-run-with-different-numa-topologies
+++ a/tools/testing/selftests/vm/ksm_tests.c
@@ -354,12 +354,34 @@ err_out:
 	return KSFT_FAIL;
 }
 
+static int get_next_mem_node(int node)
+{
+
+	long node_size;
+	int mem_node = 0;
+	int i, max_node = numa_max_node();
+
+	for (i = node + 1; i <= max_node + node; i++) {
+		mem_node = i % (max_node + 1);
+		node_size = numa_node_size(mem_node, NULL);
+		if (node_size > 0)
+			break;
+	}
+	return mem_node;
+}
+
+static int get_first_mem_node(void)
+{
+	return get_next_mem_node(numa_max_node());
+}
+
 static int check_ksm_numa_merge(int mapping, int prot, int timeout, bool merge_across_nodes,
 				size_t page_size)
 {
 	void *numa1_map_ptr, *numa2_map_ptr;
 	struct timespec start_time;
 	int page_count = 2;
+	int first_node;
 
 	if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) {
 		perror("clock_gettime");
@@ -370,7 +392,7 @@ static int check_ksm_numa_merge(int mapp
 		perror("NUMA support not enabled");
 		return KSFT_SKIP;
 	}
-	if (numa_max_node() < 1) {
+	if (numa_num_configured_nodes() <= 1) {
 		printf("At least 2 NUMA nodes must be available\n");
 		return KSFT_SKIP;
 	}
@@ -378,8 +400,9 @@ static int check_ksm_numa_merge(int mapp
 		return KSFT_FAIL;
 
 	/* allocate 2 pages in 2 different NUMA nodes and fill them with the same data */
-	numa1_map_ptr = numa_alloc_onnode(page_size, 0);
-	numa2_map_ptr = numa_alloc_onnode(page_size, 1);
+	first_node = get_first_mem_node();
+	numa1_map_ptr = numa_alloc_onnode(page_size, first_node);
+	numa2_map_ptr = numa_alloc_onnode(page_size, get_next_mem_node(first_node));
 	if (!numa1_map_ptr || !numa2_map_ptr) {
 		perror("numa_alloc_onnode");
 		return KSFT_FAIL;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 178/262] selftests: vm: add KSM huge pages merging time test
  2021-11-05 20:34 incoming Andrew Morton
                   ` (176 preceding siblings ...)
  2021-11-05 20:43 ` [patch 177/262] selftest/vm: fix ksm selftest to run with different NUMA topologies Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:43 ` [patch 179/262] mm/vmstat: annotate data race for zone->free_area[order].nr_free Andrew Morton
                   ` (83 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, pedrodemargomes, torvalds, zhansayabagdaulet

From: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Subject: selftests: vm: add KSM huge pages merging time test

Add test case of KSM merging time using mostly huge pages

Link: https://lkml.kernel.org/r/20211013044045.360251-1-pedrodemargomes@gmail.com
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Cc: Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/ksm_tests.c |  125 ++++++++++++++++++++++-
 1 file changed, 124 insertions(+), 1 deletion(-)

--- a/tools/testing/selftests/vm/ksm_tests.c~selftests-vm-add-ksm-huge-pages-merging-time-test
+++ a/tools/testing/selftests/vm/ksm_tests.c
@@ -5,6 +5,10 @@
 #include <time.h>
 #include <string.h>
 #include <numa.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <err.h>
 
 #include "../kselftest.h"
 #include "../../../../include/vdso/time64.h"
@@ -18,6 +22,15 @@
 #define KSM_MERGE_ACROSS_NODES_DEFAULT true
 #define MB (1ul << 20)
 
+#define PAGE_SHIFT 12
+#define HPAGE_SHIFT 21
+
+#define PAGE_SIZE (1 << PAGE_SHIFT)
+#define HPAGE_SIZE (1 << HPAGE_SHIFT)
+
+#define PAGEMAP_PRESENT(ent)	(((ent) & (1ull << 63)) != 0)
+#define PAGEMAP_PFN(ent)	((ent) & ((1ull << 55) - 1))
+
 struct ksm_sysfs {
 	unsigned long max_page_sharing;
 	unsigned long merge_across_nodes;
@@ -34,6 +47,7 @@ enum ksm_test_name {
 	CHECK_KSM_ZERO_PAGE_MERGE,
 	CHECK_KSM_NUMA_MERGE,
 	KSM_MERGE_TIME,
+	KSM_MERGE_TIME_HUGE_PAGES,
 	KSM_COW_TIME
 };
 
@@ -100,6 +114,9 @@ static void print_help(void)
 	       " -P evaluate merging time and speed.\n"
 	       "    For this test, the size of duplicated memory area (in MiB)\n"
 	       "    must be provided using -s option\n"
+				 " -H evaluate merging time and speed of area allocated mostly with huge pages\n"
+	       "    For this test, the size of duplicated memory area (in MiB)\n"
+	       "    must be provided using -s option\n"
 	       " -C evaluate the time required to break COW of merged pages.\n\n");
 
 	printf(" -a: specify the access protections of pages.\n"
@@ -439,6 +456,101 @@ err_out:
 	return KSFT_FAIL;
 }
 
+int64_t allocate_transhuge(void *ptr, int pagemap_fd)
+{
+	uint64_t ent[2];
+
+	/* drop pmd */
+	if (mmap(ptr, HPAGE_SIZE, PROT_READ | PROT_WRITE,
+				MAP_FIXED | MAP_ANONYMOUS |
+				MAP_NORESERVE | MAP_PRIVATE, -1, 0) != ptr)
+		errx(2, "mmap transhuge");
+
+	if (madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE))
+		err(2, "MADV_HUGEPAGE");
+
+	/* allocate transparent huge page */
+	*(volatile void **)ptr = ptr;
+
+	if (pread(pagemap_fd, ent, sizeof(ent),
+			(uintptr_t)ptr >> (PAGE_SHIFT - 3)) != sizeof(ent))
+		err(2, "read pagemap");
+
+	if (PAGEMAP_PRESENT(ent[0]) && PAGEMAP_PRESENT(ent[1]) &&
+	    PAGEMAP_PFN(ent[0]) + 1 == PAGEMAP_PFN(ent[1]) &&
+	    !(PAGEMAP_PFN(ent[0]) & ((1 << (HPAGE_SHIFT - PAGE_SHIFT)) - 1)))
+		return PAGEMAP_PFN(ent[0]);
+
+	return -1;
+}
+
+static int ksm_merge_hugepages_time(int mapping, int prot, int timeout, size_t map_size)
+{
+	void *map_ptr, *map_ptr_orig;
+	struct timespec start_time, end_time;
+	unsigned long scan_time_ns;
+	int pagemap_fd, n_normal_pages, n_huge_pages;
+
+	map_size *= MB;
+	size_t len = map_size;
+
+	len -= len % HPAGE_SIZE;
+	map_ptr_orig = mmap(NULL, len + HPAGE_SIZE, PROT_READ | PROT_WRITE,
+			MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0);
+	map_ptr = map_ptr_orig + HPAGE_SIZE - (uintptr_t)map_ptr_orig % HPAGE_SIZE;
+
+	if (map_ptr_orig == MAP_FAILED)
+		err(2, "initial mmap");
+
+	if (madvise(map_ptr, len + HPAGE_SIZE, MADV_HUGEPAGE))
+		err(2, "MADV_HUGEPAGE");
+
+	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
+	if (pagemap_fd < 0)
+		err(2, "open pagemap");
+
+	n_normal_pages = 0;
+	n_huge_pages = 0;
+	for (void *p = map_ptr; p < map_ptr + len; p += HPAGE_SIZE) {
+		if (allocate_transhuge(p, pagemap_fd) < 0)
+			n_normal_pages++;
+		else
+			n_huge_pages++;
+	}
+	printf("Number of normal pages:    %d\n", n_normal_pages);
+	printf("Number of huge pages:    %d\n", n_huge_pages);
+
+	memset(map_ptr, '*', len);
+
+	if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) {
+		perror("clock_gettime");
+		goto err_out;
+	}
+	if (ksm_merge_pages(map_ptr, map_size, start_time, timeout))
+		goto err_out;
+	if (clock_gettime(CLOCK_MONOTONIC_RAW, &end_time)) {
+		perror("clock_gettime");
+		goto err_out;
+	}
+
+	scan_time_ns = (end_time.tv_sec - start_time.tv_sec) * NSEC_PER_SEC +
+		       (end_time.tv_nsec - start_time.tv_nsec);
+
+	printf("Total size:    %lu MiB\n", map_size / MB);
+	printf("Total time:    %ld.%09ld s\n", scan_time_ns / NSEC_PER_SEC,
+	       scan_time_ns % NSEC_PER_SEC);
+	printf("Average speed:  %.3f MiB/s\n", (map_size / MB) /
+					       ((double)scan_time_ns / NSEC_PER_SEC));
+
+	munmap(map_ptr_orig, len + HPAGE_SIZE);
+	return KSFT_PASS;
+
+err_out:
+	printf("Not OK\n");
+	munmap(map_ptr_orig, len + HPAGE_SIZE);
+	return KSFT_FAIL;
+}
+
 static int ksm_merge_time(int mapping, int prot, int timeout, size_t map_size)
 {
 	void *map_ptr;
@@ -564,7 +676,7 @@ int main(int argc, char *argv[])
 	bool merge_across_nodes = KSM_MERGE_ACROSS_NODES_DEFAULT;
 	long size_MB = 0;
 
-	while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPC")) != -1) {
+	while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCH")) != -1) {
 		switch (opt) {
 		case 'a':
 			prot = str_to_prot(optarg);
@@ -618,6 +730,9 @@ int main(int argc, char *argv[])
 		case 'P':
 			test_name = KSM_MERGE_TIME;
 			break;
+		case 'H':
+			test_name = KSM_MERGE_TIME_HUGE_PAGES;
+			break;
 		case 'C':
 			test_name = KSM_COW_TIME;
 			break;
@@ -670,6 +785,14 @@ int main(int argc, char *argv[])
 		ret = ksm_merge_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec,
 				     size_MB);
 		break;
+	case KSM_MERGE_TIME_HUGE_PAGES:
+		if (size_MB == 0) {
+			printf("Option '-s' is required.\n");
+			return KSFT_FAIL;
+		}
+		ret = ksm_merge_hugepages_time(MAP_PRIVATE | MAP_ANONYMOUS, prot,
+				ksm_scan_limit_sec, size_MB);
+		break;
 	case KSM_COW_TIME:
 		ret = ksm_cow_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec,
 				   page_size);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 179/262] mm/vmstat: annotate data race for zone->free_area[order].nr_free
  2021-11-05 20:34 incoming Andrew Morton
                   ` (177 preceding siblings ...)
  2021-11-05 20:43 ` [patch 178/262] selftests: vm: add KSM huge pages merging time test Andrew Morton
@ 2021-11-05 20:43 ` Andrew Morton
  2021-11-05 20:44 ` [patch 180/262] mm: vmstat.c: make extfrag_index show more pretty Andrew Morton
                   ` (82 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:43 UTC (permalink / raw)
  To: akpm, linux-mm, liushixin2, mm-commits, paulmck, torvalds

From: Liu Shixin <liushixin2@huawei.com>
Subject: mm/vmstat: annotate data race for zone->free_area[order].nr_free

KCSAN reports a data-race on v5.10 which also exists on mainline:

==================================================================
BUG: KCSAN: data-race in extfrag_for_order+0x33/0x2d0

race at unknown origin, with read to 0xffff9ee9bfffab48 of 8 bytes by task 34 on cpu 1:
 extfrag_for_order+0x33/0x2d0
 kcompactd+0x5f0/0xce0
 kthread+0x1f9/0x220
 ret_from_fork+0x22/0x30

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 34 Comm: kcompactd0 Not tainted 5.10.0+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
==================================================================

Access to zone->free_area[order].nr_free in extfrag_for_order()/frag_show_print()
is lockless. That's intentional and the stats are a rough estimate anyway.
Annotate them with data_race().

[liushixin2@huawei.com: add comments]
  Link: https://lkml.kernel.org/r/20210918084655.2696522-1-liushixin2@huawei.com
Link: https://lkml.kernel.org/r/20210908015606.3999871-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmstat.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

--- a/mm/vmstat.c~mm-vmstat-annotate-data-race-for-zone-free_areanr_free
+++ a/mm/vmstat.c
@@ -1070,8 +1070,13 @@ static void fill_contig_page_info(struct
 	for (order = 0; order < MAX_ORDER; order++) {
 		unsigned long blocks;
 
-		/* Count number of free blocks */
-		blocks = zone->free_area[order].nr_free;
+		/*
+		 * Count number of free blocks.
+		 *
+		 * Access to nr_free is lockless as nr_free is used only for
+		 * diagnostic purposes. Use data_race to avoid KCSAN warning.
+		 */
+		blocks = data_race(zone->free_area[order].nr_free);
 		info->free_blocks_total += blocks;
 
 		/* Count free base pages */
@@ -1446,7 +1451,11 @@ static void frag_show_print(struct seq_f
 
 	seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name);
 	for (order = 0; order < MAX_ORDER; ++order)
-		seq_printf(m, "%6lu ", zone->free_area[order].nr_free);
+		/*
+		 * Access to nr_free is lockless as nr_free is used only for
+		 * printing purposes. Use data_race to avoid KCSAN warning.
+		 */
+		seq_printf(m, "%6lu ", data_race(zone->free_area[order].nr_free));
 	seq_putc(m, '\n');
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 180/262] mm: vmstat.c: make extfrag_index show more pretty
  2021-11-05 20:34 incoming Andrew Morton
                   ` (178 preceding siblings ...)
  2021-11-05 20:43 ` [patch 179/262] mm/vmstat: annotate data race for zone->free_area[order].nr_free Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 181/262] selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers Andrew Morton
                   ` (81 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, linf, linux-mm, mm-commits, torvalds

From: Lin Feng <linf@wangsu.com>
Subject: mm: vmstat.c: make extfrag_index show more pretty

fragmentation_index may return -1000 and the corresponding formated value
showed by seq_printf will take a negative signatrue, but other positive
formated values don't take a positive signatrue, so the output becomes
unaligned.

before:
Node 0, zone      DMA -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone    DMA32 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone   Normal -1.000 -1.000 -1.000 -1.000 0.931 0.966 0.983 0.992 0.996 0.998 0.999

after this patch:
Node 0, zone      DMA -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone    DMA32 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000 -1.000
Node 0, zone   Normal -1.000 -1.000 -1.000 -1.000  0.931  0.966  0.983  0.992  0.996  0.998  0.999

Link: https://lkml.kernel.org/r/20211019103241.134797-1-linf@wangsu.com
Signed-off-by: Lin Feng <linf@wangsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/vmstat.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/vmstat.c~mm-vmstatc-make-extfrag_index-show-more-pretty
+++ a/mm/vmstat.c
@@ -2191,7 +2191,7 @@ static void extfrag_show_print(struct se
 	for (order = 0; order < MAX_ORDER; ++order) {
 		fill_contig_page_info(zone, order, &info);
 		index = __fragmentation_index(order, &info);
-		seq_printf(m, "%d.%03d ", index / 1000, index % 1000);
+		seq_printf(m, "%2d.%03d ", index / 1000, index % 1000);
 	}
 
 	seq_putc(m, '\n');
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 181/262] selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers
  2021-11-05 20:34 incoming Andrew Morton
                   ` (179 preceding siblings ...)
  2021-11-05 20:44 ` [patch 180/262] mm: vmstat.c: make extfrag_index show more pretty Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 182/262] mm/memory_hotplug: add static qualifier for online_policy_to_str() Andrew Morton
                   ` (80 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, david, linux-mm, mm-commits, skhan, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers

The madv_populate selftest currently builds with a warning when the local
installed headers (via the distribution) don't include MADV_POPULATE_READ
and MADV_POPULATE_WRITE.  The warning is correct, because the test cannot
locate the necessary header.

Reason is that the in-tree installed headers (usr/include) have a "linux"
instead of a "sys" subdirectory.

Including "linux/mman.h" instead of "sys/mman.h" doesn't work (e.g.,
mmap() and madvise() are not defined that way).  The only thing that seems
to work is including "linux/mman.h" in addition to "sys/mman.h".

We can get rid of our availability check and simplify.

Link: https://lkml.kernel.org/r/20211015165758.41374-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/vm/madv_populate.c |   15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

--- a/tools/testing/selftests/vm/madv_populate.c~selftests-vm-make-madv_populate_readwrite-use-in-tree-headers
+++ a/tools/testing/selftests/vm/madv_populate.c
@@ -14,12 +14,11 @@
 #include <unistd.h>
 #include <errno.h>
 #include <fcntl.h>
+#include <linux/mman.h>
 #include <sys/mman.h>
 
 #include "../kselftest.h"
 
-#if defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE)
-
 /*
  * For now, we're using 2 MiB of private anonymous memory for all tests.
  */
@@ -328,15 +327,3 @@ int main(int argc, char **argv)
 				   err, ksft_test_num());
 	return ksft_exit_pass();
 }
-
-#else /* defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE) */
-
-#warning "missing MADV_POPULATE_READ or MADV_POPULATE_WRITE definition"
-
-int main(int argc, char **argv)
-{
-	ksft_print_header();
-	ksft_exit_skip("MADV_POPULATE_READ or MADV_POPULATE_WRITE not defined\n");
-}
-
-#endif /* defined(MADV_POPULATE_READ) && defined(MADV_POPULATE_WRITE) */
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 182/262] mm/memory_hotplug: add static qualifier for online_policy_to_str()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (180 preceding siblings ...)
  2021-11-05 20:44 ` [patch 181/262] selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 183/262] memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node" Andrew Morton
                   ` (79 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, david, linux-mm, mm-commits, songmuchun, tangyizhou, torvalds

From: Tang Yizhou <tangyizhou@huawei.com>
Subject: mm/memory_hotplug: add static qualifier for online_policy_to_str()

online_policy_to_str is only used in memory_hotplug.c and should be
defined as static.

Link: https://lkml.kernel.org/r/20210913024534.26161-1-tangyizhou@huawei.com
Signed-off-by: Tang Yizhou <tangyizhou@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory_hotplug.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memory_hotplug.c~mm-memory_hotplug-add-static-qualifier-for-online_policy_to_str
+++ a/mm/memory_hotplug.c
@@ -57,7 +57,7 @@ enum {
 	ONLINE_POLICY_AUTO_MOVABLE,
 };
 
-const char *online_policy_to_str[] = {
+static const char * const online_policy_to_str[] = {
 	[ONLINE_POLICY_CONTIG_ZONES] = "contig-zones",
 	[ONLINE_POLICY_AUTO_MOVABLE] = "auto-movable",
 };
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 183/262] memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node"
  2021-11-05 20:34 incoming Andrew Morton
                   ` (181 preceding siblings ...)
  2021-11-05 20:44 ` [patch 182/262] mm/memory_hotplug: add static qualifier for online_policy_to_str() Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 184/262] memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path Andrew Morton
                   ` (78 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, corbet, david, linux-mm, mhocko, mm-commits, osalvador,
	rppt, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node"

Patch series "memory-hotplug.rst: document the "auto-movable" online policy".

Now that the memory-hotplug.rst overhaul is upstream, proper documentation
for the "auto-movable" online policy, documenting all new toggles and
options.  Along, two fixes for the original overhaul.


This patch (of 3):

We really want to refer to the "movable_node" kernel command line
parameter here.

Link: https://lkml.kernel.org/r/20210930144117.23641-2-david@redhat.com
Fixes: ac3332c44767 ("memory-hotplug.rst: complete admin-guide overhaul")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/memory-hotplug.rst |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-fix-two-instances-of-movablecore-that-should-be-movable_node
+++ a/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -166,7 +166,7 @@ Or alternatively::
 	% echo 1 > /sys/devices/system/memory/memoryXXX/online
 
 The kernel will select the target zone automatically, usually defaulting to
-``ZONE_NORMAL`` unless ``movablecore=1`` has been specified on the kernel
+``ZONE_NORMAL`` unless ``movable_node`` has been specified on the kernel
 command line or if the memory block would intersect the ZONE_MOVABLE already.
 
 One can explicitly request to associate an offline memory block with
@@ -393,7 +393,7 @@ command line parameters are relevant:
 ======================== =======================================================
 ``memhp_default_state``	 configure auto-onlining by essentially setting
                          ``/sys/devices/system/memory/auto_online_blocks``.
-``movablecore``		 configure automatic zone selection of the kernel. When
+``movable_node``	 configure automatic zone selection in the kernel. When
 			 set, the kernel will default to ZONE_MOVABLE, unless
 			 other zones can be kept contiguous.
 ======================== =======================================================
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 184/262] memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path
  2021-11-05 20:34 incoming Andrew Morton
                   ` (182 preceding siblings ...)
  2021-11-05 20:44 ` [patch 183/262] memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node" Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 185/262] memory-hotplug.rst: document the "auto-movable" online policy Andrew Morton
                   ` (77 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, corbet, david, linux-mm, mhocko, mm-commits, osalvador,
	rppt, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path

We accidentially added a superfluous "s".

Link: https://lkml.kernel.org/r/20210930144117.23641-3-david@redhat.com
Fixes: ac3332c44767 ("memory-hotplug.rst: complete admin-guide overhaul")
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/memory-hotplug.rst |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-fix-wrong-sys-module-memory_hotplug-parameters-path
+++ a/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -410,7 +410,7 @@ them with ``memory_hotplug.`` such as::
 
 and they can be observed (and some even modified at runtime) via::
 
-	/sys/modules/memory_hotplug/parameters/
+	/sys/module/memory_hotplug/parameters/
 
 The following module parameters are currently defined:
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 185/262] memory-hotplug.rst: document the "auto-movable" online policy
  2021-11-05 20:34 incoming Andrew Morton
                   ` (183 preceding siblings ...)
  2021-11-05 20:44 ` [patch 184/262] memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 186/262] mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG Andrew Morton
                   ` (76 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, corbet, david, linux-mm, mhocko, mm-commits, osalvador,
	rppt, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: memory-hotplug.rst: document the "auto-movable" online policy

In commit e83a437faa62 ("mm/memory_hotplug: introduce "auto-movable"
online policy") we introduced a new memory online policy to automatically
select a zone for memory blocks to be onlined.  We added a way to set the
active online policy and tunables for the auto-movable online policy.  In
follow-up commits we tweaked the "auto-movable" policy to also consider
memory device details when selecting zones for memory blocks to be
onlined.

Let's document the new toggles and how the two online policies we have
work.

[david@redhat.com: updates]
  Link: https://lkml.kernel.org/r/20211011082058.6076-4-david@redhat.com
Link: https://lkml.kernel.org/r/20210930144117.23641-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/memory-hotplug.rst |  141 ++++++++++++--
 1 file changed, 121 insertions(+), 20 deletions(-)

--- a/Documentation/admin-guide/mm/memory-hotplug.rst~memory-hotplugrst-document-the-auto-movable-online-policy
+++ a/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -165,9 +165,8 @@ Or alternatively::
 
 	% echo 1 > /sys/devices/system/memory/memoryXXX/online
 
-The kernel will select the target zone automatically, usually defaulting to
-``ZONE_NORMAL`` unless ``movable_node`` has been specified on the kernel
-command line or if the memory block would intersect the ZONE_MOVABLE already.
+The kernel will select the target zone automatically, depending on the
+configured ``online_policy``.
 
 One can explicitly request to associate an offline memory block with
 ZONE_MOVABLE by::
@@ -198,6 +197,9 @@ Auto-onlining can be enabled by writing
 
 	% echo online > /sys/devices/system/memory/auto_online_blocks
 
+Similarly to manual onlining, with ``online`` the kernel will select the
+target zone automatically, depending on the configured ``online_policy``.
+
 Modifying the auto-online behavior will only affect all subsequently added
 memory blocks only.
 
@@ -393,11 +395,16 @@ command line parameters are relevant:
 ======================== =======================================================
 ``memhp_default_state``	 configure auto-onlining by essentially setting
                          ``/sys/devices/system/memory/auto_online_blocks``.
-``movable_node``	 configure automatic zone selection in the kernel. When
-			 set, the kernel will default to ZONE_MOVABLE, unless
-			 other zones can be kept contiguous.
+``movable_node``	 configure automatic zone selection in the kernel when
+			 using the ``contig-zones`` online policy. When
+			 set, the kernel will default to ZONE_MOVABLE when
+			 onlining a memory block, unless other zones can be kept
+			 contiguous.
 ======================== =======================================================
 
+See Documentation/admin-guide/kernel-parameters.txt for a more generic
+description of these command line parameters.
+
 Module Parameters
 ------------------
 
@@ -414,20 +421,114 @@ and they can be observed (and some even
 
 The following module parameters are currently defined:
 
-======================== =======================================================
-``memmap_on_memory``	 read-write: Allocate memory for the memmap from the
-			 added memory block itself. Even if enabled, actual
-			 support depends on various other system properties and
-			 should only be regarded as a hint whether the behavior
-			 would be desired.
-
-			 While allocating the memmap from the memory block
-			 itself makes memory hotplug less likely to fail and
-			 keeps the memmap on the same NUMA node in any case, it
-			 can fragment physical memory in a way that huge pages
-			 in bigger granularity cannot be formed on hotplugged
-			 memory.
-======================== =======================================================
+================================ ===============================================
+``memmap_on_memory``		 read-write: Allocate memory for the memmap from
+				 the added memory block itself. Even if enabled,
+				 actual support depends on various other system
+				 properties and should only be regarded as a
+				 hint whether the behavior would be desired.
+
+				 While allocating the memmap from the memory
+				 block itself makes memory hotplug less likely
+				 to fail and keeps the memmap on the same NUMA
+				 node in any case, it can fragment physical
+				 memory in a way that huge pages in bigger
+				 granularity cannot be formed on hotplugged
+				 memory.
+``online_policy``		 read-write: Set the basic policy used for
+				 automatic zone selection when onlining memory
+				 blocks without specifying a target zone.
+				 ``contig-zones`` has been the kernel default
+				 before this parameter was added. After an
+				 online policy was configured and memory was
+				 online, the policy should not be changed
+				 anymore.
+
+				 When set to ``contig-zones``, the kernel will
+				 try keeping zones contiguous. If a memory block
+				 intersects multiple zones or no zone, the
+				 behavior depends on the ``movable_node`` kernel
+				 command line parameter: default to ZONE_MOVABLE
+				 if set, default to the applicable kernel zone
+				 (usually ZONE_NORMAL) if not set.
+
+				 When set to ``auto-movable``, the kernel will
+				 try onlining memory blocks to ZONE_MOVABLE if
+				 possible according to the configuration and
+				 memory device details. With this policy, one
+				 can avoid zone imbalances when eventually
+				 hotplugging a lot of memory later and still
+				 wanting to be able to hotunplug as much as
+				 possible reliably, very desirable in
+				 virtualized environments. This policy ignores
+				 the ``movable_node`` kernel command line
+				 parameter and isn't really applicable in
+				 environments that require it (e.g., bare metal
+				 with hotunpluggable nodes) where hotplugged
+				 memory might be exposed via the
+				 firmware-provided memory map early during boot
+				 to the system instead of getting detected,
+				 added and onlined  later during boot (such as
+				 done by virtio-mem or by some hypervisors
+				 implementing emulated DIMMs). As one example, a
+				 hotplugged DIMM will be onlined either
+				 completely to ZONE_MOVABLE or completely to
+				 ZONE_NORMAL, not a mixture.
+				 As another example, as many memory blocks
+				 belonging to a virtio-mem device will be
+				 onlined to ZONE_MOVABLE as possible,
+				 special-casing units of memory blocks that can
+				 only get hotunplugged together. *This policy
+				 does not protect from setups that are
+				 problematic with ZONE_MOVABLE and does not
+				 change the zone of memory blocks dynamically
+				 after they were onlined.*
+``auto_movable_ratio``		 read-write: Set the maximum MOVABLE:KERNEL
+				 memory ratio in % for the ``auto-movable``
+				 online policy. Whether the ratio applies only
+				 for the system across all NUMA nodes or also
+				 per NUMA nodes depends on the
+				 ``auto_movable_numa_aware`` configuration.
+
+				 All accounting is based on present memory pages
+				 in the zones combined with accounting per
+				 memory device. Memory dedicated to the CMA
+				 allocator is accounted as MOVABLE, although
+				 residing on one of the kernel zones. The
+				 possible ratio depends on the actual workload.
+				 The kernel default is "301" %, for example,
+				 allowing for hotplugging 24 GiB to a 8 GiB VM
+				 and automatically onlining all hotplugged
+				 memory to ZONE_MOVABLE in many setups. The
+				 additional 1% deals with some pages being not
+				 present, for example, because of some firmware
+				 allocations.
+
+				 Note that ZONE_NORMAL memory provided by one
+				 memory device does not allow for more
+				 ZONE_MOVABLE memory for a different memory
+				 device. As one example, onlining memory of a
+				 hotplugged DIMM to ZONE_NORMAL will not allow
+				 for another hotplugged DIMM to get onlined to
+				 ZONE_MOVABLE automatically. In contrast, memory
+				 hotplugged by a virtio-mem device that got
+				 onlined to ZONE_NORMAL will allow for more
+				 ZONE_MOVABLE memory within *the same*
+				 virtio-mem device.
+``auto_movable_numa_aware``	 read-write: Configure whether the
+				 ``auto_movable_ratio`` in the ``auto-movable``
+				 online policy also applies per NUMA
+				 node in addition to the whole system across all
+				 NUMA nodes. The kernel default is "Y".
+
+				 Disabling NUMA awareness can be helpful when
+				 dealing with NUMA nodes that should be
+				 completely hotunpluggable, onlining the memory
+				 completely to ZONE_MOVABLE automatically if
+				 possible.
+
+				 Parameter availability depends on CONFIG_NUMA.
+================================ ===============================================
 
 ZONE_MOVABLE
 ============
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 186/262] mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG
  2021-11-05 20:34 incoming Andrew Morton
                   ` (184 preceding siblings ...)
  2021-11-05 20:44 ` [patch 185/262] memory-hotplug.rst: document the "auto-movable" online policy Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 187/262] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE Andrew Morton
                   ` (75 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, shuah, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG

Patch series "mm/memory_hotplug: Kconfig and 32 bit cleanups".

Some cleanups around CONFIG_MEMORY_HOTPLUG, including removing 32 bit
leftovers of memory hotplug support.


This patch (of 6):

SPARSEMEM is the only possible memory model for x86-64, FLATMEM is not
possible:
	config ARCH_FLATMEM_ENABLE
		def_bool y
		depends on X86_32 && !NUMA

And X86_64_ACPI_NUMA (obviously) only supports x86-64:
	config X86_64_ACPI_NUMA
		def_bool y
		depends on X86_64 && NUMA && ACPI && PCI

Let's just remove the CONFIG_X86_64_ACPI_NUMA dependency, as it does no
longer make sense.

Link: https://lkml.kernel.org/r/20210929143600.49379-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alex Shi <alexs@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/Kconfig~mm-memory_hotplug-remove-config_x86_64_acpi_numa-dependency-from-config_memory_hotplug
+++ a/mm/Kconfig
@@ -123,7 +123,7 @@ config ARCH_ENABLE_MEMORY_HOTPLUG
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
 	select MEMORY_ISOLATION
-	depends on SPARSEMEM || X86_64_ACPI_NUMA
+	depends on SPARSEMEM
 	depends on ARCH_ENABLE_MEMORY_HOTPLUG
 	depends on 64BIT || BROKEN
 	select NUMA_KEEP_MEMINFO if NUMA
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 187/262] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE
  2021-11-05 20:34 incoming Andrew Morton
                   ` (185 preceding siblings ...)
  2021-11-05 20:44 ` [patch 186/262] mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 188/262] mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit Andrew Morton
                   ` (74 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, skhan, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE

CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.

Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>	[kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/include/asm/machdep.h            |    2 -
 arch/powerpc/kernel/setup_64.c                |    2 -
 arch/powerpc/platforms/powernv/setup.c        |    4 +-
 arch/powerpc/platforms/pseries/setup.c        |    2 -
 drivers/base/Makefile                         |    2 -
 drivers/base/node.c                           |    9 ++----
 drivers/virtio/Kconfig                        |    2 -
 include/linux/memory.h                        |   24 ++++++----------
 include/linux/node.h                          |    4 +-
 lib/Kconfig.debug                             |    2 -
 mm/Kconfig                                    |    4 --
 mm/memory_hotplug.c                           |    2 -
 tools/testing/selftests/memory-hotplug/config |    1 
 13 files changed, 24 insertions(+), 36 deletions(-)

--- a/arch/powerpc/include/asm/machdep.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/arch/powerpc/include/asm/machdep.h
@@ -32,7 +32,7 @@ struct machdep_calls {
 	void		(*iommu_save)(void);
 	void		(*iommu_restore)(void);
 #endif
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 	unsigned long	(*memory_block_size)(void);
 #endif
 #endif /* CONFIG_PPC64 */
--- a/arch/powerpc/kernel/setup_64.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/arch/powerpc/kernel/setup_64.c
@@ -912,7 +912,7 @@ void __init setup_per_cpu_areas(void)
 }
 #endif
 
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 unsigned long memory_block_size_bytes(void)
 {
 	if (ppc_md.memory_block_size)
--- a/arch/powerpc/platforms/powernv/setup.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/arch/powerpc/platforms/powernv/setup.c
@@ -440,7 +440,7 @@ static void pnv_kexec_cpu_down(int crash
 }
 #endif /* CONFIG_KEXEC_CORE */
 
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 static unsigned long pnv_memory_block_size(void)
 {
 	/*
@@ -553,7 +553,7 @@ define_machine(powernv) {
 #ifdef CONFIG_KEXEC_CORE
 	.kexec_cpu_down		= pnv_kexec_cpu_down,
 #endif
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 	.memory_block_size	= pnv_memory_block_size,
 #endif
 };
--- a/arch/powerpc/platforms/pseries/setup.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/arch/powerpc/platforms/pseries/setup.c
@@ -1089,7 +1089,7 @@ define_machine(pseries) {
 	.machine_kexec          = pSeries_machine_kexec,
 	.kexec_cpu_down         = pseries_kexec_cpu_down,
 #endif
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 	.memory_block_size	= pseries_memory_block_size,
 #endif
 };
--- a/drivers/base/Makefile~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/drivers/base/Makefile
@@ -13,7 +13,7 @@ obj-y			+= power/
 obj-$(CONFIG_ISA_BUS_API)	+= isa.o
 obj-y				+= firmware_loader/
 obj-$(CONFIG_NUMA)	+= node.o
-obj-$(CONFIG_MEMORY_HOTPLUG_SPARSE) += memory.o
+obj-$(CONFIG_MEMORY_HOTPLUG) += memory.o
 ifeq ($(CONFIG_SYSFS),y)
 obj-$(CONFIG_MODULES)	+= module.o
 endif
--- a/drivers/base/node.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/drivers/base/node.c
@@ -629,7 +629,7 @@ static void node_device_release(struct d
 {
 	struct node *node = to_node(dev);
 
-#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS)
+#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS)
 	/*
 	 * We schedule the work only when a memory section is
 	 * onlined/offlined on this node. When we come here,
@@ -782,7 +782,7 @@ int unregister_cpu_under_node(unsigned i
 	return 0;
 }
 
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifdef CONFIG_MEMORY_HOTPLUG
 static int __ref get_nid_for_pfn(unsigned long pfn)
 {
 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
@@ -958,10 +958,9 @@ static int node_memory_callback(struct n
 	return NOTIFY_OK;
 }
 #endif	/* CONFIG_HUGETLBFS */
-#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
+#endif /* CONFIG_MEMORY_HOTPLUG */
 
-#if !defined(CONFIG_MEMORY_HOTPLUG_SPARSE) || \
-    !defined(CONFIG_HUGETLBFS)
+#if !defined(CONFIG_MEMORY_HOTPLUG) || !defined(CONFIG_HUGETLBFS)
 static inline int node_memory_callback(struct notifier_block *self,
 				unsigned long action, void *arg)
 {
--- a/drivers/virtio/Kconfig~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/drivers/virtio/Kconfig
@@ -98,7 +98,7 @@ config VIRTIO_MEM
 	default m
 	depends on X86_64
 	depends on VIRTIO
-	depends on MEMORY_HOTPLUG_SPARSE
+	depends on MEMORY_HOTPLUG
 	depends on MEMORY_HOTREMOVE
 	depends on CONTIG_ALLOC
 	help
--- a/include/linux/memory.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/include/linux/memory.h
@@ -110,7 +110,7 @@ struct mem_section;
 #define SLAB_CALLBACK_PRI       1
 #define IPC_CALLBACK_PRI        10
 
-#ifndef CONFIG_MEMORY_HOTPLUG_SPARSE
+#ifndef CONFIG_MEMORY_HOTPLUG
 static inline void memory_dev_init(void)
 {
 	return;
@@ -126,7 +126,14 @@ static inline int memory_notify(unsigned
 {
 	return 0;
 }
-#else
+static inline int hotplug_memory_notifier(notifier_fn_t fn, int pri)
+{
+	return 0;
+}
+/* These aren't inline functions due to a GCC bug. */
+#define register_hotmemory_notifier(nb)    ({ (void)(nb); 0; })
+#define unregister_hotmemory_notifier(nb)  ({ (void)(nb); })
+#else /* CONFIG_MEMORY_HOTPLUG */
 extern int register_memory_notifier(struct notifier_block *nb);
 extern void unregister_memory_notifier(struct notifier_block *nb);
 int create_memory_block_devices(unsigned long start, unsigned long size,
@@ -148,9 +155,6 @@ struct memory_group *memory_group_find_b
 typedef int (*walk_memory_groups_func_t)(struct memory_group *, void *);
 int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
 			       struct memory_group *excluded, void *arg);
-#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
-
-#ifdef CONFIG_MEMORY_HOTPLUG
 #define hotplug_memory_notifier(fn, pri) ({		\
 	static __meminitdata struct notifier_block fn##_mem_nb =\
 		{ .notifier_call = fn, .priority = pri };\
@@ -158,15 +162,7 @@ int walk_dynamic_memory_groups(int nid,
 })
 #define register_hotmemory_notifier(nb)		register_memory_notifier(nb)
 #define unregister_hotmemory_notifier(nb) 	unregister_memory_notifier(nb)
-#else
-static inline int hotplug_memory_notifier(notifier_fn_t fn, int pri)
-{
-	return 0;
-}
-/* These aren't inline functions due to a GCC bug. */
-#define register_hotmemory_notifier(nb)    ({ (void)(nb); 0; })
-#define unregister_hotmemory_notifier(nb)  ({ (void)(nb); })
-#endif
+#endif	/* CONFIG_MEMORY_HOTPLUG */
 
 /*
  * Kernel text modification mutex, used for code patching. Users of this lock
--- a/include/linux/node.h~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/include/linux/node.h
@@ -85,7 +85,7 @@ struct node {
 	struct device	dev;
 	struct list_head access_list;
 
-#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS)
+#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_HUGETLBFS)
 	struct work_struct	node_work;
 #endif
 #ifdef CONFIG_HMEM_REPORTING
@@ -98,7 +98,7 @@ struct memory_block;
 extern struct node *node_devices[];
 typedef  void (*node_registration_func_t)(struct node *);
 
-#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_NUMA)
+#if defined(CONFIG_MEMORY_HOTPLUG) && defined(CONFIG_NUMA)
 void link_mem_sections(int nid, unsigned long start_pfn,
 		       unsigned long end_pfn,
 		       enum meminit_context context);
--- a/lib/Kconfig.debug~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/lib/Kconfig.debug
@@ -877,7 +877,7 @@ config DEBUG_MEMORY_INIT
 
 config MEMORY_NOTIFIER_ERROR_INJECT
 	tristate "Memory hotplug notifier error injection module"
-	depends on MEMORY_HOTPLUG_SPARSE && NOTIFIER_ERROR_INJECTION
+	depends on MEMORY_HOTPLUG && NOTIFIER_ERROR_INJECTION
 	help
 	  This option provides the ability to inject artificial errors to
 	  memory hotplug notifier chain callbacks.  It is controlled through
--- a/mm/Kconfig~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/mm/Kconfig
@@ -128,10 +128,6 @@ config MEMORY_HOTPLUG
 	depends on 64BIT || BROKEN
 	select NUMA_KEEP_MEMINFO if NUMA
 
-config MEMORY_HOTPLUG_SPARSE
-	def_bool y
-	depends on SPARSEMEM && MEMORY_HOTPLUG
-
 config MEMORY_HOTPLUG_DEFAULT_ONLINE
 	bool "Online the newly added memory blocks by default"
 	depends on MEMORY_HOTPLUG
--- a/mm/memory_hotplug.c~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/mm/memory_hotplug.c
@@ -220,7 +220,6 @@ static void release_memory_resource(stru
 	kfree(res);
 }
 
-#ifdef CONFIG_MEMORY_HOTPLUG_SPARSE
 static int check_pfn_span(unsigned long pfn, unsigned long nr_pages,
 		const char *reason)
 {
@@ -1163,7 +1162,6 @@ failed_addition:
 	mem_hotplug_done();
 	return ret;
 }
-#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
 
 static void reset_node_present_pages(pg_data_t *pgdat)
 {
--- a/tools/testing/selftests/memory-hotplug/config~mm-memory_hotplug-remove-config_memory_hotplug_sparse
+++ a/tools/testing/selftests/memory-hotplug/config
@@ -1,5 +1,4 @@
 CONFIG_MEMORY_HOTPLUG=y
-CONFIG_MEMORY_HOTPLUG_SPARSE=y
 CONFIG_NOTIFIER_ERROR_INJECTION=y
 CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m
 CONFIG_MEMORY_HOTREMOVE=y
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 188/262] mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit
  2021-11-05 20:34 incoming Andrew Morton
                   ` (186 preceding siblings ...)
  2021-11-05 20:44 ` [patch 187/262] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 189/262] mm/memory_hotplug: remove HIGHMEM leftovers Andrew Morton
                   ` (73 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, shuah, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit

32 bit support is broken in various ways: for example, we can online
memory that should actually go to ZONE_HIGHMEM to ZONE_MOVABLE or in some
cases even to one of the other kernel zones.

We marked it BROKEN in commit b59d02ed0869 ("mm/memory_hotplug: disable
the functionality for 32b") almost one year ago.  According to that commit
it might be broken at least since 2017.  Further, there is hardly a sane
use case nowadays.

Let's just depend completely on 64bit, dropping the "BROKEN" dependency to
make clear that we are not going to support it again.  Next, we'll remove
some HIGHMEM leftovers from memory hotplug code to clean up.

Link: https://lkml.kernel.org/r/20210929143600.49379-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/Kconfig~mm-memory_hotplug-restrict-config_memory_hotplug-to-64-bit
+++ a/mm/Kconfig
@@ -125,7 +125,7 @@ config MEMORY_HOTPLUG
 	select MEMORY_ISOLATION
 	depends on SPARSEMEM
 	depends on ARCH_ENABLE_MEMORY_HOTPLUG
-	depends on 64BIT || BROKEN
+	depends on 64BIT
 	select NUMA_KEEP_MEMINFO if NUMA
 
 config MEMORY_HOTPLUG_DEFAULT_ONLINE
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 189/262] mm/memory_hotplug: remove HIGHMEM leftovers
  2021-11-05 20:34 incoming Andrew Morton
                   ` (187 preceding siblings ...)
  2021-11-05 20:44 ` [patch 188/262] mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 190/262] mm/memory_hotplug: remove stale function declarations Andrew Morton
                   ` (72 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, shuah, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: remove HIGHMEM leftovers

We don't support CONFIG_MEMORY_HOTPLUG on 32 bit and consequently not
HIGHMEM.  Let's remove any leftover code -- including the unused
"status_change_nid_high" field part of the memory notifier.

Link: https://lkml.kernel.org/r/20210929143600.49379-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/core-api/memory-hotplug.rst                    |    3 
 Documentation/translations/zh_CN/core-api/memory-hotplug.rst |    4 -
 include/linux/memory.h                                       |    1 
 mm/memory_hotplug.c                                          |   36 ----------
 4 files changed, 2 insertions(+), 42 deletions(-)

--- a/Documentation/core-api/memory-hotplug.rst~mm-memory_hotplug-remove-highmem-leftovers
+++ a/Documentation/core-api/memory-hotplug.rst
@@ -57,7 +57,6 @@ The third argument (arg) passes a pointe
 		unsigned long start_pfn;
 		unsigned long nr_pages;
 		int status_change_nid_normal;
-		int status_change_nid_high;
 		int status_change_nid;
 	}
 
@@ -65,8 +64,6 @@ The third argument (arg) passes a pointe
 - nr_pages is # of pages of online/offline memory.
 - status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
   is (will be) set/clear, if this is -1, then nodemask status is not changed.
-- status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
-  is (will be) set/clear, if this is -1, then nodemask status is not changed.
 - status_change_nid is set node id when N_MEMORY of nodemask is (will be)
   set/clear. It means a new(memoryless) node gets new memory by online and a
   node loses all memory. If this is -1, then nodemask status is not changed.
--- a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst~mm-memory_hotplug-remove-highmem-leftovers
+++ a/Documentation/translations/zh_CN/core-api/memory-hotplug.rst
@@ -63,7 +63,6 @@ memory_notify结构体的指针::
 		unsigned long start_pfn;
 		unsigned long nr_pages;
 		int status_change_nid_normal;
-		int status_change_nid_high;
 		int status_change_nid;
 	}
 
@@ -74,9 +73,6 @@ memory_notify结构体的指针::
 - status_change_nid_normal是当nodemask的N_NORMAL_MEMORY被设置/清除时设置节
   点id,如果是-1,则nodemask状态不改变。
 
-- status_change_nid_high是当nodemask的N_HIGH_MEMORY被设置/清除时设置的节点
-  id,如果这个值为-1,那么nodemask状态不会改变。
-
 - status_change_nid是当nodemask的N_MEMORY被(将)设置/清除时设置的节点id。这
   意味着一个新的(没上线的)节点通过联机获得新的内存,而一个节点失去了所有的内
   存。如果这个值为-1,那么nodemask的状态就不会改变。
--- a/include/linux/memory.h~mm-memory_hotplug-remove-highmem-leftovers
+++ a/include/linux/memory.h
@@ -96,7 +96,6 @@ struct memory_notify {
 	unsigned long start_pfn;
 	unsigned long nr_pages;
 	int status_change_nid_normal;
-	int status_change_nid_high;
 	int status_change_nid;
 };
 
--- a/mm/memory_hotplug.c~mm-memory_hotplug-remove-highmem-leftovers
+++ a/mm/memory_hotplug.c
@@ -21,7 +21,6 @@
 #include <linux/memory.h>
 #include <linux/memremap.h>
 #include <linux/memory_hotplug.h>
-#include <linux/highmem.h>
 #include <linux/vmalloc.h>
 #include <linux/ioport.h>
 #include <linux/delay.h>
@@ -585,10 +584,6 @@ void generic_online_page(struct page *pa
 	debug_pagealloc_map_pages(page, 1 << order);
 	__free_pages_core(page, order);
 	totalram_pages_add(1UL << order);
-#ifdef CONFIG_HIGHMEM
-	if (PageHighMem(page))
-		totalhigh_pages_add(1UL << order);
-#endif
 }
 EXPORT_SYMBOL_GPL(generic_online_page);
 
@@ -625,16 +620,11 @@ static void node_states_check_changes_on
 
 	arg->status_change_nid = NUMA_NO_NODE;
 	arg->status_change_nid_normal = NUMA_NO_NODE;
-	arg->status_change_nid_high = NUMA_NO_NODE;
 
 	if (!node_state(nid, N_MEMORY))
 		arg->status_change_nid = nid;
 	if (zone_idx(zone) <= ZONE_NORMAL && !node_state(nid, N_NORMAL_MEMORY))
 		arg->status_change_nid_normal = nid;
-#ifdef CONFIG_HIGHMEM
-	if (zone_idx(zone) <= ZONE_HIGHMEM && !node_state(nid, N_HIGH_MEMORY))
-		arg->status_change_nid_high = nid;
-#endif
 }
 
 static void node_states_set_node(int node, struct memory_notify *arg)
@@ -642,9 +632,6 @@ static void node_states_set_node(int nod
 	if (arg->status_change_nid_normal >= 0)
 		node_set_state(node, N_NORMAL_MEMORY);
 
-	if (arg->status_change_nid_high >= 0)
-		node_set_state(node, N_HIGH_MEMORY);
-
 	if (arg->status_change_nid >= 0)
 		node_set_state(node, N_MEMORY);
 }
@@ -1801,7 +1788,6 @@ static void node_states_check_changes_of
 
 	arg->status_change_nid = NUMA_NO_NODE;
 	arg->status_change_nid_normal = NUMA_NO_NODE;
-	arg->status_change_nid_high = NUMA_NO_NODE;
 
 	/*
 	 * Check whether node_states[N_NORMAL_MEMORY] will be changed.
@@ -1816,24 +1802,9 @@ static void node_states_check_changes_of
 	if (zone_idx(zone) <= ZONE_NORMAL && nr_pages >= present_pages)
 		arg->status_change_nid_normal = zone_to_nid(zone);
 
-#ifdef CONFIG_HIGHMEM
 	/*
-	 * node_states[N_HIGH_MEMORY] contains nodes which
-	 * have normal memory or high memory.
-	 * Here we add the present_pages belonging to ZONE_HIGHMEM.
-	 * If the zone is within the range of [0..ZONE_HIGHMEM), and
-	 * we determine that the zones in that range become empty,
-	 * we need to clear the node for N_HIGH_MEMORY.
-	 */
-	present_pages += pgdat->node_zones[ZONE_HIGHMEM].present_pages;
-	if (zone_idx(zone) <= ZONE_HIGHMEM && nr_pages >= present_pages)
-		arg->status_change_nid_high = zone_to_nid(zone);
-#endif
-
-	/*
-	 * We have accounted the pages from [0..ZONE_NORMAL), and
-	 * in case of CONFIG_HIGHMEM the pages from ZONE_HIGHMEM
-	 * as well.
+	 * We have accounted the pages from [0..ZONE_NORMAL); ZONE_HIGHMEM
+	 * does not apply as we don't support 32bit.
 	 * Here we count the possible pages from ZONE_MOVABLE.
 	 * If after having accounted all the pages, we see that the nr_pages
 	 * to be offlined is over or equal to the accounted pages,
@@ -1851,9 +1822,6 @@ static void node_states_clear_node(int n
 	if (arg->status_change_nid_normal >= 0)
 		node_clear_state(node, N_NORMAL_MEMORY);
 
-	if (arg->status_change_nid_high >= 0)
-		node_clear_state(node, N_HIGH_MEMORY);
-
 	if (arg->status_change_nid >= 0)
 		node_clear_state(node, N_MEMORY);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 190/262] mm/memory_hotplug: remove stale function declarations
  2021-11-05 20:34 incoming Andrew Morton
                   ` (188 preceding siblings ...)
  2021-11-05 20:44 ` [patch 189/262] mm/memory_hotplug: remove HIGHMEM leftovers Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 191/262] x86: remove memory hotplug support on X86_32 Andrew Morton
                   ` (71 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, shuah, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: remove stale function declarations

These functions no longer exist.

Link: https://lkml.kernel.org/r/20210929143600.49379-6-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memory_hotplug.h |    3 ---
 1 file changed, 3 deletions(-)

--- a/include/linux/memory_hotplug.h~mm-memory_hotplug-remove-stale-function-declarations
+++ a/include/linux/memory_hotplug.h
@@ -98,9 +98,6 @@ static inline void zone_seqlock_init(str
 {
 	seqlock_init(&zone->span_seqlock);
 }
-extern int zone_grow_free_lists(struct zone *zone, unsigned long new_nr_pages);
-extern int zone_grow_waitqueues(struct zone *zone, unsigned long nr_pages);
-extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
 extern void adjust_present_page_count(struct page *page,
 				      struct memory_group *group,
 				      long nr_pages);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 191/262] x86: remove memory hotplug support on X86_32
  2021-11-05 20:34 incoming Andrew Morton
                   ` (189 preceding siblings ...)
  2021-11-05 20:44 ` [patch 190/262] mm/memory_hotplug: remove stale function declarations Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 192/262] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() Andrew Morton
                   ` (70 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, alexs, benh, bp, corbet, dave.hansen, david, gregkh, hpa,
	jasowang, linux-mm, luto, mhocko, mingo, mm-commits, mpe, mst,
	osalvador, paulus, peterz, rafael, rppt, shuah, tglx, torvalds

From: David Hildenbrand <david@redhat.com>
Subject: x86: remove memory hotplug support on X86_32

CONFIG_MEMORY_HOTPLUG was marked BROKEN over one year and we just
restricted it to 64 bit.  Let's remove the unused x86 32bit implementation
and simplify the Kconfig.

Link: https://lkml.kernel.org/r/20210929143600.49379-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/Kconfig      |    6 +++---
 arch/x86/mm/init_32.c |   31 -------------------------------
 2 files changed, 3 insertions(+), 34 deletions(-)

--- a/arch/x86/Kconfig~x86-remove-memory-hotplug-support-on-x86_32
+++ a/arch/x86/Kconfig
@@ -62,7 +62,7 @@ config X86
 	select ARCH_32BIT_OFF_T			if X86_32
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_ENABLE_HUGEPAGE_MIGRATION if X86_64 && HUGETLB_PAGE && MIGRATION
-	select ARCH_ENABLE_MEMORY_HOTPLUG if X86_64 || (X86_32 && HIGHMEM)
+	select ARCH_ENABLE_MEMORY_HOTPLUG if X86_64
 	select ARCH_ENABLE_MEMORY_HOTREMOVE if MEMORY_HOTPLUG
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if (PGTABLE_LEVELS > 2) && (X86_64 || X86_PAE)
 	select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE
@@ -1614,7 +1614,7 @@ config ARCH_SELECT_MEMORY_MODEL
 
 config ARCH_MEMORY_PROBE
 	bool "Enable sysfs memory/probe interface"
-	depends on X86_64 && MEMORY_HOTPLUG
+	depends on MEMORY_HOTPLUG
 	help
 	  This option enables a sysfs memory/probe interface for testing.
 	  See Documentation/admin-guide/mm/memory-hotplug.rst for more information.
@@ -2394,7 +2394,7 @@ endmenu
 
 config ARCH_HAS_ADD_PAGES
 	def_bool y
-	depends on X86_64 && ARCH_ENABLE_MEMORY_HOTPLUG
+	depends on ARCH_ENABLE_MEMORY_HOTPLUG
 
 config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
 	def_bool y
--- a/arch/x86/mm/init_32.c~x86-remove-memory-hotplug-support-on-x86_32
+++ a/arch/x86/mm/init_32.c
@@ -779,37 +779,6 @@ void __init mem_init(void)
 	test_wp_bit();
 }
 
-#ifdef CONFIG_MEMORY_HOTPLUG
-int arch_add_memory(int nid, u64 start, u64 size,
-		    struct mhp_params *params)
-{
-	unsigned long start_pfn = start >> PAGE_SHIFT;
-	unsigned long nr_pages = size >> PAGE_SHIFT;
-	int ret;
-
-	/*
-	 * The page tables were already mapped at boot so if the caller
-	 * requests a different mapping type then we must change all the
-	 * pages with __set_memory_prot().
-	 */
-	if (params->pgprot.pgprot != PAGE_KERNEL.pgprot) {
-		ret = __set_memory_prot(start, nr_pages, params->pgprot);
-		if (ret)
-			return ret;
-	}
-
-	return __add_pages(nid, start_pfn, nr_pages, params);
-}
-
-void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap)
-{
-	unsigned long start_pfn = start >> PAGE_SHIFT;
-	unsigned long nr_pages = size >> PAGE_SHIFT;
-
-	__remove_pages(start_pfn, nr_pages, altmap);
-}
-#endif
-
 int kernel_set_to_readonly __read_mostly;
 
 static void mark_nxdata_nx(void)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 192/262] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (190 preceding siblings ...)
  2021-11-05 20:44 ` [patch 191/262] x86: remove memory hotplug support on X86_32 Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 193/262] memblock: improve MEMBLOCK_HOTPLUG documentation Andrew Morton
                   ` (69 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, aneesh.kumar, arnd, borntraeger, chenhuacai, david,
	ebiederm, geert, gor, hca, Jianyong.Wu, jiaxun.yang, linux-mm,
	mhocko, mm-commits, osalvador, rppt, shahab, torvalds, tsbogend,
	vgupta

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource()

Patch series "mm/memory_hotplug: full support for add_memory_driver_managed() with CONFIG_ARCH_KEEP_MEMBLOCK", v2.

Architectures that require CONFIG_ARCH_KEEP_MEMBLOCK=y, such as arm64,
don't cleanly support add_memory_driver_managed() yet.  Most prominently,
kexec_file can still end up placing kexec images on such driver-managed
memory, resulting in undesired behavior, for example, having kexec images
located on memory not part of the firmware-provided memory map.

Teaching kexec to not place images on driver-managed memory is especially
relevant for virtio-mem.  Details can be found in commit 7b7b27214bba
("mm/memory_hotplug: introduce add_memory_driver_managed()").

Extend memblock with a new flag and set it from memory hotplug code when
applicable.  This is required to fully support virtio-mem on arm64, making
also kexec_file behave like on x86-64.


This patch (of 2):

If memblock_add_node() fails, we're most probably running out of memory. 
While this is unlikely to happen, it can happen and having memory added
without a memblock can be problematic for architectures that use memblock
to detect valid memory.  Let's fail in a nice way instead of silently
ignoring the error.

Link: https://lkml.kernel.org/r/20211004093605.5830-1-david@redhat.com
Link: https://lkml.kernel.org/r/20211004093605.5830-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Shahab Vahedi <shahab@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory_hotplug.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/mm/memory_hotplug.c~mm-memory_hotplug-handle-memblock_add_node-failures-in-add_memory_resource
+++ a/mm/memory_hotplug.c
@@ -1369,8 +1369,11 @@ int __ref add_memory_resource(int nid, s
 
 	mem_hotplug_begin();
 
-	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
-		memblock_add_node(start, size, nid);
+	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
+		ret = memblock_add_node(start, size, nid);
+		if (ret)
+			goto error_mem_hotplug_end;
+	}
 
 	ret = __try_online_node(nid, false);
 	if (ret < 0)
@@ -1443,6 +1446,7 @@ error:
 		rollback_node_hotadd(nid);
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
 		memblock_remove(start, size);
+error_mem_hotplug_end:
 	mem_hotplug_done();
 	return ret;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 193/262] memblock: improve MEMBLOCK_HOTPLUG documentation
  2021-11-05 20:34 incoming Andrew Morton
                   ` (191 preceding siblings ...)
  2021-11-05 20:44 ` [patch 192/262] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 194/262] memblock: allow to specify flags with memblock_add_node() Andrew Morton
                   ` (68 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, aneesh.kumar, arnd, borntraeger, chenhuacai, david,
	ebiederm, geert, gor, hca, Jianyong.Wu, jiaxun.yang, linux-mm,
	mhocko, mm-commits, osalvador, rppt, shahab, torvalds, tsbogend,
	vgupta

From: David Hildenbrand <david@redhat.com>
Subject: memblock: improve MEMBLOCK_HOTPLUG documentation

The description of MEMBLOCK_HOTPLUG is currently short and consequently
misleading: we're actually dealing with a memory region that might get
hotunplugged later (i.e., the platform+firmware supports it), yet it is
indicated in the firmware-provided memory map as system ram that will just
get used by the system for any purpose when not taking special care.  The
firmware marked this memory region as a hot(un)plugged (e.g., hotplugged
before reboot), implying that it might get hotunplugged again later.

Whether we consider this information depends on the "movable_node" kernel
commandline parameter: only with "movable_node" set, we'll try keeping
this memory hotunpluggable, for example, by not serving early allocations
from this memory region and by letting the buddy manage it using the
ZONE_MOVABLE.

Let's make this clearer by extending the documentation.

Note: kexec *has to* indicate this memory to the second kernel.  With
"movable_node" set, we don't want to place kexec-images on this memory. 
Without "movable_node" set, we don't care and can place kexec-images on
this memory.  In both cases, after successful memory hotunplug, kexec has
to be re-armed to update the memory map for the second kernel and to place
the kexec-images somewhere else.

Link: https://lkml.kernel.org/r/20211004093605.5830-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shahab Vahedi <shahab@synopsys.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memblock.h |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/include/linux/memblock.h~memblock-improve-memblock_hotplug-documentation
+++ a/include/linux/memblock.h
@@ -28,7 +28,11 @@ extern unsigned long long max_possible_p
 /**
  * enum memblock_flags - definition of memory region attributes
  * @MEMBLOCK_NONE: no special request
- * @MEMBLOCK_HOTPLUG: hotpluggable region
+ * @MEMBLOCK_HOTPLUG: memory region indicated in the firmware-provided memory
+ * map during early boot as hot(un)pluggable system RAM (e.g., memory range
+ * that might get hotunplugged later). With "movable_node" set on the kernel
+ * commandline, try keeping this memory region hotunpluggable. Does not apply
+ * to memblocks added ("hotplugged") after early boot.
  * @MEMBLOCK_MIRROR: mirrored region
  * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
  * reserved in the memory map; refer to memblock_mark_nomap() description
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 194/262] memblock: allow to specify flags with memblock_add_node()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (192 preceding siblings ...)
  2021-11-05 20:44 ` [patch 193/262] memblock: improve MEMBLOCK_HOTPLUG documentation Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 195/262] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
                   ` (67 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, aneesh.kumar, arnd, borntraeger, chenhuacai, david,
	ebiederm, geert, gor, hca, Jianyong.Wu, jiaxun.yang, linux-mm,
	mhocko, mm-commits, osalvador, rppt, shahab, torvalds, tsbogend,
	vgupta

From: David Hildenbrand <david@redhat.com>
Subject: memblock: allow to specify flags with memblock_add_node()

We want to specify flags when hotplugging memory.  Let's prepare to pass
flags to memblock_add_node() by adjusting all existing users.

Note that when hotplugging memory the system is already up and running and
we might have concurrent memblock users: for example, while we're
hotplugging memory, kexec_file code might search for suitable memory
regions to place kexec images.  It's important to add the memory directly
to memblock via a single call with the right flags, instead of adding the
memory first and apply flags later: otherwise, concurrent memblock users
might temporarily stumble over memblocks with wrong flags, which will be
important in a follow-up patch that introduces a new flag to properly
handle add_memory_driver_managed().

Link: https://lkml.kernel.org/r/20211004093605.5830-4-david@redhat.com
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shahab Vahedi <shahab@synopsys.com>	[arch/arc]
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arc/mm/init.c               |    4 ++--
 arch/ia64/mm/contig.c            |    2 +-
 arch/ia64/mm/init.c              |    2 +-
 arch/m68k/mm/mcfmmu.c            |    3 ++-
 arch/m68k/mm/motorola.c          |    6 ++++--
 arch/mips/loongson64/init.c      |    4 +++-
 arch/mips/sgi-ip27/ip27-memory.c |    3 ++-
 arch/s390/kernel/setup.c         |    3 ++-
 include/linux/memblock.h         |    3 ++-
 include/linux/mm.h               |    2 +-
 mm/memblock.c                    |    9 +++++----
 mm/memory_hotplug.c              |    2 +-
 12 files changed, 26 insertions(+), 17 deletions(-)

--- a/arch/arc/mm/init.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/arc/mm/init.c
@@ -59,13 +59,13 @@ void __init early_init_dt_add_memory_arc
 
 		low_mem_sz = size;
 		in_use = 1;
-		memblock_add_node(base, size, 0);
+		memblock_add_node(base, size, 0, MEMBLOCK_NONE);
 	} else {
 #ifdef CONFIG_HIGHMEM
 		high_mem_start = base;
 		high_mem_sz = size;
 		in_use = 1;
-		memblock_add_node(base, size, 1);
+		memblock_add_node(base, size, 1, MEMBLOCK_NONE);
 		memblock_reserve(base, size);
 #endif
 	}
--- a/arch/ia64/mm/contig.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/ia64/mm/contig.c
@@ -153,7 +153,7 @@ find_memory (void)
 	efi_memmap_walk(find_max_min_low_pfn, NULL);
 	max_pfn = max_low_pfn;
 
-	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0);
+	memblock_add_node(0, PFN_PHYS(max_low_pfn), 0, MEMBLOCK_NONE);
 
 	find_initrd();
 
--- a/arch/ia64/mm/init.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/ia64/mm/init.c
@@ -378,7 +378,7 @@ int __init register_active_ranges(u64 st
 #endif
 
 	if (start < end)
-		memblock_add_node(__pa(start), end - start, nid);
+		memblock_add_node(__pa(start), end - start, nid, MEMBLOCK_NONE);
 	return 0;
 }
 
--- a/arch/m68k/mm/mcfmmu.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/m68k/mm/mcfmmu.c
@@ -174,7 +174,8 @@ void __init cf_bootmem_alloc(void)
 	m68k_memory[0].addr = _rambase;
 	m68k_memory[0].size = _ramend - _rambase;
 
-	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
+	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
+			  MEMBLOCK_NONE);
 
 	/* compute total pages in system */
 	num_pages = PFN_DOWN(_ramend - _rambase);
--- a/arch/m68k/mm/motorola.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/m68k/mm/motorola.c
@@ -410,7 +410,8 @@ void __init paging_init(void)
 
 	min_addr = m68k_memory[0].addr;
 	max_addr = min_addr + m68k_memory[0].size;
-	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0);
+	memblock_add_node(m68k_memory[0].addr, m68k_memory[0].size, 0,
+			  MEMBLOCK_NONE);
 	for (i = 1; i < m68k_num_memory;) {
 		if (m68k_memory[i].addr < min_addr) {
 			printk("Ignoring memory chunk at 0x%lx:0x%lx before the first chunk\n",
@@ -421,7 +422,8 @@ void __init paging_init(void)
 				(m68k_num_memory - i) * sizeof(struct m68k_mem_info));
 			continue;
 		}
-		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i);
+		memblock_add_node(m68k_memory[i].addr, m68k_memory[i].size, i,
+				  MEMBLOCK_NONE);
 		addr = m68k_memory[i].addr + m68k_memory[i].size;
 		if (addr > max_addr)
 			max_addr = addr;
--- a/arch/mips/loongson64/init.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/mips/loongson64/init.c
@@ -77,7 +77,9 @@ void __init szmem(unsigned int node)
 				(u32)node_id, mem_type, mem_start, mem_size);
 			pr_info("       start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
 				start_pfn, end_pfn, num_physpages);
-			memblock_add_node(PFN_PHYS(start_pfn), PFN_PHYS(node_psize), node);
+			memblock_add_node(PFN_PHYS(start_pfn),
+					  PFN_PHYS(node_psize), node,
+					  MEMBLOCK_NONE);
 			break;
 		case SYSTEM_RAM_RESERVED:
 			pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
--- a/arch/mips/sgi-ip27/ip27-memory.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/mips/sgi-ip27/ip27-memory.c
@@ -341,7 +341,8 @@ static void __init szmem(void)
 				continue;
 			}
 			memblock_add_node(PFN_PHYS(slot_getbasepfn(node, slot)),
-					  PFN_PHYS(slot_psize), node);
+					  PFN_PHYS(slot_psize), node,
+					  MEMBLOCK_NONE);
 		}
 	}
 }
--- a/arch/s390/kernel/setup.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/arch/s390/kernel/setup.c
@@ -593,7 +593,8 @@ static void __init setup_resources(void)
 	 * part of the System RAM resource.
 	 */
 	if (crashk_res.end) {
-		memblock_add_node(crashk_res.start, resource_size(&crashk_res), 0);
+		memblock_add_node(crashk_res.start, resource_size(&crashk_res),
+				  0, MEMBLOCK_NONE);
 		memblock_reserve(crashk_res.start, resource_size(&crashk_res));
 		insert_resource(&iomem_resource, &crashk_res);
 	}
--- a/include/linux/memblock.h~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/include/linux/memblock.h
@@ -104,7 +104,8 @@ static inline void memblock_discard(void
 #endif
 
 void memblock_allow_resize(void);
-int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
+int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid,
+		      enum memblock_flags flags);
 int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_phys_free(phys_addr_t base, phys_addr_t size);
--- a/include/linux/mm.h~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/include/linux/mm.h
@@ -2425,7 +2425,7 @@ static inline unsigned long get_num_phys
  * unsigned long max_zone_pfns[MAX_NR_ZONES] = {max_dma, max_normal_pfn,
  * 							 max_highmem_pfn};
  * for_each_valid_physical_page_range()
- * 	memblock_add_node(base, size, nid)
+ *	memblock_add_node(base, size, nid, MEMBLOCK_NONE)
  * free_area_init(max_zone_pfns);
  */
 void free_area_init(unsigned long *max_zone_pfn);
--- a/mm/memblock.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/mm/memblock.c
@@ -655,6 +655,7 @@ repeat:
  * @base: base address of the new region
  * @size: size of the new region
  * @nid: nid of the new region
+ * @flags: flags of the new region
  *
  * Add new memblock region [@base, @base + @size) to the "memory"
  * type. See memblock_add_range() description for mode details
@@ -663,14 +664,14 @@ repeat:
  * 0 on success, -errno on failure.
  */
 int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size,
-				       int nid)
+				      int nid, enum memblock_flags flags)
 {
 	phys_addr_t end = base + size - 1;
 
-	memblock_dbg("%s: [%pa-%pa] nid=%d %pS\n", __func__,
-		     &base, &end, nid, (void *)_RET_IP_);
+	memblock_dbg("%s: [%pa-%pa] nid=%d flags=%x %pS\n", __func__,
+		     &base, &end, nid, flags, (void *)_RET_IP_);
 
-	return memblock_add_range(&memblock.memory, base, size, nid, 0);
+	return memblock_add_range(&memblock.memory, base, size, nid, flags);
 }
 
 /**
--- a/mm/memory_hotplug.c~memblock-allow-to-specify-flags-with-memblock_add_node
+++ a/mm/memory_hotplug.c
@@ -1370,7 +1370,7 @@ int __ref add_memory_resource(int nid, s
 	mem_hotplug_begin();
 
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
-		ret = memblock_add_node(start, size, nid);
+		ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE);
 		if (ret)
 			goto error_mem_hotplug_end;
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 195/262] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-11-05 20:34 incoming Andrew Morton
                   ` (193 preceding siblings ...)
  2021-11-05 20:44 ` [patch 194/262] memblock: allow to specify flags with memblock_add_node() Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:44 ` [patch 196/262] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
                   ` (66 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, aneesh.kumar, arnd, borntraeger, chenhuacai, david,
	ebiederm, geert, gor, hca, Jianyong.Wu, jiaxun.yang, linux-mm,
	mhocko, mm-commits, osalvador, rppt, shahab, torvalds, tsbogend,
	vgupta

From: David Hildenbrand <david@redhat.com>
Subject: memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED

Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED,
indicating that we're dealing with a memory region that is never indicated
in the firmware-provided memory map, but always detected and added by a
driver.

Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such memory
regions like ordinary MEMBLOCK_NONE memory regions -- for example, when
selecting memory regions to add to the vmcore for dumping in the
crashkernel via for_each_mem_range().

However, especially kexec_file is not supposed to select such memblocks
via for_each_free_mem_range() / for_each_free_mem_range_reverse() to place
kexec images, similar to how we handle IORESOURCE_SYSRAM_DRIVER_MANAGED
without CONFIG_ARCH_KEEP_MEMBLOCK.

We'll make sure that memory hotplug code sets the flag where applicable
(IORESOURCE_SYSRAM_DRIVER_MANAGED) next.  This prepares architectures that
need CONFIG_ARCH_KEEP_MEMBLOCK, such as arm64, for virtio-mem support.

Note that kexec *must not* indicate this memory to the second kernel and
*must not* place kexec-images on this memory.  Let's add a comment to
kexec_walk_memblock(), documenting how we handle MEMBLOCK_DRIVER_MANAGED
now just like using IORESOURCE_SYSRAM_DRIVER_MANAGED in
locate_mem_hole_callback() for kexec_walk_resources().

Also note that MEMBLOCK_HOTPLUG cannot be reused due to different
semantics:
	MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the
	firmware-provided memory map and added to the system early during
	boot; kexec *has to* indicate this memory to the second kernel and
	can place kexec-images on this memory. After memory hotunplug,
	kexec has to be re-armed. We mostly ignore this flag when
	"movable_node" is not set on the kernel command line, because
	then we're told to not care about hotunpluggability of such
	memory regions.

	MEMBLOCK_DRIVER_MANAGED: memory is not indicated as "System RAM" in
	the firmware-provided memory map; this memory is always detected
	and added to the system by a driver; memory might not actually be
	physically hotunpluggable. kexec *must not* indicate this memory to
	the second kernel and *must not* place kexec-images on this memory.

Link: https://lkml.kernel.org/r/20211004093605.5830-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shahab Vahedi <shahab@synopsys.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memblock.h |   16 ++++++++++++++--
 kernel/kexec_file.c      |    5 +++++
 mm/memblock.c            |    4 ++++
 3 files changed, 23 insertions(+), 2 deletions(-)

--- a/include/linux/memblock.h~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed
+++ a/include/linux/memblock.h
@@ -37,12 +37,17 @@ extern unsigned long long max_possible_p
  * @MEMBLOCK_NOMAP: don't add to kernel direct mapping and treat as
  * reserved in the memory map; refer to memblock_mark_nomap() description
  * for further details
+ * @MEMBLOCK_DRIVER_MANAGED: memory region that is always detected and added
+ * via a driver, and never indicated in the firmware-provided memory map as
+ * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the
+ * kernel resource tree.
  */
 enum memblock_flags {
 	MEMBLOCK_NONE		= 0x0,	/* No special request */
 	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
 	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
 	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
+	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
 };
 
 /**
@@ -213,7 +218,8 @@ static inline void __next_physmem_range(
  */
 #define for_each_mem_range(i, p_start, p_end) \
 	__for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,	\
-			     MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
+			     MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED, \
+			     p_start, p_end, NULL)
 
 /**
  * for_each_mem_range_rev - reverse iterate through memblock areas from
@@ -224,7 +230,8 @@ static inline void __next_physmem_range(
  */
 #define for_each_mem_range_rev(i, p_start, p_end)			\
 	__for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \
-				 MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
+				 MEMBLOCK_HOTPLUG | MEMBLOCK_DRIVER_MANAGED,\
+				 p_start, p_end, NULL)
 
 /**
  * for_each_reserved_mem_range - iterate over all reserved memblock areas
@@ -254,6 +261,11 @@ static inline bool memblock_is_nomap(str
 	return m->flags & MEMBLOCK_NOMAP;
 }
 
+static inline bool memblock_is_driver_managed(struct memblock_region *m)
+{
+	return m->flags & MEMBLOCK_DRIVER_MANAGED;
+}
+
 int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
 			    unsigned long  *end_pfn);
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
--- a/kernel/kexec_file.c~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed
+++ a/kernel/kexec_file.c
@@ -556,6 +556,11 @@ static int kexec_walk_memblock(struct ke
 	if (kbuf->image->type == KEXEC_TYPE_CRASH)
 		return func(&crashk_res, kbuf);
 
+	/*
+	 * Using MEMBLOCK_NONE will properly skip MEMBLOCK_DRIVER_MANAGED. See
+	 * IORESOURCE_SYSRAM_DRIVER_MANAGED handling in
+	 * locate_mem_hole_callback().
+	 */
 	if (kbuf->top_down) {
 		for_each_free_mem_range_reverse(i, NUMA_NO_NODE, MEMBLOCK_NONE,
 						&mstart, &mend, NULL) {
--- a/mm/memblock.c~memblock-add-memblock_driver_managed-to-mimic-ioresource_sysram_driver_managed
+++ a/mm/memblock.c
@@ -982,6 +982,10 @@ static bool should_skip_region(struct me
 	if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m))
 		return true;
 
+	/* skip driver-managed memory unless we were asked for it explicitly */
+	if (!(flags & MEMBLOCK_DRIVER_MANAGED) && memblock_is_driver_managed(m))
+		return true;
+
 	return false;
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 196/262] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED
  2021-11-05 20:34 incoming Andrew Morton
                   ` (194 preceding siblings ...)
  2021-11-05 20:44 ` [patch 195/262] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
@ 2021-11-05 20:44 ` Andrew Morton
  2021-11-05 20:45 ` [patch 197/262] mm/rmap.c: avoid double faults migrating device private pages Andrew Morton
                   ` (65 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:44 UTC (permalink / raw)
  To: akpm, aneesh.kumar, arnd, borntraeger, chenhuacai, david,
	ebiederm, geert, gor, hca, Jianyong.Wu, jiaxun.yang, linux-mm,
	mhocko, mm-commits, osalvador, rppt, shahab, torvalds, tsbogend,
	vgupta

From: David Hildenbrand <david@redhat.com>
Subject: mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED

Let's communicate driver-managed regions to memblock, to properly teach
kexec_file with CONFIG_ARCH_KEEP_MEMBLOCK to not place images on these
memory regions.

Link: https://lkml.kernel.org/r/20211004093605.5830-6-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Jianyong Wu <Jianyong.Wu@arm.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shahab Vahedi <shahab@synopsys.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory_hotplug.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/memory_hotplug.c~mm-memory_hotplug-indicate-memblock_driver_managed-with-ioresource_sysram_driver_managed
+++ a/mm/memory_hotplug.c
@@ -1342,6 +1342,7 @@ bool mhp_supports_memmap_on_memory(unsig
 int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 {
 	struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) };
+	enum memblock_flags memblock_flags = MEMBLOCK_NONE;
 	struct vmem_altmap mhp_altmap = {};
 	struct memory_group *group = NULL;
 	u64 start, size;
@@ -1370,7 +1371,9 @@ int __ref add_memory_resource(int nid, s
 	mem_hotplug_begin();
 
 	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
-		ret = memblock_add_node(start, size, nid, MEMBLOCK_NONE);
+		if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED)
+			memblock_flags = MEMBLOCK_DRIVER_MANAGED;
+		ret = memblock_add_node(start, size, nid, memblock_flags);
 		if (ret)
 			goto error_mem_hotplug_end;
 	}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 197/262] mm/rmap.c: avoid double faults migrating device private pages
  2021-11-05 20:34 incoming Andrew Morton
                   ` (195 preceding siblings ...)
  2021-11-05 20:44 ` [patch 196/262] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 198/262] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() Andrew Morton
                   ` (64 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, apopple, jglisse, jhubbard, linux-mm, mm-commits,
	rcampbell, torvalds

From: Alistair Popple <apopple@nvidia.com>
Subject: mm/rmap.c: avoid double faults migrating device private pages

During migration special page table entries are installed for each page
being migrated.  These entries store the pfn and associated permissions of
ptes mapping the page being migarted.

Device-private pages use special swap pte entries to distinguish read-only
vs.  writeable pages which the migration code checks when creating
migration entries.  Normally this follows a fast path in
migrate_vma_collect_pmd() which correctly copies the permissions of
device-private pages over to migration entries when migrating pages back
to the CPU.

However the slow-path falls back to using try_to_migrate() which
unconditionally creates read-only migration entries for device-private
pages.  This leads to unnecessary double faults on the CPU as the new
pages are always mapped read-only even when they could be mapped
writeable.  Fix this by correctly copying device-private permissions in
try_to_migrate_one().

Link: https://lkml.kernel.org/r/20211018045247.3128058-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/rmap.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/mm/rmap.c~mm-rmapc-avoid-double-faults-migrating-device-private-pages
+++ a/mm/rmap.c
@@ -1807,6 +1807,7 @@ static bool try_to_migrate_one(struct pa
 		update_hiwater_rss(mm);
 
 		if (is_zone_device_page(page)) {
+			unsigned long pfn = page_to_pfn(page);
 			swp_entry_t entry;
 			pte_t swp_pte;
 
@@ -1815,8 +1816,11 @@ static bool try_to_migrate_one(struct pa
 			 * pte. do_swap_page() will wait until the migration
 			 * pte is removed and then restart fault handling.
 			 */
-			entry = make_readable_migration_entry(
-							page_to_pfn(page));
+			entry = pte_to_swp_entry(pteval);
+			if (is_writable_device_private_entry(entry))
+				entry = make_writable_migration_entry(pfn);
+			else
+				entry = make_readable_migration_entry(pfn);
 			swp_pte = swp_entry_to_pte(entry);
 
 			/*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 198/262] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (196 preceding siblings ...)
  2021-11-05 20:45 ` [patch 197/262] mm/rmap.c: avoid double faults migrating device private pages Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 199/262] mm/highmem: remove deprecated kmap_atomic Andrew Morton
                   ` (63 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, henryburns, linmiaohe, linux-mm, minchan, mm-commits,
	senozhatsky, torvalds

From: Miaohe Lin <linmiaohe@huawei.com>
Subject: mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()

There is one possible race window between zs_pool_dec_isolated() and
zs_unregister_migration() because wait_for_isolated_drain() checks the
isolated count without holding class->lock and there is no order inside
zs_pool_dec_isolated().  Thus the below race window could be possible:

zs_pool_dec_isolated		zs_unregister_migration
  check pool->destroying != 0
				  pool->destroying = true;
				  smp_mb();
				  wait_for_isolated_drain()
				    wait for pool->isolated_pages == 0
  atomic_long_dec(&pool->isolated_pages);
  atomic_long_read(&pool->isolated_pages) == 0

Since we observe the pool->destroying (false) before atomic_long_dec() for
pool->isolated_pages, waking pool->migration_wait up is missed.

Fix this by ensure checking pool->destroying happens after the
atomic_long_dec(&pool->isolated_pages).

Link: https://lkml.kernel.org/r/20210708115027.7557-1-linmiaohe@huawei.com
Fixes: 701d678599d0 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Henry Burns <henryburns@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/zsmalloc.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/mm/zsmalloc.c~mm-zsmallocc-close-race-window-between-zs_pool_dec_isolated-and-zs_unregister_migration
+++ a/mm/zsmalloc.c
@@ -1830,10 +1830,11 @@ static inline void zs_pool_dec_isolated(
 	VM_BUG_ON(atomic_long_read(&pool->isolated_pages) <= 0);
 	atomic_long_dec(&pool->isolated_pages);
 	/*
-	 * There's no possibility of racing, since wait_for_isolated_drain()
-	 * checks the isolated count under &class->lock after enqueuing
-	 * on migration_wait.
+	 * Checking pool->destroying must happen after atomic_long_dec()
+	 * for pool->isolated_pages above. Paired with the smp_mb() in
+	 * zs_unregister_migration().
 	 */
+	smp_mb__after_atomic();
 	if (atomic_long_read(&pool->isolated_pages) == 0 && pool->destroying)
 		wake_up_all(&pool->migration_wait);
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 199/262] mm/highmem: remove deprecated kmap_atomic
  2021-11-05 20:34 incoming Andrew Morton
                   ` (197 preceding siblings ...)
  2021-11-05 20:45 ` [patch 198/262] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 200/262] zram_drv: allow reclaim on bio_alloc Andrew Morton
                   ` (62 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, ira.weiny, linux-mm, mm-commits, peterz, prathu.baronia,
	rdunlap, tglx, torvalds, willy

From: Ira Weiny <ira.weiny@intel.com>
Subject: mm/highmem: remove deprecated kmap_atomic

kmap_atomic() is being deprecated in favor of kmap_local_page().

Replace the uses of kmap_atomic() within the highmem code.

On profiling clear_huge_page() using ftrace an improvement of 62% was
observed on the below setup.

Setup:-
Below data has been collected on Qualcomm's SM7250 SoC THP enabled
(kernel v4.19.113) with only CPU-0(Cortex-A55) and CPU-7(Cortex-A76)
switched on and set to max frequency, also DDR set to perf governor.

FTRACE Data:-

Base data:-
Number of iterations: 48
Mean of allocation time: 349.5 us
std deviation: 74.5 us

v4 data:-
Number of iterations: 48
Mean of allocation time: 131 us
std deviation: 32.7 us

The following simple userspace experiment to allocate
100MB(BUF_SZ) of pages and writing to it gave us a good insight,
we observed an improvement of 42% in allocation and writing timings.
-------------------------------------------------------------
Test code snippet
-------------------------------------------------------------
      clock_start();
      buf = malloc(BUF_SZ); /* Allocate 100 MB of memory */

        for(i=0; i < BUF_SZ_PAGES; i++)
        {
                *((int *)(buf + (i*PAGE_SIZE))) = 1;
        }
      clock_end();
-------------------------------------------------------------

Malloc test timings for 100MB anon allocation:-

Base data:-
Number of iterations: 100
Mean of allocation time: 31831 us
std deviation: 4286 us

v4 data:-
Number of iterations: 100
Mean of allocation time: 18193 us
std deviation: 4915 us

[willy@infradead.org: fix zero_user_segments()]
  Link: https://lkml.kernel.org/r/YYVhHCJcm2DM2G9u@casper.infradead.org
Link: https://lkml.kernel.org/r/20210204073255.20769-2-prathu.baronia@oneplus.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Prathu Baronia <prathu.baronia@oneplus.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/highmem.h |   28 ++++++++++++++--------------
 mm/highmem.c            |    6 +++---
 2 files changed, 17 insertions(+), 17 deletions(-)

--- a/include/linux/highmem.h~mm-highmem-remove-deprecated-kmap_atomic
+++ a/include/linux/highmem.h
@@ -143,9 +143,9 @@ static inline void invalidate_kernel_vma
 #ifndef clear_user_highpage
 static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 {
-	void *addr = kmap_atomic(page);
+	void *addr = kmap_local_page(page);
 	clear_user_page(addr, vaddr, page);
-	kunmap_atomic(addr);
+	kunmap_local(addr);
 }
 #endif
 
@@ -177,9 +177,9 @@ alloc_zeroed_user_highpage_movable(struc
 
 static inline void clear_highpage(struct page *page)
 {
-	void *kaddr = kmap_atomic(page);
+	void *kaddr = kmap_local_page(page);
 	clear_page(kaddr);
-	kunmap_atomic(kaddr);
+	kunmap_local(kaddr);
 }
 
 #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGE
@@ -202,7 +202,7 @@ static inline void zero_user_segments(st
 		unsigned start1, unsigned end1,
 		unsigned start2, unsigned end2)
 {
-	void *kaddr = kmap_atomic(page);
+	void *kaddr = kmap_local_page(page);
 	unsigned int i;
 
 	BUG_ON(end1 > page_size(page) || end2 > page_size(page));
@@ -213,7 +213,7 @@ static inline void zero_user_segments(st
 	if (end2 > start2)
 		memset(kaddr + start2, 0, end2 - start2);
 
-	kunmap_atomic(kaddr);
+	kunmap_local(kaddr);
 	for (i = 0; i < compound_nr(page); i++)
 		flush_dcache_page(page + i);
 }
@@ -238,11 +238,11 @@ static inline void copy_user_highpage(st
 {
 	char *vfrom, *vto;
 
-	vfrom = kmap_atomic(from);
-	vto = kmap_atomic(to);
+	vfrom = kmap_local_page(from);
+	vto = kmap_local_page(to);
 	copy_user_page(vto, vfrom, vaddr, to);
-	kunmap_atomic(vto);
-	kunmap_atomic(vfrom);
+	kunmap_local(vto);
+	kunmap_local(vfrom);
 }
 
 #endif
@@ -253,11 +253,11 @@ static inline void copy_highpage(struct
 {
 	char *vfrom, *vto;
 
-	vfrom = kmap_atomic(from);
-	vto = kmap_atomic(to);
+	vfrom = kmap_local_page(from);
+	vto = kmap_local_page(to);
 	copy_page(vto, vfrom);
-	kunmap_atomic(vto);
-	kunmap_atomic(vfrom);
+	kunmap_local(vto);
+	kunmap_local(vfrom);
 }
 
 #endif
--- a/mm/highmem.c~mm-highmem-remove-deprecated-kmap_atomic
+++ a/mm/highmem.c
@@ -383,7 +383,7 @@ void zero_user_segments(struct page *pag
 			unsigned this_end = min_t(unsigned, end1, PAGE_SIZE);
 
 			if (end1 > start1) {
-				kaddr = kmap_atomic(page + i);
+				kaddr = kmap_local_page(page + i);
 				memset(kaddr + start1, 0, this_end - start1);
 			}
 			end1 -= this_end;
@@ -398,7 +398,7 @@ void zero_user_segments(struct page *pag
 
 			if (end2 > start2) {
 				if (!kaddr)
-					kaddr = kmap_atomic(page + i);
+					kaddr = kmap_local_page(page + i);
 				memset(kaddr + start2, 0, this_end - start2);
 			}
 			end2 -= this_end;
@@ -406,7 +406,7 @@ void zero_user_segments(struct page *pag
 		}
 
 		if (kaddr) {
-			kunmap_atomic(kaddr);
+			kunmap_local(kaddr);
 			flush_dcache_page(page + i);
 		}
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 200/262] zram_drv: allow reclaim on bio_alloc
  2021-11-05 20:34 incoming Andrew Morton
                   ` (198 preceding siblings ...)
  2021-11-05 20:45 ` [patch 199/262] mm/highmem: remove deprecated kmap_atomic Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 201/262] zram: off by one in read_block_state() Andrew Morton
                   ` (61 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, jaewon31.kim, linux-mm, minchan, mm-commits, torvalds, ytk.lee

From: Jaewon Kim <jaewon31.kim@samsung.com>
Subject: zram_drv: allow reclaim on bio_alloc

The read_from_bdev_async is not called on atomic context.  So GFP_NOIO is
available rather than GFP_ATOMIC.  If there were reclaimable pages with
GFP_NOIO, we can avoid allocation failure and page fault failure.

Link: https://lkml.kernel.org/r/20210908005241.28062-1-jaewon31.kim@samsung.com
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Reported-by: Yong-Taek Lee <ytk.lee@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/block/zram/zram_drv.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/block/zram/zram_drv.c~zram_drv-allow-reclaim-on-bio_alloc
+++ a/drivers/block/zram/zram_drv.c
@@ -587,7 +587,7 @@ static int read_from_bdev_async(struct z
 {
 	struct bio *bio;
 
-	bio = bio_alloc(GFP_ATOMIC, 1);
+	bio = bio_alloc(GFP_NOIO, 1);
 	if (!bio)
 		return -ENOMEM;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 201/262] zram: off by one in read_block_state()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (199 preceding siblings ...)
  2021-11-05 20:45 ` [patch 200/262] zram_drv: allow reclaim on bio_alloc Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 202/262] zram: introduce an aged idle interface Andrew Morton
                   ` (60 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dan.carpenter, linux-mm, minchan, mm-commits, senozhatsky,
	torvalds

From: Dan Carpenter <dan.carpenter@oracle.com>
Subject: zram: off by one in read_block_state()

snprintf() returns the number of bytes it would have printed if there were
space.  But it does not count the NUL terminator.  So that means that if
"count == copied" then this has already overflowed by one character.

This bug likely isn't super harmful in real life.

Link: https://lkml.kernel.org/r/20210916130404.GA25094@kili
Fixes: c0265342bff4 ("zram: introduce zram memory tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/block/zram/zram_drv.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/block/zram/zram_drv.c~zram-off-by-one-in-read_block_state
+++ a/drivers/block/zram/zram_drv.c
@@ -910,7 +910,7 @@ static ssize_t read_block_state(struct f
 			zram_test_flag(zram, index, ZRAM_HUGE) ? 'h' : '.',
 			zram_test_flag(zram, index, ZRAM_IDLE) ? 'i' : '.');
 
-		if (count < copied) {
+		if (count <= copied) {
 			zram_slot_unlock(zram, index);
 			break;
 		}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 202/262] zram: introduce an aged idle interface
  2021-11-05 20:34 incoming Andrew Morton
                   ` (200 preceding siblings ...)
  2021-11-05 20:45 ` [patch 201/262] zram: off by one in read_block_state() Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 203/262] mm: remove HARDENED_USERCOPY_FALLBACK Andrew Morton
                   ` (59 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, bgeffon, corbet, jsbarnes, linux-mm, minchan, mm-commits,
	ngupta, senozhatsky, suleiman, torvalds

From: Brian Geffon <bgeffon@google.com>
Subject: zram: introduce an aged idle interface

This change introduces an aged idle interface to the existing idle sysfs
file for zram.

When CONFIG_ZRAM_MEMORY_TRACKING is enabled the idle file now also accepts
an integer argument.  This integer is the age (in seconds) of pages to
mark as idle.  The idle file still supports 'all' as it always has.  This
new approach allows for much more control over which pages get marked as
idle.

[bgeffon@google.com: use IS_ENABLED and cleanup comment]
  Link: https://lkml.kernel.org/r/20210924161128.1508015-1-bgeffon@google.com
[bgeffon@google.com: Sergey's cleanup suggestions]
  Link: https://lkml.kernel.org/r/20210929143056.13067-1-bgeffon@google.com
Link: https://lkml.kernel.org/r/20210923130115.1344361-1-bgeffon@google.com
Signed-off-by: Brian Geffon <bgeffon@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Jesse Barnes <jsbarnes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/blockdev/zram.rst |    8 ++
 drivers/block/zram/zram_drv.c               |   62 +++++++++++++-----
 2 files changed, 54 insertions(+), 16 deletions(-)

--- a/Documentation/admin-guide/blockdev/zram.rst~zram-introduce-an-aged-idle-interface
+++ a/Documentation/admin-guide/blockdev/zram.rst
@@ -328,6 +328,14 @@ as idle::
 From now on, any pages on zram are idle pages. The idle mark
 will be removed until someone requests access of the block.
 IOW, unless there is access request, those pages are still idle pages.
+Additionally, when CONFIG_ZRAM_MEMORY_TRACKING is enabled pages can be
+marked as idle based on how long (in seconds) it's been since they were
+last accessed::
+
+        echo 86400 > /sys/block/zramX/idle
+
+In this example all pages which haven't been accessed in more than 86400
+seconds (one day) will be marked idle.
 
 Admin can request writeback of those idle pages at right timing via::
 
--- a/drivers/block/zram/zram_drv.c~zram-introduce-an-aged-idle-interface
+++ a/drivers/block/zram/zram_drv.c
@@ -291,22 +291,16 @@ static ssize_t mem_used_max_store(struct
 	return len;
 }
 
-static ssize_t idle_store(struct device *dev,
-		struct device_attribute *attr, const char *buf, size_t len)
+/*
+ * Mark all pages which are older than or equal to cutoff as IDLE.
+ * Callers should hold the zram init lock in read mode
+ */
+static void mark_idle(struct zram *zram, ktime_t cutoff)
 {
-	struct zram *zram = dev_to_zram(dev);
+	int is_idle = 1;
 	unsigned long nr_pages = zram->disksize >> PAGE_SHIFT;
 	int index;
 
-	if (!sysfs_streq(buf, "all"))
-		return -EINVAL;
-
-	down_read(&zram->init_lock);
-	if (!init_done(zram)) {
-		up_read(&zram->init_lock);
-		return -EINVAL;
-	}
-
 	for (index = 0; index < nr_pages; index++) {
 		/*
 		 * Do not mark ZRAM_UNDER_WB slot as ZRAM_IDLE to close race.
@@ -314,14 +308,50 @@ static ssize_t idle_store(struct device
 		 */
 		zram_slot_lock(zram, index);
 		if (zram_allocated(zram, index) &&
-				!zram_test_flag(zram, index, ZRAM_UNDER_WB))
-			zram_set_flag(zram, index, ZRAM_IDLE);
+				!zram_test_flag(zram, index, ZRAM_UNDER_WB)) {
+#ifdef CONFIG_ZRAM_MEMORY_TRACKING
+			is_idle = !cutoff || ktime_after(cutoff, zram->table[index].ac_time);
+#endif
+			if (is_idle)
+				zram_set_flag(zram, index, ZRAM_IDLE);
+		}
 		zram_slot_unlock(zram, index);
 	}
+}
 
-	up_read(&zram->init_lock);
+static ssize_t idle_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t len)
+{
+	struct zram *zram = dev_to_zram(dev);
+	ktime_t cutoff_time = 0;
+	ssize_t rv = -EINVAL;
 
-	return len;
+	if (!sysfs_streq(buf, "all")) {
+		/*
+		 * If it did not parse as 'all' try to treat it as an integer when
+		 * we have memory tracking enabled.
+		 */
+		u64 age_sec;
+
+		if (IS_ENABLED(CONFIG_ZRAM_MEMORY_TRACKING) && !kstrtoull(buf, 0, &age_sec))
+			cutoff_time = ktime_sub(ktime_get_boottime(),
+					ns_to_ktime(age_sec * NSEC_PER_SEC));
+		else
+			goto out;
+	}
+
+	down_read(&zram->init_lock);
+	if (!init_done(zram))
+		goto out_unlock;
+
+	/* A cutoff_time of 0 marks everything as idle, this is the "all" behavior */
+	mark_idle(zram, cutoff_time);
+	rv = len;
+
+out_unlock:
+	up_read(&zram->init_lock);
+out:
+	return rv;
 }
 
 #ifdef CONFIG_ZRAM_WRITEBACK
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 203/262] mm: remove HARDENED_USERCOPY_FALLBACK
  2021-11-05 20:34 incoming Andrew Morton
                   ` (201 preceding siblings ...)
  2021-11-05 20:45 ` [patch 202/262] zram: introduce an aged idle interface Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 204/262] include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h Andrew Morton
                   ` (58 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, cl, iamjoonsoo.kim, jmorris, joel, keescook, linux-mm,
	mm-commits, penberg, rientjes, serge, steve, torvalds, vbabka

From: Stephen Kitt <steve@sk2.org>
Subject: mm: remove HARDENED_USERCOPY_FALLBACK

This has served its purpose and is no longer used.  All usercopy
violations appear to have been handled by now, any remaining instances (or
new bugs) will cause copies to be rejected.

This isn't a direct revert of commit 2d891fbc3bb6 ("usercopy: Allow strict
enforcement of whitelists"); since usercopy_fallback is effectively 0, the
fallback handling is removed too.

This also removes the usercopy_fallback module parameter on slab_common.

Link: https://github.com/KSPP/linux/issues/153
Link: https://lkml.kernel.org/r/20210921061149.1091163-1-steve@sk2.org
Signed-off-by: Stephen Kitt <steve@sk2.org>
Suggested-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>	[defconfig change]
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E . Hallyn" <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/configs/skiroot_defconfig |    1 -
 include/linux/slab.h                   |    2 --
 mm/slab.c                              |   13 -------------
 mm/slab_common.c                       |    8 --------
 mm/slub.c                              |   14 --------------
 security/Kconfig                       |   14 --------------
 6 files changed, 52 deletions(-)

--- a/arch/powerpc/configs/skiroot_defconfig~mm-remove-hardened_usercopy_fallback
+++ a/arch/powerpc/configs/skiroot_defconfig
@@ -275,7 +275,6 @@ CONFIG_NLS_UTF8=y
 CONFIG_ENCRYPTED_KEYS=y
 CONFIG_SECURITY=y
 CONFIG_HARDENED_USERCOPY=y
-# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
 CONFIG_HARDENED_USERCOPY_PAGESPAN=y
 CONFIG_FORTIFY_SOURCE=y
 CONFIG_SECURITY_LOCKDOWN_LSM=y
--- a/include/linux/slab.h~mm-remove-hardened_usercopy_fallback
+++ a/include/linux/slab.h
@@ -142,8 +142,6 @@ struct mem_cgroup;
 void __init kmem_cache_init(void);
 bool slab_is_available(void);
 
-extern bool usercopy_fallback;
-
 struct kmem_cache *kmem_cache_create(const char *name, unsigned int size,
 			unsigned int align, slab_flags_t flags,
 			void (*ctor)(void *));
--- a/mm/slab.c~mm-remove-hardened_usercopy_fallback
+++ a/mm/slab.c
@@ -4204,19 +4204,6 @@ void __check_heap_object(const void *ptr
 	    n <= cachep->useroffset - offset + cachep->usersize)
 		return;
 
-	/*
-	 * If the copy is still within the allocated object, produce
-	 * a warning instead of rejecting the copy. This is intended
-	 * to be a temporary method to find any missing usercopy
-	 * whitelists.
-	 */
-	if (usercopy_fallback &&
-	    offset <= cachep->object_size &&
-	    n <= cachep->object_size - offset) {
-		usercopy_warn("SLAB object", cachep->name, to_user, offset, n);
-		return;
-	}
-
 	usercopy_abort("SLAB object", cachep->name, to_user, offset, n);
 }
 #endif /* CONFIG_HARDENED_USERCOPY */
--- a/mm/slab_common.c~mm-remove-hardened_usercopy_fallback
+++ a/mm/slab_common.c
@@ -37,14 +37,6 @@ LIST_HEAD(slab_caches);
 DEFINE_MUTEX(slab_mutex);
 struct kmem_cache *kmem_cache;
 
-#ifdef CONFIG_HARDENED_USERCOPY
-bool usercopy_fallback __ro_after_init =
-		IS_ENABLED(CONFIG_HARDENED_USERCOPY_FALLBACK);
-module_param(usercopy_fallback, bool, 0400);
-MODULE_PARM_DESC(usercopy_fallback,
-		"WARN instead of reject usercopy whitelist violations");
-#endif
-
 static LIST_HEAD(slab_caches_to_rcu_destroy);
 static void slab_caches_to_rcu_destroy_workfn(struct work_struct *work);
 static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
--- a/mm/slub.c~mm-remove-hardened_usercopy_fallback
+++ a/mm/slub.c
@@ -4489,7 +4489,6 @@ void __check_heap_object(const void *ptr
 {
 	struct kmem_cache *s;
 	unsigned int offset;
-	size_t object_size;
 	bool is_kfence = is_kfence_address(ptr);
 
 	ptr = kasan_reset_tag(ptr);
@@ -4522,19 +4521,6 @@ void __check_heap_object(const void *ptr
 	    n <= s->useroffset - offset + s->usersize)
 		return;
 
-	/*
-	 * If the copy is still within the allocated object, produce
-	 * a warning instead of rejecting the copy. This is intended
-	 * to be a temporary method to find any missing usercopy
-	 * whitelists.
-	 */
-	object_size = slab_ksize(s);
-	if (usercopy_fallback &&
-	    offset <= object_size && n <= object_size - offset) {
-		usercopy_warn("SLUB object", s->name, to_user, offset, n);
-		return;
-	}
-
 	usercopy_abort("SLUB object", s->name, to_user, offset, n);
 }
 #endif /* CONFIG_HARDENED_USERCOPY */
--- a/security/Kconfig~mm-remove-hardened_usercopy_fallback
+++ a/security/Kconfig
@@ -163,20 +163,6 @@ config HARDENED_USERCOPY
 	  or are part of the kernel text. This kills entire classes
 	  of heap overflow exploits and similar kernel memory exposures.
 
-config HARDENED_USERCOPY_FALLBACK
-	bool "Allow usercopy whitelist violations to fallback to object size"
-	depends on HARDENED_USERCOPY
-	default y
-	help
-	  This is a temporary option that allows missing usercopy whitelists
-	  to be discovered via a WARN() to the kernel log, instead of
-	  rejecting the copy, falling back to non-whitelisted hardened
-	  usercopy that checks the slab allocation size instead of the
-	  whitelist size. This option will be removed once it seems like
-	  all missing usercopy whitelists have been identified and fixed.
-	  Booting with "slab_common.usercopy_fallback=Y/N" can change
-	  this setting.
-
 config HARDENED_USERCOPY_PAGESPAN
 	bool "Refuse to copy allocations that span multiple pages"
 	depends on HARDENED_USERCOPY
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 204/262] include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h
  2021-11-05 20:34 incoming Andrew Morton
                   ` (202 preceding siblings ...)
  2021-11-05 20:45 ` [patch 203/262] mm: remove HARDENED_USERCOPY_FALLBACK Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 205/262] stacktrace: move filter_irq_stacks() to kernel/stacktrace.c Andrew Morton
                   ` (57 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, davem, horms, kuba, linux-mm, liumh1, marcelo.leitner,
	mm-commits, pshelar, torvalds, ulf.hansson, vyasevich, willy

From: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Subject: include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h

nr_free_buffer_pages could be exposed through mm.h instead of swap.h.  The
advantage of this change is that it can reduce the obsolete includes.  For
example, net/ipv4/tcp.c wouldn't need swap.h any more since it has already
included mm.h.  Similarly, after checking all the other files, it comes
that tcp.c, udp.c meter.c ,...  follow the same rule, so these files can
have swap.h removed too.

Moreover, after preprocessing all the files that use nr_free_buffer_pages,
it turns out that those files have already included mm.h.Thus, we can move
nr_free_buffer_pages from swap.h to mm.h safely.  This change will not
affect the compilation of other files.

Link: https://lkml.kernel.org/r/20210912133640.1624-1-liumh1@shanghaitech.edu.cn
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
CC: Ulf Hansson <ulf.hansson@linaro.org>
Cc: "David S . Miller" <davem@davemloft.net>
Cc: Simon Horman <horms@verge.net.au>
Cc: Pravin B Shelar <pshelar@ovn.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/mmc/core/mmc_test.c    |    1 -
 include/linux/mm.h             |    2 ++
 include/linux/swap.h           |    1 -
 net/ipv4/tcp.c                 |    1 -
 net/ipv4/udp.c                 |    1 -
 net/netfilter/ipvs/ip_vs_ctl.c |    1 -
 net/openvswitch/meter.c        |    1 -
 net/sctp/protocol.c            |    1 -
 8 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/mmc/core/mmc_test.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/drivers/mmc/core/mmc_test.c
@@ -10,7 +10,6 @@
 #include <linux/slab.h>
 
 #include <linux/scatterlist.h>
-#include <linux/swap.h>		/* For nr_free_buffer_pages() */
 #include <linux/list.h>
 
 #include <linux/debugfs.h>
--- a/include/linux/mm.h~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/include/linux/mm.h
@@ -875,6 +875,8 @@ void put_pages_list(struct list_head *pa
 void split_page(struct page *page, unsigned int order);
 void copy_huge_page(struct page *dst, struct page *src);
 
+unsigned long nr_free_buffer_pages(void);
+
 /*
  * Compound pages have a destructor function.  Provide a
  * prototype for that function and accessor functions.
--- a/include/linux/swap.h~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/include/linux/swap.h
@@ -335,7 +335,6 @@ void workingset_update_node(struct xa_no
 
 /* linux/mm/page_alloc.c */
 extern unsigned long totalreserve_pages;
-extern unsigned long nr_free_buffer_pages(void);
 
 /* Definition of global_zone_page_state not available yet */
 #define nr_free_pages() global_zone_page_state(NR_FREE_PAGES)
--- a/net/ipv4/tcp.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/net/ipv4/tcp.c
@@ -260,7 +260,6 @@
 #include <linux/random.h>
 #include <linux/memblock.h>
 #include <linux/highmem.h>
-#include <linux/swap.h>
 #include <linux/cache.h>
 #include <linux/err.h>
 #include <linux/time.h>
--- a/net/ipv4/udp.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/net/ipv4/udp.c
@@ -78,7 +78,6 @@
 #include <asm/ioctls.h>
 #include <linux/memblock.h>
 #include <linux/highmem.h>
-#include <linux/swap.h>
 #include <linux/types.h>
 #include <linux/fcntl.h>
 #include <linux/module.h>
--- a/net/netfilter/ipvs/ip_vs_ctl.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/net/netfilter/ipvs/ip_vs_ctl.c
@@ -24,7 +24,6 @@
 #include <linux/sysctl.h>
 #include <linux/proc_fs.h>
 #include <linux/workqueue.h>
-#include <linux/swap.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 
--- a/net/openvswitch/meter.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/net/openvswitch/meter.c
@@ -12,7 +12,6 @@
 #include <linux/openvswitch.h>
 #include <linux/netlink.h>
 #include <linux/rculist.h>
-#include <linux/swap.h>
 
 #include <net/netlink.h>
 #include <net/genetlink.h>
--- a/net/sctp/protocol.c~include-linux-mmh-move-nr_free_buffer_pages-from-swaph-to-mmh
+++ a/net/sctp/protocol.c
@@ -33,7 +33,6 @@
 #include <linux/seq_file.h>
 #include <linux/memblock.h>
 #include <linux/highmem.h>
-#include <linux/swap.h>
 #include <linux/slab.h>
 #include <net/net_namespace.h>
 #include <net/protocol.h>
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 205/262] stacktrace: move filter_irq_stacks() to kernel/stacktrace.c
  2021-11-05 20:34 incoming Andrew Morton
                   ` (203 preceding siblings ...)
  2021-11-05 20:45 ` [patch 204/262] include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 206/262] kfence: count unexpectedly skipped allocations Andrew Morton
                   ` (56 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: stacktrace: move filter_irq_stacks() to kernel/stacktrace.c

filter_irq_stacks() has little to do with the stackdepot implementation,
except that it is usually used by users (such as KASAN) of stackdepot to
reduce the stack trace.

However, filter_irq_stacks() itself is not useful without a stack trace as
obtained by stack_trace_save() and friends.

Therefore, move filter_irq_stacks() to kernel/stacktrace.c, so that new
users of filter_irq_stacks() do not have to start depending on STACKDEPOT
only for filter_irq_stacks().

Link: https://lkml.kernel.org/r/20210923104803.2620285-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/stackdepot.h |    2 --
 include/linux/stacktrace.h |    1 +
 kernel/stacktrace.c        |   30 ++++++++++++++++++++++++++++++
 lib/stackdepot.c           |   24 ------------------------
 4 files changed, 31 insertions(+), 26 deletions(-)

--- a/include/linux/stackdepot.h~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec
+++ a/include/linux/stackdepot.h
@@ -25,8 +25,6 @@ depot_stack_handle_t stack_depot_save(un
 unsigned int stack_depot_fetch(depot_stack_handle_t handle,
 			       unsigned long **entries);
 
-unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries);
-
 #ifdef CONFIG_STACKDEPOT
 int stack_depot_init(void);
 #else
--- a/include/linux/stacktrace.h~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec
+++ a/include/linux/stacktrace.h
@@ -21,6 +21,7 @@ unsigned int stack_trace_save_tsk(struct
 unsigned int stack_trace_save_regs(struct pt_regs *regs, unsigned long *store,
 				   unsigned int size, unsigned int skipnr);
 unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
+unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries);
 
 /* Internal interfaces. Do not use in generic code */
 #ifdef CONFIG_ARCH_STACKWALK
--- a/kernel/stacktrace.c~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec
+++ a/kernel/stacktrace.c
@@ -13,6 +13,7 @@
 #include <linux/export.h>
 #include <linux/kallsyms.h>
 #include <linux/stacktrace.h>
+#include <linux/interrupt.h>
 
 /**
  * stack_trace_print - Print the entries in the stack trace
@@ -373,3 +374,32 @@ unsigned int stack_trace_save_user(unsig
 #endif /* CONFIG_USER_STACKTRACE_SUPPORT */
 
 #endif /* !CONFIG_ARCH_STACKWALK */
+
+static inline bool in_irqentry_text(unsigned long ptr)
+{
+	return (ptr >= (unsigned long)&__irqentry_text_start &&
+		ptr < (unsigned long)&__irqentry_text_end) ||
+		(ptr >= (unsigned long)&__softirqentry_text_start &&
+		 ptr < (unsigned long)&__softirqentry_text_end);
+}
+
+/**
+ * filter_irq_stacks - Find first IRQ stack entry in trace
+ * @entries:	Pointer to stack trace array
+ * @nr_entries:	Number of entries in the storage array
+ *
+ * Return: Number of trace entries until IRQ stack starts.
+ */
+unsigned int filter_irq_stacks(unsigned long *entries, unsigned int nr_entries)
+{
+	unsigned int i;
+
+	for (i = 0; i < nr_entries; i++) {
+		if (in_irqentry_text(entries[i])) {
+			/* Include the irqentry function into the stack. */
+			return i + 1;
+		}
+	}
+	return nr_entries;
+}
+EXPORT_SYMBOL_GPL(filter_irq_stacks);
--- a/lib/stackdepot.c~stacktrace-move-filter_irq_stacks-to-kernel-stacktracec
+++ a/lib/stackdepot.c
@@ -20,7 +20,6 @@
  */
 
 #include <linux/gfp.h>
-#include <linux/interrupt.h>
 #include <linux/jhash.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
@@ -371,26 +370,3 @@ depot_stack_handle_t stack_depot_save(un
 	return __stack_depot_save(entries, nr_entries, alloc_flags, true);
 }
 EXPORT_SYMBOL_GPL(stack_depot_save);
-
-static inline int in_irqentry_text(unsigned long ptr)
-{
-	return (ptr >= (unsigned long)&__irqentry_text_start &&
-		ptr < (unsigned long)&__irqentry_text_end) ||
-		(ptr >= (unsigned long)&__softirqentry_text_start &&
-		 ptr < (unsigned long)&__softirqentry_text_end);
-}
-
-unsigned int filter_irq_stacks(unsigned long *entries,
-					     unsigned int nr_entries)
-{
-	unsigned int i;
-
-	for (i = 0; i < nr_entries; i++) {
-		if (in_irqentry_text(entries[i])) {
-			/* Include the irqentry function into the stack. */
-			return i + 1;
-		}
-	}
-	return nr_entries;
-}
-EXPORT_SYMBOL_GPL(filter_irq_stacks);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 206/262] kfence: count unexpectedly skipped allocations
  2021-11-05 20:34 incoming Andrew Morton
                   ` (204 preceding siblings ...)
  2021-11-05 20:45 ` [patch 205/262] stacktrace: move filter_irq_stacks() to kernel/stacktrace.c Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 207/262] kfence: move saving stack trace of allocations into __kfence_alloc() Andrew Morton
                   ` (55 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: count unexpectedly skipped allocations

Maintain a counter to count allocations that are skipped due to being
incompatible (oversized, incompatible gfp flags) or no capacity.

This is to compute the fraction of allocations that could not be serviced
by KFENCE, which we expect to be rare.

Link: https://lkml.kernel.org/r/20210923104803.2620285-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- a/mm/kfence/core.c~kfence-count-unexpectedly-skipped-allocations
+++ a/mm/kfence/core.c
@@ -112,6 +112,8 @@ enum kfence_counter_id {
 	KFENCE_COUNTER_FREES,
 	KFENCE_COUNTER_ZOMBIES,
 	KFENCE_COUNTER_BUGS,
+	KFENCE_COUNTER_SKIP_INCOMPAT,
+	KFENCE_COUNTER_SKIP_CAPACITY,
 	KFENCE_COUNTER_COUNT,
 };
 static atomic_long_t counters[KFENCE_COUNTER_COUNT];
@@ -121,6 +123,8 @@ static const char *const counter_names[]
 	[KFENCE_COUNTER_FREES]		= "total frees",
 	[KFENCE_COUNTER_ZOMBIES]	= "zombie allocations",
 	[KFENCE_COUNTER_BUGS]		= "total bugs",
+	[KFENCE_COUNTER_SKIP_INCOMPAT]	= "skipped allocations (incompatible)",
+	[KFENCE_COUNTER_SKIP_CAPACITY]	= "skipped allocations (capacity)",
 };
 static_assert(ARRAY_SIZE(counter_names) == KFENCE_COUNTER_COUNT);
 
@@ -271,8 +275,10 @@ static void *kfence_guarded_alloc(struct
 		list_del_init(&meta->list);
 	}
 	raw_spin_unlock_irqrestore(&kfence_freelist_lock, flags);
-	if (!meta)
+	if (!meta) {
+		atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_CAPACITY]);
 		return NULL;
+	}
 
 	if (unlikely(!raw_spin_trylock_irqsave(&meta->lock, flags))) {
 		/*
@@ -740,8 +746,10 @@ void *__kfence_alloc(struct kmem_cache *
 	 * Perform size check before switching kfence_allocation_gate, so that
 	 * we don't disable KFENCE without making an allocation.
 	 */
-	if (size > PAGE_SIZE)
+	if (size > PAGE_SIZE) {
+		atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_INCOMPAT]);
 		return NULL;
+	}
 
 	/*
 	 * Skip allocations from non-default zones, including DMA. We cannot
@@ -749,8 +757,10 @@ void *__kfence_alloc(struct kmem_cache *
 	 * properties (e.g. reside in DMAable memory).
 	 */
 	if ((flags & GFP_ZONEMASK) ||
-	    (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32)))
+	    (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32))) {
+		atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_INCOMPAT]);
 		return NULL;
+	}
 
 	/*
 	 * allocation_gate only needs to become non-zero, so it doesn't make
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 207/262] kfence: move saving stack trace of allocations into __kfence_alloc()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (205 preceding siblings ...)
  2021-11-05 20:45 ` [patch 206/262] kfence: count unexpectedly skipped allocations Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 208/262] kfence: limit currently covered allocations when pool nearly full Andrew Morton
                   ` (54 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: move saving stack trace of allocations into __kfence_alloc()

Move the saving of the stack trace of allocations into __kfence_alloc(),
so that the stack entries array can be used outside of
kfence_guarded_alloc() and we avoid potentially unwinding the stack
multiple times.

Link: https://lkml.kernel.org/r/20210923104803.2620285-3-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |   35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

--- a/mm/kfence/core.c~kfence-move-saving-stack-trace-of-allocations-into-__kfence_alloc
+++ a/mm/kfence/core.c
@@ -187,19 +187,26 @@ static inline unsigned long metadata_to_
  * Update the object's metadata state, including updating the alloc/free stacks
  * depending on the state transition.
  */
-static noinline void metadata_update_state(struct kfence_metadata *meta,
-					   enum kfence_object_state next)
+static noinline void
+metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state next,
+		      unsigned long *stack_entries, size_t num_stack_entries)
 {
 	struct kfence_track *track =
 		next == KFENCE_OBJECT_FREED ? &meta->free_track : &meta->alloc_track;
 
 	lockdep_assert_held(&meta->lock);
 
-	/*
-	 * Skip over 1 (this) functions; noinline ensures we do not accidentally
-	 * skip over the caller by never inlining.
-	 */
-	track->num_stack_entries = stack_trace_save(track->stack_entries, KFENCE_STACK_DEPTH, 1);
+	if (stack_entries) {
+		memcpy(track->stack_entries, stack_entries,
+		       num_stack_entries * sizeof(stack_entries[0]));
+	} else {
+		/*
+		 * Skip over 1 (this) functions; noinline ensures we do not
+		 * accidentally skip over the caller by never inlining.
+		 */
+		num_stack_entries = stack_trace_save(track->stack_entries, KFENCE_STACK_DEPTH, 1);
+	}
+	track->num_stack_entries = num_stack_entries;
 	track->pid = task_pid_nr(current);
 	track->cpu = raw_smp_processor_id();
 	track->ts_nsec = local_clock(); /* Same source as printk timestamps. */
@@ -261,7 +268,8 @@ static __always_inline void for_each_can
 	}
 }
 
-static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp)
+static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp,
+				  unsigned long *stack_entries, size_t num_stack_entries)
 {
 	struct kfence_metadata *meta = NULL;
 	unsigned long flags;
@@ -320,7 +328,7 @@ static void *kfence_guarded_alloc(struct
 	addr = (void *)meta->addr;
 
 	/* Update remaining metadata. */
-	metadata_update_state(meta, KFENCE_OBJECT_ALLOCATED);
+	metadata_update_state(meta, KFENCE_OBJECT_ALLOCATED, stack_entries, num_stack_entries);
 	/* Pairs with READ_ONCE() in kfence_shutdown_cache(). */
 	WRITE_ONCE(meta->cache, cache);
 	meta->size = size;
@@ -400,7 +408,7 @@ static void kfence_guarded_free(void *ad
 		memzero_explicit(addr, meta->size);
 
 	/* Mark the object as freed. */
-	metadata_update_state(meta, KFENCE_OBJECT_FREED);
+	metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0);
 
 	raw_spin_unlock_irqrestore(&meta->lock, flags);
 
@@ -742,6 +750,9 @@ void kfence_shutdown_cache(struct kmem_c
 
 void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
 {
+	unsigned long stack_entries[KFENCE_STACK_DEPTH];
+	size_t num_stack_entries;
+
 	/*
 	 * Perform size check before switching kfence_allocation_gate, so that
 	 * we don't disable KFENCE without making an allocation.
@@ -786,7 +797,9 @@ void *__kfence_alloc(struct kmem_cache *
 	if (!READ_ONCE(kfence_enabled))
 		return NULL;
 
-	return kfence_guarded_alloc(s, size, flags);
+	num_stack_entries = stack_trace_save(stack_entries, KFENCE_STACK_DEPTH, 0);
+
+	return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries);
 }
 
 size_t kfence_ksize(const void *addr)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 208/262] kfence: limit currently covered allocations when pool nearly full
  2021-11-05 20:34 incoming Andrew Morton
                   ` (206 preceding siblings ...)
  2021-11-05 20:45 ` [patch 207/262] kfence: move saving stack trace of allocations into __kfence_alloc() Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 209/262] kfence: add note to documentation about skipping covered allocations Andrew Morton
                   ` (53 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: limit currently covered allocations when pool nearly full

One of KFENCE's main design principles is that with increasing uptime,
allocation coverage increases sufficiently to detect previously undetected
bugs.

We have observed that frequent long-lived allocations of the same source
(e.g.  pagecache) tend to permanently fill up the KFENCE pool with
increasing system uptime, thus breaking the above requirement.  The
workaround thus far had been increasing the sample interval and/or
increasing the KFENCE pool size, but is no reliable solution.

To ensure diverse coverage of allocations, limit currently covered
allocations of the same source once pool utilization reaches 75%
(configurable via `kfence.skip_covered_thresh`) or above.  The effect is
retaining reasonable allocation coverage when the pool is close to full.

A side-effect is that this also limits frequent long-lived allocations of
the same source filling up the pool permanently.

Uniqueness of an allocation for coverage purposes is based on its
(partial) allocation stack trace (the source).  A Counting Bloom filter is
used to check if an allocation is covered; if the allocation is currently
covered, the allocation is skipped by KFENCE.

Testing was done using:

	(a) a synthetic workload that performs frequent long-lived
	    allocations (default config values; sample_interval=1;
	    num_objects=63), and

	(b) normal desktop workloads on an otherwise idle machine where
	    the problem was first reported after a few days of uptime
	    (default config values).

In both test cases the sampled allocation rate no longer drops to zero at
any point.  In the case of (b) we observe (after 2 days uptime) 15% unique
allocations in the pool, 77% pool utilization, with 20% "skipped
allocations (covered)".

[elver@google.com: simplify and just use hash_32(), use more random stack_hash_seed]
  Link: https://lkml.kernel.org/r/YU3MRGaCaJiYht5g@elver.google.com
[elver@google.com: fix 32 bit]
Link: https://lkml.kernel.org/r/20210923104803.2620285-4-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c   |  109 ++++++++++++++++++++++++++++++++++++++++++-
 mm/kfence/kfence.h |    2 
 2 files changed, 109 insertions(+), 2 deletions(-)

--- a/mm/kfence/core.c~kfence-limit-currently-covered-allocations-when-pool-nearly-full
+++ a/mm/kfence/core.c
@@ -10,12 +10,15 @@
 #include <linux/atomic.h>
 #include <linux/bug.h>
 #include <linux/debugfs.h>
+#include <linux/hash.h>
 #include <linux/irq_work.h>
+#include <linux/jhash.h>
 #include <linux/kcsan-checks.h>
 #include <linux/kfence.h>
 #include <linux/kmemleak.h>
 #include <linux/list.h>
 #include <linux/lockdep.h>
+#include <linux/log2.h>
 #include <linux/memblock.h>
 #include <linux/moduleparam.h>
 #include <linux/random.h>
@@ -82,6 +85,10 @@ static const struct kernel_param_ops sam
 };
 module_param_cb(sample_interval, &sample_interval_param_ops, &kfence_sample_interval, 0600);
 
+/* Pool usage% threshold when currently covered allocations are skipped. */
+static unsigned long kfence_skip_covered_thresh __read_mostly = 75;
+module_param_named(skip_covered_thresh, kfence_skip_covered_thresh, ulong, 0644);
+
 /* The pool of pages used for guard pages and objects. */
 char *__kfence_pool __ro_after_init;
 EXPORT_SYMBOL(__kfence_pool); /* Export for test modules. */
@@ -105,6 +112,32 @@ DEFINE_STATIC_KEY_FALSE(kfence_allocatio
 /* Gates the allocation, ensuring only one succeeds in a given period. */
 atomic_t kfence_allocation_gate = ATOMIC_INIT(1);
 
+/*
+ * A Counting Bloom filter of allocation coverage: limits currently covered
+ * allocations of the same source filling up the pool.
+ *
+ * Assuming a range of 15%-85% unique allocations in the pool at any point in
+ * time, the below parameters provide a probablity of 0.02-0.33 for false
+ * positive hits respectively:
+ *
+ *	P(alloc_traces) = (1 - e^(-HNUM * (alloc_traces / SIZE)) ^ HNUM
+ */
+#define ALLOC_COVERED_HNUM	2
+#define ALLOC_COVERED_ORDER	(const_ilog2(CONFIG_KFENCE_NUM_OBJECTS) + 2)
+#define ALLOC_COVERED_SIZE	(1 << ALLOC_COVERED_ORDER)
+#define ALLOC_COVERED_HNEXT(h)	hash_32(h, ALLOC_COVERED_ORDER)
+#define ALLOC_COVERED_MASK	(ALLOC_COVERED_SIZE - 1)
+static atomic_t alloc_covered[ALLOC_COVERED_SIZE];
+
+/* Stack depth used to determine uniqueness of an allocation. */
+#define UNIQUE_ALLOC_STACK_DEPTH ((size_t)8)
+
+/*
+ * Randomness for stack hashes, making the same collisions across reboots and
+ * different machines less likely.
+ */
+static u32 stack_hash_seed __ro_after_init;
+
 /* Statistics counters for debugfs. */
 enum kfence_counter_id {
 	KFENCE_COUNTER_ALLOCATED,
@@ -114,6 +147,7 @@ enum kfence_counter_id {
 	KFENCE_COUNTER_BUGS,
 	KFENCE_COUNTER_SKIP_INCOMPAT,
 	KFENCE_COUNTER_SKIP_CAPACITY,
+	KFENCE_COUNTER_SKIP_COVERED,
 	KFENCE_COUNTER_COUNT,
 };
 static atomic_long_t counters[KFENCE_COUNTER_COUNT];
@@ -125,11 +159,57 @@ static const char *const counter_names[]
 	[KFENCE_COUNTER_BUGS]		= "total bugs",
 	[KFENCE_COUNTER_SKIP_INCOMPAT]	= "skipped allocations (incompatible)",
 	[KFENCE_COUNTER_SKIP_CAPACITY]	= "skipped allocations (capacity)",
+	[KFENCE_COUNTER_SKIP_COVERED]	= "skipped allocations (covered)",
 };
 static_assert(ARRAY_SIZE(counter_names) == KFENCE_COUNTER_COUNT);
 
 /* === Internals ============================================================ */
 
+static inline bool should_skip_covered(void)
+{
+	unsigned long thresh = (CONFIG_KFENCE_NUM_OBJECTS * kfence_skip_covered_thresh) / 100;
+
+	return atomic_long_read(&counters[KFENCE_COUNTER_ALLOCATED]) > thresh;
+}
+
+static u32 get_alloc_stack_hash(unsigned long *stack_entries, size_t num_entries)
+{
+	num_entries = min(num_entries, UNIQUE_ALLOC_STACK_DEPTH);
+	num_entries = filter_irq_stacks(stack_entries, num_entries);
+	return jhash(stack_entries, num_entries * sizeof(stack_entries[0]), stack_hash_seed);
+}
+
+/*
+ * Adds (or subtracts) count @val for allocation stack trace hash
+ * @alloc_stack_hash from Counting Bloom filter.
+ */
+static void alloc_covered_add(u32 alloc_stack_hash, int val)
+{
+	int i;
+
+	for (i = 0; i < ALLOC_COVERED_HNUM; i++) {
+		atomic_add(val, &alloc_covered[alloc_stack_hash & ALLOC_COVERED_MASK]);
+		alloc_stack_hash = ALLOC_COVERED_HNEXT(alloc_stack_hash);
+	}
+}
+
+/*
+ * Returns true if the allocation stack trace hash @alloc_stack_hash is
+ * currently contained (non-zero count) in Counting Bloom filter.
+ */
+static bool alloc_covered_contains(u32 alloc_stack_hash)
+{
+	int i;
+
+	for (i = 0; i < ALLOC_COVERED_HNUM; i++) {
+		if (!atomic_read(&alloc_covered[alloc_stack_hash & ALLOC_COVERED_MASK]))
+			return false;
+		alloc_stack_hash = ALLOC_COVERED_HNEXT(alloc_stack_hash);
+	}
+
+	return true;
+}
+
 static bool kfence_protect(unsigned long addr)
 {
 	return !KFENCE_WARN_ON(!kfence_protect_page(ALIGN_DOWN(addr, PAGE_SIZE), true));
@@ -269,7 +349,8 @@ static __always_inline void for_each_can
 }
 
 static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t gfp,
-				  unsigned long *stack_entries, size_t num_stack_entries)
+				  unsigned long *stack_entries, size_t num_stack_entries,
+				  u32 alloc_stack_hash)
 {
 	struct kfence_metadata *meta = NULL;
 	unsigned long flags;
@@ -332,6 +413,8 @@ static void *kfence_guarded_alloc(struct
 	/* Pairs with READ_ONCE() in kfence_shutdown_cache(). */
 	WRITE_ONCE(meta->cache, cache);
 	meta->size = size;
+	meta->alloc_stack_hash = alloc_stack_hash;
+
 	for_each_canary(meta, set_canary_byte);
 
 	/* Set required struct page fields. */
@@ -344,6 +427,8 @@ static void *kfence_guarded_alloc(struct
 
 	raw_spin_unlock_irqrestore(&meta->lock, flags);
 
+	alloc_covered_add(alloc_stack_hash, 1);
+
 	/* Memory initialization. */
 
 	/*
@@ -412,6 +497,8 @@ static void kfence_guarded_free(void *ad
 
 	raw_spin_unlock_irqrestore(&meta->lock, flags);
 
+	alloc_covered_add(meta->alloc_stack_hash, -1);
+
 	/* Protect to detect use-after-frees. */
 	kfence_protect((unsigned long)addr);
 
@@ -677,6 +764,7 @@ void __init kfence_init(void)
 	if (!kfence_sample_interval)
 		return;
 
+	stack_hash_seed = (u32)random_get_entropy();
 	if (!kfence_init_pool()) {
 		pr_err("%s failed\n", __func__);
 		return;
@@ -752,6 +840,7 @@ void *__kfence_alloc(struct kmem_cache *
 {
 	unsigned long stack_entries[KFENCE_STACK_DEPTH];
 	size_t num_stack_entries;
+	u32 alloc_stack_hash;
 
 	/*
 	 * Perform size check before switching kfence_allocation_gate, so that
@@ -799,7 +888,23 @@ void *__kfence_alloc(struct kmem_cache *
 
 	num_stack_entries = stack_trace_save(stack_entries, KFENCE_STACK_DEPTH, 0);
 
-	return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries);
+	/*
+	 * Do expensive check for coverage of allocation in slow-path after
+	 * allocation_gate has already become non-zero, even though it might
+	 * mean not making any allocation within a given sample interval.
+	 *
+	 * This ensures reasonable allocation coverage when the pool is almost
+	 * full, including avoiding long-lived allocations of the same source
+	 * filling up the pool (e.g. pagecache allocations).
+	 */
+	alloc_stack_hash = get_alloc_stack_hash(stack_entries, num_stack_entries);
+	if (should_skip_covered() && alloc_covered_contains(alloc_stack_hash)) {
+		atomic_long_inc(&counters[KFENCE_COUNTER_SKIP_COVERED]);
+		return NULL;
+	}
+
+	return kfence_guarded_alloc(s, size, flags, stack_entries, num_stack_entries,
+				    alloc_stack_hash);
 }
 
 size_t kfence_ksize(const void *addr)
--- a/mm/kfence/kfence.h~kfence-limit-currently-covered-allocations-when-pool-nearly-full
+++ a/mm/kfence/kfence.h
@@ -87,6 +87,8 @@ struct kfence_metadata {
 	/* Allocation and free stack information. */
 	struct kfence_track alloc_track;
 	struct kfence_track free_track;
+	/* For updating alloc_covered on frees. */
+	u32 alloc_stack_hash;
 };
 
 extern struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS];
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 209/262] kfence: add note to documentation about skipping covered allocations
  2021-11-05 20:34 incoming Andrew Morton
                   ` (207 preceding siblings ...)
  2021-11-05 20:45 ` [patch 208/262] kfence: limit currently covered allocations when pool nearly full Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 210/262] kfence: test: use kunit_skip() to skip tests Andrew Morton
                   ` (52 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: add note to documentation about skipping covered allocations

Add a note briefly mentioning the new policy about "skipping currently
covered allocations if pool close to full." Since this has a notable
impact on KFENCE's bug-detection ability on systems with large uptimes, it
is worth pointing out the feature.

Link: https://lkml.kernel.org/r/20210923104803.2620285-5-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/kfence.rst |   11 +++++++++++
 1 file changed, 11 insertions(+)

--- a/Documentation/dev-tools/kfence.rst~kfence-add-note-to-documentation-about-skipping-covered-allocations
+++ a/Documentation/dev-tools/kfence.rst
@@ -269,6 +269,17 @@ tail of KFENCE's freelist, so that the l
 first, and the chances of detecting use-after-frees of recently freed objects
 is increased.
 
+If pool utilization reaches 75% (default) or above, to reduce the risk of the
+pool eventually being fully occupied by allocated objects yet ensure diverse
+coverage of allocations, KFENCE limits currently covered allocations of the
+same source from further filling up the pool. The "source" of an allocation is
+based on its partial allocation stack trace. A side-effect is that this also
+limits frequent long-lived allocations (e.g. pagecache) of the same source
+filling up the pool permanently, which is the most common risk for the pool
+becoming full and the sampled allocation rate dropping to zero. The threshold
+at which to start limiting currently covered allocations can be configured via
+the boot parameter ``kfence.skip_covered_thresh`` (pool usage%).
+
 Interface
 ---------
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 210/262] kfence: test: use kunit_skip() to skip tests
  2021-11-05 20:34 incoming Andrew Morton
                   ` (208 preceding siblings ...)
  2021-11-05 20:45 ` [patch 209/262] kfence: add note to documentation about skipping covered allocations Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 211/262] kfence: shorten critical sections of alloc/free Andrew Morton
                   ` (51 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, davidgow, dvyukov, elver, glider, linux-mm, mm-commits,
	nogikh, tarasmadan, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: test: use kunit_skip() to skip tests

Use the new kunit_skip() to skip tests if requirements were not met.  It
makes it easier to see in KUnit's summary if there were skipped tests.

Link: https://lkml.kernel.org/r/20210922182541.1372400-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Aleksandr Nogikh <nogikh@google.com>
Cc: Taras Madan <tarasmadan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/kfence_test.c |   14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

--- a/mm/kfence/kfence_test.c~kfence-test-use-kunit_skip-to-skip-tests
+++ a/mm/kfence/kfence_test.c
@@ -32,6 +32,11 @@
 #define arch_kfence_test_address(addr) (addr)
 #endif
 
+#define KFENCE_TEST_REQUIRES(test, cond) do {			\
+	if (!(cond))						\
+		kunit_skip((test), "Test requires: " #cond);	\
+} while (0)
+
 /* Report as observed from console. */
 static struct {
 	spinlock_t lock;
@@ -555,8 +560,7 @@ static void test_init_on_free(struct kun
 	};
 	int i;
 
-	if (!IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON))
-		return;
+	KFENCE_TEST_REQUIRES(test, IS_ENABLED(CONFIG_INIT_ON_FREE_DEFAULT_ON));
 	/* Assume it hasn't been disabled on command line. */
 
 	setup_test_cache(test, size, 0, NULL);
@@ -603,10 +607,8 @@ static void test_gfpzero(struct kunit *t
 	char *buf1, *buf2;
 	int i;
 
-	if (CONFIG_KFENCE_SAMPLE_INTERVAL > 100) {
-		kunit_warn(test, "skipping ... would take too long\n");
-		return;
-	}
+	/* Skip if we think it'd take too long. */
+	KFENCE_TEST_REQUIRES(test, CONFIG_KFENCE_SAMPLE_INTERVAL <= 100);
 
 	setup_test_cache(test, size, 0, NULL);
 	buf1 = test_alloc(test, size, GFP_KERNEL, ALLOCATE_ANY);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 211/262] kfence: shorten critical sections of alloc/free
  2021-11-05 20:34 incoming Andrew Morton
                   ` (209 preceding siblings ...)
  2021-11-05 20:45 ` [patch 210/262] kfence: test: use kunit_skip() to skip tests Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 212/262] kfence: always use static branches to guard kfence_alloc() Andrew Morton
                   ` (50 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: shorten critical sections of alloc/free

Initializing memory and setting/checking the canary bytes is relatively
expensive, and doing so in the meta->lock critical sections extends the
duration with preemption and interrupts disabled unnecessarily.

Any reads to meta->addr and meta->size in kfence_guarded_alloc() and
kfence_guarded_free() don't require locking meta->lock as long as the
object is removed from the freelist: only kfence_guarded_alloc() sets
meta->addr and meta->size after removing it from the freelist, which
requires a preceding kfence_guarded_free() returning it to the list or the
initial state.

Therefore move reads to meta->addr and meta->size, including expensive
memory initialization using them, out of meta->lock critical sections.

Link: https://lkml.kernel.org/r/20210930153706.2105471-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/kfence/core.c |   38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

--- a/mm/kfence/core.c~kfence-shorten-critical-sections-of-alloc-free
+++ a/mm/kfence/core.c
@@ -309,12 +309,19 @@ static inline bool set_canary_byte(u8 *a
 /* Check canary byte at @addr. */
 static inline bool check_canary_byte(u8 *addr)
 {
+	struct kfence_metadata *meta;
+	unsigned long flags;
+
 	if (likely(*addr == KFENCE_CANARY_PATTERN(addr)))
 		return true;
 
 	atomic_long_inc(&counters[KFENCE_COUNTER_BUGS]);
-	kfence_report_error((unsigned long)addr, false, NULL, addr_to_metadata((unsigned long)addr),
-			    KFENCE_ERROR_CORRUPTION);
+
+	meta = addr_to_metadata((unsigned long)addr);
+	raw_spin_lock_irqsave(&meta->lock, flags);
+	kfence_report_error((unsigned long)addr, false, NULL, meta, KFENCE_ERROR_CORRUPTION);
+	raw_spin_unlock_irqrestore(&meta->lock, flags);
+
 	return false;
 }
 
@@ -324,8 +331,6 @@ static __always_inline void for_each_can
 	const unsigned long pageaddr = ALIGN_DOWN(meta->addr, PAGE_SIZE);
 	unsigned long addr;
 
-	lockdep_assert_held(&meta->lock);
-
 	/*
 	 * We'll iterate over each canary byte per-side until fn() returns
 	 * false. However, we'll still iterate over the canary bytes to the
@@ -414,8 +419,9 @@ static void *kfence_guarded_alloc(struct
 	WRITE_ONCE(meta->cache, cache);
 	meta->size = size;
 	meta->alloc_stack_hash = alloc_stack_hash;
+	raw_spin_unlock_irqrestore(&meta->lock, flags);
 
-	for_each_canary(meta, set_canary_byte);
+	alloc_covered_add(alloc_stack_hash, 1);
 
 	/* Set required struct page fields. */
 	page = virt_to_page(meta->addr);
@@ -425,11 +431,8 @@ static void *kfence_guarded_alloc(struct
 	if (IS_ENABLED(CONFIG_SLAB))
 		page->s_mem = addr;
 
-	raw_spin_unlock_irqrestore(&meta->lock, flags);
-
-	alloc_covered_add(alloc_stack_hash, 1);
-
 	/* Memory initialization. */
+	for_each_canary(meta, set_canary_byte);
 
 	/*
 	 * We check slab_want_init_on_alloc() ourselves, rather than letting
@@ -454,6 +457,7 @@ static void kfence_guarded_free(void *ad
 {
 	struct kcsan_scoped_access assert_page_exclusive;
 	unsigned long flags;
+	bool init;
 
 	raw_spin_lock_irqsave(&meta->lock, flags);
 
@@ -481,6 +485,13 @@ static void kfence_guarded_free(void *ad
 		meta->unprotected_page = 0;
 	}
 
+	/* Mark the object as freed. */
+	metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0);
+	init = slab_want_init_on_free(meta->cache);
+	raw_spin_unlock_irqrestore(&meta->lock, flags);
+
+	alloc_covered_add(meta->alloc_stack_hash, -1);
+
 	/* Check canary bytes for memory corruption. */
 	for_each_canary(meta, check_canary_byte);
 
@@ -489,16 +500,9 @@ static void kfence_guarded_free(void *ad
 	 * data is still there, and after a use-after-free is detected, we
 	 * unprotect the page, so the data is still accessible.
 	 */
-	if (!zombie && unlikely(slab_want_init_on_free(meta->cache)))
+	if (!zombie && unlikely(init))
 		memzero_explicit(addr, meta->size);
 
-	/* Mark the object as freed. */
-	metadata_update_state(meta, KFENCE_OBJECT_FREED, NULL, 0);
-
-	raw_spin_unlock_irqrestore(&meta->lock, flags);
-
-	alloc_covered_add(meta->alloc_stack_hash, -1);
-
 	/* Protect to detect use-after-frees. */
 	kfence_protect((unsigned long)addr);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 212/262] kfence: always use static branches to guard kfence_alloc()
  2021-11-05 20:34 incoming Andrew Morton
                   ` (210 preceding siblings ...)
  2021-11-05 20:45 ` [patch 211/262] kfence: shorten critical sections of alloc/free Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 213/262] kfence: default to dynamic branch instead of static keys mode Andrew Morton
                   ` (49 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: always use static branches to guard kfence_alloc()

Regardless of KFENCE mode (CONFIG_KFENCE_STATIC_KEYS: either using static
keys to gate allocations, or using a simple dynamic branch), always use a
static branch to avoid the dynamic branch in kfence_alloc() if KFENCE was
disabled at boot.

For CONFIG_KFENCE_STATIC_KEYS=n, this now avoids the dynamic branch if
KFENCE was disabled at boot.

To simplify, also unifies the location where kfence_allocation_gate is
read-checked to just be inline in kfence_alloc().

Link: https://lkml.kernel.org/r/20211019102524.2807208-1-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/kfence.h |   21 +++++++++++----------
 mm/kfence/core.c       |   16 +++++++---------
 2 files changed, 18 insertions(+), 19 deletions(-)

--- a/include/linux/kfence.h~kfence-always-use-static-branches-to-guard-kfence_alloc
+++ a/include/linux/kfence.h
@@ -14,6 +14,9 @@
 
 #ifdef CONFIG_KFENCE
 
+#include <linux/atomic.h>
+#include <linux/static_key.h>
+
 /*
  * We allocate an even number of pages, as it simplifies calculations to map
  * address to metadata indices; effectively, the very first page serves as an
@@ -22,13 +25,8 @@
 #define KFENCE_POOL_SIZE ((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 * PAGE_SIZE)
 extern char *__kfence_pool;
 
-#ifdef CONFIG_KFENCE_STATIC_KEYS
-#include <linux/static_key.h>
 DECLARE_STATIC_KEY_FALSE(kfence_allocation_key);
-#else
-#include <linux/atomic.h>
 extern atomic_t kfence_allocation_gate;
-#endif
 
 /**
  * is_kfence_address() - check if an address belongs to KFENCE pool
@@ -116,13 +114,16 @@ void *__kfence_alloc(struct kmem_cache *
  */
 static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
 {
-#ifdef CONFIG_KFENCE_STATIC_KEYS
-	if (static_branch_unlikely(&kfence_allocation_key))
+#if defined(CONFIG_KFENCE_STATIC_KEYS) || CONFIG_KFENCE_SAMPLE_INTERVAL == 0
+	if (!static_branch_unlikely(&kfence_allocation_key))
+		return NULL;
 #else
-	if (unlikely(!atomic_read(&kfence_allocation_gate)))
+	if (!static_branch_likely(&kfence_allocation_key))
+		return NULL;
 #endif
-		return __kfence_alloc(s, size, flags);
-	return NULL;
+	if (likely(atomic_read(&kfence_allocation_gate)))
+		return NULL;
+	return __kfence_alloc(s, size, flags);
 }
 
 /**
--- a/mm/kfence/core.c~kfence-always-use-static-branches-to-guard-kfence_alloc
+++ a/mm/kfence/core.c
@@ -104,10 +104,11 @@ struct kfence_metadata kfence_metadata[C
 static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist);
 static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. */
 
-#ifdef CONFIG_KFENCE_STATIC_KEYS
-/* The static key to set up a KFENCE allocation. */
+/*
+ * The static key to set up a KFENCE allocation; or if static keys are not used
+ * to gate allocations, to avoid a load and compare if KFENCE is disabled.
+ */
 DEFINE_STATIC_KEY_FALSE(kfence_allocation_key);
-#endif
 
 /* Gates the allocation, ensuring only one succeeds in a given period. */
 atomic_t kfence_allocation_gate = ATOMIC_INIT(1);
@@ -774,6 +775,8 @@ void __init kfence_init(void)
 		return;
 	}
 
+	if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS))
+		static_branch_enable(&kfence_allocation_key);
 	WRITE_ONCE(kfence_enabled, true);
 	queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
 	pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE,
@@ -866,12 +869,7 @@ void *__kfence_alloc(struct kmem_cache *
 		return NULL;
 	}
 
-	/*
-	 * allocation_gate only needs to become non-zero, so it doesn't make
-	 * sense to continue writing to it and pay the associated contention
-	 * cost, in case we have a large number of concurrent allocations.
-	 */
-	if (atomic_read(&kfence_allocation_gate) || atomic_inc_return(&kfence_allocation_gate) > 1)
+	if (atomic_inc_return(&kfence_allocation_gate) > 1)
 		return NULL;
 #ifdef CONFIG_KFENCE_STATIC_KEYS
 	/*
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 213/262] kfence: default to dynamic branch instead of static keys mode
  2021-11-05 20:34 incoming Andrew Morton
                   ` (211 preceding siblings ...)
  2021-11-05 20:45 ` [patch 212/262] kfence: always use static branches to guard kfence_alloc() Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 214/262] mm/damon: grammar s/works/work/ Andrew Morton
                   ` (48 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, dvyukov, elver, glider, jannh, linux-mm, mm-commits, torvalds

From: Marco Elver <elver@google.com>
Subject: kfence: default to dynamic branch instead of static keys mode

We have observed that on very large machines with newer CPUs, the static
key/branch switching delay is on the order of milliseconds.  This is due
to the required broadcast IPIs, which simply does not scale well to
hundreds of CPUs (cores).  If done too frequently, this can adversely
affect tail latencies of various workloads.

One workaround is to increase the sample interval to several seconds,
while decreasing sampled allocation coverage, but the problem still exists
and could still increase tail latencies.

As already noted in the Kconfig help text, there are trade-offs: at lower
sample intervals the dynamic branch results in better performance;
however, at very large sample intervals, the static keys mode can result
in better performance -- careful benchmarking is recommended.

Our initial benchmarking showed that with large enough sample intervals
and workloads stressing the allocator, the static keys mode was slightly
better.  Evaluating and observing the possible system-wide side-effects of
the static-key-switching induced broadcast IPIs, however, was a blind spot
(in particular on large machines with 100s of cores).

Therefore, a major downside of the static keys mode is, unfortunately,
that it is hard to predict performance on new system architectures and
topologies, but also making conclusions about performance of new workloads
based on a limited set of benchmarks.

Most distributions will simply select the defaults, while targeting a
large variety of different workloads and system architectures.  As such,
the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is
only recommended after careful evaluation.

For reference, on x86-64 the condition in kfence_alloc() generates exactly
2 instructions in the kmem_cache_alloc() fast-path:

 | ...
 | cmpl   $0x0,0x1a8021c(%rip)  # ffffffff82d560d0 <kfence_allocation_gate>
 | je     ffffffff812d6003      <kmem_cache_alloc+0x243>
 | ...

which, given kfence_allocation_gate is infrequently modified, should be
well predicted by most CPUs.

Link: https://lkml.kernel.org/r/20211019102524.2807208-2-elver@google.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/dev-tools/kfence.rst |   12 ++++++++----
 lib/Kconfig.kfence                 |   26 +++++++++++++++-----------
 2 files changed, 23 insertions(+), 15 deletions(-)

--- a/Documentation/dev-tools/kfence.rst~kfence-default-to-dynamic-branch-instead-of-static-keys-mode
+++ a/Documentation/dev-tools/kfence.rst
@@ -231,10 +231,14 @@ Guarded allocations are set up based on
 of the sample interval, the next allocation through the main allocator (SLAB or
 SLUB) returns a guarded allocation from the KFENCE object pool (allocation
 sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and
-the next allocation is set up after the expiration of the interval. To "gate" a
-KFENCE allocation through the main allocator's fast-path without overhead,
-KFENCE relies on static branches via the static keys infrastructure. The static
-branch is toggled to redirect the allocation to KFENCE.
+the next allocation is set up after the expiration of the interval.
+
+When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated"
+through the main allocator's fast-path by relying on static branches via the
+static keys infrastructure. The static branch is toggled to redirect the
+allocation to KFENCE. Depending on sample interval, target workloads, and
+system architecture, this may perform better than the simple dynamic branch.
+Careful benchmarking is recommended.
 
 KFENCE objects each reside on a dedicated page, at either the left or right
 page boundaries selected at random. The pages to the left and right of the
--- a/lib/Kconfig.kfence~kfence-default-to-dynamic-branch-instead-of-static-keys-mode
+++ a/lib/Kconfig.kfence
@@ -25,17 +25,6 @@ menuconfig KFENCE
 
 if KFENCE
 
-config KFENCE_STATIC_KEYS
-	bool "Use static keys to set up allocations"
-	default y
-	depends on JUMP_LABEL # To ensure performance, require jump labels
-	help
-	  Use static keys (static branches) to set up KFENCE allocations. Using
-	  static keys is normally recommended, because it avoids a dynamic
-	  branch in the allocator's fast path. However, with very low sample
-	  intervals, or on systems that do not support jump labels, a dynamic
-	  branch may still be an acceptable performance trade-off.
-
 config KFENCE_SAMPLE_INTERVAL
 	int "Default sample interval in milliseconds"
 	default 100
@@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS
 	  pages are required; with one containing the object and two adjacent
 	  ones used as guard pages.
 
+config KFENCE_STATIC_KEYS
+	bool "Use static keys to set up allocations" if EXPERT
+	depends on JUMP_LABEL
+	help
+	  Use static keys (static branches) to set up KFENCE allocations. This
+	  option is only recommended when using very large sample intervals, or
+	  performance has carefully been evaluated with this option.
+
+	  Using static keys comes with trade-offs that need to be carefully
+	  evaluated given target workloads and system architectures. Notably,
+	  enabling and disabling static keys invoke IPI broadcasts, the latency
+	  and impact of which is much harder to predict than a dynamic branch.
+
+	  Say N if you are unsure.
+
 config KFENCE_STRESS_TEST_FAULTS
 	int "Stress testing of fault handling and error reporting" if EXPERT
 	default 0
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 214/262] mm/damon: grammar s/works/work/
  2021-11-05 20:34 incoming Andrew Morton
                   ` (212 preceding siblings ...)
  2021-11-05 20:45 ` [patch 213/262] kfence: default to dynamic branch instead of static keys mode Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 215/262] Documentation/vm: move user guides to admin-guide/mm/ Andrew Morton
                   ` (47 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, geert, linux-mm, mm-commits, sjpark, torvalds

From: Geert Uytterhoeven <geert@linux-m68k.org>
Subject: mm/damon: grammar s/works/work/

Correct a singular versus plural grammar mistake in the help text for the
DAMON_VADDR config symbol.

Link: https://lkml.kernel.org/r/20210914073451.3883834-1-geert@linux-m68k.org
Fixes: 3f49584b262cf8f4 ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/damon/Kconfig~mm-damon-grammar-s-works-work
+++ a/mm/damon/Kconfig
@@ -30,7 +30,7 @@ config DAMON_VADDR
 	select PAGE_IDLE_FLAG
 	help
 	  This builds the default data access monitoring primitives for DAMON
-	  that works for virtual address spaces.
+	  that work for virtual address spaces.
 
 config DAMON_VADDR_KUNIT_TEST
 	bool "Test for DAMON primitives" if !KUNIT_ALL_TESTS
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 215/262] Documentation/vm: move user guides to admin-guide/mm/
  2021-11-05 20:34 incoming Andrew Morton
                   ` (213 preceding siblings ...)
  2021-11-05 20:45 ` [patch 214/262] mm/damon: grammar s/works/work/ Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:45 ` [patch 216/262] MAINTAINERS: update SeongJae's email address Andrew Morton
                   ` (46 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, sjpark, torvalds

From: SeongJae Park <sjpark@amazon.de>
Subject: Documentation/vm: move user guides to admin-guide/mm/

Most memory management user guide documents are in 'admin-guide/mm/', but
two of those are in 'vm/'.  This commit moves the two docs into
'admin-guide/mm' for easier documents finding.

Link: https://lkml.kernel.org/r/20210917123958.3819-2-sj@kernel.org
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/index.rst        |  2 ++
 .../{vm => admin-guide/mm}/swap_numa.rst      |  0
 .../{vm => admin-guide/mm}/zswap.rst          |  0
 Documentation/vm/index.rst                    | 26 ++++---------------
 4 files changed, 7 insertions(+), 21 deletions(-)
 rename Documentation/{vm => admin-guide/mm}/swap_numa.rst (100%)
 rename Documentation/{vm => admin-guide/mm}/zswap.rst (100%)

diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index cbd19d5e625f..c21b5823f126 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -37,5 +37,7 @@ the Linux memory management.
    numaperf
    pagemap
    soft-dirty
+   swap_numa
    transhuge
    userfaultfd
+   zswap
diff --git a/Documentation/vm/swap_numa.rst b/Documentation/admin-guide/mm/swap_numa.rst
similarity index 100%
rename from Documentation/vm/swap_numa.rst
rename to Documentation/admin-guide/mm/swap_numa.rst
diff --git a/Documentation/vm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst
similarity index 100%
rename from Documentation/vm/zswap.rst
rename to Documentation/admin-guide/mm/zswap.rst
diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst
index b51f0d8992f8..6f5ffef4b716 100644
--- a/Documentation/vm/index.rst
+++ b/Documentation/vm/index.rst
@@ -3,27 +3,11 @@ Linux Memory Management Documentation
 =====================================
 
 This is a collection of documents about the Linux memory management (mm)
-subsystem.  If you are looking for advice on simply allocating memory,
-see the :ref:`memory_allocation`.
-
-User guides for MM features
-===========================
-
-The following documents provide guides for controlling and tuning
-various features of the Linux memory management
-
-.. toctree::
-   :maxdepth: 1
-
-   swap_numa
-   zswap
-
-Kernel developers MM documentation
-==================================
-
-The below documents describe MM internals with different level of
-details ranging from notes and mailing list responses to elaborate
-descriptions of data structures and algorithms.
+subsystem internals with different level of details ranging from notes and
+mailing list responses for elaborating descriptions of data structures and
+algorithms.  If you are looking for advice on simply allocating memory, see the
+:ref:`memory_allocation`.  For controlling and tuning guides, see the
+:doc:`admin guide <../admin-guide/mm/index>`.
 
 .. toctree::
    :maxdepth: 1
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 485+ messages in thread

* [patch 216/262] MAINTAINERS: update SeongJae's email address
  2021-11-05 20:34 incoming Andrew Morton
                   ` (214 preceding siblings ...)
  2021-11-05 20:45 ` [patch 215/262] Documentation/vm: move user guides to admin-guide/mm/ Andrew Morton
@ 2021-11-05 20:45 ` Andrew Morton
  2021-11-05 20:46 ` [patch 217/262] docs/vm/damon: remove broken reference Andrew Morton
                   ` (45 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:45 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, sj, sjpark, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: MAINTAINERS: update SeongJae's email address

This commit updates SeongJae's email address in MAINTAINERS file to his
preferred one.

Link: https://lkml.kernel.org/r/20210917123958.3819-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 MAINTAINERS |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/MAINTAINERS~maintainers-update-seongjaes-email-address
+++ a/MAINTAINERS
@@ -5161,7 +5161,7 @@ F:	net/ax25/ax25_timer.c
 F:	net/ax25/sysctl_net_ax25.c
 
 DATA ACCESS MONITOR
-M:	SeongJae Park <sjpark@amazon.de>
+M:	SeongJae Park <sj@kernel.org>
 L:	linux-mm@kvack.org
 S:	Maintained
 F:	Documentation/admin-guide/mm/damon/
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 217/262] docs/vm/damon: remove broken reference
  2021-11-05 20:34 incoming Andrew Morton
                   ` (215 preceding siblings ...)
  2021-11-05 20:45 ` [patch 216/262] MAINTAINERS: update SeongJae's email address Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 218/262] include/linux/damon.h: fix kernel-doc comments for 'damon_callback' Andrew Morton
                   ` (44 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, sjpark, torvalds

From: SeongJae Park <sjpark@amazon.de>
Subject: docs/vm/damon: remove broken reference

Building DAMON documents warns for a reference to nonexisting doc, as
below:

    $ time make htmldocs
    [...]
    Documentation/vm/damon/index.rst:24: WARNING: toctree contains reference to nonexisting document 'vm/damon/plans'

This commit fixes the warning by removing the wrong reference.

Link: https://lkml.kernel.org/r/20210917123958.3819-4-sj@kernel.org
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/vm/damon/index.rst |    1 -
 1 file changed, 1 deletion(-)

--- a/Documentation/vm/damon/index.rst~docs-vm-damon-remove-broken-reference
+++ a/Documentation/vm/damon/index.rst
@@ -27,4 +27,3 @@ workloads and systems.
    faq
    design
    api
-   plans
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 218/262] include/linux/damon.h: fix kernel-doc comments for 'damon_callback'
  2021-11-05 20:34 incoming Andrew Morton
                   ` (216 preceding siblings ...)
  2021-11-05 20:46 ` [patch 217/262] docs/vm/damon: remove broken reference Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 219/262] mm/damon/core: print kdamond start log in debug mode only Andrew Morton
                   ` (43 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, sjpark, torvalds

From: SeongJae Park <sjpark@amazon.de>
Subject: include/linux/damon.h: fix kernel-doc comments for 'damon_callback'

A few Kernel-doc comments in 'damon.h' are broken.  This commit fixes
those.

Link: https://lkml.kernel.org/r/20210917123958.3819-5-sj@kernel.org
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/include/linux/damon.h~include-linux-damonh-fix-kernel-doc-comments-for-damon_callback
+++ a/include/linux/damon.h
@@ -62,7 +62,7 @@ struct damon_target {
 struct damon_ctx;
 
 /**
- * struct damon_primitive	Monitoring primitives for given use cases.
+ * struct damon_primitive - Monitoring primitives for given use cases.
  *
  * @init:			Initialize primitive-internal data structures.
  * @update:			Update primitive-internal data structures.
@@ -108,8 +108,8 @@ struct damon_primitive {
 	void (*cleanup)(struct damon_ctx *context);
 };
 
-/*
- * struct damon_callback	Monitoring events notification callbacks.
+/**
+ * struct damon_callback - Monitoring events notification callbacks.
  *
  * @before_start:	Called before starting the monitoring.
  * @after_sampling:	Called after each sampling.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 219/262] mm/damon/core: print kdamond start log in debug mode only
  2021-11-05 20:34 incoming Andrew Morton
                   ` (217 preceding siblings ...)
  2021-11-05 20:46 ` [patch 218/262] include/linux/damon.h: fix kernel-doc comments for 'damon_callback' Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 220/262] mm/damon: remove unnecessary do_exit() from kdamond Andrew Morton
                   ` (42 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, sj, sjpark, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/core: print kdamond start log in debug mode only

Logging of kdamond startup is using 'pr_info()' unnecessarily.  This
commit makes it to use 'pr_debug()' instead.

Link: https://lkml.kernel.org/r/20210917123958.3819-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/damon/core.c~mm-damon-core-print-kdamond-start-log-in-debug-mode-only
+++ a/mm/damon/core.c
@@ -653,7 +653,7 @@ static int kdamond_fn(void *data)
 	unsigned long sz_limit = 0;
 
 	mutex_lock(&ctx->kdamond_lock);
-	pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
+	pr_debug("kdamond (%d) starts\n", ctx->kdamond->pid);
 	mutex_unlock(&ctx->kdamond_lock);
 
 	if (ctx->primitive.init)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 220/262] mm/damon: remove unnecessary do_exit() from kdamond
  2021-11-05 20:34 incoming Andrew Morton
                   ` (218 preceding siblings ...)
  2021-11-05 20:46 ` [patch 219/262] mm/damon/core: print kdamond start log in debug mode only Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 221/262] mm/damon: needn't hold kdamond_lock to print pid of kdamond Andrew Morton
                   ` (41 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, changbin.du, linux-mm, mm-commits, sjpark, torvalds

From: Changbin Du <changbin.du@gmail.com>
Subject: mm/damon: remove unnecessary do_exit() from kdamond

Just return from the kthread function.

Link: https://lkml.kernel.org/r/20210927232421.17694-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Cc: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/damon/core.c~mm-damon-remove-unnecessary-do_exit-from-kdamond
+++ a/mm/damon/core.c
@@ -714,7 +714,7 @@ static int kdamond_fn(void *data)
 	nr_running_ctxs--;
 	mutex_unlock(&damon_lock);
 
-	do_exit(0);
+	return 0;
 }
 
 #include "core-test.h"
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 221/262] mm/damon: needn't hold kdamond_lock to print pid of kdamond
  2021-11-05 20:34 incoming Andrew Morton
                   ` (219 preceding siblings ...)
  2021-11-05 20:46 ` [patch 220/262] mm/damon: remove unnecessary do_exit() from kdamond Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 222/262] mm/damon/core: nullify pointer ctx->kdamond with a NULL Andrew Morton
                   ` (40 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, changbin.du, linux-mm, mm-commits, sj, torvalds

From: Changbin Du <changbin.du@gmail.com>
Subject: mm/damon: needn't hold kdamond_lock to print pid of kdamond

Just get the pid by 'current->pid'. Meanwhile, to be symmetrical make
the 'starts' and 'finishes' logs both use debug level.

Link: https://lkml.kernel.org/r/20210927232432.17750-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

--- a/mm/damon/core.c~mm-damon-neednt-hold-kdamond_lock-to-print-pid-of-kdamond
+++ a/mm/damon/core.c
@@ -652,9 +652,7 @@ static int kdamond_fn(void *data)
 	unsigned int max_nr_accesses = 0;
 	unsigned long sz_limit = 0;
 
-	mutex_lock(&ctx->kdamond_lock);
-	pr_debug("kdamond (%d) starts\n", ctx->kdamond->pid);
-	mutex_unlock(&ctx->kdamond_lock);
+	pr_debug("kdamond (%d) starts\n", current->pid);
 
 	if (ctx->primitive.init)
 		ctx->primitive.init(ctx);
@@ -705,7 +703,7 @@ static int kdamond_fn(void *data)
 	if (ctx->primitive.cleanup)
 		ctx->primitive.cleanup(ctx);
 
-	pr_debug("kdamond (%d) finishes\n", ctx->kdamond->pid);
+	pr_debug("kdamond (%d) finishes\n", current->pid);
 	mutex_lock(&ctx->kdamond_lock);
 	ctx->kdamond = NULL;
 	mutex_unlock(&ctx->kdamond_lock);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 222/262] mm/damon/core: nullify pointer ctx->kdamond with a NULL
  2021-11-05 20:34 incoming Andrew Morton
                   ` (220 preceding siblings ...)
  2021-11-05 20:46 ` [patch 221/262] mm/damon: needn't hold kdamond_lock to print pid of kdamond Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 223/262] mm/damon/core: account age of target regions Andrew Morton
                   ` (39 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, colin.king, linux-mm, mm-commits, sj, torvalds

From: Colin Ian King <colin.king@canonical.com>
Subject: mm/damon/core: nullify pointer ctx->kdamond with a NULL

Currently a plain integer is being used to nullify the pointer
ctx->kdamond.  Use NULL instead.  Cleans up sparse warning:

mm/damon/core.c:317:40: warning: Using plain integer as NULL pointer

Link: https://lkml.kernel.org/r/20210925215908.181226-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/damon/core.c~mm-damon-core-nullify-pointer-ctx-kdamond-with-a-null
+++ a/mm/damon/core.c
@@ -314,7 +314,7 @@ static int __damon_start(struct damon_ct
 				nr_running_ctxs);
 		if (IS_ERR(ctx->kdamond)) {
 			err = PTR_ERR(ctx->kdamond);
-			ctx->kdamond = 0;
+			ctx->kdamond = NULL;
 		}
 	}
 	mutex_unlock(&ctx->kdamond_lock);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 223/262] mm/damon/core: account age of target regions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (221 preceding siblings ...)
  2021-11-05 20:46 ` [patch 222/262] mm/damon/core: nullify pointer ctx->kdamond with a NULL Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 224/262] mm/damon/core: implement DAMON-based Operation Schemes (DAMOS) Andrew Morton
                   ` (38 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/core: account age of target regions

Patch series "Implement Data Access Monitoring-based Memory Operation Schemes".

Introduction
============

DAMON[1] can be used as a primitive for data access aware memory
management optimizations.  For that, users who want such optimizations
should run DAMON, read the monitoring results, analyze it, plan a new
memory management scheme, and apply the new scheme by themselves.  Such
efforts will be inevitable for some complicated optimizations.

However, in many other cases, the users would simply want the system to
apply a memory management action to a memory region of a specific size
having a specific access frequency for a specific time.  For example,
"page out a memory region larger than 100 MiB keeping only rare accesses
more than 2 minutes", or "Do not use THP for a memory region larger than 2
MiB rarely accessed for more than 1 seconds".

To make the works easier and non-redundant, this patchset implements a new
feature of DAMON, which is called Data Access Monitoring-based Operation
Schemes (DAMOS).  Using the feature, users can describe the normal schemes
in a simple way and ask DAMON to execute those on its own.

[1] https://damonitor.github.io

Evaluations
===========

DAMOS is accurate and useful for memory management optimizations.  An
experimental DAMON-based operation scheme for THP, 'ethp', removes 76.15%
of THP memory overheads while preserving 51.25% of THP speedup.  Another
experimental DAMON-based 'proactive reclamation' implementation, 'prcl',
reduces 93.38% of residential sets and 23.63% of system memory footprint
while incurring only 1.22% runtime overhead in the best case
(parsec3/freqmine).

NOTE that the experimental THP optimization and proactive reclamation are
not for production but only for proof of concepts.

Please refer to the showcase web site's evaluation document[1] for
detailed evaluation setup and results.

[1] https://damonitor.github.io/doc/html/v34/vm/damon/eval.html

Long-term Support Trees
-----------------------

For people who want to test DAMON but using LTS kernels, there are another
couple of trees based on two latest LTS kernels respectively and containing the
'damon/master' backports.

- For v5.4.y: https://git.kernel.org/sj/h/damon/for-v5.4.y
- For v5.10.y: https://git.kernel.org/sj/h/damon/for-v5.10.y

Sequence Of Patches
===================

The 1st patch accounts age of each region.  The 2nd patch implements the
core of the DAMON-based operation schemes feature.  The 3rd patch makes
the default monitoring primitives for virtual address spaces to support
the schemes.  From this point, the kernel space users can use DAMOS.  The
4th patch exports the feature to the user space via the debugfs interface.
The 5th patch implements schemes statistics feature for easier tuning of
the schemes and runtime access pattern analysis, and the 6th patch adds
selftests for these changes.  Finally, the 7th patch documents this new
feature.


This patch (of 7):

DAMON can be used for data access pattern aware memory management
optimizations.  For that, users should run DAMON, read the monitoring
results, analyze it, plan a new memory management scheme, and apply the
new scheme by themselves.  It would not be too hard, but still require
some level of effort.  For complicated cases, this effort is inevitable.

That said, in many cases, users would simply want to apply an actions to a
memory region of a specific size having a specific access frequency for a
specific time.  For example, "page out a memory region larger than 100 MiB
but having a low access frequency more than 10 minutes", or "Use THP for a
memory region larger than 2 MiB having a high access frequency for more
than 2 seconds".

For such optimizations, users will need to first account the age of each
region themselves.  To reduce such efforts, this commit implements a
simple age account of each region in DAMON.  For each aggregation step,
DAMON compares the access frequency with that from last aggregation and
reset the age of the region if the change is significant.  Else, the age
is incremented.  Also, in case of the merge of regions, the region
size-weighted average of the ages is set as the age of merged new region.

Link: https://lkml.kernel.org/r/20211001125604.29660-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211001125604.29660-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Marco Elver <elver@google.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Greg Thelen <gthelen@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: David Rienjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   10 ++++++++++
 mm/damon/core.c       |   13 +++++++++++++
 2 files changed, 23 insertions(+)

--- a/include/linux/damon.h~mm-damon-core-account-age-of-target-regions
+++ a/include/linux/damon.h
@@ -31,12 +31,22 @@ struct damon_addr_range {
  * @sampling_addr:	Address of the sample for the next access check.
  * @nr_accesses:	Access frequency of this region.
  * @list:		List head for siblings.
+ * @age:		Age of this region.
+ *
+ * @age is initially zero, increased for each aggregation interval, and reset
+ * to zero again if the access frequency is significantly changed.  If two
+ * regions are merged into a new region, both @nr_accesses and @age of the new
+ * region are set as region size-weighted average of those of the two regions.
  */
 struct damon_region {
 	struct damon_addr_range ar;
 	unsigned long sampling_addr;
 	unsigned int nr_accesses;
 	struct list_head list;
+
+	unsigned int age;
+/* private: Internal value for age calculation. */
+	unsigned int last_nr_accesses;
 };
 
 /**
--- a/mm/damon/core.c~mm-damon-core-account-age-of-target-regions
+++ a/mm/damon/core.c
@@ -45,6 +45,9 @@ struct damon_region *damon_new_region(un
 	region->nr_accesses = 0;
 	INIT_LIST_HEAD(&region->list);
 
+	region->age = 0;
+	region->last_nr_accesses = 0;
+
 	return region;
 }
 
@@ -444,6 +447,7 @@ static void kdamond_reset_aggregated(str
 
 		damon_for_each_region(r, t) {
 			trace_damon_aggregated(t, r, damon_nr_regions(t));
+			r->last_nr_accesses = r->nr_accesses;
 			r->nr_accesses = 0;
 		}
 	}
@@ -461,6 +465,7 @@ static void damon_merge_two_regions(stru
 
 	l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
 			(sz_l + sz_r);
+	l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
 	l->ar.end = r->ar.end;
 	damon_destroy_region(r, t);
 }
@@ -480,6 +485,11 @@ static void damon_merge_regions_of(struc
 	struct damon_region *r, *prev = NULL, *next;
 
 	damon_for_each_region_safe(r, next, t) {
+		if (diff_of(r->nr_accesses, r->last_nr_accesses) > thres)
+			r->age = 0;
+		else
+			r->age++;
+
 		if (prev && prev->ar.end == r->ar.start &&
 		    diff_of(prev->nr_accesses, r->nr_accesses) <= thres &&
 		    sz_damon_region(prev) + sz_damon_region(r) <= sz_limit)
@@ -527,6 +537,9 @@ static void damon_split_region_at(struct
 
 	r->ar.end = new->ar.start;
 
+	new->age = r->age;
+	new->last_nr_accesses = r->last_nr_accesses;
+
 	damon_insert_region(new, r, damon_next_region(r), t);
 }
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 224/262] mm/damon/core: implement DAMON-based Operation Schemes (DAMOS)
  2021-11-05 20:34 incoming Andrew Morton
                   ` (222 preceding siblings ...)
  2021-11-05 20:46 ` [patch 223/262] mm/damon/core: account age of target regions Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 225/262] mm/damon/vaddr: support DAMON-based Operation Schemes Andrew Morton
                   ` (37 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/core: implement DAMON-based Operation Schemes (DAMOS)

In many cases, users might use DAMON for simple data access aware memory
management optimizations such as applying an operation scheme to a memory
region of a specific size having a specific access frequency for a
specific time.  For example, "page out a memory region larger than 100 MiB
but having a low access frequency more than 10 minutes", or "Use THP for a
memory region larger than 2 MiB having a high access frequency for more
than 2 seconds".

Most simple form of the solution would be doing offline data access
pattern profiling using DAMON and modifying the application source code or
system configuration based on the profiling results.  Or, developing a
daemon constructed with two modules (one for access monitoring and the
other for applying memory management actions via mlock(), madvise(),
sysctl, etc) is imaginable.

To avoid users spending their time for implementation of such simple data
access monitoring-based operation schemes, this commit makes DAMON to
handle such schemes directly.  With this commit, users can simply specify
their desired schemes to DAMON.  Then, DAMON will automatically apply the
schemes to the user-specified target processes.

Each of the schemes is composed with conditions for filtering of the
target memory regions and desired memory management action for the target.
Specifically, the format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

The filtering conditions are size of memory region, number of accesses to
the region monitored by DAMON, and the age of the region.  The age of
region is incremented periodically but reset when its addresses or access
frequency has significantly changed or the action of a scheme was applied.
For the action, current implementation supports a few of madvise()-like
hints, ``WILLNEED``, ``COLD``, ``PAGEOUT``, ``HUGEPAGE``, and
``NOHUGEPAGE``.

Because DAMON supports various address spaces and application of the
actions to a monitoring target region is dependent to the type of the
target address space, the application code should be implemented by each
primitives and registered to the framework.  Note that this commit only
implements the framework part.  Following commit will implement the action
applications for virtual address spaces primitives.

Link: https://lkml.kernel.org/r/20211001125604.29660-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   66 ++++++++++++++++++++++++
 mm/damon/core.c       |  109 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 175 insertions(+)

--- a/include/linux/damon.h~mm-damon-core-implement-damon-based-operation-schemes-damos
+++ a/include/linux/damon.h
@@ -69,6 +69,48 @@ struct damon_target {
 	struct list_head list;
 };
 
+/**
+ * enum damos_action - Represents an action of a Data Access Monitoring-based
+ * Operation Scheme.
+ *
+ * @DAMOS_WILLNEED:	Call ``madvise()`` for the region with MADV_WILLNEED.
+ * @DAMOS_COLD:		Call ``madvise()`` for the region with MADV_COLD.
+ * @DAMOS_PAGEOUT:	Call ``madvise()`` for the region with MADV_PAGEOUT.
+ * @DAMOS_HUGEPAGE:	Call ``madvise()`` for the region with MADV_HUGEPAGE.
+ * @DAMOS_NOHUGEPAGE:	Call ``madvise()`` for the region with MADV_NOHUGEPAGE.
+ */
+enum damos_action {
+	DAMOS_WILLNEED,
+	DAMOS_COLD,
+	DAMOS_PAGEOUT,
+	DAMOS_HUGEPAGE,
+	DAMOS_NOHUGEPAGE,
+};
+
+/**
+ * struct damos - Represents a Data Access Monitoring-based Operation Scheme.
+ * @min_sz_region:	Minimum size of target regions.
+ * @max_sz_region:	Maximum size of target regions.
+ * @min_nr_accesses:	Minimum ``->nr_accesses`` of target regions.
+ * @max_nr_accesses:	Maximum ``->nr_accesses`` of target regions.
+ * @min_age_region:	Minimum age of target regions.
+ * @max_age_region:	Maximum age of target regions.
+ * @action:		&damo_action to be applied to the target regions.
+ * @list:		List head for siblings.
+ *
+ * Note that both the minimums and the maximums are inclusive.
+ */
+struct damos {
+	unsigned long min_sz_region;
+	unsigned long max_sz_region;
+	unsigned int min_nr_accesses;
+	unsigned int max_nr_accesses;
+	unsigned int min_age_region;
+	unsigned int max_age_region;
+	enum damos_action action;
+	struct list_head list;
+};
+
 struct damon_ctx;
 
 /**
@@ -79,6 +121,7 @@ struct damon_ctx;
  * @prepare_access_checks:	Prepare next access check of target regions.
  * @check_accesses:		Check the accesses to target regions.
  * @reset_aggregated:		Reset aggregated accesses monitoring results.
+ * @apply_scheme:		Apply a DAMON-based operation scheme.
  * @target_valid:		Determine if the target is valid.
  * @cleanup:			Clean up the context.
  *
@@ -104,6 +147,9 @@ struct damon_ctx;
  * of its update.  The value will be used for regions adjustment threshold.
  * @reset_aggregated should reset the access monitoring results that aggregated
  * by @check_accesses.
+ * @apply_scheme is called from @kdamond when a region for user provided
+ * DAMON-based operation scheme is found.  It should apply the scheme's action
+ * to the region.  This is not used for &DAMON_ARBITRARY_TARGET case.
  * @target_valid should check whether the target is still valid for the
  * monitoring.
  * @cleanup is called from @kdamond just before its termination.
@@ -114,6 +160,8 @@ struct damon_primitive {
 	void (*prepare_access_checks)(struct damon_ctx *context);
 	unsigned int (*check_accesses)(struct damon_ctx *context);
 	void (*reset_aggregated)(struct damon_ctx *context);
+	int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t,
+			struct damon_region *r, struct damos *scheme);
 	bool (*target_valid)(void *target);
 	void (*cleanup)(struct damon_ctx *context);
 };
@@ -192,6 +240,7 @@ struct damon_callback {
  * @min_nr_regions:	The minimum number of adaptive monitoring regions.
  * @max_nr_regions:	The maximum number of adaptive monitoring regions.
  * @adaptive_targets:	Head of monitoring targets (&damon_target) list.
+ * @schemes:		Head of schemes (&damos) list.
  */
 struct damon_ctx {
 	unsigned long sample_interval;
@@ -213,6 +262,7 @@ struct damon_ctx {
 	unsigned long min_nr_regions;
 	unsigned long max_nr_regions;
 	struct list_head adaptive_targets;
+	struct list_head schemes;
 };
 
 #define damon_next_region(r) \
@@ -233,6 +283,12 @@ struct damon_ctx {
 #define damon_for_each_target_safe(t, next, ctx)	\
 	list_for_each_entry_safe(t, next, &(ctx)->adaptive_targets, list)
 
+#define damon_for_each_scheme(s, ctx) \
+	list_for_each_entry(s, &(ctx)->schemes, list)
+
+#define damon_for_each_scheme_safe(s, next, ctx) \
+	list_for_each_entry_safe(s, next, &(ctx)->schemes, list)
+
 #ifdef CONFIG_DAMON
 
 struct damon_region *damon_new_region(unsigned long start, unsigned long end);
@@ -242,6 +298,14 @@ inline void damon_insert_region(struct d
 void damon_add_region(struct damon_region *r, struct damon_target *t);
 void damon_destroy_region(struct damon_region *r, struct damon_target *t);
 
+struct damos *damon_new_scheme(
+		unsigned long min_sz_region, unsigned long max_sz_region,
+		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+		unsigned int min_age_region, unsigned int max_age_region,
+		enum damos_action action);
+void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
+void damon_destroy_scheme(struct damos *s);
+
 struct damon_target *damon_new_target(unsigned long id);
 void damon_add_target(struct damon_ctx *ctx, struct damon_target *t);
 void damon_free_target(struct damon_target *t);
@@ -255,6 +319,8 @@ int damon_set_targets(struct damon_ctx *
 int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
 		unsigned long aggr_int, unsigned long primitive_upd_int,
 		unsigned long min_nr_reg, unsigned long max_nr_reg);
+int damon_set_schemes(struct damon_ctx *ctx,
+			struct damos **schemes, ssize_t nr_schemes);
 int damon_nr_running_ctxs(void);
 
 int damon_start(struct damon_ctx **ctxs, int nr_ctxs);
--- a/mm/damon/core.c~mm-damon-core-implement-damon-based-operation-schemes-damos
+++ a/mm/damon/core.c
@@ -85,6 +85,50 @@ void damon_destroy_region(struct damon_r
 	damon_free_region(r);
 }
 
+struct damos *damon_new_scheme(
+		unsigned long min_sz_region, unsigned long max_sz_region,
+		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+		unsigned int min_age_region, unsigned int max_age_region,
+		enum damos_action action)
+{
+	struct damos *scheme;
+
+	scheme = kmalloc(sizeof(*scheme), GFP_KERNEL);
+	if (!scheme)
+		return NULL;
+	scheme->min_sz_region = min_sz_region;
+	scheme->max_sz_region = max_sz_region;
+	scheme->min_nr_accesses = min_nr_accesses;
+	scheme->max_nr_accesses = max_nr_accesses;
+	scheme->min_age_region = min_age_region;
+	scheme->max_age_region = max_age_region;
+	scheme->action = action;
+	INIT_LIST_HEAD(&scheme->list);
+
+	return scheme;
+}
+
+void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
+{
+	list_add_tail(&s->list, &ctx->schemes);
+}
+
+static void damon_del_scheme(struct damos *s)
+{
+	list_del(&s->list);
+}
+
+static void damon_free_scheme(struct damos *s)
+{
+	kfree(s);
+}
+
+void damon_destroy_scheme(struct damos *s)
+{
+	damon_del_scheme(s);
+	damon_free_scheme(s);
+}
+
 /*
  * Construct a damon_target struct
  *
@@ -156,6 +200,7 @@ struct damon_ctx *damon_new_ctx(void)
 	ctx->max_nr_regions = 1000;
 
 	INIT_LIST_HEAD(&ctx->adaptive_targets);
+	INIT_LIST_HEAD(&ctx->schemes);
 
 	return ctx;
 }
@@ -175,7 +220,13 @@ static void damon_destroy_targets(struct
 
 void damon_destroy_ctx(struct damon_ctx *ctx)
 {
+	struct damos *s, *next_s;
+
 	damon_destroy_targets(ctx);
+
+	damon_for_each_scheme_safe(s, next_s, ctx)
+		damon_destroy_scheme(s);
+
 	kfree(ctx);
 }
 
@@ -251,6 +302,30 @@ int damon_set_attrs(struct damon_ctx *ct
 }
 
 /**
+ * damon_set_schemes() - Set data access monitoring based operation schemes.
+ * @ctx:	monitoring context
+ * @schemes:	array of the schemes
+ * @nr_schemes:	number of entries in @schemes
+ *
+ * This function should not be called while the kdamond of the context is
+ * running.
+ *
+ * Return: 0 if success, or negative error code otherwise.
+ */
+int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
+			ssize_t nr_schemes)
+{
+	struct damos *s, *next;
+	ssize_t i;
+
+	damon_for_each_scheme_safe(s, next, ctx)
+		damon_destroy_scheme(s);
+	for (i = 0; i < nr_schemes; i++)
+		damon_add_scheme(ctx, schemes[i]);
+	return 0;
+}
+
+/**
  * damon_nr_running_ctxs() - Return number of currently running contexts.
  */
 int damon_nr_running_ctxs(void)
@@ -453,6 +528,39 @@ static void kdamond_reset_aggregated(str
 	}
 }
 
+static void damon_do_apply_schemes(struct damon_ctx *c,
+				   struct damon_target *t,
+				   struct damon_region *r)
+{
+	struct damos *s;
+	unsigned long sz;
+
+	damon_for_each_scheme(s, c) {
+		sz = r->ar.end - r->ar.start;
+		if (sz < s->min_sz_region || s->max_sz_region < sz)
+			continue;
+		if (r->nr_accesses < s->min_nr_accesses ||
+				s->max_nr_accesses < r->nr_accesses)
+			continue;
+		if (r->age < s->min_age_region || s->max_age_region < r->age)
+			continue;
+		if (c->primitive.apply_scheme)
+			c->primitive.apply_scheme(c, t, r, s);
+		r->age = 0;
+	}
+}
+
+static void kdamond_apply_schemes(struct damon_ctx *c)
+{
+	struct damon_target *t;
+	struct damon_region *r;
+
+	damon_for_each_target(t, c) {
+		damon_for_each_region(r, t)
+			damon_do_apply_schemes(c, t, r);
+	}
+}
+
 #define sz_damon_region(r) (r->ar.end - r->ar.start)
 
 /*
@@ -693,6 +801,7 @@ static int kdamond_fn(void *data)
 			if (ctx->callback.after_aggregation &&
 					ctx->callback.after_aggregation(ctx))
 				set_kdamond_stop(ctx);
+			kdamond_apply_schemes(ctx);
 			kdamond_reset_aggregated(ctx);
 			kdamond_split_regions(ctx);
 			if (ctx->primitive.reset_aggregated)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 225/262] mm/damon/vaddr: support DAMON-based Operation Schemes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (223 preceding siblings ...)
  2021-11-05 20:46 ` [patch 224/262] mm/damon/core: implement DAMON-based Operation Schemes (DAMOS) Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 226/262] mm/damon/dbgfs: " Andrew Morton
                   ` (36 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/vaddr: support DAMON-based Operation Schemes

This commit makes DAMON's default primitives for virtual address spaces to
support DAMON-based Operation Schemes (DAMOS) by implementing actions
application functions and registering it to the monitoring context.  The
implementation simply links 'madvise()' for related DAMOS actions.  That
is, 'madvise(MADV_WILLNEED)' is called for 'WILLNEED' DAMOS action and
similar for other actions ('COLD', 'PAGEOUT', 'HUGEPAGE', 'NOHUGEPAGE').

So, the kernel space DAMON users can now use the DAMON-based optimizations
with only small amount of code.

Link: https://lkml.kernel.org/r/20211001125604.29660-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    2 +
 mm/damon/vaddr.c      |   56 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

--- a/include/linux/damon.h~mm-damon-vaddr-support-damon-based-operation-schemes
+++ a/include/linux/damon.h
@@ -337,6 +337,8 @@ void damon_va_prepare_access_checks(stru
 unsigned int damon_va_check_accesses(struct damon_ctx *ctx);
 bool damon_va_target_valid(void *t);
 void damon_va_cleanup(struct damon_ctx *ctx);
+int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_va_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_VADDR */
--- a/mm/damon/vaddr.c~mm-damon-vaddr-support-damon-based-operation-schemes
+++ a/mm/damon/vaddr.c
@@ -7,6 +7,7 @@
 
 #define pr_fmt(fmt) "damon-va: " fmt
 
+#include <asm-generic/mman-common.h>
 #include <linux/damon.h>
 #include <linux/hugetlb.h>
 #include <linux/mm.h>
@@ -658,6 +659,60 @@ bool damon_va_target_valid(void *target)
 	return false;
 }
 
+#ifndef CONFIG_ADVISE_SYSCALLS
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+			int behavior)
+{
+	return -EINVAL;
+}
+#else
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+			int behavior)
+{
+	struct mm_struct *mm;
+	int ret = -ENOMEM;
+
+	mm = damon_get_mm(target);
+	if (!mm)
+		goto out;
+
+	ret = do_madvise(mm, PAGE_ALIGN(r->ar.start),
+			PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
+	mmput(mm);
+out:
+	return ret;
+}
+#endif	/* CONFIG_ADVISE_SYSCALLS */
+
+int damon_va_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+	int madv_action;
+
+	switch (scheme->action) {
+	case DAMOS_WILLNEED:
+		madv_action = MADV_WILLNEED;
+		break;
+	case DAMOS_COLD:
+		madv_action = MADV_COLD;
+		break;
+	case DAMOS_PAGEOUT:
+		madv_action = MADV_PAGEOUT;
+		break;
+	case DAMOS_HUGEPAGE:
+		madv_action = MADV_HUGEPAGE;
+		break;
+	case DAMOS_NOHUGEPAGE:
+		madv_action = MADV_NOHUGEPAGE;
+		break;
+	default:
+		pr_warn("Wrong action %d\n", scheme->action);
+		return -EINVAL;
+	}
+
+	return damos_madvise(t, r, madv_action);
+}
+
 void damon_va_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = damon_va_init;
@@ -667,6 +722,7 @@ void damon_va_set_primitives(struct damo
 	ctx->primitive.reset_aggregated = NULL;
 	ctx->primitive.target_valid = damon_va_target_valid;
 	ctx->primitive.cleanup = NULL;
+	ctx->primitive.apply_scheme = damon_va_apply_scheme;
 }
 
 #include "vaddr-test.h"
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 226/262] mm/damon/dbgfs: support DAMON-based Operation Schemes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (224 preceding siblings ...)
  2021-11-05 20:46 ` [patch 225/262] mm/damon/vaddr: support DAMON-based Operation Schemes Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 227/262] mm/damon/schemes: implement statistics feature Andrew Morton
                   ` (35 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: support DAMON-based Operation Schemes

This commit makes 'damon-dbgfs' to support the data access monitoring
oriented memory management schemes.  Users can read and update the schemes
using ``<debugfs>/damon/schemes`` file.  The format is::

    <min/max size> <min/max access frequency> <min/max age> <action>

Link: https://lkml.kernel.org/r/20211001125604.29660-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |  165 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 162 insertions(+), 3 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-damon-based-operation-schemes
+++ a/mm/damon/dbgfs.c
@@ -98,6 +98,159 @@ out:
 	return ret;
 }
 
+static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
+{
+	struct damos *s;
+	int written = 0;
+	int rc;
+
+	damon_for_each_scheme(s, c) {
+		rc = scnprintf(&buf[written], len - written,
+				"%lu %lu %u %u %u %u %d\n",
+				s->min_sz_region, s->max_sz_region,
+				s->min_nr_accesses, s->max_nr_accesses,
+				s->min_age_region, s->max_age_region,
+				s->action);
+		if (!rc)
+			return -ENOMEM;
+
+		written += rc;
+	}
+	return written;
+}
+
+static ssize_t dbgfs_schemes_read(struct file *file, char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = file->private_data;
+	char *kbuf;
+	ssize_t len;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	mutex_lock(&ctx->kdamond_lock);
+	len = sprint_schemes(ctx, kbuf, count);
+	mutex_unlock(&ctx->kdamond_lock);
+	if (len < 0)
+		goto out;
+	len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+	kfree(kbuf);
+	return len;
+}
+
+static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes)
+{
+	ssize_t i;
+
+	for (i = 0; i < nr_schemes; i++)
+		kfree(schemes[i]);
+	kfree(schemes);
+}
+
+static bool damos_action_valid(int action)
+{
+	switch (action) {
+	case DAMOS_WILLNEED:
+	case DAMOS_COLD:
+	case DAMOS_PAGEOUT:
+	case DAMOS_HUGEPAGE:
+	case DAMOS_NOHUGEPAGE:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/*
+ * Converts a string into an array of struct damos pointers
+ *
+ * Returns an array of struct damos pointers that converted if the conversion
+ * success, or NULL otherwise.
+ */
+static struct damos **str_to_schemes(const char *str, ssize_t len,
+				ssize_t *nr_schemes)
+{
+	struct damos *scheme, **schemes;
+	const int max_nr_schemes = 256;
+	int pos = 0, parsed, ret;
+	unsigned long min_sz, max_sz;
+	unsigned int min_nr_a, max_nr_a, min_age, max_age;
+	unsigned int action;
+
+	schemes = kmalloc_array(max_nr_schemes, sizeof(scheme),
+			GFP_KERNEL);
+	if (!schemes)
+		return NULL;
+
+	*nr_schemes = 0;
+	while (pos < len && *nr_schemes < max_nr_schemes) {
+		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
+				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
+				&min_age, &max_age, &action, &parsed);
+		if (ret != 7)
+			break;
+		if (!damos_action_valid(action)) {
+			pr_err("wrong action %d\n", action);
+			goto fail;
+		}
+
+		pos += parsed;
+		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
+				min_age, max_age, action);
+		if (!scheme)
+			goto fail;
+
+		schemes[*nr_schemes] = scheme;
+		*nr_schemes += 1;
+	}
+	return schemes;
+fail:
+	free_schemes_arr(schemes, *nr_schemes);
+	return NULL;
+}
+
+static ssize_t dbgfs_schemes_write(struct file *file, const char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = file->private_data;
+	char *kbuf;
+	struct damos **schemes;
+	ssize_t nr_schemes = 0, ret = count;
+	int err;
+
+	kbuf = user_input_str(buf, count, ppos);
+	if (IS_ERR(kbuf))
+		return PTR_ERR(kbuf);
+
+	schemes = str_to_schemes(kbuf, ret, &nr_schemes);
+	if (!schemes) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	mutex_lock(&ctx->kdamond_lock);
+	if (ctx->kdamond) {
+		ret = -EBUSY;
+		goto unlock_out;
+	}
+
+	err = damon_set_schemes(ctx, schemes, nr_schemes);
+	if (err)
+		ret = err;
+	else
+		nr_schemes = 0;
+unlock_out:
+	mutex_unlock(&ctx->kdamond_lock);
+	free_schemes_arr(schemes, nr_schemes);
+out:
+	kfree(kbuf);
+	return ret;
+}
+
 static inline bool targetid_is_pid(const struct damon_ctx *ctx)
 {
 	return ctx->primitive.target_valid == damon_va_target_valid;
@@ -279,6 +432,12 @@ static const struct file_operations attr
 	.write = dbgfs_attrs_write,
 };
 
+static const struct file_operations schemes_fops = {
+	.open = damon_dbgfs_open,
+	.read = dbgfs_schemes_read,
+	.write = dbgfs_schemes_write,
+};
+
 static const struct file_operations target_ids_fops = {
 	.open = damon_dbgfs_open,
 	.read = dbgfs_target_ids_read,
@@ -292,10 +451,10 @@ static const struct file_operations kdam
 
 static void dbgfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx)
 {
-	const char * const file_names[] = {"attrs", "target_ids",
+	const char * const file_names[] = {"attrs", "schemes", "target_ids",
 		"kdamond_pid"};
-	const struct file_operations *fops[] = {&attrs_fops, &target_ids_fops,
-		&kdamond_pid_fops};
+	const struct file_operations *fops[] = {&attrs_fops, &schemes_fops,
+		&target_ids_fops, &kdamond_pid_fops};
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(file_names); i++)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 227/262] mm/damon/schemes: implement statistics feature
  2021-11-05 20:34 incoming Andrew Morton
                   ` (225 preceding siblings ...)
  2021-11-05 20:46 ` [patch 226/262] mm/damon/dbgfs: " Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 228/262] selftests/damon: add 'schemes' debugfs tests Andrew Morton
                   ` (34 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: implement statistics feature

To tune the DAMON-based operation schemes, knowing how many and how large
regions are affected by each of the schemes will be helful.  Those stats
could be used for not only the tuning, but also monitoring of the working
set size and the number of regions, if the scheme does not change the
program behavior too much.

For the reason, this commit implements the statistics for the schemes. 
The total number and size of the regions that each scheme is applied are
exported to users via '->stat_count' and '->stat_sz' of 'struct damos'. 
Admins can also check the number by reading 'schemes' debugfs file.  The
last two integers now represents the stats.  To allow collecting the stats
without changing the program behavior, this commit also adds new scheme
action, 'DAMOS_STAT'.  Note that 'DAMOS_STAT' is not only making no memory
operation actions, but also does not reset the age of regions.

Link: https://lkml.kernel.org/r/20211001125604.29660-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   10 +++++++++-
 mm/damon/core.c       |    7 ++++++-
 mm/damon/dbgfs.c      |    5 +++--
 mm/damon/vaddr.c      |    2 ++
 4 files changed, 20 insertions(+), 4 deletions(-)

--- a/include/linux/damon.h~mm-damon-schemes-implement-statistics-feature
+++ a/include/linux/damon.h
@@ -78,6 +78,7 @@ struct damon_target {
  * @DAMOS_PAGEOUT:	Call ``madvise()`` for the region with MADV_PAGEOUT.
  * @DAMOS_HUGEPAGE:	Call ``madvise()`` for the region with MADV_HUGEPAGE.
  * @DAMOS_NOHUGEPAGE:	Call ``madvise()`` for the region with MADV_NOHUGEPAGE.
+ * @DAMOS_STAT:		Do nothing but count the stat.
  */
 enum damos_action {
 	DAMOS_WILLNEED,
@@ -85,6 +86,7 @@ enum damos_action {
 	DAMOS_PAGEOUT,
 	DAMOS_HUGEPAGE,
 	DAMOS_NOHUGEPAGE,
+	DAMOS_STAT,		/* Do nothing but only record the stat */
 };
 
 /**
@@ -96,9 +98,13 @@ enum damos_action {
  * @min_age_region:	Minimum age of target regions.
  * @max_age_region:	Maximum age of target regions.
  * @action:		&damo_action to be applied to the target regions.
+ * @stat_count:		Total number of regions that this scheme is applied.
+ * @stat_sz:		Total size of regions that this scheme is applied.
  * @list:		List head for siblings.
  *
- * Note that both the minimums and the maximums are inclusive.
+ * For each aggregation interval, DAMON applies @action to monitoring target
+ * regions fit in the condition and updates the statistics.  Note that both
+ * the minimums and the maximums are inclusive.
  */
 struct damos {
 	unsigned long min_sz_region;
@@ -108,6 +114,8 @@ struct damos {
 	unsigned int min_age_region;
 	unsigned int max_age_region;
 	enum damos_action action;
+	unsigned long stat_count;
+	unsigned long stat_sz;
 	struct list_head list;
 };
 
--- a/mm/damon/core.c~mm-damon-schemes-implement-statistics-feature
+++ a/mm/damon/core.c
@@ -103,6 +103,8 @@ struct damos *damon_new_scheme(
 	scheme->min_age_region = min_age_region;
 	scheme->max_age_region = max_age_region;
 	scheme->action = action;
+	scheme->stat_count = 0;
+	scheme->stat_sz = 0;
 	INIT_LIST_HEAD(&scheme->list);
 
 	return scheme;
@@ -544,9 +546,12 @@ static void damon_do_apply_schemes(struc
 			continue;
 		if (r->age < s->min_age_region || s->max_age_region < r->age)
 			continue;
+		s->stat_count++;
+		s->stat_sz += sz;
 		if (c->primitive.apply_scheme)
 			c->primitive.apply_scheme(c, t, r, s);
-		r->age = 0;
+		if (s->action != DAMOS_STAT)
+			r->age = 0;
 	}
 }
 
--- a/mm/damon/dbgfs.c~mm-damon-schemes-implement-statistics-feature
+++ a/mm/damon/dbgfs.c
@@ -106,11 +106,11 @@ static ssize_t sprint_schemes(struct dam
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d\n",
+				"%lu %lu %u %u %u %u %d %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
-				s->action);
+				s->action, s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
 
@@ -159,6 +159,7 @@ static bool damos_action_valid(int actio
 	case DAMOS_PAGEOUT:
 	case DAMOS_HUGEPAGE:
 	case DAMOS_NOHUGEPAGE:
+	case DAMOS_STAT:
 		return true;
 	default:
 		return false;
--- a/mm/damon/vaddr.c~mm-damon-schemes-implement-statistics-feature
+++ a/mm/damon/vaddr.c
@@ -705,6 +705,8 @@ int damon_va_apply_scheme(struct damon_c
 	case DAMOS_NOHUGEPAGE:
 		madv_action = MADV_NOHUGEPAGE;
 		break;
+	case DAMOS_STAT:
+		return 0;
 	default:
 		pr_warn("Wrong action %d\n", scheme->action);
 		return -EINVAL;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 228/262] selftests/damon: add 'schemes' debugfs tests
  2021-11-05 20:34 incoming Andrew Morton
                   ` (226 preceding siblings ...)
  2021-11-05 20:46 ` [patch 227/262] mm/damon/schemes: implement statistics feature Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 229/262] Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes Andrew Morton
                   ` (33 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: selftests/damon: add 'schemes' debugfs tests

This commit adds simple selftets for 'schemes' debugfs file of DAMON.

Link: https://lkml.kernel.org/r/20211001125604.29660-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/damon/debugfs_attrs.sh |   13 +++++++++++++
 1 file changed, 13 insertions(+)

--- a/tools/testing/selftests/damon/debugfs_attrs.sh~selftests-damon-add-schemes-debugfs-tests
+++ a/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -57,6 +57,19 @@ test_write_fail "$file" "1 2 3 5 4" "$or
 test_content "$file" "$orig_content" "1 2 3 4 5" "successfully written"
 echo "$orig_content" > "$file"
 
+# Test schemes file
+# =================
+
+file="$DBGFS/schemes"
+orig_content=$(cat "$file")
+
+test_write_succ "$file" "1 2 3 4 5 6 4" \
+	"$orig_content" "valid input"
+test_write_fail "$file" "1 2
+3 4 5 6 3" "$orig_content" "multi lines"
+test_write_succ "$file" "" "$orig_content" "disabling"
+echo "$orig_content" > "$file"
+
 # Test target_ids file
 # ====================
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 229/262] Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (227 preceding siblings ...)
  2021-11-05 20:46 ` [patch 228/262] selftests/damon: add 'schemes' debugfs tests Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 230/262] mm/damon/dbgfs: allow users to set initial monitoring target regions Andrew Morton
                   ` (32 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes

This commit add description of DAMON-based operation schemes in the DAMON
documents.

Link: https://lkml.kernel.org/r/20211001125604.29660-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/start.rst |   11 +++
 Documentation/admin-guide/mm/damon/usage.rst |   51 ++++++++++++++++-
 2 files changed, 60 insertions(+), 2 deletions(-)

--- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-document-damon-based-operation-schemes
+++ a/Documentation/admin-guide/mm/damon/start.rst
@@ -108,6 +108,17 @@ the results as separate image files. ::
 You can view the visualizations of this example workload at [1]_.
 Visualizations of other realistic workloads are available at [2]_ [3]_ [4]_.
 
+
+Data Access Pattern Aware Memory Management
+===========================================
+
+Below three commands make every memory region of size >=4K that doesn't
+accessed for >=60 seconds in your workload to be swapped out. ::
+
+    $ echo "#min-size max-size min-acc max-acc min-age max-age action" > scheme
+    $ echo "4K        max      0       0       60s     max     pageout" >> scheme
+    $ damo schemes -c my_thp_scheme <pid of your workload>
+
 .. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns
 .. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html
 .. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.png.html
--- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-document-damon-based-operation-schemes
+++ a/Documentation/admin-guide/mm/damon/usage.rst
@@ -34,8 +34,8 @@ the reason, this document describes only
 debugfs Interface
 =================
 
-DAMON exports three files, ``attrs``, ``target_ids``, and ``monitor_on`` under
-its debugfs directory, ``<debugfs>/damon/``.
+DAMON exports four files, ``attrs``, ``target_ids``, ``schemes`` and
+``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``.
 
 
 Attributes
@@ -74,6 +74,53 @@ check it again::
 Note that setting the target ids doesn't start the monitoring.
 
 
+Schemes
+-------
+
+For usual DAMON-based data access aware memory management optimizations, users
+would simply want the system to apply a memory management action to a memory
+region of a specific size having a specific access frequency for a specific
+time.  DAMON receives such formalized operation schemes from the user and
+applies those to the target processes.  It also counts the total number and
+size of regions that each scheme is applied.  This statistics can be used for
+online analysis or tuning of the schemes.
+
+Users can get and set the schemes by reading from and writing to ``schemes``
+debugfs file.  Reading the file also shows the statistics of each scheme.  To
+the file, each of the schemes should be represented in each line in below form:
+
+    min-size max-size min-acc max-acc min-age max-age action
+
+Note that the ranges are closed interval.  Bytes for the size of regions
+(``min-size`` and ``max-size``), number of monitored accesses per aggregate
+interval for access frequency (``min-acc`` and ``max-acc``), number of
+aggregate intervals for the age of regions (``min-age`` and ``max-age``), and a
+predefined integer for memory management actions should be used.  The supported
+numbers and their meanings are as below.
+
+ - 0: Call ``madvise()`` for the region with ``MADV_WILLNEED``
+ - 1: Call ``madvise()`` for the region with ``MADV_COLD``
+ - 2: Call ``madvise()`` for the region with ``MADV_PAGEOUT``
+ - 3: Call ``madvise()`` for the region with ``MADV_HUGEPAGE``
+ - 4: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE``
+ - 5: Do nothing but count the statistics
+
+You can disable schemes by simply writing an empty string to the file.  For
+example, below commands applies a scheme saying "If a memory region of size in
+[4KiB, 8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate
+interval in [10, 20], page out the region", check the entered scheme again, and
+finally remove the scheme. ::
+
+    # cd <debugfs>/damon
+    # echo "4096 8192    0 5    10 20    2" > schemes
+    # cat schemes
+    4096 8192 0 5 10 20 2 0 0
+    # echo > schemes
+
+The last two integers in the 4th line of above example is the total number and
+the total size of the regions that the scheme is applied.
+
+
 Turning On/Off
 --------------
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 230/262] mm/damon/dbgfs: allow users to set initial monitoring target regions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (228 preceding siblings ...)
  2021-11-05 20:46 ` [patch 229/262] Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 231/262] mm/damon/dbgfs-test: add a unit test case for 'init_regions' Andrew Morton
                   ` (31 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: allow users to set initial monitoring target regions

Patch series "DAMON: Support Physical Memory Address Space Monitoring:.

DAMON currently supports only virtual address spaces monitoring.  It can
be easily extended for various use cases and address spaces by configuring
its monitoring primitives layer to use appropriate primitives
implementations, though.  This patchset implements monitoring primitives
for the physical address space monitoring using the structure.

The first 3 patches allow the user space users manually set the monitoring
regions.  The 1st patch implements the feature in the 'damon-dbgfs'. 
Then, patches for adding a unit tests (the 2nd patch) and updating the
documentation (the 3rd patch) follow.

Following 4 patches implement the physical address space monitoring
primitives.  The 4th patch makes some primitive functions for the virtual
address spaces primitives reusable.  The 5th patch implements the physical
address space monitoring primitives.  The 6th patch links the primitives
to the 'damon-dbgfs'.  Finally, 7th patch documents this new features.


This patch (of 7):

Some 'damon-dbgfs' users would want to monitor only a part of the entire
virtual memory address space.  The program interface users in the kernel
space could use '->before_start()' callback or set the regions inside the
context struct as they want, but 'damon-dbgfs' users cannot.

For the reason, this commit introduces a new debugfs file called
'init_region'.  'damon-dbgfs' users can specify which initial monitoring
target address regions they want by writing special input to the file. 
The input should describe each region in each line in the below form:

    <pid> <start address> <end address>

Note that the regions will be updated to cover entire memory mapped
regions after a 'regions update interval' is passed.  If you want the
regions to not be updated after the initial setting, you could set the
interval as a very long time, say, a few decades.

Link: https://lkml.kernel.org/r/20211012205711.29216-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211012205711.29216-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Marco Elver <elver@google.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Greg Thelen <gthelen@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: David Rienjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |  156 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 154 insertions(+), 2 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-allow-users-to-set-initial-monitoring-target-regions
+++ a/mm/damon/dbgfs.c
@@ -394,6 +394,152 @@ out:
 	return ret;
 }
 
+static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len)
+{
+	struct damon_target *t;
+	struct damon_region *r;
+	int written = 0;
+	int rc;
+
+	damon_for_each_target(t, c) {
+		damon_for_each_region(r, t) {
+			rc = scnprintf(&buf[written], len - written,
+					"%lu %lu %lu\n",
+					t->id, r->ar.start, r->ar.end);
+			if (!rc)
+				return -ENOMEM;
+			written += rc;
+		}
+	}
+	return written;
+}
+
+static ssize_t dbgfs_init_regions_read(struct file *file, char __user *buf,
+		size_t count, loff_t *ppos)
+{
+	struct damon_ctx *ctx = file->private_data;
+	char *kbuf;
+	ssize_t len;
+
+	kbuf = kmalloc(count, GFP_KERNEL);
+	if (!kbuf)
+		return -ENOMEM;
+
+	mutex_lock(&ctx->kdamond_lock);
+	if (ctx->kdamond) {
+		mutex_unlock(&ctx->kdamond_lock);
+		len = -EBUSY;
+		goto out;
+	}
+
+	len = sprint_init_regions(ctx, kbuf, count);
+	mutex_unlock(&ctx->kdamond_lock);
+	if (len < 0)
+		goto out;
+	len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+	kfree(kbuf);
+	return len;
+}
+
+static int add_init_region(struct damon_ctx *c,
+			 unsigned long target_id, struct damon_addr_range *ar)
+{
+	struct damon_target *t;
+	struct damon_region *r, *prev;
+	unsigned long id;
+	int rc = -EINVAL;
+
+	if (ar->start >= ar->end)
+		return -EINVAL;
+
+	damon_for_each_target(t, c) {
+		id = t->id;
+		if (targetid_is_pid(c))
+			id = (unsigned long)pid_vnr((struct pid *)id);
+		if (id == target_id) {
+			r = damon_new_region(ar->start, ar->end);
+			if (!r)
+				return -ENOMEM;
+			damon_add_region(r, t);
+			if (damon_nr_regions(t) > 1) {
+				prev = damon_prev_region(r);
+				if (prev->ar.end > r->ar.start) {
+					damon_destroy_region(r, t);
+					return -EINVAL;
+				}
+			}
+			rc = 0;
+		}
+	}
+	return rc;
+}
+
+static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len)
+{
+	struct damon_target *t;
+	struct damon_region *r, *next;
+	int pos = 0, parsed, ret;
+	unsigned long target_id;
+	struct damon_addr_range ar;
+	int err;
+
+	damon_for_each_target(t, c) {
+		damon_for_each_region_safe(r, next, t)
+			damon_destroy_region(r, t);
+	}
+
+	while (pos < len) {
+		ret = sscanf(&str[pos], "%lu %lu %lu%n",
+				&target_id, &ar.start, &ar.end, &parsed);
+		if (ret != 3)
+			break;
+		err = add_init_region(c, target_id, &ar);
+		if (err)
+			goto fail;
+		pos += parsed;
+	}
+
+	return 0;
+
+fail:
+	damon_for_each_target(t, c) {
+		damon_for_each_region_safe(r, next, t)
+			damon_destroy_region(r, t);
+	}
+	return err;
+}
+
+static ssize_t dbgfs_init_regions_write(struct file *file,
+					  const char __user *buf, size_t count,
+					  loff_t *ppos)
+{
+	struct damon_ctx *ctx = file->private_data;
+	char *kbuf;
+	ssize_t ret = count;
+	int err;
+
+	kbuf = user_input_str(buf, count, ppos);
+	if (IS_ERR(kbuf))
+		return PTR_ERR(kbuf);
+
+	mutex_lock(&ctx->kdamond_lock);
+	if (ctx->kdamond) {
+		ret = -EBUSY;
+		goto unlock_out;
+	}
+
+	err = set_init_regions(ctx, kbuf, ret);
+	if (err)
+		ret = err;
+
+unlock_out:
+	mutex_unlock(&ctx->kdamond_lock);
+	kfree(kbuf);
+	return ret;
+}
+
 static ssize_t dbgfs_kdamond_pid_read(struct file *file,
 		char __user *buf, size_t count, loff_t *ppos)
 {
@@ -445,6 +591,12 @@ static const struct file_operations targ
 	.write = dbgfs_target_ids_write,
 };
 
+static const struct file_operations init_regions_fops = {
+	.open = damon_dbgfs_open,
+	.read = dbgfs_init_regions_read,
+	.write = dbgfs_init_regions_write,
+};
+
 static const struct file_operations kdamond_pid_fops = {
 	.open = damon_dbgfs_open,
 	.read = dbgfs_kdamond_pid_read,
@@ -453,9 +605,9 @@ static const struct file_operations kdam
 static void dbgfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx)
 {
 	const char * const file_names[] = {"attrs", "schemes", "target_ids",
-		"kdamond_pid"};
+		"init_regions", "kdamond_pid"};
 	const struct file_operations *fops[] = {&attrs_fops, &schemes_fops,
-		&target_ids_fops, &kdamond_pid_fops};
+		&target_ids_fops, &init_regions_fops, &kdamond_pid_fops};
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(file_names); i++)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 231/262] mm/damon/dbgfs-test: add a unit test case for 'init_regions'
  2021-11-05 20:34 incoming Andrew Morton
                   ` (229 preceding siblings ...)
  2021-11-05 20:46 ` [patch 230/262] mm/damon/dbgfs: allow users to set initial monitoring target regions Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 232/262] Docs/admin-guide/mm/damon: document 'init_regions' feature Andrew Morton
                   ` (30 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs-test: add a unit test case for 'init_regions'

This commit adds another test case for the new feature, 'init_regions'.

Link: https://lkml.kernel.org/r/20211012205711.29216-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs-test.h |   54 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

--- a/mm/damon/dbgfs-test.h~mm-damon-dbgfs-test-add-a-unit-test-case-for-init_regions
+++ a/mm/damon/dbgfs-test.h
@@ -109,9 +109,63 @@ static void damon_dbgfs_test_set_targets
 	dbgfs_destroy_ctx(ctx);
 }
 
+static void damon_dbgfs_test_set_init_regions(struct kunit *test)
+{
+	struct damon_ctx *ctx = damon_new_ctx();
+	unsigned long ids[] = {1, 2, 3};
+	/* Each line represents one region in ``<target id> <start> <end>`` */
+	char * const valid_inputs[] = {"2 10 20\n 2   20 30\n2 35 45",
+		"2 10 20\n",
+		"2 10 20\n1 39 59\n1 70 134\n  2  20 25\n",
+		""};
+	/* Reading the file again will show sorted, clean output */
+	char * const valid_expects[] = {"2 10 20\n2 20 30\n2 35 45\n",
+		"2 10 20\n",
+		"1 39 59\n1 70 134\n2 10 20\n2 20 25\n",
+		""};
+	char * const invalid_inputs[] = {"4 10 20\n",	/* target not exists */
+		"2 10 20\n 2 14 26\n",		/* regions overlap */
+		"1 10 20\n2 30 40\n 1 5 8"};	/* not sorted by address */
+	char *input, *expect;
+	int i, rc;
+	char buf[256];
+
+	damon_set_targets(ctx, ids, 3);
+
+	/* Put valid inputs and check the results */
+	for (i = 0; i < ARRAY_SIZE(valid_inputs); i++) {
+		input = valid_inputs[i];
+		expect = valid_expects[i];
+
+		rc = set_init_regions(ctx, input, strnlen(input, 256));
+		KUNIT_EXPECT_EQ(test, rc, 0);
+
+		memset(buf, 0, 256);
+		sprint_init_regions(ctx, buf, 256);
+
+		KUNIT_EXPECT_STREQ(test, (char *)buf, expect);
+	}
+	/* Put invlid inputs and check the return error code */
+	for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) {
+		input = invalid_inputs[i];
+		pr_info("input: %s\n", input);
+		rc = set_init_regions(ctx, input, strnlen(input, 256));
+		KUNIT_EXPECT_EQ(test, rc, -EINVAL);
+
+		memset(buf, 0, 256);
+		sprint_init_regions(ctx, buf, 256);
+
+		KUNIT_EXPECT_STREQ(test, (char *)buf, "");
+	}
+
+	damon_set_targets(ctx, NULL, 0);
+	damon_destroy_ctx(ctx);
+}
+
 static struct kunit_case damon_test_cases[] = {
 	KUNIT_CASE(damon_dbgfs_test_str_to_target_ids),
 	KUNIT_CASE(damon_dbgfs_test_set_targets),
+	KUNIT_CASE(damon_dbgfs_test_set_init_regions),
 	{},
 };
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 232/262] Docs/admin-guide/mm/damon: document 'init_regions' feature
  2021-11-05 20:34 incoming Andrew Morton
                   ` (230 preceding siblings ...)
  2021-11-05 20:46 ` [patch 231/262] mm/damon/dbgfs-test: add a unit test case for 'init_regions' Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 233/262] mm/damon/vaddr: separate commonly usable functions Andrew Morton
                   ` (29 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/damon: document 'init_regions' feature

This commit adds description of the 'init_regions' feature in the DAMON
usage document.

Link: https://lkml.kernel.org/r/20211012205711.29216-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/usage.rst |   41 ++++++++++++++++-
 1 file changed, 39 insertions(+), 2 deletions(-)

--- a/Documentation/admin-guide/mm/damon/usage.rst~docs-admin-guide-mm-damon-document-init_regions-feature
+++ a/Documentation/admin-guide/mm/damon/usage.rst
@@ -34,8 +34,9 @@ the reason, this document describes only
 debugfs Interface
 =================
 
-DAMON exports four files, ``attrs``, ``target_ids``, ``schemes`` and
-``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``.
+DAMON exports five files, ``attrs``, ``target_ids``, ``init_regions``,
+``schemes`` and ``monitor_on`` under its debugfs directory,
+``<debugfs>/damon/``.
 
 
 Attributes
@@ -74,6 +75,42 @@ check it again::
 Note that setting the target ids doesn't start the monitoring.
 
 
+Initial Monitoring Target Regions
+---------------------------------
+
+In case of the debugfs based monitoring, DAMON automatically sets and updates
+the monitoring target regions so that entire memory mappings of target
+processes can be covered.  However, users can want to limit the monitoring
+region to specific address ranges, such as the heap, the stack, or specific
+file-mapped area.  Or, some users can know the initial access pattern of their
+workloads and therefore want to set optimal initial regions for the 'adaptive
+regions adjustment'.
+
+In such cases, users can explicitly set the initial monitoring target regions
+as they want, by writing proper values to the ``init_regions`` file.  Each line
+of the input should represent one region in below form.::
+
+    <target id> <start address> <end address>
+
+The ``target id`` should already in ``target_ids`` file, and the regions should
+be passed in address order.  For example, below commands will set a couple of
+address ranges, ``1-100`` and ``100-200`` as the initial monitoring target
+region of process 42, and another couple of address ranges, ``20-40`` and
+``50-100`` as that of process 4242.::
+
+    # cd <debugfs>/damon
+    # echo "42   1       100
+            42   100     200
+            4242 20      40
+            4242 50      100" > init_regions
+
+Note that this sets the initial monitoring target regions only.  In case of
+virtual memory monitoring, DAMON will automatically updates the boundary of the
+regions after one ``regions update interval``.  Therefore, users should set the
+``regions update interval`` large enough in this case, if they don't want the
+update.
+
+
 Schemes
 -------
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 233/262] mm/damon/vaddr: separate commonly usable functions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (231 preceding siblings ...)
  2021-11-05 20:46 ` [patch 232/262] Docs/admin-guide/mm/damon: document 'init_regions' feature Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:46 ` [patch 234/262] mm/damon: implement primitives for physical address space monitoring Andrew Morton
                   ` (28 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/vaddr: separate commonly usable functions

This commit moves functions in the default virtual address spaces
monitoring primitives that commonly usable from other address spaces like
physical address space into a header file.  Those will be reused by the
physical address space monitoring primitives which will be implemented by
the following commit.

[sj@kernel.org: include 'highmem.h' to fix a build failure]
  Link: https://lkml.kernel.org/r/20211014110848.5204-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211012205711.29216-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/Makefile       |    2 
 mm/damon/prmtv-common.c |   87 +++++++++++++++++++++++++++++++++++++
 mm/damon/prmtv-common.h |   17 +++++++
 mm/damon/vaddr.c        |   88 +-------------------------------------
 4 files changed, 108 insertions(+), 86 deletions(-)

--- a/mm/damon/Makefile~mm-damon-vaddr-separate-commonly-usable-functions
+++ a/mm/damon/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DAMON)		:= core.o
-obj-$(CONFIG_DAMON_VADDR)	+= vaddr.o
+obj-$(CONFIG_DAMON_VADDR)	+= prmtv-common.o vaddr.o
 obj-$(CONFIG_DAMON_DBGFS)	+= dbgfs.o
--- /dev/null
+++ a/mm/damon/prmtv-common.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Common Primitives for Data Access Monitoring
+ *
+ * Author: SeongJae Park <sj@kernel.org>
+ */
+
+#include <linux/mmu_notifier.h>
+#include <linux/page_idle.h>
+#include <linux/pagemap.h>
+#include <linux/rmap.h>
+
+#include "prmtv-common.h"
+
+/*
+ * Get an online page for a pfn if it's in the LRU list.  Otherwise, returns
+ * NULL.
+ *
+ * The body of this function is stolen from the 'page_idle_get_page()'.  We
+ * steal rather than reuse it because the code is quite simple.
+ */
+struct page *damon_get_page(unsigned long pfn)
+{
+	struct page *page = pfn_to_online_page(pfn);
+
+	if (!page || !PageLRU(page) || !get_page_unless_zero(page))
+		return NULL;
+
+	if (unlikely(!PageLRU(page))) {
+		put_page(page);
+		page = NULL;
+	}
+	return page;
+}
+
+void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
+{
+	bool referenced = false;
+	struct page *page = damon_get_page(pte_pfn(*pte));
+
+	if (!page)
+		return;
+
+	if (pte_young(*pte)) {
+		referenced = true;
+		*pte = pte_mkold(*pte);
+	}
+
+#ifdef CONFIG_MMU_NOTIFIER
+	if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
+		referenced = true;
+#endif /* CONFIG_MMU_NOTIFIER */
+
+	if (referenced)
+		set_page_young(page);
+
+	set_page_idle(page);
+	put_page(page);
+}
+
+void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	bool referenced = false;
+	struct page *page = damon_get_page(pmd_pfn(*pmd));
+
+	if (!page)
+		return;
+
+	if (pmd_young(*pmd)) {
+		referenced = true;
+		*pmd = pmd_mkold(*pmd);
+	}
+
+#ifdef CONFIG_MMU_NOTIFIER
+	if (mmu_notifier_clear_young(mm, addr,
+				addr + ((1UL) << HPAGE_PMD_SHIFT)))
+		referenced = true;
+#endif /* CONFIG_MMU_NOTIFIER */
+
+	if (referenced)
+		set_page_young(page);
+
+	set_page_idle(page);
+	put_page(page);
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+}
--- /dev/null
+++ a/mm/damon/prmtv-common.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Common Primitives for Data Access Monitoring
+ *
+ * Author: SeongJae Park <sj@kernel.org>
+ */
+
+#include <linux/damon.h>
+#include <linux/random.h>
+
+/* Get a random number in [l, r) */
+#define damon_rand(l, r) (l + prandom_u32_max(r - l))
+
+struct page *damon_get_page(unsigned long pfn);
+
+void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr);
+void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr);
--- a/mm/damon/vaddr.c~mm-damon-vaddr-separate-commonly-usable-functions
+++ a/mm/damon/vaddr.c
@@ -8,25 +8,19 @@
 #define pr_fmt(fmt) "damon-va: " fmt
 
 #include <asm-generic/mman-common.h>
-#include <linux/damon.h>
+#include <linux/highmem.h>
 #include <linux/hugetlb.h>
-#include <linux/mm.h>
 #include <linux/mmu_notifier.h>
-#include <linux/highmem.h>
 #include <linux/page_idle.h>
 #include <linux/pagewalk.h>
-#include <linux/random.h>
-#include <linux/sched/mm.h>
-#include <linux/slab.h>
+
+#include "prmtv-common.h"
 
 #ifdef CONFIG_DAMON_VADDR_KUNIT_TEST
 #undef DAMON_MIN_REGION
 #define DAMON_MIN_REGION 1
 #endif
 
-/* Get a random number in [l, r) */
-#define damon_rand(l, r) (l + prandom_u32_max(r - l))
-
 /*
  * 't->id' should be the pointer to the relevant 'struct pid' having reference
  * count.  Caller must put the returned task, unless it is NULL.
@@ -373,82 +367,6 @@ void damon_va_update(struct damon_ctx *c
 	}
 }
 
-/*
- * Get an online page for a pfn if it's in the LRU list.  Otherwise, returns
- * NULL.
- *
- * The body of this function is stolen from the 'page_idle_get_page()'.  We
- * steal rather than reuse it because the code is quite simple.
- */
-static struct page *damon_get_page(unsigned long pfn)
-{
-	struct page *page = pfn_to_online_page(pfn);
-
-	if (!page || !PageLRU(page) || !get_page_unless_zero(page))
-		return NULL;
-
-	if (unlikely(!PageLRU(page))) {
-		put_page(page);
-		page = NULL;
-	}
-	return page;
-}
-
-static void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm,
-			     unsigned long addr)
-{
-	bool referenced = false;
-	struct page *page = damon_get_page(pte_pfn(*pte));
-
-	if (!page)
-		return;
-
-	if (pte_young(*pte)) {
-		referenced = true;
-		*pte = pte_mkold(*pte);
-	}
-
-#ifdef CONFIG_MMU_NOTIFIER
-	if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
-		referenced = true;
-#endif /* CONFIG_MMU_NOTIFIER */
-
-	if (referenced)
-		set_page_young(page);
-
-	set_page_idle(page);
-	put_page(page);
-}
-
-static void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm,
-			     unsigned long addr)
-{
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	bool referenced = false;
-	struct page *page = damon_get_page(pmd_pfn(*pmd));
-
-	if (!page)
-		return;
-
-	if (pmd_young(*pmd)) {
-		referenced = true;
-		*pmd = pmd_mkold(*pmd);
-	}
-
-#ifdef CONFIG_MMU_NOTIFIER
-	if (mmu_notifier_clear_young(mm, addr,
-				addr + ((1UL) << HPAGE_PMD_SHIFT)))
-		referenced = true;
-#endif /* CONFIG_MMU_NOTIFIER */
-
-	if (referenced)
-		set_page_young(page);
-
-	set_page_idle(page);
-	put_page(page);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-}
-
 static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
 		unsigned long next, struct mm_walk *walk)
 {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 234/262] mm/damon: implement primitives for physical address space monitoring
  2021-11-05 20:34 incoming Andrew Morton
                   ` (232 preceding siblings ...)
  2021-11-05 20:46 ` [patch 233/262] mm/damon/vaddr: separate commonly usable functions Andrew Morton
@ 2021-11-05 20:46 ` Andrew Morton
  2021-11-05 20:47 ` [patch 235/262] mm/damon/dbgfs: support physical memory monitoring Andrew Morton
                   ` (27 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:46 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon: implement primitives for physical address space monitoring

This commit implements the monitoring primitives for the physical memory
address space.  Internally, it uses the PTE Accessed bit, similar to that
of the virtual address spaces monitoring primitives.  It supports only
user memory pages, as idle pages tracking does.  If the monitoring target
physical memory address range contains non-user memory pages, access check
of the pages will do nothing but simply treat the pages as not accessed.

Link: https://lkml.kernel.org/r/20211012205711.29216-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   10 +
 mm/damon/Kconfig      |    8 +
 mm/damon/Makefile     |    1 
 mm/damon/paddr.c      |  224 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 243 insertions(+)

--- a/include/linux/damon.h~mm-damon-implement-primitives-for-physical-address-space-monitoring
+++ a/include/linux/damon.h
@@ -351,4 +351,14 @@ void damon_va_set_primitives(struct damo
 
 #endif	/* CONFIG_DAMON_VADDR */
 
+#ifdef CONFIG_DAMON_PADDR
+
+/* Monitoring primitives for the physical memory address space */
+void damon_pa_prepare_access_checks(struct damon_ctx *ctx);
+unsigned int damon_pa_check_accesses(struct damon_ctx *ctx);
+bool damon_pa_target_valid(void *t);
+void damon_pa_set_primitives(struct damon_ctx *ctx);
+
+#endif	/* CONFIG_DAMON_PADDR */
+
 #endif	/* _DAMON_H */
--- a/mm/damon/Kconfig~mm-damon-implement-primitives-for-physical-address-space-monitoring
+++ a/mm/damon/Kconfig
@@ -32,6 +32,14 @@ config DAMON_VADDR
 	  This builds the default data access monitoring primitives for DAMON
 	  that work for virtual address spaces.
 
+config DAMON_PADDR
+	bool "Data access monitoring primitives for the physical address space"
+	depends on DAMON && MMU
+	select PAGE_IDLE_FLAG
+	help
+	  This builds the default data access monitoring primitives for DAMON
+	  that works for the physical address space.
+
 config DAMON_VADDR_KUNIT_TEST
 	bool "Test for DAMON primitives" if !KUNIT_ALL_TESTS
 	depends on DAMON_VADDR && KUNIT=y
--- a/mm/damon/Makefile~mm-damon-implement-primitives-for-physical-address-space-monitoring
+++ a/mm/damon/Makefile
@@ -2,4 +2,5 @@
 
 obj-$(CONFIG_DAMON)		:= core.o
 obj-$(CONFIG_DAMON_VADDR)	+= prmtv-common.o vaddr.o
+obj-$(CONFIG_DAMON_PADDR)	+= prmtv-common.o paddr.o
 obj-$(CONFIG_DAMON_DBGFS)	+= dbgfs.o
--- /dev/null
+++ a/mm/damon/paddr.c
@@ -0,0 +1,224 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DAMON Primitives for The Physical Address Space
+ *
+ * Author: SeongJae Park <sj@kernel.org>
+ */
+
+#define pr_fmt(fmt) "damon-pa: " fmt
+
+#include <linux/mmu_notifier.h>
+#include <linux/page_idle.h>
+#include <linux/pagemap.h>
+#include <linux/rmap.h>
+
+#include "prmtv-common.h"
+
+static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
+		unsigned long addr, void *arg)
+{
+	struct page_vma_mapped_walk pvmw = {
+		.page = page,
+		.vma = vma,
+		.address = addr,
+	};
+
+	while (page_vma_mapped_walk(&pvmw)) {
+		addr = pvmw.address;
+		if (pvmw.pte)
+			damon_ptep_mkold(pvmw.pte, vma->vm_mm, addr);
+		else
+			damon_pmdp_mkold(pvmw.pmd, vma->vm_mm, addr);
+	}
+	return true;
+}
+
+static void damon_pa_mkold(unsigned long paddr)
+{
+	struct page *page = damon_get_page(PHYS_PFN(paddr));
+	struct rmap_walk_control rwc = {
+		.rmap_one = __damon_pa_mkold,
+		.anon_lock = page_lock_anon_vma_read,
+	};
+	bool need_lock;
+
+	if (!page)
+		return;
+
+	if (!page_mapped(page) || !page_rmapping(page)) {
+		set_page_idle(page);
+		goto out;
+	}
+
+	need_lock = !PageAnon(page) || PageKsm(page);
+	if (need_lock && !trylock_page(page))
+		goto out;
+
+	rmap_walk(page, &rwc);
+
+	if (need_lock)
+		unlock_page(page);
+
+out:
+	put_page(page);
+}
+
+static void __damon_pa_prepare_access_check(struct damon_ctx *ctx,
+					    struct damon_region *r)
+{
+	r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
+
+	damon_pa_mkold(r->sampling_addr);
+}
+
+void damon_pa_prepare_access_checks(struct damon_ctx *ctx)
+{
+	struct damon_target *t;
+	struct damon_region *r;
+
+	damon_for_each_target(t, ctx) {
+		damon_for_each_region(r, t)
+			__damon_pa_prepare_access_check(ctx, r);
+	}
+}
+
+struct damon_pa_access_chk_result {
+	unsigned long page_sz;
+	bool accessed;
+};
+
+static bool __damon_pa_young(struct page *page, struct vm_area_struct *vma,
+		unsigned long addr, void *arg)
+{
+	struct damon_pa_access_chk_result *result = arg;
+	struct page_vma_mapped_walk pvmw = {
+		.page = page,
+		.vma = vma,
+		.address = addr,
+	};
+
+	result->accessed = false;
+	result->page_sz = PAGE_SIZE;
+	while (page_vma_mapped_walk(&pvmw)) {
+		addr = pvmw.address;
+		if (pvmw.pte) {
+			result->accessed = pte_young(*pvmw.pte) ||
+				!page_is_idle(page) ||
+				mmu_notifier_test_young(vma->vm_mm, addr);
+		} else {
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+			result->accessed = pmd_young(*pvmw.pmd) ||
+				!page_is_idle(page) ||
+				mmu_notifier_test_young(vma->vm_mm, addr);
+			result->page_sz = ((1UL) << HPAGE_PMD_SHIFT);
+#else
+			WARN_ON_ONCE(1);
+#endif	/* CONFIG_TRANSPARENT_HUGEPAGE */
+		}
+		if (result->accessed) {
+			page_vma_mapped_walk_done(&pvmw);
+			break;
+		}
+	}
+
+	/* If accessed, stop walking */
+	return !result->accessed;
+}
+
+static bool damon_pa_young(unsigned long paddr, unsigned long *page_sz)
+{
+	struct page *page = damon_get_page(PHYS_PFN(paddr));
+	struct damon_pa_access_chk_result result = {
+		.page_sz = PAGE_SIZE,
+		.accessed = false,
+	};
+	struct rmap_walk_control rwc = {
+		.arg = &result,
+		.rmap_one = __damon_pa_young,
+		.anon_lock = page_lock_anon_vma_read,
+	};
+	bool need_lock;
+
+	if (!page)
+		return false;
+
+	if (!page_mapped(page) || !page_rmapping(page)) {
+		if (page_is_idle(page))
+			result.accessed = false;
+		else
+			result.accessed = true;
+		put_page(page);
+		goto out;
+	}
+
+	need_lock = !PageAnon(page) || PageKsm(page);
+	if (need_lock && !trylock_page(page)) {
+		put_page(page);
+		return NULL;
+	}
+
+	rmap_walk(page, &rwc);
+
+	if (need_lock)
+		unlock_page(page);
+	put_page(page);
+
+out:
+	*page_sz = result.page_sz;
+	return result.accessed;
+}
+
+static void __damon_pa_check_access(struct damon_ctx *ctx,
+				    struct damon_region *r)
+{
+	static unsigned long last_addr;
+	static unsigned long last_page_sz = PAGE_SIZE;
+	static bool last_accessed;
+
+	/* If the region is in the last checked page, reuse the result */
+	if (ALIGN_DOWN(last_addr, last_page_sz) ==
+				ALIGN_DOWN(r->sampling_addr, last_page_sz)) {
+		if (last_accessed)
+			r->nr_accesses++;
+		return;
+	}
+
+	last_accessed = damon_pa_young(r->sampling_addr, &last_page_sz);
+	if (last_accessed)
+		r->nr_accesses++;
+
+	last_addr = r->sampling_addr;
+}
+
+unsigned int damon_pa_check_accesses(struct damon_ctx *ctx)
+{
+	struct damon_target *t;
+	struct damon_region *r;
+	unsigned int max_nr_accesses = 0;
+
+	damon_for_each_target(t, ctx) {
+		damon_for_each_region(r, t) {
+			__damon_pa_check_access(ctx, r);
+			max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+		}
+	}
+
+	return max_nr_accesses;
+}
+
+bool damon_pa_target_valid(void *t)
+{
+	return true;
+}
+
+void damon_pa_set_primitives(struct damon_ctx *ctx)
+{
+	ctx->primitive.init = NULL;
+	ctx->primitive.update = NULL;
+	ctx->primitive.prepare_access_checks = damon_pa_prepare_access_checks;
+	ctx->primitive.check_accesses = damon_pa_check_accesses;
+	ctx->primitive.reset_aggregated = NULL;
+	ctx->primitive.target_valid = damon_pa_target_valid;
+	ctx->primitive.cleanup = NULL;
+	ctx->primitive.apply_scheme = NULL;
+}
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 235/262] mm/damon/dbgfs: support physical memory monitoring
  2021-11-05 20:34 incoming Andrew Morton
                   ` (233 preceding siblings ...)
  2021-11-05 20:46 ` [patch 234/262] mm/damon: implement primitives for physical address space monitoring Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 236/262] Docs/DAMON: document physical memory monitoring support Andrew Morton
                   ` (26 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: support physical memory monitoring

This commit makes the 'damon-dbgfs' to support the physical memory
monitoring, in addition to the virtual memory monitoring.

Users can do the physical memory monitoring by writing a special keyword,
'paddr' to the 'target_ids' debugfs file.  Then, DAMON will check the
special keyword and configure the monitoring context to run with the
primitives for the physical address space.

Unlike the virtual memory monitoring, the monitoring target region will
not be automatically set.  Therefore, users should also set the monitoring
target address region using the 'init_regions' debugfs file.

Also, note that the physical memory monitoring will not automatically
terminated.  The user should explicitly turn off the monitoring by writing
'off' to the 'monitor_on' debugfs file.

Link: https://lkml.kernel.org/r/20211012205711.29216-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/Kconfig |    2 +-
 mm/damon/dbgfs.c |   21 ++++++++++++++++++---
 2 files changed, 19 insertions(+), 4 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-physical-memory-monitoring
+++ a/mm/damon/dbgfs.c
@@ -339,6 +339,7 @@ static ssize_t dbgfs_target_ids_write(st
 		const char __user *buf, size_t count, loff_t *ppos)
 {
 	struct damon_ctx *ctx = file->private_data;
+	bool id_is_pid = true;
 	char *kbuf, *nrs;
 	unsigned long *targets;
 	ssize_t nr_targets;
@@ -351,6 +352,11 @@ static ssize_t dbgfs_target_ids_write(st
 		return PTR_ERR(kbuf);
 
 	nrs = kbuf;
+	if (!strncmp(kbuf, "paddr\n", count)) {
+		id_is_pid = false;
+		/* target id is meaningless here, but we set it just for fun */
+		scnprintf(kbuf, count, "42    ");
+	}
 
 	targets = str_to_target_ids(nrs, ret, &nr_targets);
 	if (!targets) {
@@ -358,7 +364,7 @@ static ssize_t dbgfs_target_ids_write(st
 		goto out;
 	}
 
-	if (targetid_is_pid(ctx)) {
+	if (id_is_pid) {
 		for (i = 0; i < nr_targets; i++) {
 			targets[i] = (unsigned long)find_get_pid(
 					(int)targets[i]);
@@ -372,15 +378,24 @@ static ssize_t dbgfs_target_ids_write(st
 
 	mutex_lock(&ctx->kdamond_lock);
 	if (ctx->kdamond) {
-		if (targetid_is_pid(ctx))
+		if (id_is_pid)
 			dbgfs_put_pids(targets, nr_targets);
 		ret = -EBUSY;
 		goto unlock_out;
 	}
 
+	/* remove targets with previously-set primitive */
+	damon_set_targets(ctx, NULL, 0);
+
+	/* Configure the context for the address space type */
+	if (id_is_pid)
+		damon_va_set_primitives(ctx);
+	else
+		damon_pa_set_primitives(ctx);
+
 	err = damon_set_targets(ctx, targets, nr_targets);
 	if (err) {
-		if (targetid_is_pid(ctx))
+		if (id_is_pid)
 			dbgfs_put_pids(targets, nr_targets);
 		ret = err;
 	}
--- a/mm/damon/Kconfig~mm-damon-dbgfs-support-physical-memory-monitoring
+++ a/mm/damon/Kconfig
@@ -54,7 +54,7 @@ config DAMON_VADDR_KUNIT_TEST
 
 config DAMON_DBGFS
 	bool "DAMON debugfs interface"
-	depends on DAMON_VADDR && DEBUG_FS
+	depends on DAMON_VADDR && DAMON_PADDR && DEBUG_FS
 	help
 	  This builds the debugfs interface for DAMON.  The user space admins
 	  can use the interface for arbitrary data access monitoring.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 236/262] Docs/DAMON: document physical memory monitoring support
  2021-11-05 20:34 incoming Andrew Morton
                   ` (234 preceding siblings ...)
  2021-11-05 20:47 ` [patch 235/262] mm/damon/dbgfs: support physical memory monitoring Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 237/262] mm/damon/vaddr: constify static mm_walk_ops Andrew Morton
                   ` (25 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, brendanhiggins, corbet, david, dwmw, elver,
	foersleo, gthelen, Jonathan.Cameron, linux-mm, markubo,
	mm-commits, rientjes, shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/DAMON: document physical memory monitoring support

This commit updates the DAMON documents for the physical memory address
space monitoring support.

Link: https://lkml.kernel.org/r/20211012205711.29216-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rienjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/usage.rst |   25 +++++++++++---
 Documentation/vm/damon/design.rst            |   29 ++++++++++-------
 Documentation/vm/damon/faq.rst               |    5 +-
 3 files changed, 40 insertions(+), 19 deletions(-)

--- a/Documentation/admin-guide/mm/damon/usage.rst~docs-damon-document-physical-memory-monitoring-support
+++ a/Documentation/admin-guide/mm/damon/usage.rst
@@ -10,15 +10,16 @@ DAMON provides below three interfaces fo
   This is for privileged people such as system administrators who want a
   just-working human-friendly interface.  Using this, users can use the DAMON’s
   major features in a human-friendly way.  It may not be highly tuned for
-  special cases, though.  It supports only virtual address spaces monitoring.
+  special cases, though.  It supports both virtual and physical address spaces
+  monitoring.
 - *debugfs interface.*
   This is for privileged user space programmers who want more optimized use of
   DAMON.  Using this, users can use DAMON’s major features by reading
   from and writing to special debugfs files.  Therefore, you can write and use
   your personalized DAMON debugfs wrapper programs that reads/writes the
   debugfs files instead of you.  The DAMON user space tool is also a reference
-  implementation of such programs.  It supports only virtual address spaces
-  monitoring.
+  implementation of such programs.  It supports both virtual and physical
+  address spaces monitoring.
 - *Kernel Space Programming Interface.*
   This is for kernel space programmers.  Using this, users can utilize every
   feature of DAMON most flexibly and efficiently by writing kernel space
@@ -72,20 +73,34 @@ check it again::
     # cat target_ids
     42 4242
 
+Users can also monitor the physical memory address space of the system by
+writing a special keyword, "``paddr\n``" to the file.  Because physical address
+space monitoring doesn't support multiple targets, reading the file will show a
+fake value, ``42``, as below::
+
+    # cd <debugfs>/damon
+    # echo paddr > target_ids
+    # cat target_ids
+    42
+
 Note that setting the target ids doesn't start the monitoring.
 
 
 Initial Monitoring Target Regions
 ---------------------------------
 
-In case of the debugfs based monitoring, DAMON automatically sets and updates
-the monitoring target regions so that entire memory mappings of target
+In case of the virtual address space monitoring, DAMON automatically sets and
+updates the monitoring target regions so that entire memory mappings of target
 processes can be covered.  However, users can want to limit the monitoring
 region to specific address ranges, such as the heap, the stack, or specific
 file-mapped area.  Or, some users can know the initial access pattern of their
 workloads and therefore want to set optimal initial regions for the 'adaptive
 regions adjustment'.
 
+In contrast, DAMON do not automatically sets and updates the monitoring target
+regions in case of physical memory monitoring.  Therefore, users should set the
+monitoring target regions by themselves.
+
 In such cases, users can explicitly set the initial monitoring target regions
 as they want, by writing proper values to the ``init_regions`` file.  Each line
 of the input should represent one region in below form.::
--- a/Documentation/vm/damon/design.rst~docs-damon-document-physical-memory-monitoring-support
+++ a/Documentation/vm/damon/design.rst
@@ -35,13 +35,17 @@ two parts:
 1. Identification of the monitoring target address range for the address space.
 2. Access check of specific address range in the target space.
 
-DAMON currently provides the implementation of the primitives for only the
-virtual address spaces. Below two subsections describe how it works.
+DAMON currently provides the implementations of the primitives for the physical
+and virtual address spaces. Below two subsections describe how those work.
 
 
 VMA-based Target Address Range Construction
 -------------------------------------------
 
+This is only for the virtual address space primitives implementation.  That for
+the physical address space simply asks users to manually set the monitoring
+target address ranges.
+
 Only small parts in the super-huge virtual address space of the processes are
 mapped to the physical memory and accessed.  Thus, tracking the unmapped
 address regions is just wasteful.  However, because DAMON can deal with some
@@ -71,15 +75,18 @@ to make a reasonable trade-off.  Below s
 PTE Accessed-bit Based Access Check
 -----------------------------------
 
-The implementation for the virtual address space uses PTE Accessed-bit for
-basic access checks.  It finds the relevant PTE Accessed bit from the address
-by walking the page table for the target task of the address.  In this way, the
-implementation finds and clears the bit for next sampling target address and
-checks whether the bit set again after one sampling period.  This could disturb
-other kernel subsystems using the Accessed bits, namely Idle page tracking and
-the reclaim logic.  To avoid such disturbances, DAMON makes it mutually
-exclusive with Idle page tracking and uses ``PG_idle`` and ``PG_young`` page
-flags to solve the conflict with the reclaim logic, as Idle page tracking does.
+Both of the implementations for physical and virtual address spaces use PTE
+Accessed-bit for basic access checks.  Only one difference is the way of
+finding the relevant PTE Accessed bit(s) from the address.  While the
+implementation for the virtual address walks the page table for the target task
+of the address, the implementation for the physical address walks every page
+table having a mapping to the address.  In this way, the implementations find
+and clear the bit(s) for next sampling target address and checks whether the
+bit(s) set again after one sampling period.  This could disturb other kernel
+subsystems using the Accessed bits, namely Idle page tracking and the reclaim
+logic.  To avoid such disturbances, DAMON makes it mutually exclusive with Idle
+page tracking and uses ``PG_idle`` and ``PG_young`` page flags to solve the
+conflict with the reclaim logic, as Idle page tracking does.
 
 
 Address Space Independent Core Mechanisms
--- a/Documentation/vm/damon/faq.rst~docs-damon-document-physical-memory-monitoring-support
+++ a/Documentation/vm/damon/faq.rst
@@ -36,10 +36,9 @@ constructions and actual access checks c
 DAMON core by the users.  In this way, DAMON users can monitor any address
 space with any access check technique.
 
-Nonetheless, DAMON provides vma tracking and PTE Accessed bit check based
+Nonetheless, DAMON provides vma/rmap tracking and PTE Accessed bit check based
 implementations of the address space dependent functions for the virtual memory
-by default, for a reference and convenient use.  In near future, we will
-provide those for physical memory address space.
+and the physical memory by default, for a reference and convenient use.
 
 
 Can I simply monitor page granularity?
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 237/262] mm/damon/vaddr: constify static mm_walk_ops
  2021-11-05 20:34 incoming Andrew Morton
                   ` (235 preceding siblings ...)
  2021-11-05 20:47 ` [patch 236/262] Docs/DAMON: document physical memory monitoring support Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 238/262] mm/damon/dbgfs: remove unnecessary variables Andrew Morton
                   ` (24 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, anshuman.khandual, linux-mm, mm-commits, rikard.falkeborn,
	sj, torvalds

From: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Subject: mm/damon/vaddr: constify static mm_walk_ops

The only usage of these structs is to pass their addresses to
walk_page_range(), which takes a pointer to const mm_walk_ops as argument.
Make them const to allow the compiler to put them in read-only memory.

Link: https://lkml.kernel.org/r/20211014075042.17174-2-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/vaddr.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/damon/vaddr.c~mm-damon-vaddr-constify-static-mm_walk_ops
+++ a/mm/damon/vaddr.c
@@ -394,7 +394,7 @@ out:
 	return 0;
 }
 
-static struct mm_walk_ops damon_mkold_ops = {
+static const struct mm_walk_ops damon_mkold_ops = {
 	.pmd_entry = damon_mkold_pmd_entry,
 };
 
@@ -490,7 +490,7 @@ out:
 	return 0;
 }
 
-static struct mm_walk_ops damon_young_ops = {
+static const struct mm_walk_ops damon_young_ops = {
 	.pmd_entry = damon_young_pmd_entry,
 };
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 238/262] mm/damon/dbgfs: remove unnecessary variables
  2021-11-05 20:34 incoming Andrew Morton
                   ` (236 preceding siblings ...)
  2021-11-05 20:47 ` [patch 237/262] mm/damon/vaddr: constify static mm_walk_ops Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 239/262] mm/damon/paddr: support the pageout scheme Andrew Morton
                   ` (23 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, rongwei.wang, sj, torvalds

From: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Subject: mm/damon/dbgfs: remove unnecessary variables

In some functions, it's unnecessary to declare 'err' and 'ret' variables
at the same time.  This patch mainly to simplify the issue of such
declarations by reusing one variable.

Link: https://lkml.kernel.org/r/20211014073014.35754-1-sj@kernel.org
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |   66 +++++++++++++++++++++------------------------
 1 file changed, 31 insertions(+), 35 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-remove-unnecessary-variables
+++ a/mm/damon/dbgfs.c
@@ -69,8 +69,7 @@ static ssize_t dbgfs_attrs_write(struct
 	struct damon_ctx *ctx = file->private_data;
 	unsigned long s, a, r, minr, maxr;
 	char *kbuf;
-	ssize_t ret = count;
-	int err;
+	ssize_t ret;
 
 	kbuf = user_input_str(buf, count, ppos);
 	if (IS_ERR(kbuf))
@@ -88,9 +87,9 @@ static ssize_t dbgfs_attrs_write(struct
 		goto unlock_out;
 	}
 
-	err = damon_set_attrs(ctx, s, a, r, minr, maxr);
-	if (err)
-		ret = err;
+	ret = damon_set_attrs(ctx, s, a, r, minr, maxr);
+	if (!ret)
+		ret = count;
 unlock_out:
 	mutex_unlock(&ctx->kdamond_lock);
 out:
@@ -220,14 +219,13 @@ static ssize_t dbgfs_schemes_write(struc
 	struct damon_ctx *ctx = file->private_data;
 	char *kbuf;
 	struct damos **schemes;
-	ssize_t nr_schemes = 0, ret = count;
-	int err;
+	ssize_t nr_schemes = 0, ret;
 
 	kbuf = user_input_str(buf, count, ppos);
 	if (IS_ERR(kbuf))
 		return PTR_ERR(kbuf);
 
-	schemes = str_to_schemes(kbuf, ret, &nr_schemes);
+	schemes = str_to_schemes(kbuf, count, &nr_schemes);
 	if (!schemes) {
 		ret = -EINVAL;
 		goto out;
@@ -239,11 +237,12 @@ static ssize_t dbgfs_schemes_write(struc
 		goto unlock_out;
 	}
 
-	err = damon_set_schemes(ctx, schemes, nr_schemes);
-	if (err)
-		ret = err;
-	else
+	ret = damon_set_schemes(ctx, schemes, nr_schemes);
+	if (!ret) {
+		ret = count;
 		nr_schemes = 0;
+	}
+
 unlock_out:
 	mutex_unlock(&ctx->kdamond_lock);
 	free_schemes_arr(schemes, nr_schemes);
@@ -343,9 +342,8 @@ static ssize_t dbgfs_target_ids_write(st
 	char *kbuf, *nrs;
 	unsigned long *targets;
 	ssize_t nr_targets;
-	ssize_t ret = count;
+	ssize_t ret;
 	int i;
-	int err;
 
 	kbuf = user_input_str(buf, count, ppos);
 	if (IS_ERR(kbuf))
@@ -358,7 +356,7 @@ static ssize_t dbgfs_target_ids_write(st
 		scnprintf(kbuf, count, "42    ");
 	}
 
-	targets = str_to_target_ids(nrs, ret, &nr_targets);
+	targets = str_to_target_ids(nrs, count, &nr_targets);
 	if (!targets) {
 		ret = -ENOMEM;
 		goto out;
@@ -393,11 +391,12 @@ static ssize_t dbgfs_target_ids_write(st
 	else
 		damon_pa_set_primitives(ctx);
 
-	err = damon_set_targets(ctx, targets, nr_targets);
-	if (err) {
+	ret = damon_set_targets(ctx, targets, nr_targets);
+	if (ret) {
 		if (id_is_pid)
 			dbgfs_put_pids(targets, nr_targets);
-		ret = err;
+	} else {
+		ret = count;
 	}
 
 unlock_out:
@@ -715,8 +714,7 @@ static ssize_t dbgfs_mk_context_write(st
 {
 	char *kbuf;
 	char *ctx_name;
-	ssize_t ret = count;
-	int err;
+	ssize_t ret;
 
 	kbuf = user_input_str(buf, count, ppos);
 	if (IS_ERR(kbuf))
@@ -734,9 +732,9 @@ static ssize_t dbgfs_mk_context_write(st
 	}
 
 	mutex_lock(&damon_dbgfs_lock);
-	err = dbgfs_mk_context(ctx_name);
-	if (err)
-		ret = err;
+	ret = dbgfs_mk_context(ctx_name);
+	if (!ret)
+		ret = count;
 	mutex_unlock(&damon_dbgfs_lock);
 
 out:
@@ -805,8 +803,7 @@ static ssize_t dbgfs_rm_context_write(st
 		const char __user *buf, size_t count, loff_t *ppos)
 {
 	char *kbuf;
-	ssize_t ret = count;
-	int err;
+	ssize_t ret;
 	char *ctx_name;
 
 	kbuf = user_input_str(buf, count, ppos);
@@ -825,9 +822,9 @@ static ssize_t dbgfs_rm_context_write(st
 	}
 
 	mutex_lock(&damon_dbgfs_lock);
-	err = dbgfs_rm_context(ctx_name);
-	if (err)
-		ret = err;
+	ret = dbgfs_rm_context(ctx_name);
+	if (!ret)
+		ret = count;
 	mutex_unlock(&damon_dbgfs_lock);
 
 out:
@@ -851,9 +848,8 @@ static ssize_t dbgfs_monitor_on_read(str
 static ssize_t dbgfs_monitor_on_write(struct file *file,
 		const char __user *buf, size_t count, loff_t *ppos)
 {
-	ssize_t ret = count;
+	ssize_t ret;
 	char *kbuf;
-	int err;
 
 	kbuf = user_input_str(buf, count, ppos);
 	if (IS_ERR(kbuf))
@@ -866,14 +862,14 @@ static ssize_t dbgfs_monitor_on_write(st
 	}
 
 	if (!strncmp(kbuf, "on", count))
-		err = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs);
+		ret = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs);
 	else if (!strncmp(kbuf, "off", count))
-		err = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs);
+		ret = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs);
 	else
-		err = -EINVAL;
+		ret = -EINVAL;
 
-	if (err)
-		ret = err;
+	if (!ret)
+		ret = count;
 	kfree(kbuf);
 	return ret;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 239/262] mm/damon/paddr: support the pageout scheme
  2021-11-05 20:34 incoming Andrew Morton
                   ` (237 preceding siblings ...)
  2021-11-05 20:47 ` [patch 238/262] mm/damon/dbgfs: remove unnecessary variables Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 240/262] mm/damon/schemes: implement size quota for schemes application speed control Andrew Morton
                   ` (22 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/paddr: support the pageout scheme

Introduction
============

This patchset 1) makes the engine for general data access pattern-oriented
memory management (DAMOS) be more useful for production environments, and
2) implements a static kernel module for lightweight proactive reclamation
using the engine.

Proactive Reclamation
---------------------

On general memory over-committed systems, proactively reclaiming cold
pages helps saving memory and reducing latency spikes that incurred by the
direct reclaim or the CPU consumption of kswapd, while incurring only
minimal performance degradation[2].

A Free Pages Reporting[8] based memory over-commit virtualization system
would be one more specific use case.  In the system, the guest VMs reports
their free memory to host, and the host reallocates the reported memory to
other guests.  As a result, the system's memory utilization can be
maximized.  However, the guests could be not so memory-frugal, because
some kernel subsystems and user-space applications are designed to use as
much memory as available.  Then, guests would report only small amount of
free memory to host, results in poor memory utilization.  Running the
proactive reclamation in such guests could help mitigating this problem.

Google has also implemented this idea and using it in their data center. 
They further proposed upstreaming it in LSFMM'19, and "the general
consensus was that, while this sort of proactive reclaim would be useful
for a number of users, the cost of this particular solution was too high
to consider merging it upstream"[3].  The cost mainly comes from the
coldness tracking.  Roughly speaking, the implementation periodically
scans the 'Accessed' bit of each page.  For the reason, the overhead
linearly increases as the size of the memory and the scanning frequency
grows.  As a result, Google is known to dedicating one CPU for the work. 
That's a reasonable option to someone like Google, but it wouldn't be so
to some others.

DAMON and DAMOS: An engine for data access pattern-oriented memory management
-----------------------------------------------------------------------------

DAMON[4] is a framework for general data access monitoring.  Its adaptive
monitoring overhead control feature minimizes its monitoring overhead.  It
also let the upper-bound of the overhead be configurable by clients,
regardless of the size of the monitoring target memory.  While monitoring
70 GiB memory of a production system every 5 milliseconds, it consumes
less than 1% single CPU time.  For this, it could sacrify some of the
quality of the monitoring results.  Nevertheless, the lower-bound of the
quality is configurable, and it uses a best-effort algorithm for better
quality.  Our test results[5] show the quality is practical enough.  From
the production system monitoring, we were able to find a 4 KiB region in
the 70 GiB memory that shows highest access frequency.

We normally don't monitor the data access pattern just for fun but to
improve something like memory management.  Proactive reclamation is one
such usage.  For such general cases, DAMON provides a feature called
DAMon-based Operation Schemes (DAMOS)[6].  It makes DAMON an engine for
general data access pattern oriented memory management.  Using this,
clients can ask DAMON to find memory regions of specific data access
pattern and apply some memory management action (e.g., page out, move to
head of the LRU list, use huge page, ...).  We call the request 'scheme'.

Proactive Reclamation on top of DAMON/DAMOS
-------------------------------------------

Therefore, by using DAMON for the cold pages detection, the proactive
reclamation's monitoring overhead issue can be solved.  Actually, we
previously implemented a version of proactive reclamation using DAMOS and
achieved noticeable improvements with our evaluation setup[5]. 
Nevertheless, it more for a proof-of-concept, rather than production uses.
It supports only virtual address spaces of processes, and require
additional tuning efforts for given workloads and the hardware.  For the
tuning, we introduced a simple auto-tuning user space tool[8].  Google is
also known to using a ML-based similar approach for their fleets[2].  But,
making it just works with intuitive knobs in the kernel would be helpful
for general users.

To this end, this patchset improves DAMOS to be ready for such production
usages, and implements another version of the proactive reclamation,
namely DAMON_RECLAIM, on top of it.

DAMOS Improvements: Aggressiveness Control, Prioritization, and Watermarks
--------------------------------------------------------------------------

First of all, the current version of DAMOS supports only virtual address
spaces.  This patchset makes it supports the physical address space for
the page out action.

Next major problem of the current version of DAMOS is the lack of the
aggressiveness control, which can results in arbitrary overhead.  For
example, if huge memory regions having the data access pattern of interest
are found, applying the requested action to all of the regions could incur
significant overhead.  It can be controlled by tuning the target data
access pattern with manual or automated approaches[2,7].  But, some people
would prefer the kernel to just work with only intuitive tuning or default
values.

For such cases, this patchset implements a safeguard, namely time/size
quota.  Using this, the clients can specify up to how much time can be
used for applying the action, and/or up to how much memory regions the
action can be applied within a user-specified time duration.  A followup
question is, to which memory regions should the action applied within the
limits?  We implement a simple regions prioritization mechanism for each
action and make DAMOS to apply the action to high priority regions first. 
It also allows clients tune the prioritization mechanism to use different
weights for size, access frequency, and age of memory regions.  This means
we could use not only LRU but also LFU or some fancy algorithms like
CAR[9] with lightweight overhead.

Though DAMON is lightweight, someone would want to remove even the cold
pages monitoring overhead when it is unnecessary.  Currently, it should
manually turned on and off by clients, but some clients would simply want
to turn it on and off based on some metrics like free memory ratio or
memory fragmentation.  For such cases, this patchset implements a
watermarks-based automatic activation feature.  It allows the clients
configure the metric of their interest, and three watermarks of the
metric.  If the metric is higher than the high watermark or lower than the
low watermark, the scheme is deactivated.  If the metric is lower than the
mid watermark but higher than the low watermark, the scheme is activated.

DAMON-based Reclaim
-------------------

Using the improved version of DAMOS, this patchset implements a static
kernel module called 'damon_reclaim'.  It finds memory regions that didn't
accessed for specific time duration and page out.  Consuming too much CPU
for the paging out operations, or doing pageout too frequently can be
critical for systems configuring their swap devices with software-defined
in-memory block devices like zram/zswap or total number of writes limited
devices like SSDs, respectively.  To avoid the problems, the time/size
quotas can be configured.  Under the quotas, it pages out memory regions
that didn't accessed longer first.  Also, to remove the monitoring
overhead under peaceful situation, and to fall back to the LRU-list based
page granularity reclamation when it doesn't make progress, the three
watermarks based activation mechanism is used, with the free memory ratio
as the watermark metric.

For convenient configurations, it provides several module parameters. 
Using these, sysadmins can enable/disable it, and tune its parameters
including the coldness identification time threshold, the time/size quotas
and the three watermarks.

Evaluation
==========

In short, DAMON_RECLAIM with 50ms/s time quota and regions prioritization
on v5.15-rc5 Linux kernel with ZRAM swap device achieves 38.58% memory
saving with only 1.94% runtime overhead.  For this, DAMON_RECLAIM consumes
only 4.97% of single CPU time.

Setup
-----

We evaluate DAMON_RECLAIM to show how each of the DAMOS improvements make
effect.  For this, we measure DAMON_RECLAIM's CPU consumption, entire
system memory footprint, total number of major page faults, and runtime of
24 realistic workloads in PARSEC3 and SPLASH-2X benchmark suites on my
QEMU/KVM based virtual machine.  The virtual machine runs on an i3.metal
AWS instance, has 130GiB memory, and runs a linux kernel built on latest
-mm tree[1] plus this patchset.  It also utilizes a 4 GiB ZRAM swap
device.  We repeats the measurement 5 times and use averages.

[1] https://github.com/hnaz/linux-mm/tree/v5.15-rc5-mmots-2021-10-13-19-55

Detailed Results
----------------

The results are summarized in the below table.

With coldness identification threshold of 5 seconds, DAMON_RECLAIM without
the time quota-based speed limit achieves 47.21% memory saving, but incur
4.59% runtime slowdown to the workloads on average.  For this,
DAMON_RECLAIM consumes about 11.28% single CPU time.

Applying time quotas of 200ms/s, 50ms/s, and 10ms/s without the regions
prioritization reduces the slowdown to 4.89%, 2.65%, and 1.5%,
respectively.  Time quota of 200ms/s (20%) makes no real change compared
to the quota unapplied version, because the quota unapplied version
consumes only 11.28% CPU time.  DAMON_RECLAIM's CPU utilization also
similarly reduced: 11.24%, 5.51%, and 2.01% of single CPU time.  That is,
the overhead is proportional to the speed limit.  Nevertheless, it also
reduces the memory saving because it becomes less aggressive.  In detail,
the three variants show 48.76%, 37.83%, and 7.85% memory saving,
respectively.

Applying the regions prioritization (page out regions that not accessed
longer first within the time quota) further reduces the performance
degradation.  Runtime slowdowns and total number of major page faults
increase has been 4.89%/218,690% -> 4.39%/166,136% (200ms/s),
2.65%/111,886% -> 1.94%/59,053% (50ms/s), and 1.5%/34,973.40% ->
2.08%/8,781.75% (10ms/s).  The runtime under 10ms/s time quota has
increased with prioritization, but apparently that's under the margin of
error.

    time quota   prioritization  memory_saving  cpu_util  slowdown  pgmajfaults overhead
    N            N               47.21%         11.28%    4.59%     194,802%
    200ms/s      N               48.76%         11.24%    4.89%     218,690%
    50ms/s       N               37.83%         5.51%     2.65%     111,886%
    10ms/s       N               7.85%          2.01%     1.5%      34,793.40%
    200ms/s      Y               50.08%         10.38%    4.39%     166,136%
    50ms/s       Y               38.58%         4.97%     1.94%     59,053%
    10ms/s       Y               3.63%          1.73%     2.08%     8,781.75%

Baseline and Complete Git Trees
===============================

The patches are based on the latest -mm tree
(v5.15-rc5-mmots-2021-10-13-19-55).  You can also clone the complete git tree
from:

    $ git clone git://github.com/sjp38/linux -b damon_reclaim/patches/v1

The web is also available:
https://git.kernel.org/pub/scm/linux/kernel/git/sj/linux.git/tag/?h=damon_reclaim/patches/v1

Sequence Of Patches
===================

The first patch makes DAMOS support the physical address space for the page out
action.  Following five patches (patches 2-6) implement the time/size quotas.
Next four patches (patches 7-10) implement the memory regions prioritization
within the limit.  Then, three following patches (patches 11-13) implement the
watermarks-based schemes activation.  Finally, the last two patches (patches
14-15) implement and document the DAMON-based reclamation using the advanced
DAMOS.

[1] https://www.kernel.org/doc/html/v5.15-rc1/vm/damon/index.html
[2] https://research.google/pubs/pub48551/
[3] https://lwn.net/Articles/787611/
[4] https://damonitor.github.io
[5] https://damonitor.github.io/doc/html/latest/vm/damon/eval.html
[6] https://lore.kernel.org/linux-mm/20211001125604.29660-1-sj@kernel.org/
[7] https://github.com/awslabs/damoos
[8] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html
[9] https://www.usenix.org/conference/fast-04/car-clock-adaptive-replacement


This patch (of 15):

This commit makes the DAMON primitives for physical address space support
the pageout action for DAMON-based Operation Schemes.  With this commit,
hence, users can easily implement system-level data access-aware
reclamations using DAMOS.

[sj@kernel.org: fix missing-prototype build warning]
  Link: https://lkml.kernel.org/r/20211025064220.13904-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211019150731.16699-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211019150731.16699-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Marco Elver <elver@google.com>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Greg Thelen <gthelen@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    2 ++
 mm/damon/paddr.c      |   37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 38 insertions(+), 1 deletion(-)

--- a/include/linux/damon.h~mm-damon-paddr-support-the-pageout-scheme
+++ a/include/linux/damon.h
@@ -357,6 +357,8 @@ void damon_va_set_primitives(struct damo
 void damon_pa_prepare_access_checks(struct damon_ctx *ctx);
 unsigned int damon_pa_check_accesses(struct damon_ctx *ctx);
 bool damon_pa_target_valid(void *t);
+int damon_pa_apply_scheme(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_pa_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_PADDR */
--- a/mm/damon/paddr.c~mm-damon-paddr-support-the-pageout-scheme
+++ a/mm/damon/paddr.c
@@ -11,7 +11,9 @@
 #include <linux/page_idle.h>
 #include <linux/pagemap.h>
 #include <linux/rmap.h>
+#include <linux/swap.h>
 
+#include "../internal.h"
 #include "prmtv-common.h"
 
 static bool __damon_pa_mkold(struct page *page, struct vm_area_struct *vma,
@@ -211,6 +213,39 @@ bool damon_pa_target_valid(void *t)
 	return true;
 }
 
+int damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+	unsigned long addr;
+	LIST_HEAD(page_list);
+
+	if (scheme->action != DAMOS_PAGEOUT)
+		return -EINVAL;
+
+	for (addr = r->ar.start; addr < r->ar.end; addr += PAGE_SIZE) {
+		struct page *page = damon_get_page(PHYS_PFN(addr));
+
+		if (!page)
+			continue;
+
+		ClearPageReferenced(page);
+		test_and_clear_page_young(page);
+		if (isolate_lru_page(page)) {
+			put_page(page);
+			continue;
+		}
+		if (PageUnevictable(page)) {
+			putback_lru_page(page);
+		} else {
+			list_add(&page->lru, &page_list);
+			put_page(page);
+		}
+	}
+	reclaim_pages(&page_list);
+	cond_resched();
+	return 0;
+}
+
 void damon_pa_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = NULL;
@@ -220,5 +255,5 @@ void damon_pa_set_primitives(struct damo
 	ctx->primitive.reset_aggregated = NULL;
 	ctx->primitive.target_valid = damon_pa_target_valid;
 	ctx->primitive.cleanup = NULL;
-	ctx->primitive.apply_scheme = NULL;
+	ctx->primitive.apply_scheme = damon_pa_apply_scheme;
 }
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 240/262] mm/damon/schemes: implement size quota for schemes application speed control
  2021-11-05 20:34 incoming Andrew Morton
                   ` (238 preceding siblings ...)
  2021-11-05 20:47 ` [patch 239/262] mm/damon/paddr: support the pageout scheme Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 241/262] mm/damon/schemes: skip already charged targets and regions Andrew Morton
                   ` (21 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: implement size quota for schemes application speed control

There could be arbitrarily large memory regions fulfilling the target data
access pattern of a DAMON-based operation scheme.  In the case, applying
the action of the scheme could incur too high overhead.  To provide an
intuitive way for avoiding it, this commit implements a feature called
size quota.  If the quota is set, DAMON tries to apply the action only up
to the given amount of memory regions within a given time window.

Link: https://lkml.kernel.org/r/20211019150731.16699-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   36 +++++++++++++++++++++---
 mm/damon/core.c       |   60 ++++++++++++++++++++++++++++++++++------
 mm/damon/dbgfs.c      |    4 ++
 3 files changed, 87 insertions(+), 13 deletions(-)

--- a/include/linux/damon.h~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control
+++ a/include/linux/damon.h
@@ -90,6 +90,26 @@ enum damos_action {
 };
 
 /**
+ * struct damos_quota - Controls the aggressiveness of the given scheme.
+ * @sz:			Maximum bytes of memory that the action can be applied.
+ * @reset_interval:	Charge reset interval in milliseconds.
+ *
+ * To avoid consuming too much CPU time or IO resources for applying the
+ * &struct damos->action to large memory, DAMON allows users to set a size
+ * quota.  The quota can be set by writing non-zero values to &sz.  If the size
+ * quota is set, DAMON tries to apply the action only up to &sz bytes within
+ * &reset_interval.
+ */
+struct damos_quota {
+	unsigned long sz;
+	unsigned long reset_interval;
+
+/* private: For charging the quota */
+	unsigned long charged_sz;
+	unsigned long charged_from;
+};
+
+/**
  * struct damos - Represents a Data Access Monitoring-based Operation Scheme.
  * @min_sz_region:	Minimum size of target regions.
  * @max_sz_region:	Maximum size of target regions.
@@ -98,13 +118,20 @@ enum damos_action {
  * @min_age_region:	Minimum age of target regions.
  * @max_age_region:	Maximum age of target regions.
  * @action:		&damo_action to be applied to the target regions.
+ * @quota:		Control the aggressiveness of this scheme.
  * @stat_count:		Total number of regions that this scheme is applied.
  * @stat_sz:		Total size of regions that this scheme is applied.
  * @list:		List head for siblings.
  *
- * For each aggregation interval, DAMON applies @action to monitoring target
- * regions fit in the condition and updates the statistics.  Note that both
- * the minimums and the maximums are inclusive.
+ * For each aggregation interval, DAMON finds regions which fit in the
+ * condition (&min_sz_region, &max_sz_region, &min_nr_accesses,
+ * &max_nr_accesses, &min_age_region, &max_age_region) and applies &action to
+ * those.  To avoid consuming too much CPU time or IO resources for the
+ * &action, &quota is used.
+ *
+ * After applying the &action to each region, &stat_count and &stat_sz is
+ * updated to reflect the number of regions and total size of regions that the
+ * &action is applied.
  */
 struct damos {
 	unsigned long min_sz_region;
@@ -114,6 +141,7 @@ struct damos {
 	unsigned int min_age_region;
 	unsigned int max_age_region;
 	enum damos_action action;
+	struct damos_quota quota;
 	unsigned long stat_count;
 	unsigned long stat_sz;
 	struct list_head list;
@@ -310,7 +338,7 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action);
+		enum damos_action action, struct damos_quota *quota);
 void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
 void damon_destroy_scheme(struct damos *s);
 
--- a/mm/damon/core.c~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control
+++ a/mm/damon/core.c
@@ -89,7 +89,7 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action)
+		enum damos_action action, struct damos_quota *quota)
 {
 	struct damos *scheme;
 
@@ -107,6 +107,11 @@ struct damos *damon_new_scheme(
 	scheme->stat_sz = 0;
 	INIT_LIST_HEAD(&scheme->list);
 
+	scheme->quota.sz = quota->sz;
+	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.charged_sz = 0;
+	scheme->quota.charged_from = 0;
+
 	return scheme;
 }
 
@@ -530,15 +535,25 @@ static void kdamond_reset_aggregated(str
 	}
 }
 
+static void damon_split_region_at(struct damon_ctx *ctx,
+		struct damon_target *t, struct damon_region *r,
+		unsigned long sz_r);
+
 static void damon_do_apply_schemes(struct damon_ctx *c,
 				   struct damon_target *t,
 				   struct damon_region *r)
 {
 	struct damos *s;
-	unsigned long sz;
 
 	damon_for_each_scheme(s, c) {
-		sz = r->ar.end - r->ar.start;
+		struct damos_quota *quota = &s->quota;
+		unsigned long sz = r->ar.end - r->ar.start;
+
+		/* Check the quota */
+		if (quota->sz && quota->charged_sz >= quota->sz)
+			continue;
+
+		/* Check the target regions condition */
 		if (sz < s->min_sz_region || s->max_sz_region < sz)
 			continue;
 		if (r->nr_accesses < s->min_nr_accesses ||
@@ -546,22 +561,51 @@ static void damon_do_apply_schemes(struc
 			continue;
 		if (r->age < s->min_age_region || s->max_age_region < r->age)
 			continue;
-		s->stat_count++;
-		s->stat_sz += sz;
-		if (c->primitive.apply_scheme)
+
+		/* Apply the scheme */
+		if (c->primitive.apply_scheme) {
+			if (quota->sz && quota->charged_sz + sz > quota->sz) {
+				sz = ALIGN_DOWN(quota->sz - quota->charged_sz,
+						DAMON_MIN_REGION);
+				if (!sz)
+					goto update_stat;
+				damon_split_region_at(c, t, r, sz);
+			}
 			c->primitive.apply_scheme(c, t, r, s);
+			quota->charged_sz += sz;
+		}
 		if (s->action != DAMOS_STAT)
 			r->age = 0;
+
+update_stat:
+		s->stat_count++;
+		s->stat_sz += sz;
 	}
 }
 
 static void kdamond_apply_schemes(struct damon_ctx *c)
 {
 	struct damon_target *t;
-	struct damon_region *r;
+	struct damon_region *r, *next_r;
+	struct damos *s;
+
+	damon_for_each_scheme(s, c) {
+		struct damos_quota *quota = &s->quota;
+
+		if (!quota->sz)
+			continue;
+
+		/* New charge window starts */
+		if (time_after_eq(jiffies, quota->charged_from +
+					msecs_to_jiffies(
+						quota->reset_interval))) {
+			quota->charged_from = jiffies;
+			quota->charged_sz = 0;
+		}
+	}
 
 	damon_for_each_target(t, c) {
-		damon_for_each_region(r, t)
+		damon_for_each_region_safe(r, next_r, t)
 			damon_do_apply_schemes(c, t, r);
 	}
 }
--- a/mm/damon/dbgfs.c~mm-damon-schemes-implement-size-quota-for-schemes-application-speed-control
+++ a/mm/damon/dbgfs.c
@@ -188,6 +188,8 @@ static struct damos **str_to_schemes(con
 
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
+		struct damos_quota quota = {};
+
 		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &parsed);
@@ -200,7 +202,7 @@ static struct damos **str_to_schemes(con
 
 		pos += parsed;
 		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
-				min_age, max_age, action);
+				min_age, max_age, action, &quota);
 		if (!scheme)
 			goto fail;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 241/262] mm/damon/schemes: skip already charged targets and regions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (239 preceding siblings ...)
  2021-11-05 20:47 ` [patch 240/262] mm/damon/schemes: implement size quota for schemes application speed control Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 242/262] mm/damon/schemes: implement time quota Andrew Morton
                   ` (20 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: skip already charged targets and regions

If DAMOS has stopped applying action in the middle of a group of memory
regions due to its size quota, it starts the work again from the beginning
of the address space in the next charge window.  If there is a huge memory
region at the beginning of the address space and it fulfills the scheme's
target data access pattern always, the action will applied to only the
region.

This commit mitigates the case by skipping memory regions that charged in
current charge window at the beginning of next charge window.

Link: https://lkml.kernel.org/r/20211019150731.16699-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    5 +++++
 mm/damon/core.c       |   37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

--- a/include/linux/damon.h~mm-damon-schemes-skip-already-charged-targets-and-regions
+++ a/include/linux/damon.h
@@ -107,6 +107,8 @@ struct damos_quota {
 /* private: For charging the quota */
 	unsigned long charged_sz;
 	unsigned long charged_from;
+	struct damon_target *charge_target_from;
+	unsigned long charge_addr_from;
 };
 
 /**
@@ -307,6 +309,9 @@ struct damon_ctx {
 #define damon_prev_region(r) \
 	(container_of(r->list.prev, struct damon_region, list))
 
+#define damon_last_region(t) \
+	(list_last_entry(&t->regions_list, struct damon_region, list))
+
 #define damon_for_each_region(r, t) \
 	list_for_each_entry(r, &t->regions_list, list)
 
--- a/mm/damon/core.c~mm-damon-schemes-skip-already-charged-targets-and-regions
+++ a/mm/damon/core.c
@@ -111,6 +111,8 @@ struct damos *damon_new_scheme(
 	scheme->quota.reset_interval = quota->reset_interval;
 	scheme->quota.charged_sz = 0;
 	scheme->quota.charged_from = 0;
+	scheme->quota.charge_target_from = NULL;
+	scheme->quota.charge_addr_from = 0;
 
 	return scheme;
 }
@@ -553,6 +555,37 @@ static void damon_do_apply_schemes(struc
 		if (quota->sz && quota->charged_sz >= quota->sz)
 			continue;
 
+		/* Skip previously charged regions */
+		if (quota->charge_target_from) {
+			if (t != quota->charge_target_from)
+				continue;
+			if (r == damon_last_region(t)) {
+				quota->charge_target_from = NULL;
+				quota->charge_addr_from = 0;
+				continue;
+			}
+			if (quota->charge_addr_from &&
+					r->ar.end <= quota->charge_addr_from)
+				continue;
+
+			if (quota->charge_addr_from && r->ar.start <
+					quota->charge_addr_from) {
+				sz = ALIGN_DOWN(quota->charge_addr_from -
+						r->ar.start, DAMON_MIN_REGION);
+				if (!sz) {
+					if (r->ar.end - r->ar.start <=
+							DAMON_MIN_REGION)
+						continue;
+					sz = DAMON_MIN_REGION;
+				}
+				damon_split_region_at(c, t, r, sz);
+				r = damon_next_region(r);
+				sz = r->ar.end - r->ar.start;
+			}
+			quota->charge_target_from = NULL;
+			quota->charge_addr_from = 0;
+		}
+
 		/* Check the target regions condition */
 		if (sz < s->min_sz_region || s->max_sz_region < sz)
 			continue;
@@ -573,6 +606,10 @@ static void damon_do_apply_schemes(struc
 			}
 			c->primitive.apply_scheme(c, t, r, s);
 			quota->charged_sz += sz;
+			if (quota->sz && quota->charged_sz >= quota->sz) {
+				quota->charge_target_from = t;
+				quota->charge_addr_from = r->ar.end + 1;
+			}
 		}
 		if (s->action != DAMOS_STAT)
 			r->age = 0;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 242/262] mm/damon/schemes: implement time quota
  2021-11-05 20:34 incoming Andrew Morton
                   ` (240 preceding siblings ...)
  2021-11-05 20:47 ` [patch 241/262] mm/damon/schemes: skip already charged targets and regions Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 243/262] mm/damon/dbgfs: support quotas of schemes Andrew Morton
                   ` (19 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: implement time quota

The size quota feature of DAMOS is useful for IO resource-critical
systems, but not so intuitive for CPU time-critical systems.  Systems
using zram or zswap-like swap device would be examples.

To provide another intuitive ways for such systems, this commit implements
time-based quota for DAMON-based Operation Schemes.  If the quota is set,
DAMOS tries to use only up to the user-defined quota of CPU time within a
given time window.

Link: https://lkml.kernel.org/r/20211019150731.16699-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   25 +++++++++++++++++-----
 mm/damon/core.c       |   45 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 60 insertions(+), 10 deletions(-)

--- a/include/linux/damon.h~mm-damon-schemes-implement-time-quota
+++ a/include/linux/damon.h
@@ -91,20 +91,35 @@ enum damos_action {
 
 /**
  * struct damos_quota - Controls the aggressiveness of the given scheme.
+ * @ms:			Maximum milliseconds that the scheme can use.
  * @sz:			Maximum bytes of memory that the action can be applied.
  * @reset_interval:	Charge reset interval in milliseconds.
  *
  * To avoid consuming too much CPU time or IO resources for applying the
- * &struct damos->action to large memory, DAMON allows users to set a size
- * quota.  The quota can be set by writing non-zero values to &sz.  If the size
- * quota is set, DAMON tries to apply the action only up to &sz bytes within
- * &reset_interval.
+ * &struct damos->action to large memory, DAMON allows users to set time and/or
+ * size quotas.  The quotas can be set by writing non-zero values to &ms and
+ * &sz, respectively.  If the time quota is set, DAMON tries to use only up to
+ * &ms milliseconds within &reset_interval for applying the action.  If the
+ * size quota is set, DAMON tries to apply the action only up to &sz bytes
+ * within &reset_interval.
+ *
+ * Internally, the time quota is transformed to a size quota using estimated
+ * throughput of the scheme's action.  DAMON then compares it against &sz and
+ * uses smaller one as the effective quota.
  */
 struct damos_quota {
+	unsigned long ms;
 	unsigned long sz;
 	unsigned long reset_interval;
 
-/* private: For charging the quota */
+/* private: */
+	/* For throughput estimation */
+	unsigned long total_charged_sz;
+	unsigned long total_charged_ns;
+
+	unsigned long esz;	/* Effective size quota in bytes */
+
+	/* For charging the quota */
 	unsigned long charged_sz;
 	unsigned long charged_from;
 	struct damon_target *charge_target_from;
--- a/mm/damon/core.c~mm-damon-schemes-implement-time-quota
+++ a/mm/damon/core.c
@@ -107,8 +107,12 @@ struct damos *damon_new_scheme(
 	scheme->stat_sz = 0;
 	INIT_LIST_HEAD(&scheme->list);
 
+	scheme->quota.ms = quota->ms;
 	scheme->quota.sz = quota->sz;
 	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.total_charged_sz = 0;
+	scheme->quota.total_charged_ns = 0;
+	scheme->quota.esz = 0;
 	scheme->quota.charged_sz = 0;
 	scheme->quota.charged_from = 0;
 	scheme->quota.charge_target_from = NULL;
@@ -550,9 +554,10 @@ static void damon_do_apply_schemes(struc
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
 		unsigned long sz = r->ar.end - r->ar.start;
+		struct timespec64 begin, end;
 
 		/* Check the quota */
-		if (quota->sz && quota->charged_sz >= quota->sz)
+		if (quota->esz && quota->charged_sz >= quota->esz)
 			continue;
 
 		/* Skip previously charged regions */
@@ -597,16 +602,21 @@ static void damon_do_apply_schemes(struc
 
 		/* Apply the scheme */
 		if (c->primitive.apply_scheme) {
-			if (quota->sz && quota->charged_sz + sz > quota->sz) {
-				sz = ALIGN_DOWN(quota->sz - quota->charged_sz,
+			if (quota->esz &&
+					quota->charged_sz + sz > quota->esz) {
+				sz = ALIGN_DOWN(quota->esz - quota->charged_sz,
 						DAMON_MIN_REGION);
 				if (!sz)
 					goto update_stat;
 				damon_split_region_at(c, t, r, sz);
 			}
+			ktime_get_coarse_ts64(&begin);
 			c->primitive.apply_scheme(c, t, r, s);
+			ktime_get_coarse_ts64(&end);
+			quota->total_charged_ns += timespec64_to_ns(&end) -
+				timespec64_to_ns(&begin);
 			quota->charged_sz += sz;
-			if (quota->sz && quota->charged_sz >= quota->sz) {
+			if (quota->esz && quota->charged_sz >= quota->esz) {
 				quota->charge_target_from = t;
 				quota->charge_addr_from = r->ar.end + 1;
 			}
@@ -620,6 +630,29 @@ update_stat:
 	}
 }
 
+/* Shouldn't be called if quota->ms and quota->sz are zero */
+static void damos_set_effective_quota(struct damos_quota *quota)
+{
+	unsigned long throughput;
+	unsigned long esz;
+
+	if (!quota->ms) {
+		quota->esz = quota->sz;
+		return;
+	}
+
+	if (quota->total_charged_ns)
+		throughput = quota->total_charged_sz * 1000000 /
+			quota->total_charged_ns;
+	else
+		throughput = PAGE_SIZE * 1024;
+	esz = throughput * quota->ms;
+
+	if (quota->sz && quota->sz < esz)
+		esz = quota->sz;
+	quota->esz = esz;
+}
+
 static void kdamond_apply_schemes(struct damon_ctx *c)
 {
 	struct damon_target *t;
@@ -629,15 +662,17 @@ static void kdamond_apply_schemes(struct
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
 
-		if (!quota->sz)
+		if (!quota->ms && !quota->sz)
 			continue;
 
 		/* New charge window starts */
 		if (time_after_eq(jiffies, quota->charged_from +
 					msecs_to_jiffies(
 						quota->reset_interval))) {
+			quota->total_charged_sz += quota->charged_sz;
 			quota->charged_from = jiffies;
 			quota->charged_sz = 0;
+			damos_set_effective_quota(quota);
 		}
 	}
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 243/262] mm/damon/dbgfs: support quotas of schemes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (241 preceding siblings ...)
  2021-11-05 20:47 ` [patch 242/262] mm/damon/schemes: implement time quota Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 244/262] mm/damon/selftests: support schemes quotas Andrew Morton
                   ` (18 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: support quotas of schemes

This commit makes the debugfs interface of DAMON support the scheme quotas
by chaning the format of the input for the schemes file.

Link: https://lkml.kernel.org/r/20211019150731.16699-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-quotas-of-schemes
+++ a/mm/damon/dbgfs.c
@@ -105,11 +105,14 @@ static ssize_t sprint_schemes(struct dam
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
-				s->action, s->stat_count, s->stat_sz);
+				s->action,
+				s->quota.ms, s->quota.sz,
+				s->quota.reset_interval,
+				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
 
@@ -190,10 +193,11 @@ static struct damos **str_to_schemes(con
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
 
-		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
+		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
-				&min_age, &max_age, &action, &parsed);
-		if (ret != 7)
+				&min_age, &max_age, &action, &quota.ms,
+				&quota.sz, &quota.reset_interval, &parsed);
+		if (ret != 10)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 244/262] mm/damon/selftests: support schemes quotas
  2021-11-05 20:34 incoming Andrew Morton
                   ` (242 preceding siblings ...)
  2021-11-05 20:47 ` [patch 243/262] mm/damon/dbgfs: support quotas of schemes Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 245/262] mm/damon/schemes: prioritize regions within the quotas Andrew Morton
                   ` (17 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/selftests: support schemes quotas

This commit updates DAMON selftests to support updated schemes debugfs
file format for the quotas.

Link: https://lkml.kernel.org/r/20211019150731.16699-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/damon/debugfs_attrs.sh |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/testing/selftests/damon/debugfs_attrs.sh~mm-damon-selftests-support-schemes-quotas
+++ a/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -63,10 +63,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 245/262] mm/damon/schemes: prioritize regions within the quotas
  2021-11-05 20:34 incoming Andrew Morton
                   ` (243 preceding siblings ...)
  2021-11-05 20:47 ` [patch 244/262] mm/damon/selftests: support schemes quotas Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 246/262] mm/damon/vaddr,paddr: support pageout prioritization Andrew Morton
                   ` (16 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: prioritize regions within the quotas

This commit makes DAMON apply schemes to regions having higher priority
first, if it cannot apply schemes to all regions due to the quotas.

The prioritization function should be implemented in the monitoring
primitives.  Those would commonly calculate the priority of the region
using attributes of regions, namely 'size', 'nr_accesses', and 'age'.  For
example, some primitive would calculate the priority of each region using
a weighted sum of 'nr_accesses' and 'age' of the region.

The optimal weights would depend on give environments, so this commit
makes those customizable.  Nevertheless, the score calculation functions
are only encouraged to respect the weights, not mandated.

Link: https://lkml.kernel.org/r/20211019150731.16699-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   26 ++++++++++++++++
 mm/damon/core.c       |   62 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 81 insertions(+), 7 deletions(-)

--- a/include/linux/damon.h~mm-damon-schemes-prioritize-regions-within-the-quotas
+++ a/include/linux/damon.h
@@ -14,6 +14,8 @@
 
 /* Minimal region size.  Every damon_region is aligned by this. */
 #define DAMON_MIN_REGION	PAGE_SIZE
+/* Max priority score for DAMON-based operation schemes */
+#define DAMOS_MAX_SCORE		(99)
 
 /**
  * struct damon_addr_range - Represents an address region of [@start, @end).
@@ -95,6 +97,10 @@ enum damos_action {
  * @sz:			Maximum bytes of memory that the action can be applied.
  * @reset_interval:	Charge reset interval in milliseconds.
  *
+ * @weight_sz:		Weight of the region's size for prioritization.
+ * @weight_nr_accesses:	Weight of the region's nr_accesses for prioritization.
+ * @weight_age:		Weight of the region's age for prioritization.
+ *
  * To avoid consuming too much CPU time or IO resources for applying the
  * &struct damos->action to large memory, DAMON allows users to set time and/or
  * size quotas.  The quotas can be set by writing non-zero values to &ms and
@@ -106,12 +112,22 @@ enum damos_action {
  * Internally, the time quota is transformed to a size quota using estimated
  * throughput of the scheme's action.  DAMON then compares it against &sz and
  * uses smaller one as the effective quota.
+ *
+ * For selecting regions within the quota, DAMON prioritizes current scheme's
+ * target memory regions using the &struct damon_primitive->get_scheme_score.
+ * You could customize the prioritization logic by setting &weight_sz,
+ * &weight_nr_accesses, and &weight_age, because monitoring primitives are
+ * encouraged to respect those.
  */
 struct damos_quota {
 	unsigned long ms;
 	unsigned long sz;
 	unsigned long reset_interval;
 
+	unsigned int weight_sz;
+	unsigned int weight_nr_accesses;
+	unsigned int weight_age;
+
 /* private: */
 	/* For throughput estimation */
 	unsigned long total_charged_sz;
@@ -124,6 +140,10 @@ struct damos_quota {
 	unsigned long charged_from;
 	struct damon_target *charge_target_from;
 	unsigned long charge_addr_from;
+
+	/* For prioritization */
+	unsigned long histogram[DAMOS_MAX_SCORE + 1];
+	unsigned int min_score;
 };
 
 /**
@@ -174,6 +194,7 @@ struct damon_ctx;
  * @prepare_access_checks:	Prepare next access check of target regions.
  * @check_accesses:		Check the accesses to target regions.
  * @reset_aggregated:		Reset aggregated accesses monitoring results.
+ * @get_scheme_score:		Get the score of a region for a scheme.
  * @apply_scheme:		Apply a DAMON-based operation scheme.
  * @target_valid:		Determine if the target is valid.
  * @cleanup:			Clean up the context.
@@ -200,6 +221,8 @@ struct damon_ctx;
  * of its update.  The value will be used for regions adjustment threshold.
  * @reset_aggregated should reset the access monitoring results that aggregated
  * by @check_accesses.
+ * @get_scheme_score should return the priority score of a region for a scheme
+ * as an integer in [0, &DAMOS_MAX_SCORE].
  * @apply_scheme is called from @kdamond when a region for user provided
  * DAMON-based operation scheme is found.  It should apply the scheme's action
  * to the region.  This is not used for &DAMON_ARBITRARY_TARGET case.
@@ -213,6 +236,9 @@ struct damon_primitive {
 	void (*prepare_access_checks)(struct damon_ctx *context);
 	unsigned int (*check_accesses)(struct damon_ctx *context);
 	void (*reset_aggregated)(struct damon_ctx *context);
+	int (*get_scheme_score)(struct damon_ctx *context,
+			struct damon_target *t, struct damon_region *r,
+			struct damos *scheme);
 	int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t,
 			struct damon_region *r, struct damos *scheme);
 	bool (*target_valid)(void *target);
--- a/mm/damon/core.c~mm-damon-schemes-prioritize-regions-within-the-quotas
+++ a/mm/damon/core.c
@@ -12,6 +12,7 @@
 #include <linux/kthread.h>
 #include <linux/random.h>
 #include <linux/slab.h>
+#include <linux/string.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/damon.h>
@@ -110,6 +111,9 @@ struct damos *damon_new_scheme(
 	scheme->quota.ms = quota->ms;
 	scheme->quota.sz = quota->sz;
 	scheme->quota.reset_interval = quota->reset_interval;
+	scheme->quota.weight_sz = quota->weight_sz;
+	scheme->quota.weight_nr_accesses = quota->weight_nr_accesses;
+	scheme->quota.weight_age = quota->weight_age;
 	scheme->quota.total_charged_sz = 0;
 	scheme->quota.total_charged_ns = 0;
 	scheme->quota.esz = 0;
@@ -545,6 +549,28 @@ static void damon_split_region_at(struct
 		struct damon_target *t, struct damon_region *r,
 		unsigned long sz_r);
 
+static bool __damos_valid_target(struct damon_region *r, struct damos *s)
+{
+	unsigned long sz;
+
+	sz = r->ar.end - r->ar.start;
+	return s->min_sz_region <= sz && sz <= s->max_sz_region &&
+		s->min_nr_accesses <= r->nr_accesses &&
+		r->nr_accesses <= s->max_nr_accesses &&
+		s->min_age_region <= r->age && r->age <= s->max_age_region;
+}
+
+static bool damos_valid_target(struct damon_ctx *c, struct damon_target *t,
+		struct damon_region *r, struct damos *s)
+{
+	bool ret = __damos_valid_target(r, s);
+
+	if (!ret || !s->quota.esz || !c->primitive.get_scheme_score)
+		return ret;
+
+	return c->primitive.get_scheme_score(c, t, r, s) >= s->quota.min_score;
+}
+
 static void damon_do_apply_schemes(struct damon_ctx *c,
 				   struct damon_target *t,
 				   struct damon_region *r)
@@ -591,13 +617,7 @@ static void damon_do_apply_schemes(struc
 			quota->charge_addr_from = 0;
 		}
 
-		/* Check the target regions condition */
-		if (sz < s->min_sz_region || s->max_sz_region < sz)
-			continue;
-		if (r->nr_accesses < s->min_nr_accesses ||
-				s->max_nr_accesses < r->nr_accesses)
-			continue;
-		if (r->age < s->min_age_region || s->max_age_region < r->age)
+		if (!damos_valid_target(c, t, r, s))
 			continue;
 
 		/* Apply the scheme */
@@ -661,6 +681,8 @@ static void kdamond_apply_schemes(struct
 
 	damon_for_each_scheme(s, c) {
 		struct damos_quota *quota = &s->quota;
+		unsigned long cumulated_sz;
+		unsigned int score, max_score = 0;
 
 		if (!quota->ms && !quota->sz)
 			continue;
@@ -674,6 +696,32 @@ static void kdamond_apply_schemes(struct
 			quota->charged_sz = 0;
 			damos_set_effective_quota(quota);
 		}
+
+		if (!c->primitive.get_scheme_score)
+			continue;
+
+		/* Fill up the score histogram */
+		memset(quota->histogram, 0, sizeof(quota->histogram));
+		damon_for_each_target(t, c) {
+			damon_for_each_region(r, t) {
+				if (!__damos_valid_target(r, s))
+					continue;
+				score = c->primitive.get_scheme_score(
+						c, t, r, s);
+				quota->histogram[score] +=
+					r->ar.end - r->ar.start;
+				if (score > max_score)
+					max_score = score;
+			}
+		}
+
+		/* Set the min score limit */
+		for (cumulated_sz = 0, score = max_score; ; score--) {
+			cumulated_sz += quota->histogram[score];
+			if (cumulated_sz >= quota->esz || !score)
+				break;
+		}
+		quota->min_score = score;
 	}
 
 	damon_for_each_target(t, c) {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 246/262] mm/damon/vaddr,paddr: support pageout prioritization
  2021-11-05 20:34 incoming Andrew Morton
                   ` (244 preceding siblings ...)
  2021-11-05 20:47 ` [patch 245/262] mm/damon/schemes: prioritize regions within the quotas Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 247/262] mm/damon/dbgfs: support prioritization weights Andrew Morton
                   ` (15 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/vaddr,paddr: support pageout prioritization

This commit makes the default monitoring primitives for virtual address
spaces and the physical address sapce to support memory regions
prioritization for 'PAGEOUT' DAMOS action.  It calculates hotness of each
region as weighted sum of 'nr_accesses' and 'age' of the region and get
the priority score as reverse of the hotness, so that cold regions can be
paged out first.

Link: https://lkml.kernel.org/r/20211019150731.16699-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h   |    4 +++
 mm/damon/paddr.c        |   14 +++++++++++
 mm/damon/prmtv-common.c |   46 ++++++++++++++++++++++++++++++++++++++
 mm/damon/prmtv-common.h |    3 ++
 mm/damon/vaddr.c        |   15 ++++++++++++
 5 files changed, 82 insertions(+)

--- a/include/linux/damon.h~mm-damon-vaddrpaddr-support-pageout-prioritization
+++ a/include/linux/damon.h
@@ -421,6 +421,8 @@ bool damon_va_target_valid(void *t);
 void damon_va_cleanup(struct damon_ctx *ctx);
 int damon_va_apply_scheme(struct damon_ctx *context, struct damon_target *t,
 		struct damon_region *r, struct damos *scheme);
+int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_va_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_VADDR */
@@ -433,6 +435,8 @@ unsigned int damon_pa_check_accesses(str
 bool damon_pa_target_valid(void *t);
 int damon_pa_apply_scheme(struct damon_ctx *context, struct damon_target *t,
 		struct damon_region *r, struct damos *scheme);
+int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme);
 void damon_pa_set_primitives(struct damon_ctx *ctx);
 
 #endif	/* CONFIG_DAMON_PADDR */
--- a/mm/damon/paddr.c~mm-damon-vaddrpaddr-support-pageout-prioritization
+++ a/mm/damon/paddr.c
@@ -246,6 +246,19 @@ int damon_pa_apply_scheme(struct damon_c
 	return 0;
 }
 
+int damon_pa_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+	switch (scheme->action) {
+	case DAMOS_PAGEOUT:
+		return damon_pageout_score(context, r, scheme);
+	default:
+		break;
+	}
+
+	return DAMOS_MAX_SCORE;
+}
+
 void damon_pa_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = NULL;
@@ -256,4 +269,5 @@ void damon_pa_set_primitives(struct damo
 	ctx->primitive.target_valid = damon_pa_target_valid;
 	ctx->primitive.cleanup = NULL;
 	ctx->primitive.apply_scheme = damon_pa_apply_scheme;
+	ctx->primitive.get_scheme_score = damon_pa_scheme_score;
 }
--- a/mm/damon/prmtv-common.c~mm-damon-vaddrpaddr-support-pageout-prioritization
+++ a/mm/damon/prmtv-common.c
@@ -85,3 +85,49 @@ void damon_pmdp_mkold(pmd_t *pmd, struct
 	put_page(page);
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 }
+
+#define DAMON_MAX_SUBSCORE	(100)
+#define DAMON_MAX_AGE_IN_LOG	(32)
+
+int damon_pageout_score(struct damon_ctx *c, struct damon_region *r,
+			struct damos *s)
+{
+	unsigned int max_nr_accesses;
+	int freq_subscore;
+	unsigned int age_in_sec;
+	int age_in_log, age_subscore;
+	unsigned int freq_weight = s->quota.weight_nr_accesses;
+	unsigned int age_weight = s->quota.weight_age;
+	int hotness;
+
+	max_nr_accesses = c->aggr_interval / c->sample_interval;
+	freq_subscore = r->nr_accesses * DAMON_MAX_SUBSCORE / max_nr_accesses;
+
+	age_in_sec = (unsigned long)r->age * c->aggr_interval / 1000000;
+	for (age_in_log = 0; age_in_log < DAMON_MAX_AGE_IN_LOG && age_in_sec;
+			age_in_log++, age_in_sec >>= 1)
+		;
+
+	/* If frequency is 0, higher age means it's colder */
+	if (freq_subscore == 0)
+		age_in_log *= -1;
+
+	/*
+	 * Now age_in_log is in [-DAMON_MAX_AGE_IN_LOG, DAMON_MAX_AGE_IN_LOG].
+	 * Scale it to be in [0, 100] and set it as age subscore.
+	 */
+	age_in_log += DAMON_MAX_AGE_IN_LOG;
+	age_subscore = age_in_log * DAMON_MAX_SUBSCORE /
+		DAMON_MAX_AGE_IN_LOG / 2;
+
+	hotness = (freq_weight * freq_subscore + age_weight * age_subscore);
+	if (freq_weight + age_weight)
+		hotness /= freq_weight + age_weight;
+	/*
+	 * Transform it to fit in [0, DAMOS_MAX_SCORE]
+	 */
+	hotness = hotness * DAMOS_MAX_SCORE / DAMON_MAX_SUBSCORE;
+
+	/* Return coldness of the region */
+	return DAMOS_MAX_SCORE - hotness;
+}
--- a/mm/damon/prmtv-common.h~mm-damon-vaddrpaddr-support-pageout-prioritization
+++ a/mm/damon/prmtv-common.h
@@ -15,3 +15,6 @@ struct page *damon_get_page(unsigned lon
 
 void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr);
 void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr);
+
+int damon_pageout_score(struct damon_ctx *c, struct damon_region *r,
+			struct damos *s);
--- a/mm/damon/vaddr.c~mm-damon-vaddrpaddr-support-pageout-prioritization
+++ a/mm/damon/vaddr.c
@@ -633,6 +633,20 @@ int damon_va_apply_scheme(struct damon_c
 	return damos_madvise(t, r, madv_action);
 }
 
+int damon_va_scheme_score(struct damon_ctx *context, struct damon_target *t,
+		struct damon_region *r, struct damos *scheme)
+{
+
+	switch (scheme->action) {
+	case DAMOS_PAGEOUT:
+		return damon_pageout_score(context, r, scheme);
+	default:
+		break;
+	}
+
+	return DAMOS_MAX_SCORE;
+}
+
 void damon_va_set_primitives(struct damon_ctx *ctx)
 {
 	ctx->primitive.init = damon_va_init;
@@ -643,6 +657,7 @@ void damon_va_set_primitives(struct damo
 	ctx->primitive.target_valid = damon_va_target_valid;
 	ctx->primitive.cleanup = NULL;
 	ctx->primitive.apply_scheme = damon_va_apply_scheme;
+	ctx->primitive.get_scheme_score = damon_va_scheme_score;
 }
 
 #include "vaddr-test.h"
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 247/262] mm/damon/dbgfs: support prioritization weights
  2021-11-05 20:34 incoming Andrew Morton
                   ` (245 preceding siblings ...)
  2021-11-05 20:47 ` [patch 246/262] mm/damon/vaddr,paddr: support pageout prioritization Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 248/262] tools/selftests/damon: update for regions prioritization of schemes Andrew Morton
                   ` (14 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: support prioritization weights

This commit allows DAMON debugfs interface users set the prioritization
weights by putting three more numbers to the 'schemes' file.

Link: https://lkml.kernel.org/r/20211019150731.16699-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-prioritization-weights
+++ a/mm/damon/dbgfs.c
@@ -105,13 +105,16 @@ static ssize_t sprint_schemes(struct dam
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu %lu %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
 				s->action,
 				s->quota.ms, s->quota.sz,
 				s->quota.reset_interval,
+				s->quota.weight_sz,
+				s->quota.weight_nr_accesses,
+				s->quota.weight_age,
 				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
@@ -193,11 +196,14 @@ static struct damos **str_to_schemes(con
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
 
-		ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u %lu %lu %lu%n",
+		ret = sscanf(&str[pos],
+				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &quota.ms,
-				&quota.sz, &quota.reset_interval, &parsed);
-		if (ret != 10)
+				&quota.sz, &quota.reset_interval,
+				&quota.weight_sz, &quota.weight_nr_accesses,
+				&quota.weight_age, &parsed);
+		if (ret != 13)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 248/262] tools/selftests/damon: update for regions prioritization of schemes
  2021-11-05 20:34 incoming Andrew Morton
                   ` (246 preceding siblings ...)
  2021-11-05 20:47 ` [patch 247/262] mm/damon/dbgfs: support prioritization weights Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 249/262] mm/damon/schemes: activate schemes based on a watermarks mechanism Andrew Morton
                   ` (13 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: tools/selftests/damon: update for regions prioritization of schemes

This commit updates the DAMON selftests for 'schemes' debugfs file, as the
file format is updated.

Link: https://lkml.kernel.org/r/20211019150731.16699-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/damon/debugfs_attrs.sh |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/testing/selftests/damon/debugfs_attrs.sh~tools-selftests-damon-update-for-regions-prioritization-of-schemes
+++ a/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -63,10 +63,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3 0 0 0" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 249/262] mm/damon/schemes: activate schemes based on a watermarks mechanism
  2021-11-05 20:34 incoming Andrew Morton
                   ` (247 preceding siblings ...)
  2021-11-05 20:47 ` [patch 248/262] tools/selftests/damon: update for regions prioritization of schemes Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 250/262] mm/damon/dbgfs: support watermarks Andrew Morton
                   ` (12 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/schemes: activate schemes based on a watermarks mechanism

DAMON-based operation schemes need to be manually turned on and off.  In
some use cases, however, the condition for turning a scheme on and off
would depend on the system's situation.  For example, schemes for
proactive pages reclamation would need to be turned on when some memory
pressure is detected, and turned off when the system has enough free
memory.

For easier control of schemes activation based on the system situation,
this commit introduces a watermarks-based mechanism.  The client can
describe the watermark metric (e.g., amount of free memory in the system),
watermark check interval, and three watermarks, namely high, mid, and low.
If the scheme is deactivated, it only gets the metric and compare that to
the three watermarks for every check interval.  If the metric is higher
than the high watermark, the scheme is deactivated.  If the metric is
between the mid watermark and the low watermark, the scheme is activated. 
If the metric is lower than the low watermark, the scheme is deactivated
again.  This is to allow users fall back to traditional page-granularity
mechanisms.

Link: https://lkml.kernel.org/r/20211019150731.16699-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |   52 +++++++++++++++++++++
 mm/damon/core.c       |   97 +++++++++++++++++++++++++++++++++++++++-
 mm/damon/dbgfs.c      |    5 +-
 3 files changed, 151 insertions(+), 3 deletions(-)

--- a/include/linux/damon.h~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism
+++ a/include/linux/damon.h
@@ -147,6 +147,45 @@ struct damos_quota {
 };
 
 /**
+ * enum damos_wmark_metric - Represents the watermark metric.
+ *
+ * @DAMOS_WMARK_NONE:		Ignore the watermarks of the given scheme.
+ * @DAMOS_WMARK_FREE_MEM_RATE:	Free memory rate of the system in [0,1000].
+ */
+enum damos_wmark_metric {
+	DAMOS_WMARK_NONE,
+	DAMOS_WMARK_FREE_MEM_RATE,
+};
+
+/**
+ * struct damos_watermarks - Controls when a given scheme should be activated.
+ * @metric:	Metric for the watermarks.
+ * @interval:	Watermarks check time interval in microseconds.
+ * @high:	High watermark.
+ * @mid:	Middle watermark.
+ * @low:	Low watermark.
+ *
+ * If &metric is &DAMOS_WMARK_NONE, the scheme is always active.  Being active
+ * means DAMON does monitoring and applying the action of the scheme to
+ * appropriate memory regions.  Else, DAMON checks &metric of the system for at
+ * least every &interval microseconds and works as below.
+ *
+ * If &metric is higher than &high, the scheme is inactivated.  If &metric is
+ * between &mid and &low, the scheme is activated.  If &metric is lower than
+ * &low, the scheme is inactivated.
+ */
+struct damos_watermarks {
+	enum damos_wmark_metric metric;
+	unsigned long interval;
+	unsigned long high;
+	unsigned long mid;
+	unsigned long low;
+
+/* private: */
+	bool activated;
+};
+
+/**
  * struct damos - Represents a Data Access Monitoring-based Operation Scheme.
  * @min_sz_region:	Minimum size of target regions.
  * @max_sz_region:	Maximum size of target regions.
@@ -156,6 +195,7 @@ struct damos_quota {
  * @max_age_region:	Maximum age of target regions.
  * @action:		&damo_action to be applied to the target regions.
  * @quota:		Control the aggressiveness of this scheme.
+ * @wmarks:		Watermarks for automated (in)activation of this scheme.
  * @stat_count:		Total number of regions that this scheme is applied.
  * @stat_sz:		Total size of regions that this scheme is applied.
  * @list:		List head for siblings.
@@ -166,6 +206,14 @@ struct damos_quota {
  * those.  To avoid consuming too much CPU time or IO resources for the
  * &action, &quota is used.
  *
+ * To do the work only when needed, schemes can be activated for specific
+ * system situations using &wmarks.  If all schemes that registered to the
+ * monitoring context are inactive, DAMON stops monitoring either, and just
+ * repeatedly checks the watermarks.
+ *
+ * If all schemes that registered to a &struct damon_ctx are inactive, DAMON
+ * stops monitoring and just repeatedly checks the watermarks.
+ *
  * After applying the &action to each region, &stat_count and &stat_sz is
  * updated to reflect the number of regions and total size of regions that the
  * &action is applied.
@@ -179,6 +227,7 @@ struct damos {
 	unsigned int max_age_region;
 	enum damos_action action;
 	struct damos_quota quota;
+	struct damos_watermarks wmarks;
 	unsigned long stat_count;
 	unsigned long stat_sz;
 	struct list_head list;
@@ -384,7 +433,8 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action, struct damos_quota *quota);
+		enum damos_action action, struct damos_quota *quota,
+		struct damos_watermarks *wmarks);
 void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
 void damon_destroy_scheme(struct damos *s);
 
--- a/mm/damon/core.c~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism
+++ a/mm/damon/core.c
@@ -10,6 +10,7 @@
 #include <linux/damon.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
+#include <linux/mm.h>
 #include <linux/random.h>
 #include <linux/slab.h>
 #include <linux/string.h>
@@ -90,7 +91,8 @@ struct damos *damon_new_scheme(
 		unsigned long min_sz_region, unsigned long max_sz_region,
 		unsigned int min_nr_accesses, unsigned int max_nr_accesses,
 		unsigned int min_age_region, unsigned int max_age_region,
-		enum damos_action action, struct damos_quota *quota)
+		enum damos_action action, struct damos_quota *quota,
+		struct damos_watermarks *wmarks)
 {
 	struct damos *scheme;
 
@@ -122,6 +124,13 @@ struct damos *damon_new_scheme(
 	scheme->quota.charge_target_from = NULL;
 	scheme->quota.charge_addr_from = 0;
 
+	scheme->wmarks.metric = wmarks->metric;
+	scheme->wmarks.interval = wmarks->interval;
+	scheme->wmarks.high = wmarks->high;
+	scheme->wmarks.mid = wmarks->mid;
+	scheme->wmarks.low = wmarks->low;
+	scheme->wmarks.activated = true;
+
 	return scheme;
 }
 
@@ -582,6 +591,9 @@ static void damon_do_apply_schemes(struc
 		unsigned long sz = r->ar.end - r->ar.start;
 		struct timespec64 begin, end;
 
+		if (!s->wmarks.activated)
+			continue;
+
 		/* Check the quota */
 		if (quota->esz && quota->charged_sz >= quota->esz)
 			continue;
@@ -684,6 +696,9 @@ static void kdamond_apply_schemes(struct
 		unsigned long cumulated_sz;
 		unsigned int score, max_score = 0;
 
+		if (!s->wmarks.activated)
+			continue;
+
 		if (!quota->ms && !quota->sz)
 			continue;
 
@@ -924,6 +939,83 @@ static bool kdamond_need_stop(struct dam
 	return true;
 }
 
+static unsigned long damos_wmark_metric_value(enum damos_wmark_metric metric)
+{
+	struct sysinfo i;
+
+	switch (metric) {
+	case DAMOS_WMARK_FREE_MEM_RATE:
+		si_meminfo(&i);
+		return i.freeram * 1000 / i.totalram;
+	default:
+		break;
+	}
+	return -EINVAL;
+}
+
+/*
+ * Returns zero if the scheme is active.  Else, returns time to wait for next
+ * watermark check in micro-seconds.
+ */
+static unsigned long damos_wmark_wait_us(struct damos *scheme)
+{
+	unsigned long metric;
+
+	if (scheme->wmarks.metric == DAMOS_WMARK_NONE)
+		return 0;
+
+	metric = damos_wmark_metric_value(scheme->wmarks.metric);
+	/* higher than high watermark or lower than low watermark */
+	if (metric > scheme->wmarks.high || scheme->wmarks.low > metric) {
+		if (scheme->wmarks.activated)
+			pr_debug("inactivate a scheme (%d) for %s wmark\n",
+					scheme->action,
+					metric > scheme->wmarks.high ?
+					"high" : "low");
+		scheme->wmarks.activated = false;
+		return scheme->wmarks.interval;
+	}
+
+	/* inactive and higher than middle watermark */
+	if ((scheme->wmarks.high >= metric && metric >= scheme->wmarks.mid) &&
+			!scheme->wmarks.activated)
+		return scheme->wmarks.interval;
+
+	if (!scheme->wmarks.activated)
+		pr_debug("activate a scheme (%d)\n", scheme->action);
+	scheme->wmarks.activated = true;
+	return 0;
+}
+
+static void kdamond_usleep(unsigned long usecs)
+{
+	if (usecs > 100 * 1000)
+		schedule_timeout_interruptible(usecs_to_jiffies(usecs));
+	else
+		usleep_range(usecs, usecs + 1);
+}
+
+/* Returns negative error code if it's not activated but should return */
+static int kdamond_wait_activation(struct damon_ctx *ctx)
+{
+	struct damos *s;
+	unsigned long wait_time;
+	unsigned long min_wait_time = 0;
+
+	while (!kdamond_need_stop(ctx)) {
+		damon_for_each_scheme(s, ctx) {
+			wait_time = damos_wmark_wait_us(s);
+			if (!min_wait_time || wait_time < min_wait_time)
+				min_wait_time = wait_time;
+		}
+		if (!min_wait_time)
+			return 0;
+
+		kdamond_usleep(min_wait_time);
+	}
+	return -EBUSY;
+}
+
 static void set_kdamond_stop(struct damon_ctx *ctx)
 {
 	mutex_lock(&ctx->kdamond_lock);
@@ -952,6 +1044,9 @@ static int kdamond_fn(void *data)
 	sz_limit = damon_region_sz_limit(ctx);
 
 	while (!kdamond_need_stop(ctx)) {
+		if (kdamond_wait_activation(ctx))
+			continue;
+
 		if (ctx->primitive.prepare_access_checks)
 			ctx->primitive.prepare_access_checks(ctx);
 		if (ctx->callback.after_sampling &&
--- a/mm/damon/dbgfs.c~mm-damon-schemes-activate-schemes-based-on-a-watermarks-mechanism
+++ a/mm/damon/dbgfs.c
@@ -195,6 +195,9 @@ static struct damos **str_to_schemes(con
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
+		struct damos_watermarks wmarks = {
+			.metric = DAMOS_WMARK_NONE,
+		};
 
 		ret = sscanf(&str[pos],
 				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
@@ -212,7 +215,7 @@ static struct damos **str_to_schemes(con
 
 		pos += parsed;
 		scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
-				min_age, max_age, action, &quota);
+				min_age, max_age, action, &quota, &wmarks);
 		if (!scheme)
 			goto fail;
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 250/262] mm/damon/dbgfs: support watermarks
  2021-11-05 20:34 incoming Andrew Morton
                   ` (248 preceding siblings ...)
  2021-11-05 20:47 ` [patch 249/262] mm/damon/schemes: activate schemes based on a watermarks mechanism Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 251/262] selftests/damon: " Andrew Morton
                   ` (11 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon/dbgfs: support watermarks

This commit updates DAMON debugfs interface to support the watermarks
based schemes activation.  For this, now 'schemes' file receives five more
values.

Link: https://lkml.kernel.org/r/20211019150731.16699-13-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/dbgfs.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-support-watermarks
+++ a/mm/damon/dbgfs.c
@@ -105,7 +105,7 @@ static ssize_t sprint_schemes(struct dam
 
 	damon_for_each_scheme(s, c) {
 		rc = scnprintf(&buf[written], len - written,
-				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %lu %lu\n",
+				"%lu %lu %u %u %u %u %d %lu %lu %lu %u %u %u %d %lu %lu %lu %lu %lu %lu\n",
 				s->min_sz_region, s->max_sz_region,
 				s->min_nr_accesses, s->max_nr_accesses,
 				s->min_age_region, s->max_age_region,
@@ -115,6 +115,8 @@ static ssize_t sprint_schemes(struct dam
 				s->quota.weight_sz,
 				s->quota.weight_nr_accesses,
 				s->quota.weight_age,
+				s->wmarks.metric, s->wmarks.interval,
+				s->wmarks.high, s->wmarks.mid, s->wmarks.low,
 				s->stat_count, s->stat_sz);
 		if (!rc)
 			return -ENOMEM;
@@ -195,18 +197,18 @@ static struct damos **str_to_schemes(con
 	*nr_schemes = 0;
 	while (pos < len && *nr_schemes < max_nr_schemes) {
 		struct damos_quota quota = {};
-		struct damos_watermarks wmarks = {
-			.metric = DAMOS_WMARK_NONE,
-		};
+		struct damos_watermarks wmarks;
 
 		ret = sscanf(&str[pos],
-				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u%n",
+				"%lu %lu %u %u %u %u %u %lu %lu %lu %u %u %u %u %lu %lu %lu %lu%n",
 				&min_sz, &max_sz, &min_nr_a, &max_nr_a,
 				&min_age, &max_age, &action, &quota.ms,
 				&quota.sz, &quota.reset_interval,
 				&quota.weight_sz, &quota.weight_nr_accesses,
-				&quota.weight_age, &parsed);
-		if (ret != 13)
+				&quota.weight_age, &wmarks.metric,
+				&wmarks.interval, &wmarks.high, &wmarks.mid,
+				&wmarks.low, &parsed);
+		if (ret != 18)
 			break;
 		if (!damos_action_valid(action)) {
 			pr_err("wrong action %d\n", action);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 251/262] selftests/damon: support watermarks
  2021-11-05 20:34 incoming Andrew Morton
                   ` (249 preceding siblings ...)
  2021-11-05 20:47 ` [patch 250/262] mm/damon/dbgfs: support watermarks Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:47 ` [patch 252/262] mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) Andrew Morton
                   ` (10 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: selftests/damon: support watermarks

This commit updates DAMON selftests for 'schemes' debugfs file to reflect
the changes in the format.

Link: https://lkml.kernel.org/r/20211019150731.16699-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/damon/debugfs_attrs.sh |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/tools/testing/selftests/damon/debugfs_attrs.sh~selftests-damon-support-watermarks
+++ a/tools/testing/selftests/damon/debugfs_attrs.sh
@@ -63,10 +63,10 @@ echo "$orig_content" > "$file"
 file="$DBGFS/schemes"
 orig_content=$(cat "$file")
 
-test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3" \
+test_write_succ "$file" "1 2 3 4 5 6 4 0 0 0 1 2 3 1 100 3 2 1" \
 	"$orig_content" "valid input"
 test_write_fail "$file" "1 2
-3 4 5 6 3 0 0 0 1 2 3" "$orig_content" "multi lines"
+3 4 5 6 3 0 0 0 1 2 3 1 100 3 2 1" "$orig_content" "multi lines"
 test_write_succ "$file" "" "$orig_content" "disabling"
 echo "$orig_content" > "$file"
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 252/262] mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  2021-11-05 20:34 incoming Andrew Morton
                   ` (250 preceding siblings ...)
  2021-11-05 20:47 ` [patch 251/262] selftests/damon: " Andrew Morton
@ 2021-11-05 20:47 ` Andrew Morton
  2021-11-05 20:48 ` [patch 253/262] Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM Andrew Morton
                   ` (9 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:47 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds, yangyingliang

From: SeongJae Park <sj@kernel.org>
Subject: mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)

This commit implements a new kernel subsystem that finds cold memory
regions using DAMON and reclaims those immediately.  It is intended to be
used as proactive lightweigh reclamation logic for light memory pressure. 
For heavy memory pressure, it could be inactivated and fall back to the
traditional page-scanning based reclamation.

It's implemented on top of DAMON framework to use the DAMON-based
Operation Schemes (DAMOS) feature.  It utilizes all the DAMOS features
including speed limit, prioritization, and watermarks.

It could be enabled and tuned in boot time via the kernel boot parameter,
and in run time via its module parameters
('/sys/module/damon_reclaim/parameters/') interface.

[yangyingliang@huawei.com: fix error return code in damon_reclaim_turn()]
  Link: https://lkml.kernel.org/r/20211025124500.2758060-1-yangyingliang@huawei.com
Link: https://lkml.kernel.org/r/20211019150731.16699-15-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/Kconfig   |   12 +
 mm/damon/Makefile  |    1 
 mm/damon/reclaim.c |  356 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 369 insertions(+)

--- a/mm/damon/Kconfig~mm-damon-introduce-damon-based-reclamation-damon_reclaim
+++ a/mm/damon/Kconfig
@@ -73,4 +73,16 @@ config DAMON_DBGFS_KUNIT_TEST
 
 	  If unsure, say N.
 
+config DAMON_RECLAIM
+	bool "Build DAMON-based reclaim (DAMON_RECLAIM)"
+	depends on DAMON_PADDR
+	help
+	  This builds the DAMON-based reclamation subsystem.  It finds pages
+	  that not accessed for a long time (cold) using DAMON and reclaim
+	  those.
+
+	  This is suggested to be used as a proactive and lightweight
+	  reclamation under light memory pressure, while the traditional page
+	  scanning-based reclamation is used for heavy pressure.
+
 endmenu
--- a/mm/damon/Makefile~mm-damon-introduce-damon-based-reclamation-damon_reclaim
+++ a/mm/damon/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_DAMON)		:= core.o
 obj-$(CONFIG_DAMON_VADDR)	+= prmtv-common.o vaddr.o
 obj-$(CONFIG_DAMON_PADDR)	+= prmtv-common.o paddr.o
 obj-$(CONFIG_DAMON_DBGFS)	+= dbgfs.o
+obj-$(CONFIG_DAMON_RECLAIM)	+= reclaim.o
--- /dev/null
+++ a/mm/damon/reclaim.c
@@ -0,0 +1,356 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DAMON-based page reclamation
+ *
+ * Author: SeongJae Park <sj@kernel.org>
+ */
+
+#define pr_fmt(fmt) "damon-reclaim: " fmt
+
+#include <linux/damon.h>
+#include <linux/ioport.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/workqueue.h>
+
+#ifdef MODULE_PARAM_PREFIX
+#undef MODULE_PARAM_PREFIX
+#endif
+#define MODULE_PARAM_PREFIX "damon_reclaim."
+
+/*
+ * Enable or disable DAMON_RECLAIM.
+ *
+ * You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``.
+ * Setting it as ``N`` disables DAMON_RECLAIM.  Note that DAMON_RECLAIM could
+ * do no real monitoring and reclamation due to the watermarks-based activation
+ * condition.  Refer to below descriptions for the watermarks parameter for
+ * this.
+ */
+static bool enabled __read_mostly;
+module_param(enabled, bool, 0600);
+
+/*
+ * Time threshold for cold memory regions identification in microseconds.
+ *
+ * If a memory region is not accessed for this or longer time, DAMON_RECLAIM
+ * identifies the region as cold, and reclaims.  120 seconds by default.
+ */
+static unsigned long min_age __read_mostly = 120000000;
+module_param(min_age, ulong, 0600);
+
+/*
+ * Limit of time for trying the reclamation in milliseconds.
+ *
+ * DAMON_RECLAIM tries to use only up to this time within a time window
+ * (quota_reset_interval_ms) for trying reclamation of cold pages.  This can be
+ * used for limiting CPU consumption of DAMON_RECLAIM.  If the value is zero,
+ * the limit is disabled.
+ *
+ * 10 ms by default.
+ */
+static unsigned long quota_ms __read_mostly = 10;
+module_param(quota_ms, ulong, 0600);
+
+/*
+ * Limit of size of memory for the reclamation in bytes.
+ *
+ * DAMON_RECLAIM charges amount of memory which it tried to reclaim within a
+ * time window (quota_reset_interval_ms) and makes no more than this limit is
+ * tried.  This can be used for limiting consumption of CPU and IO.  If this
+ * value is zero, the limit is disabled.
+ *
+ * 128 MiB by default.
+ */
+static unsigned long quota_sz __read_mostly = 128 * 1024 * 1024;
+module_param(quota_sz, ulong, 0600);
+
+/*
+ * The time/size quota charge reset interval in milliseconds.
+ *
+ * The charge reset interval for the quota of time (quota_ms) and size
+ * (quota_sz).  That is, DAMON_RECLAIM does not try reclamation for more than
+ * quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms
+ * milliseconds.
+ *
+ * 1 second by default.
+ */
+static unsigned long quota_reset_interval_ms __read_mostly = 1000;
+module_param(quota_reset_interval_ms, ulong, 0600);
+
+/*
+ * The watermarks check time interval in microseconds.
+ *
+ * Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is
+ * enabled but inactive due to its watermarks rule.  5 seconds by default.
+ */
+static unsigned long wmarks_interval __read_mostly = 5000000;
+module_param(wmarks_interval, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the high watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is higher than
+ * this, DAMON_RECLAIM becomes inactive, so it does nothing but periodically
+ * checks the watermarks.  500 (50%) by default.
+ */
+static unsigned long wmarks_high __read_mostly = 500;
+module_param(wmarks_high, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the middle watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is between this and
+ * the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring
+ * and the reclaiming.  400 (40%) by default.
+ */
+static unsigned long wmarks_mid __read_mostly = 400;
+module_param(wmarks_mid, ulong, 0600);
+
+/*
+ * Free memory rate (per thousand) for the low watermark.
+ *
+ * If free memory of the system in bytes per thousand bytes is lower than this,
+ * DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks
+ * the watermarks.  In the case, the system falls back to the LRU-based page
+ * granularity reclamation logic.  200 (20%) by default.
+ */
+static unsigned long wmarks_low __read_mostly = 200;
+module_param(wmarks_low, ulong, 0600);
+
+/*
+ * Sampling interval for the monitoring in microseconds.
+ *
+ * The sampling interval of DAMON for the cold memory monitoring.  Please refer
+ * to the DAMON documentation for more detail.  5 ms by default.
+ */
+static unsigned long sample_interval __read_mostly = 5000;
+module_param(sample_interval, ulong, 0600);
+
+/*
+ * Aggregation interval for the monitoring in microseconds.
+ *
+ * The aggregation interval of DAMON for the cold memory monitoring.  Please
+ * refer to the DAMON documentation for more detail.  100 ms by default.
+ */
+static unsigned long aggr_interval __read_mostly = 100000;
+module_param(aggr_interval, ulong, 0600);
+
+/*
+ * Minimum number of monitoring regions.
+ *
+ * The minimal number of monitoring regions of DAMON for the cold memory
+ * monitoring.  This can be used to set lower-bound of the monitoring quality.
+ * But, setting this too high could result in increased monitoring overhead.
+ * Please refer to the DAMON documentation for more detail.  10 by default.
+ */
+static unsigned long min_nr_regions __read_mostly = 10;
+module_param(min_nr_regions, ulong, 0600);
+
+/*
+ * Maximum number of monitoring regions.
+ *
+ * The maximum number of monitoring regions of DAMON for the cold memory
+ * monitoring.  This can be used to set upper-bound of the monitoring overhead.
+ * However, setting this too low could result in bad monitoring quality.
+ * Please refer to the DAMON documentation for more detail.  1000 by default.
+ */
+static unsigned long max_nr_regions __read_mostly = 1000;
+module_param(max_nr_regions, ulong, 0600);
+
+/*
+ * Start of the target memory region in physical address.
+ *
+ * The start physical address of memory region that DAMON_RECLAIM will do work
+ * against.  By default, biggest System RAM is used as the region.
+ */
+static unsigned long monitor_region_start __read_mostly;
+module_param(monitor_region_start, ulong, 0600);
+
+/*
+ * End of the target memory region in physical address.
+ *
+ * The end physical address of memory region that DAMON_RECLAIM will do work
+ * against.  By default, biggest System RAM is used as the region.
+ */
+static unsigned long monitor_region_end __read_mostly;
+module_param(monitor_region_end, ulong, 0600);
+
+/*
+ * PID of the DAMON thread
+ *
+ * If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.
+ * Else, -1.
+ */
+static int kdamond_pid __read_mostly = -1;
+module_param(kdamond_pid, int, 0400);
+
+static struct damon_ctx *ctx;
+static struct damon_target *target;
+
+struct damon_reclaim_ram_walk_arg {
+	unsigned long start;
+	unsigned long end;
+};
+
+static int walk_system_ram(struct resource *res, void *arg)
+{
+	struct damon_reclaim_ram_walk_arg *a = arg;
+
+	if (a->end - a->start < res->end - res->start) {
+		a->start = res->start;
+		a->end = res->end;
+	}
+	return 0;
+}
+
+/*
+ * Find biggest 'System RAM' resource and store its start and end address in
+ * @start and @end, respectively.  If no System RAM is found, returns false.
+ */
+static bool get_monitoring_region(unsigned long *start, unsigned long *end)
+{
+	struct damon_reclaim_ram_walk_arg arg = {};
+
+	walk_system_ram_res(0, ULONG_MAX, &arg, walk_system_ram);
+	if (arg.end <= arg.start)
+		return false;
+
+	*start = arg.start;
+	*end = arg.end;
+	return true;
+}
+
+static struct damos *damon_reclaim_new_scheme(void)
+{
+	struct damos_watermarks wmarks = {
+		.metric = DAMOS_WMARK_FREE_MEM_RATE,
+		.interval = wmarks_interval,
+		.high = wmarks_high,
+		.mid = wmarks_mid,
+		.low = wmarks_low,
+	};
+	struct damos_quota quota = {
+		/*
+		 * Do not try reclamation for more than quota_ms milliseconds
+		 * or quota_sz bytes within quota_reset_interval_ms.
+		 */
+		.ms = quota_ms,
+		.sz = quota_sz,
+		.reset_interval = quota_reset_interval_ms,
+		/* Within the quota, page out older regions first. */
+		.weight_sz = 0,
+		.weight_nr_accesses = 0,
+		.weight_age = 1
+	};
+	struct damos *scheme = damon_new_scheme(
+			/* Find regions having PAGE_SIZE or larger size */
+			PAGE_SIZE, ULONG_MAX,
+			/* and not accessed at all */
+			0, 0,
+			/* for min_age or more micro-seconds, and */
+			min_age / aggr_interval, UINT_MAX,
+			/* page out those, as soon as found */
+			DAMOS_PAGEOUT,
+			/* under the quota. */
+			&quota,
+			/* (De)activate this according to the watermarks. */
+			&wmarks);
+
+	return scheme;
+}
+
+static int damon_reclaim_turn(bool on)
+{
+	struct damon_region *region;
+	struct damos *scheme;
+	int err;
+
+	if (!on) {
+		err = damon_stop(&ctx, 1);
+		if (!err)
+			kdamond_pid = -1;
+		return err;
+	}
+
+	err = damon_set_attrs(ctx, sample_interval, aggr_interval, 0,
+			min_nr_regions, max_nr_regions);
+	if (err)
+		return err;
+
+	if (monitor_region_start > monitor_region_end)
+		return -EINVAL;
+	if (!monitor_region_start && !monitor_region_end &&
+			!get_monitoring_region(&monitor_region_start,
+				&monitor_region_end))
+		return -EINVAL;
+	/* DAMON will free this on its own when finish monitoring */
+	region = damon_new_region(monitor_region_start, monitor_region_end);
+	if (!region)
+		return -ENOMEM;
+	damon_add_region(region, target);
+
+	/* Will be freed by 'damon_set_schemes()' below */
+	scheme = damon_reclaim_new_scheme();
+	if (!scheme) {
+		err = -ENOMEM;
+		goto free_region_out;
+	}
+	err = damon_set_schemes(ctx, &scheme, 1);
+	if (err)
+		goto free_scheme_out;
+
+	err = damon_start(&ctx, 1);
+	if (!err) {
+		kdamond_pid = ctx->kdamond->pid;
+		return 0;
+	}
+
+free_scheme_out:
+	damon_destroy_scheme(scheme);
+free_region_out:
+	damon_destroy_region(region, target);
+	return err;
+}
+
+#define ENABLE_CHECK_INTERVAL_MS	1000
+static struct delayed_work damon_reclaim_timer;
+static void damon_reclaim_timer_fn(struct work_struct *work)
+{
+	static bool last_enabled;
+	bool now_enabled;
+
+	now_enabled = enabled;
+	if (last_enabled != now_enabled) {
+		if (!damon_reclaim_turn(now_enabled))
+			last_enabled = now_enabled;
+		else
+			enabled = last_enabled;
+	}
+
+	schedule_delayed_work(&damon_reclaim_timer,
+			msecs_to_jiffies(ENABLE_CHECK_INTERVAL_MS));
+}
+static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn);
+
+static int __init damon_reclaim_init(void)
+{
+	ctx = damon_new_ctx();
+	if (!ctx)
+		return -ENOMEM;
+
+	damon_pa_set_primitives(ctx);
+
+	/* 4242 means nothing but fun */
+	target = damon_new_target(4242);
+	if (!target) {
+		damon_destroy_ctx(ctx);
+		return -ENOMEM;
+	}
+	damon_add_target(ctx, target);
+
+	schedule_delayed_work(&damon_reclaim_timer, 0);
+	return 0;
+}
+
+module_init(damon_reclaim_init);
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 253/262] Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  2021-11-05 20:34 incoming Andrew Morton
                   ` (251 preceding siblings ...)
  2021-11-05 20:47 ` [patch 252/262] mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 254/262] mm/damon: remove unnecessary variable initialization Andrew Morton
                   ` (8 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, amit, benh, corbet, david, dwmw, elver, foersleo, gthelen,
	Jonathan.Cameron, linux-mm, markubo, mm-commits, rientjes,
	shakeelb, shuah, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM

This commit adds an admin-guide document for DAMON-based Reclamation.

Link: https://lkml.kernel.org/r/20211019150731.16699-16-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Amit Shah <amit@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: David Woodhouse <dwmw@amazon.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leonard Foerster <foersleo@amazon.de>
Cc: Marco Elver <elver@google.com>
Cc: Markus Boehme <markubo@amazon.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/index.rst   |    1 
 Documentation/admin-guide/mm/damon/reclaim.rst |  235 +++++++++++++++
 2 files changed, 236 insertions(+)

--- a/Documentation/admin-guide/mm/damon/index.rst~documentation-admin-guide-mm-damon-add-a-document-for-damon_reclaim
+++ a/Documentation/admin-guide/mm/damon/index.rst
@@ -13,3 +13,4 @@ optimize those.
 
    start
    usage
+   reclaim
--- /dev/null
+++ a/Documentation/admin-guide/mm/damon/reclaim.rst
@@ -0,0 +1,235 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+DAMON-based Reclamation
+=======================
+
+DAMON-based Reclamation (DAMON_RECLAIM) is a static kernel module that aimed to
+be used for proactive and lightweight reclamation under light memory pressure.
+It doesn't aim to replace the LRU-list based page_granularity reclamation, but
+to be selectively used for different level of memory pressure and requirements.
+
+Where Proactive Reclamation is Required?
+========================================
+
+On general memory over-committed systems, proactively reclaiming cold pages
+helps saving memory and reducing latency spikes that incurred by the direct
+reclaim of the process or CPU consumption of kswapd, while incurring only
+minimal performance degradation [1]_ [2]_ .
+
+Free Pages Reporting [3]_ based memory over-commit virtualization systems are
+good example of the cases.  In such systems, the guest VMs reports their free
+memory to host, and the host reallocates the reported memory to other guests.
+As a result, the memory of the systems are fully utilized.  However, the
+guests could be not so memory-frugal, mainly because some kernel subsystems and
+user-space applications are designed to use as much memory as available.  Then,
+guests could report only small amount of memory as free to host, results in
+memory utilization drop of the systems.  Running the proactive reclamation in
+guests could mitigate this problem.
+
+How It Works?
+=============
+
+DAMON_RECLAIM finds memory regions that didn't accessed for specific time
+duration and page out.  To avoid it consuming too much CPU for the paging out
+operation, a speed limit can be configured.  Under the speed limit, it pages
+out memory regions that didn't accessed longer time first.  System
+administrators can also configure under what situation this scheme should
+automatically activated and deactivated with three memory pressure watermarks.
+
+Interface: Module Parameters
+============================
+
+To use this feature, you should first ensure your system is running on a kernel
+that is built with ``CONFIG_DAMON_RECLAIM=y``.
+
+To let sysadmins enable or disable it and tune for the given system,
+DAMON_RECLAIM utilizes module parameters.  That is, you can put
+``damon_reclaim.<parameter>=<value>`` on the kernel boot command line or write
+proper values to ``/sys/modules/damon_reclaim/parameters/<parameter>`` files.
+
+Note that the parameter values except ``enabled`` are applied only when
+DAMON_RECLAIM starts.  Therefore, if you want to apply new parameter values in
+runtime and DAMON_RECLAIM is already enabled, you should disable and re-enable
+it via ``enabled`` parameter file.  Writing of the new values to proper
+parameter values should be done before the re-enablement.
+
+Below are the description of each parameter.
+
+enabled
+-------
+
+Enable or disable DAMON_RECLAIM.
+
+You can enable DAMON_RCLAIM by setting the value of this parameter as ``Y``.
+Setting it as ``N`` disables DAMON_RECLAIM.  Note that DAMON_RECLAIM could do
+no real monitoring and reclamation due to the watermarks-based activation
+condition.  Refer to below descriptions for the watermarks parameter for this.
+
+min_age
+-------
+
+Time threshold for cold memory regions identification in microseconds.
+
+If a memory region is not accessed for this or longer time, DAMON_RECLAIM
+identifies the region as cold, and reclaims it.
+
+120 seconds by default.
+
+quota_ms
+--------
+
+Limit of time for the reclamation in milliseconds.
+
+DAMON_RECLAIM tries to use only up to this time within a time window
+(quota_reset_interval_ms) for trying reclamation of cold pages.  This can be
+used for limiting CPU consumption of DAMON_RECLAIM.  If the value is zero, the
+limit is disabled.
+
+10 ms by default.
+
+quota_sz
+--------
+
+Limit of size of memory for the reclamation in bytes.
+
+DAMON_RECLAIM charges amount of memory which it tried to reclaim within a time
+window (quota_reset_interval_ms) and makes no more than this limit is tried.
+This can be used for limiting consumption of CPU and IO.  If this value is
+zero, the limit is disabled.
+
+128 MiB by default.
+
+quota_reset_interval_ms
+-----------------------
+
+The time/size quota charge reset interval in milliseconds.
+
+The charget reset interval for the quota of time (quota_ms) and size
+(quota_sz).  That is, DAMON_RECLAIM does not try reclamation for more than
+quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms
+milliseconds.
+
+1 second by default.
+
+wmarks_interval
+---------------
+
+Minimal time to wait before checking the watermarks, when DAMON_RECLAIM is
+enabled but inactive due to its watermarks rule.
+
+wmarks_high
+-----------
+
+Free memory rate (per thousand) for the high watermark.
+
+If free memory of the system in bytes per thousand bytes is higher than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but only periodically checks
+the watermarks.
+
+wmarks_mid
+----------
+
+Free memory rate (per thousand) for the middle watermark.
+
+If free memory of the system in bytes per thousand bytes is between this and
+the low watermark, DAMON_RECLAIM becomes active, so starts the monitoring and
+the reclaiming.
+
+wmarks_low
+----------
+
+Free memory rate (per thousand) for the low watermark.
+
+If free memory of the system in bytes per thousand bytes is lower than this,
+DAMON_RECLAIM becomes inactive, so it does nothing but periodically checks the
+watermarks.  In the case, the system falls back to the LRU-list based page
+granularity reclamation logic.
+
+sample_interval
+---------------
+
+Sampling interval for the monitoring in microseconds.
+
+The sampling interval of DAMON for the cold memory monitoring.  Please refer to
+the DAMON documentation (:doc:`usage`) for more detail.
+
+aggr_interval
+-------------
+
+Aggregation interval for the monitoring in microseconds.
+
+The aggregation interval of DAMON for the cold memory monitoring.  Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+min_nr_regions
+--------------
+
+Minimum number of monitoring regions.
+
+The minimal number of monitoring regions of DAMON for the cold memory
+monitoring.  This can be used to set lower-bound of the monitoring quality.
+But, setting this too high could result in increased monitoring overhead.
+Please refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+max_nr_regions
+--------------
+
+Maximum number of monitoring regions.
+
+The maximum number of monitoring regions of DAMON for the cold memory
+monitoring.  This can be used to set upper-bound of the monitoring overhead.
+However, setting this too low could result in bad monitoring quality.  Please
+refer to the DAMON documentation (:doc:`usage`) for more detail.
+
+monitor_region_start
+--------------------
+
+Start of target memory region in physical address.
+
+The start physical address of memory region that DAMON_RECLAIM will do work
+against.  That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims.  By default, biggest System RAM is used as the region.
+
+monitor_region_end
+------------------
+
+End of target memory region in physical address.
+
+The end physical address of memory region that DAMON_RECLAIM will do work
+against.  That is, DAMON_RECLAIM will find cold memory regions in this region
+and reclaims.  By default, biggest System RAM is used as the region.
+
+kdamond_pid
+-----------
+
+PID of the DAMON thread.
+
+If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.  Else,
+-1.
+
+Example
+=======
+
+Below runtime example commands make DAMON_RECLAIM to find memory regions that
+not accessed for 30 seconds or more and pages out.  The reclamation is limited
+to be done only up to 1 GiB per second to avoid DAMON_RECLAIM consuming too
+much CPU time for the paging out operation.  It also asks DAMON_RECLAIM to do
+nothing if the system's free memory rate is more than 50%, but start the real
+works if it becomes lower than 40%.  If DAMON_RECLAIM doesn't make progress and
+therefore the free memory rate becomes lower than 20%, it asks DAMON_RECLAIM to
+do nothing again, so that we can fall back to the LRU-list based page
+granularity reclamation. ::
+
+    # cd /sys/modules/damon_reclaim/parameters
+    # echo 30000000 > min_age
+    # echo $((1 * 1024 * 1024 * 1024)) > quota_sz
+    # echo 1000 > quota_reset_interval_ms
+    # echo 500 > wmarks_high
+    # echo 400 > wmarks_mid
+    # echo 200 > wmarks_low
+    # echo Y > enabled
+
+.. [1] https://research.google/pubs/pub48551/
+.. [2] https://lwn.net/Articles/787611/
+.. [3] https://www.kernel.org/doc/html/latest/vm/free_page_reporting.html
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 254/262] mm/damon: remove unnecessary variable initialization
  2021-11-05 20:34 incoming Andrew Morton
                   ` (252 preceding siblings ...)
  2021-11-05 20:48 ` [patch 253/262] Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 255/262] mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on Andrew Morton
                   ` (7 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, sj, torvalds, xhao

From: Xin Hao <xhao@linux.alibaba.com>
Subject: mm/damon: remove unnecessary variable initialization

Patch series "mm/damon: Fix some small bugs", v4.


This patch (of 2):

In 'damon_va_apply_three_regions', There is no need to set variable 'i' as
0

Link: https://lkml.kernel.org/r/b7df8d3dad0943a37e01f60c441b1968b2b20354.1634720326.git.xhao@linux.alibaba.com
Link: https://lkml.kernel.org/r/cover.1634720326.git.xhao@linux.alibaba.com
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/vaddr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/damon/vaddr.c~mm-damon-remove-unnecessary-variable-initialization
+++ a/mm/damon/vaddr.c
@@ -306,7 +306,7 @@ static void damon_va_apply_three_regions
 		struct damon_addr_range bregions[3])
 {
 	struct damon_region *r, *next;
-	unsigned int i = 0;
+	unsigned int i;
 
 	/* Remove regions which are not in the three big regions now */
 	damon_for_each_region_safe(r, next, t) {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 255/262] mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  2021-11-05 20:34 incoming Andrew Morton
                   ` (253 preceding siblings ...)
  2021-11-05 20:48 ` [patch 254/262] mm/damon: remove unnecessary variable initialization Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 256/262] Docs/admin-guide/mm/damon/start: fix wrong example commands Andrew Morton
                   ` (6 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, sj, torvalds, xhao

From: Xin Hao <xhao@linux.alibaba.com>
Subject: mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on

When the ctx->adaptive_targets list is empty, I did some test on
monitor_on interface like this.

    # cat /sys/kernel/debug/damon/target_ids
    #
    # echo on > /sys/kernel/debug/damon/monitor_on
    # damon: kdamond (5390) starts

Though the ctx->adaptive_targets list is empty, but the kthread_run still
be called, and the kdamond.x thread still be created, this is meaningless.

So there adds a judgment in 'dbgfs_monitor_on_write', if the
ctx->adaptive_targets list is empty, return -EINVAL.

Link: https://lkml.kernel.org/r/0a60a6e8ec9d71989e0848a4dc3311996ca3b5d4.1634720326.git.xhao@linux.alibaba.com
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    1 +
 mm/damon/core.c       |    5 +++++
 mm/damon/dbgfs.c      |   15 ++++++++++++---
 3 files changed, 18 insertions(+), 3 deletions(-)

--- a/include/linux/damon.h~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on
+++ a/include/linux/damon.h
@@ -440,6 +440,7 @@ void damon_destroy_scheme(struct damos *
 
 struct damon_target *damon_new_target(unsigned long id);
 void damon_add_target(struct damon_ctx *ctx, struct damon_target *t);
+bool damon_targets_empty(struct damon_ctx *ctx);
 void damon_free_target(struct damon_target *t);
 void damon_destroy_target(struct damon_target *t);
 unsigned int damon_nr_regions(struct damon_target *t);
--- a/mm/damon/core.c~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on
+++ a/mm/damon/core.c
@@ -180,6 +180,11 @@ void damon_add_target(struct damon_ctx *
 	list_add_tail(&t->list, &ctx->adaptive_targets);
 }
 
+bool damon_targets_empty(struct damon_ctx *ctx)
+{
+	return list_empty(&ctx->adaptive_targets);
+}
+
 static void damon_del_target(struct damon_target *t)
 {
 	list_del(&t->list);
--- a/mm/damon/dbgfs.c~mm-damon-dbgfs-add-adaptive_targets-list-check-before-enable-monitor_on
+++ a/mm/damon/dbgfs.c
@@ -878,12 +878,21 @@ static ssize_t dbgfs_monitor_on_write(st
 		return -EINVAL;
 	}
 
-	if (!strncmp(kbuf, "on", count))
+	if (!strncmp(kbuf, "on", count)) {
+		int i;
+
+		for (i = 0; i < dbgfs_nr_ctxs; i++) {
+			if (damon_targets_empty(dbgfs_ctxs[i])) {
+				kfree(kbuf);
+				return -EINVAL;
+			}
+		}
 		ret = damon_start(dbgfs_ctxs, dbgfs_nr_ctxs);
-	else if (!strncmp(kbuf, "off", count))
+	} else if (!strncmp(kbuf, "off", count)) {
 		ret = damon_stop(dbgfs_ctxs, dbgfs_nr_ctxs);
-	else
+	} else {
 		ret = -EINVAL;
+	}
 
 	if (!ret)
 		ret = count;
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 256/262] Docs/admin-guide/mm/damon/start: fix wrong example commands
  2021-11-05 20:34 incoming Andrew Morton
                   ` (254 preceding siblings ...)
  2021-11-05 20:48 ` [patch 255/262] mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 257/262] Docs/admin-guide/mm/damon/start: fix a wrong link Andrew Morton
                   ` (5 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, peterx, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/damon/start: fix wrong example commands

Patch series "Fix trivial nits in Documentation/admin-guide/mm".

This patchset fixes trivial nits in admin guide documents for DAMON and
pagemap.


This patch (of 4):

Some of the example commands in DAMON getting started guide are
outdated, missing sudo, or just wrong.  This commit fixes those.

Link: https://lkml.kernel.org/r/20211022090311.3856-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/start.rst |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-fix-wrong-example-commands
+++ a/Documentation/admin-guide/mm/damon/start.rst
@@ -19,7 +19,7 @@ your workload. ::
     # mount -t debugfs none /sys/kernel/debug/
     # git clone https://github.com/awslabs/damo
     # ./damo/damo record $(pidof <your workload>)
-    # ./damo/damo report heat --plot_ascii
+    # ./damo/damo report heats --heatmap stdout
 
 The final command draws the access heatmap of ``<your workload>``.  The heatmap
 shows which memory region (x-axis) is accessed when (y-axis) and how frequently
@@ -94,9 +94,9 @@ Visualizing Recorded Patterns
 The following three commands visualize the recorded access patterns and save
 the results as separate image files. ::
 
-    $ damo report heats --heatmap access_pattern_heatmap.png
-    $ damo report wss --range 0 101 1 --plot wss_dist.png
-    $ damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png
+    $ sudo damo report heats --heatmap access_pattern_heatmap.png
+    $ sudo damo report wss --range 0 101 1 --plot wss_dist.png
+    $ sudo damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png
 
 - ``access_pattern_heatmap.png`` will visualize the data access pattern in a
   heatmap, showing which memory region (y-axis) got accessed when (x-axis)
@@ -115,9 +115,9 @@ Data Access Pattern Aware Memory Managem
 Below three commands make every memory region of size >=4K that doesn't
 accessed for >=60 seconds in your workload to be swapped out. ::
 
-    $ echo "#min-size max-size min-acc max-acc min-age max-age action" > scheme
-    $ echo "4K        max      0       0       60s     max     pageout" >> scheme
-    $ damo schemes -c my_thp_scheme <pid of your workload>
+    $ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
+    $ echo "4K        max      0       0       60s     max     pageout" >> test_scheme
+    $ damo schemes -c test_scheme <pid of your workload>
 
 .. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns
 .. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 257/262] Docs/admin-guide/mm/damon/start: fix a wrong link
  2021-11-05 20:34 incoming Andrew Morton
                   ` (255 preceding siblings ...)
  2021-11-05 20:48 ` [patch 256/262] Docs/admin-guide/mm/damon/start: fix wrong example commands Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 258/262] Docs/admin-guide/mm/damon/start: simplify the content Andrew Morton
                   ` (4 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, peterx, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/damon/start: fix a wrong link

The 'Getting Started' of DAMON is providing a link to DAMON's user
interface document while saying about its user space tool's detailed
usages.  This commit fixes the link.

Link: https://lkml.kernel.org/r/20211022090311.3856-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/start.rst |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-fix-a-wrong-link
+++ a/Documentation/admin-guide/mm/damon/start.rst
@@ -6,7 +6,9 @@ Getting Started
 
 This document briefly describes how you can use DAMON by demonstrating its
 default user space tool.  Please note that this document describes only a part
-of its features for brevity.  Please refer to :doc:`usage` for more details.
+of its features for brevity.  Please refer to the usage `doc
+<https://github.com/awslabs/damo/blob/next/USAGE.md>`_ of the tool for more
+details.
 
 
 TL; DR
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 258/262] Docs/admin-guide/mm/damon/start: simplify the content
  2021-11-05 20:34 incoming Andrew Morton
                   ` (256 preceding siblings ...)
  2021-11-05 20:48 ` [patch 257/262] Docs/admin-guide/mm/damon/start: fix a wrong link Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 259/262] Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Andrew Morton
                   ` (3 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, peterx, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/damon/start: simplify the content

Information in 'TL; DR' section of 'Getting Started' is duplicated in
other parts of the doc.  It is also asking readers to visit the access
pattern visualizations gallery web site to show the results of example
visualization commands, while the users of the commands can use terminal
output.

To make the doc simple, this commit removes the duplicated 'TL; DR'
section and replaces the visualization example commands with versions
using terminal outputs.

Link: https://lkml.kernel.org/r/20211022090311.3856-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/damon/start.rst |  111 +++++++++--------
 1 file changed, 59 insertions(+), 52 deletions(-)

--- a/Documentation/admin-guide/mm/damon/start.rst~docs-admin-guide-mm-damon-start-simplify-the-content
+++ a/Documentation/admin-guide/mm/damon/start.rst
@@ -11,38 +11,6 @@ of its features for brevity.  Please ref
 details.
 
 
-TL; DR
-======
-
-Follow the commands below to monitor and visualize the memory access pattern of
-your workload. ::
-
-    # # build the kernel with CONFIG_DAMON_*=y, install it, and reboot
-    # mount -t debugfs none /sys/kernel/debug/
-    # git clone https://github.com/awslabs/damo
-    # ./damo/damo record $(pidof <your workload>)
-    # ./damo/damo report heats --heatmap stdout
-
-The final command draws the access heatmap of ``<your workload>``.  The heatmap
-shows which memory region (x-axis) is accessed when (y-axis) and how frequently
-(number; the higher the more accesses have been observed). ::
-
-    111111111111111111111111111111111111111111111111111111110000
-    111121111111111111111111111111211111111111111111111111110000
-    000000000000000000000000000000000000000000000000001555552000
-    000000000000000000000000000000000000000000000222223555552000
-    000000000000000000000000000000000000000011111677775000000000
-    000000000000000000000000000000000000000488888000000000000000
-    000000000000000000000000000000000177888400000000000000000000
-    000000000000000000000000000046666522222100000000000000000000
-    000000000000000000000014444344444300000000000000000000000000
-    000000000000000002222245555510000000000000000000000000000000
-    # access_frequency:  0  1  2  3  4  5  6  7  8  9
-    # x-axis: space (140286319947776-140286426374096: 101.496 MiB)
-    # y-axis: time (605442256436361-605479951866441: 37.695430s)
-    # resolution: 60x10 (1.692 MiB and 3.770s for each character)
-
-
 Prerequisites
 =============
 
@@ -93,22 +61,66 @@ pattern in the ``damon.data`` file.
 Visualizing Recorded Patterns
 =============================
 
-The following three commands visualize the recorded access patterns and save
-the results as separate image files. ::
-
-    $ sudo damo report heats --heatmap access_pattern_heatmap.png
-    $ sudo damo report wss --range 0 101 1 --plot wss_dist.png
-    $ sudo damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png
-
-- ``access_pattern_heatmap.png`` will visualize the data access pattern in a
-  heatmap, showing which memory region (y-axis) got accessed when (x-axis)
-  and how frequently (color).
-- ``wss_dist.png`` will show the distribution of the working set size.
-- ``wss_chron_change.png`` will show how the working set size has
-  chronologically changed.
+You can visualize the pattern in a heatmap, showing which memory region
+(x-axis) got accessed when (y-axis) and how frequently (number).::
 
-You can view the visualizations of this example workload at [1]_.
-Visualizations of other realistic workloads are available at [2]_ [3]_ [4]_.
+    $ sudo damo report heats --heatmap stdout
+    22222222222222222222222222222222222222211111111111111111111111111111111111111100
+    44444444444444444444444444444444444444434444444444444444444444444444444444443200
+    44444444444444444444444444444444444444433444444444444444444444444444444444444200
+    33333333333333333333333333333333333333344555555555555555555555555555555555555200
+    33333333333333333333333333333333333344444444444444444444444444444444444444444200
+    22222222222222222222222222222222222223355555555555555555555555555555555555555200
+    00000000000000000000000000000000000000288888888888888888888888888888888888888400
+    00000000000000000000000000000000000000288888888888888888888888888888888888888400
+    33333333333333333333333333333333333333355555555555555555555555555555555555555200
+    88888888888888888888888888888888888888600000000000000000000000000000000000000000
+    88888888888888888888888888888888888888600000000000000000000000000000000000000000
+    33333333333333333333333333333333333333444444444444444444444444444444444444443200
+    00000000000000000000000000000000000000288888888888888888888888888888888888888400
+    [...]
+    # access_frequency:  0  1  2  3  4  5  6  7  8  9
+    # x-axis: space (139728247021568-139728453431248: 196.848 MiB)
+    # y-axis: time (15256597248362-15326899978162: 1 m 10.303 s)
+    # resolution: 80x40 (2.461 MiB and 1.758 s for each character)
+
+You can also visualize the distribution of the working set size, sorted by the
+size.::
+
+    $ sudo damo report wss --range 0 101 10
+    # <percentile> <wss>
+    # target_id     18446632103789443072
+    # avr:  107.708 MiB
+      0             0 B |                                                           |
+     10      95.328 MiB |****************************                               |
+     20      95.332 MiB |****************************                               |
+     30      95.340 MiB |****************************                               |
+     40      95.387 MiB |****************************                               |
+     50      95.387 MiB |****************************                               |
+     60      95.398 MiB |****************************                               |
+     70      95.398 MiB |****************************                               |
+     80      95.504 MiB |****************************                               |
+     90     190.703 MiB |*********************************************************  |
+    100     196.875 MiB |***********************************************************|
+
+Using ``--sortby`` option with the above command, you can show how the working
+set size has chronologically changed.::
+
+    $ sudo damo report wss --range 0 101 10 --sortby time
+    # <percentile> <wss>
+    # target_id     18446632103789443072
+    # avr:  107.708 MiB
+      0       3.051 MiB |                                                           |
+     10     190.703 MiB |***********************************************************|
+     20      95.336 MiB |*****************************                              |
+     30      95.328 MiB |*****************************                              |
+     40      95.387 MiB |*****************************                              |
+     50      95.332 MiB |*****************************                              |
+     60      95.320 MiB |*****************************                              |
+     70      95.398 MiB |*****************************                              |
+     80      95.398 MiB |*****************************                              |
+     90      95.340 MiB |*****************************                              |
+    100      95.398 MiB |*****************************                              |
 
 
 Data Access Pattern Aware Memory Management
@@ -120,8 +132,3 @@ accessed for >=60 seconds in your worklo
     $ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
     $ echo "4K        max      0       0       60s     max     pageout" >> test_scheme
     $ damo schemes -c test_scheme <pid of your workload>
-
-.. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns
-.. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html
-.. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.png.html
-.. [4] https://damonitor.github.io/test/result/visual/latest/rec.wss_time.png.html
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 259/262] Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  2021-11-05 20:34 incoming Andrew Morton
                   ` (257 preceding siblings ...)
  2021-11-05 20:48 ` [patch 258/262] Docs/admin-guide/mm/damon/start: simplify the content Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 260/262] mm/damon: simplify stop mechanism Andrew Morton
                   ` (2 subsequent siblings)
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, corbet, linux-mm, mm-commits, peterx, sj, torvalds

From: SeongJae Park <sj@kernel.org>
Subject: Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions

Some descriptions of page flags in 'pagemap.rst' are written in
assumption of none-rst, which respects every new line, as below:

    7 - SLAB
       page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
       When compound page is used, SLUB/SLQB will only set this flag on the head

Because rst ignores the new line between the first sentence and second
sentence, resulting html looks a little bit weird, as below.

    7 - SLAB
    page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator When
                                                                       ^
    compound page is used, SLUB/SLQB will only set this flag on the head
    page; SLOB will not flag it at all.

This commit makes it more natural and consistent with other parts in the
rendered version.

Link: https://lkml.kernel.org/r/20211022090311.3856-5-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/mm/pagemap.rst |   53 ++++++++++-----------
 1 file changed, 27 insertions(+), 26 deletions(-)

--- a/Documentation/admin-guide/mm/pagemap.rst~docs-admin-guide-mm-pagemap-wordsmith-page-flags-descriptions
+++ a/Documentation/admin-guide/mm/pagemap.rst
@@ -90,13 +90,14 @@ Short descriptions to the page flags
 ====================================
 
 0 - LOCKED
-   page is being locked for exclusive access, e.g. by undergoing read/write IO
+   The page is being locked for exclusive access, e.g. by undergoing read/write
+   IO.
 7 - SLAB
-   page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
+   The page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator.
    When compound page is used, SLUB/SLQB will only set this flag on the head
    page; SLOB will not flag it at all.
 10 - BUDDY
-    a free memory block managed by the buddy system allocator
+    A free memory block managed by the buddy system allocator.
     The buddy system organizes free memory in blocks of various orders.
     An order N block has 2^N physically contiguous pages, with the BUDDY flag
     set for and _only_ for the first page.
@@ -112,65 +113,65 @@ Short descriptions to the page flags
 16 - COMPOUND_TAIL
     A compound page tail (see description above).
 17 - HUGE
-    this is an integral part of a HugeTLB page
+    This is an integral part of a HugeTLB page.
 19 - HWPOISON
-    hardware detected memory corruption on this page: don't touch the data!
+    Hardware detected memory corruption on this page: don't touch the data!
 20 - NOPAGE
-    no page frame exists at the requested address
+    No page frame exists at the requested address.
 21 - KSM
-    identical memory pages dynamically shared between one or more processes
+    Identical memory pages dynamically shared between one or more processes.
 22 - THP
-    contiguous pages which construct transparent hugepages
+    Contiguous pages which construct transparent hugepages.
 23 - OFFLINE
-    page is logically offline
+    The page is logically offline.
 24 - ZERO_PAGE
-    zero page for pfn_zero or huge_zero page
+    Zero page for pfn_zero or huge_zero page.
 25 - IDLE
-    page has not been accessed since it was marked idle (see
+    The page has not been accessed since it was marked idle (see
     :ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).
     Note that this flag may be stale in case the page was accessed via
     a PTE. To make sure the flag is up-to-date one has to read
     ``/sys/kernel/mm/page_idle/bitmap`` first.
 26 - PGTABLE
-    page is in use as a page table
+    The page is in use as a page table.
 
 IO related page flags
 ---------------------
 
 1 - ERROR
-   IO error occurred
+   IO error occurred.
 3 - UPTODATE
-   page has up-to-date data
+   The page has up-to-date data.
    ie. for file backed page: (in-memory data revision >= on-disk one)
 4 - DIRTY
-   page has been written to, hence contains new data
+   The page has been written to, hence contains new data.
    i.e. for file backed page: (in-memory data revision >  on-disk one)
 8 - WRITEBACK
-   page is being synced to disk
+   The page is being synced to disk.
 
 LRU related page flags
 ----------------------
 
 5 - LRU
-   page is in one of the LRU lists
+   The page is in one of the LRU lists.
 6 - ACTIVE
-   page is in the active LRU list
+   The page is in the active LRU list.
 18 - UNEVICTABLE
-   page is in the unevictable (non-)LRU list It is somehow pinned and
+   The page is in the unevictable (non-)LRU list It is somehow pinned and
    not a candidate for LRU page reclaims, e.g. ramfs pages,
-   shmctl(SHM_LOCK) and mlock() memory segments
+   shmctl(SHM_LOCK) and mlock() memory segments.
 2 - REFERENCED
-   page has been referenced since last LRU list enqueue/requeue
+   The page has been referenced since last LRU list enqueue/requeue.
 9 - RECLAIM
-   page will be reclaimed soon after its pageout IO completed
+   The page will be reclaimed soon after its pageout IO completed.
 11 - MMAP
-   a memory mapped page
+   A memory mapped page.
 12 - ANON
-   a memory mapped page that is not part of a file
+   A memory mapped page that is not part of a file.
 13 - SWAPCACHE
-   page is mapped to swap space, i.e. has an associated swap entry
+   The page is mapped to swap space, i.e. has an associated swap entry.
 14 - SWAPBACKED
-   page is backed by swap/RAM
+   The page is backed by swap/RAM.
 
 The page-types tool in the tools/vm directory can be used to query the
 above flags.
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 260/262] mm/damon: simplify stop mechanism
  2021-11-05 20:34 incoming Andrew Morton
                   ` (258 preceding siblings ...)
  2021-11-05 20:48 ` [patch 259/262] Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 261/262] mm/damon: fix a few spelling mistakes in comments and a pr_debug message Andrew Morton
  2021-11-05 20:48 ` [patch 262/262] mm/damon: remove return value from before_terminate callback Andrew Morton
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, changbin.du, linux-mm, mm-commits, sj, torvalds

From: Changbin Du <changbin.du@gmail.com>
Subject: mm/damon: simplify stop mechanism

A kernel thread can exit gracefully with kthread_stop().  So we don't need
a new flag 'kdamond_stop'.  And to make sure the task struct is not freed
when accessing it, get reference to it before termination.

Link: https://lkml.kernel.org/r/20211027130517.4404-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    1 
 mm/damon/core.c       |   51 +++++++++++-----------------------------
 2 files changed, 15 insertions(+), 37 deletions(-)

--- a/include/linux/damon.h~mm-damon-simplify-stop-mechanism
+++ a/include/linux/damon.h
@@ -381,7 +381,6 @@ struct damon_ctx {
 
 /* public: */
 	struct task_struct *kdamond;
-	bool kdamond_stop;
 	struct mutex kdamond_lock;
 
 	struct damon_primitive primitive;
--- a/mm/damon/core.c~mm-damon-simplify-stop-mechanism
+++ a/mm/damon/core.c
@@ -390,17 +390,6 @@ static unsigned long damon_region_sz_lim
 	return sz;
 }
 
-static bool damon_kdamond_running(struct damon_ctx *ctx)
-{
-	bool running;
-
-	mutex_lock(&ctx->kdamond_lock);
-	running = ctx->kdamond != NULL;
-	mutex_unlock(&ctx->kdamond_lock);
-
-	return running;
-}
-
 static int kdamond_fn(void *data);
 
 /*
@@ -418,7 +407,6 @@ static int __damon_start(struct damon_ct
 	mutex_lock(&ctx->kdamond_lock);
 	if (!ctx->kdamond) {
 		err = 0;
-		ctx->kdamond_stop = false;
 		ctx->kdamond = kthread_run(kdamond_fn, ctx, "kdamond.%d",
 				nr_running_ctxs);
 		if (IS_ERR(ctx->kdamond)) {
@@ -474,13 +462,15 @@ int damon_start(struct damon_ctx **ctxs,
  */
 static int __damon_stop(struct damon_ctx *ctx)
 {
+	struct task_struct *tsk;
+
 	mutex_lock(&ctx->kdamond_lock);
-	if (ctx->kdamond) {
-		ctx->kdamond_stop = true;
+	tsk = ctx->kdamond;
+	if (tsk) {
+		get_task_struct(tsk);
 		mutex_unlock(&ctx->kdamond_lock);
-		while (damon_kdamond_running(ctx))
-			usleep_range(ctx->sample_interval,
-					ctx->sample_interval * 2);
+		kthread_stop(tsk);
+		put_task_struct(tsk);
 		return 0;
 	}
 	mutex_unlock(&ctx->kdamond_lock);
@@ -925,12 +915,8 @@ static bool kdamond_need_update_primitiv
 static bool kdamond_need_stop(struct damon_ctx *ctx)
 {
 	struct damon_target *t;
-	bool stop;
 
-	mutex_lock(&ctx->kdamond_lock);
-	stop = ctx->kdamond_stop;
-	mutex_unlock(&ctx->kdamond_lock);
-	if (stop)
+	if (kthread_should_stop())
 		return true;
 
 	if (!ctx->primitive.target_valid)
@@ -1021,13 +1007,6 @@ static int kdamond_wait_activation(struc
 	return -EBUSY;
 }
 
-static void set_kdamond_stop(struct damon_ctx *ctx)
-{
-	mutex_lock(&ctx->kdamond_lock);
-	ctx->kdamond_stop = true;
-	mutex_unlock(&ctx->kdamond_lock);
-}
-
 /*
  * The monitoring daemon that runs as a kernel thread
  */
@@ -1038,17 +1017,18 @@ static int kdamond_fn(void *data)
 	struct damon_region *r, *next;
 	unsigned int max_nr_accesses = 0;
 	unsigned long sz_limit = 0;
+	bool done = false;
 
 	pr_debug("kdamond (%d) starts\n", current->pid);
 
 	if (ctx->primitive.init)
 		ctx->primitive.init(ctx);
 	if (ctx->callback.before_start && ctx->callback.before_start(ctx))
-		set_kdamond_stop(ctx);
+		done = true;
 
 	sz_limit = damon_region_sz_limit(ctx);
 
-	while (!kdamond_need_stop(ctx)) {
+	while (!kdamond_need_stop(ctx) && !done) {
 		if (kdamond_wait_activation(ctx))
 			continue;
 
@@ -1056,7 +1036,7 @@ static int kdamond_fn(void *data)
 			ctx->primitive.prepare_access_checks(ctx);
 		if (ctx->callback.after_sampling &&
 				ctx->callback.after_sampling(ctx))
-			set_kdamond_stop(ctx);
+			done = true;
 
 		usleep_range(ctx->sample_interval, ctx->sample_interval + 1);
 
@@ -1069,7 +1049,7 @@ static int kdamond_fn(void *data)
 					sz_limit);
 			if (ctx->callback.after_aggregation &&
 					ctx->callback.after_aggregation(ctx))
-				set_kdamond_stop(ctx);
+				done = true;
 			kdamond_apply_schemes(ctx);
 			kdamond_reset_aggregated(ctx);
 			kdamond_split_regions(ctx);
@@ -1088,9 +1068,8 @@ static int kdamond_fn(void *data)
 			damon_destroy_region(r, t);
 	}
 
-	if (ctx->callback.before_terminate &&
-			ctx->callback.before_terminate(ctx))
-		set_kdamond_stop(ctx);
+	if (ctx->callback.before_terminate)
+		ctx->callback.before_terminate(ctx);
 	if (ctx->primitive.cleanup)
 		ctx->primitive.cleanup(ctx);
 
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 261/262] mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  2021-11-05 20:34 incoming Andrew Morton
                   ` (259 preceding siblings ...)
  2021-11-05 20:48 ` [patch 260/262] mm/damon: simplify stop mechanism Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  2021-11-05 20:48 ` [patch 262/262] mm/damon: remove return value from before_terminate callback Andrew Morton
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, colin.i.king, colin.i.king, linux-mm, mm-commits, sj, torvalds

From: Colin Ian King <colin.i.king@googlemail.com>
Subject: mm/damon: fix a few spelling mistakes in comments and a pr_debug message

There are a few spelling mistakes in the code. Fix these.

Link: https://lkml.kernel.org/r/20211028184157.614544-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/damon/core.c       |    2 +-
 mm/damon/dbgfs-test.h |    2 +-
 mm/damon/vaddr-test.h |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/mm/damon/core.c~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message
+++ a/mm/damon/core.c
@@ -959,7 +959,7 @@ static unsigned long damos_wmark_wait_us
 	/* higher than high watermark or lower than low watermark */
 	if (metric > scheme->wmarks.high || scheme->wmarks.low > metric) {
 		if (scheme->wmarks.activated)
-			pr_debug("inactivate a scheme (%d) for %s wmark\n",
+			pr_debug("deactivate a scheme (%d) for %s wmark\n",
 					scheme->action,
 					metric > scheme->wmarks.high ?
 					"high" : "low");
--- a/mm/damon/dbgfs-test.h~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message
+++ a/mm/damon/dbgfs-test.h
@@ -145,7 +145,7 @@ static void damon_dbgfs_test_set_init_re
 
 		KUNIT_EXPECT_STREQ(test, (char *)buf, expect);
 	}
-	/* Put invlid inputs and check the return error code */
+	/* Put invalid inputs and check the return error code */
 	for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) {
 		input = invalid_inputs[i];
 		pr_info("input: %s\n", input);
--- a/mm/damon/vaddr-test.h~mm-damon-fix-a-few-spelling-mistakes-in-comments-and-a-pr_debug-message
+++ a/mm/damon/vaddr-test.h
@@ -233,7 +233,7 @@ static void damon_test_apply_three_regio
  * and 70-100) has totally freed and mapped to different area (30-32 and
  * 65-68).  The target regions which were in the old second and third big
  * regions should now be removed and new target regions covering the new second
- * and third big regions should be crated.
+ * and third big regions should be created.
  */
 static void damon_test_apply_three_regions4(struct kunit *test)
 {
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* [patch 262/262] mm/damon: remove return value from before_terminate callback
  2021-11-05 20:34 incoming Andrew Morton
                   ` (260 preceding siblings ...)
  2021-11-05 20:48 ` [patch 261/262] mm/damon: fix a few spelling mistakes in comments and a pr_debug message Andrew Morton
@ 2021-11-05 20:48 ` Andrew Morton
  261 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-05 20:48 UTC (permalink / raw)
  To: akpm, changbin.du, linux-mm, mm-commits, sj, torvalds

From: Changbin Du <changbin.du@gmail.com>
Subject: mm/damon: remove return value from before_terminate callback

Since the return value of 'before_terminate' callback is never used,
we make it have no return value.

Link: https://lkml.kernel.org/r/20211029005023.8895-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/damon.h |    2 +-
 mm/damon/dbgfs.c      |    5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

--- a/include/linux/damon.h~mm-damon-remove-return-value-from-before_terminate-callback
+++ a/include/linux/damon.h
@@ -322,7 +322,7 @@ struct damon_callback {
 	int (*before_start)(struct damon_ctx *context);
 	int (*after_sampling)(struct damon_ctx *context);
 	int (*after_aggregation)(struct damon_ctx *context);
-	int (*before_terminate)(struct damon_ctx *context);
+	void (*before_terminate)(struct damon_ctx *context);
 };
 
 /**
--- a/mm/damon/dbgfs.c~mm-damon-remove-return-value-from-before_terminate-callback
+++ a/mm/damon/dbgfs.c
@@ -645,18 +645,17 @@ static void dbgfs_fill_ctx_dir(struct de
 		debugfs_create_file(file_names[i], 0600, dir, ctx, fops[i]);
 }
 
-static int dbgfs_before_terminate(struct damon_ctx *ctx)
+static void dbgfs_before_terminate(struct damon_ctx *ctx)
 {
 	struct damon_target *t, *next;
 
 	if (!targetid_is_pid(ctx))
-		return 0;
+		return;
 
 	damon_for_each_target_safe(t, next, ctx) {
 		put_pid((struct pid *)t->id);
 		damon_destroy_target(t);
 	}
-	return 0;
 }
 
 static struct damon_ctx *dbgfs_new_ctx(void)
_


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables
  2021-11-05 20:38 ` [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables Andrew Morton
@ 2021-11-05 20:57   ` Nadav Amit
  2021-11-06 18:54     ` Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Nadav Amit @ 2021-11-05 20:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andrea Arcangeli, Andrew Cooper, Dave Hansen, Linux-MM,
	Andy Lutomirski, mm-commits, Nick Piggin, Peter Zijlstra,
	Thomas Gleixner, Linus Torvalds, Will Deacon, Yu Zhao,
	Hugh Dickins


> On Nov 5, 2021, at 1:38 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> From: Nadav Amit <namit@vmware.com>
> Subject: mm/memory.c: use correct VMA flags when freeing page-tables
> 
> Consistent use of the mmu_gather interface requires a call to
> tlb_start_vma() and tlb_end_vma() for each VMA.  free_pgtables() does not
> follow this pattern.
> 
> Certain architectures need tlb_start_vma() to be called in order for
> tlb_update_vma_flags() to update the VMA flags (tlb->vma_exec and
> tlb->vma_huge), which are later used for the proper TLB flush to be
> issued.  Since tlb_start_vma() is not called, this can lead to the wrong
> VMA flags being used when the flush is performed.
> 
> Specifically, the munmap syscall would call unmap_region(), which unmaps
> the VMAs and then frees the page-tables.  A flush is needed after the
> page-tables are removed to prevent page-walk caches from holding stale
> entries, but this flush would use the flags of the VMA flags of the last
> VMA that was flushed.  This does not appear to be right.
> 
> Use tlb_start_vma() and tlb_end_vma() to prevent this from happening. 
> This might lead to unnecessary calls to flush_cache_range() on certain
> arch's.  If needed, a new flag can be added to mmu_gather to indicate that
> the flush is not needed.

Hugh correctly indicated that I made a silly bug, and this patch is not
healping.

Nothing would explode, but I assumed the patch would be dropped for me
to submit v2.

I’ll send a fix to this fix instead unless it is dropped.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-05 20:42 ` [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested Andrew Morton
@ 2021-11-05 21:02   ` Matthew Wilcox
  2021-11-06 20:49     ` Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Matthew Wilcox @ 2021-11-05 21:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: adilger.kernel, corbet, david, djwong, hannes, linux-mm, mgorman,
	mhocko, mm-commits, neilb, riel, torvalds, tytso, vbabka

On Fri, Nov 05, 2021 at 01:42:25PM -0700, Andrew Morton wrote:
> --- a/mm/filemap.c~mm-vmscan-throttle-reclaim-until-some-writeback-completes-if-congested
> +++ a/mm/filemap.c
> @@ -1612,6 +1612,7 @@ void end_page_writeback(struct page *pag
>  
>  	smp_mb__after_atomic();
>  	wake_up_page(page, PG_writeback);
> +	acct_reclaim_writeback(page);
>  	put_page(page);
>  }
>  EXPORT_SYMBOL(end_page_writeback);

hmm?  I think you based on some older version of Linus' tree that didn't
have folios.  This fixup patch was against an older fixup patch that you did, 
but maybe it's enough for Linus to apply ...

diff --git a/mm/filemap.c b/mm/filemap.c
index 6844c9816a86..daa0e23a6ee6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1607,7 +1607,7 @@ void folio_end_writeback(struct folio *folio)
 
 	smp_mb__after_atomic();
 	folio_wake(folio, PG_writeback);
-	acct_reclaim_writeback(folio_page(folio, 0));
+	acct_reclaim_writeback(folio);
 	folio_put(folio);
 }
 EXPORT_SYMBOL(folio_end_writeback);
diff --git a/mm/internal.h b/mm/internal.h
index 632c55c5a075..3b79a5c9427a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -41,15 +41,15 @@ static inline void *folio_raw_mapping(struct folio *folio)
 	return (void *)(mapping & ~PAGE_MAPPING_FLAGS);
 }
 
-void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page,
+void __acct_reclaim_writeback(pg_data_t *pgdat, struct folio *folio,
 						int nr_throttled);
-static inline void acct_reclaim_writeback(struct page *page)
+static inline void acct_reclaim_writeback(struct folio *folio)
 {
-	pg_data_t *pgdat = page_pgdat(page);
+	pg_data_t *pgdat = folio_pgdat(folio);
 	int nr_throttled = atomic_read(&pgdat->nr_writeback_throttled);
 
 	if (nr_throttled)
-		__acct_reclaim_writeback(pgdat, page, nr_throttled);
+		__acct_reclaim_writeback(pgdat, folio, nr_throttled);
 }
 
 static inline void wake_throttle_isolated(pg_data_t *pgdat)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 59c07ee4220d..fb9584641ac7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1085,12 +1085,12 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason)
  * pages to clean. If enough pages have been cleaned since throttling
  * started then wakeup the throttled tasks.
  */
-void __acct_reclaim_writeback(pg_data_t *pgdat, struct page *page,
+void __acct_reclaim_writeback(pg_data_t *pgdat, struct folio *folio,
 							int nr_throttled)
 {
 	unsigned long nr_written;
 
-	inc_node_page_state(page, NR_THROTTLED_WRITTEN);
+	node_stat_add_folio(folio, NR_THROTTLED_WRITTEN);
 
 	/*
 	 * This is an inaccurate read as the per-cpu deltas may not



^ permalink raw reply related	[flat|nested] 485+ messages in thread

* Re: [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable
  2021-11-05 20:38 ` [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable Andrew Morton
@ 2021-11-06  4:29   ` Andy Lutomirski
  2021-11-06 19:10     ` Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andy Lutomirski @ 2021-11-06  4:29 UTC (permalink / raw)
  To: Andrew Morton, anton, Benjamin Herrenschmidt, linux-mm,
	mm-commits, Nicholas Piggin, paulus, Randy Dunlap,
	Linus Torvalds, Peter Zijlstra (Intel)

On Fri, Nov 5, 2021, at 1:38 PM, Andrew Morton wrote:
> From: Nicholas Piggin <npiggin@gmail.com>
> Subject: lazy tlb: allow lazy tlb mm refcounting to be configurable
>
> Add CONFIG_MMU_TLB_REFCOUNT which enables refcounting of the lazy tlb mm
> when it is context switched.  This can be disabled by architectures that
> don't require this refcounting if they clean up lazy tlb mms when the last
> refcount is dropped.  Currently this is always enabled, which is what
> existing code does, so the patch is effectively a no-op.
>
> Rename rq->prev_mm to rq->prev_lazy_mm, because that's what it is.

Still nacked by me.  Since I seem to have been doing a poor job of explaining my issues with this patch, I'll explain with code:

commit 54b675d9b28d9a56289d06a813250472bc621f40
Author: Andy Lutomirski <luto@kernel.org>
Date:   Fri Nov 5 21:20:47 2021 -0700

    [HACK] demonstrate lazy tlb issues

diff --git a/arch/Kconfig b/arch/Kconfig
index cca27f1b5d0e..19f273642d8f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -442,6 +442,7 @@ config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
 config MMU_LAZY_TLB_REFCOUNT
 	def_bool y
 	depends on !MMU_LAZY_TLB_SHOOTDOWN
+	depends on !X86
 
 # This option allows MMU_LAZY_TLB_REFCOUNT=n. It ensures no CPUs are using an
 # mm as a lazy tlb beyond its last reference count, by shooting down these
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25dd795497e8..c5a0c1e92524 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4902,6 +4902,13 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 */
 	arch_start_context_switch(prev);
 
+	/*
+	 * Sanity check: if something went wrong and the previous mm was
+	 * freed while we were still using it, KASAN might not notice
+	 * without help.
+	 */
+	kasan_check_byte(prev->active_mm);
+
 	/*
 	 * kernel -> kernel   lazy + transfer active
 	 *   user -> kernel   lazy + mmgrab_lazy_tlb() active

Build this with KASAN for x86 and try to boot it.  It splats left and right.  The issue is that the !MMU_LAZY_TLB_REFCOUNT mode, while safe under certain select circumstances (maybe -- still not quite convinced) cheats and ignores the fact that the scheduler itself maintains a pointer to the old mm.  On x86, on bare metal, we *already* don't access lazy mms after the process is gone because the pagetable freeing process shoots down the lazy mm, so we are compliant with all the supposed preconditions of this new mode.  But the scheduler itself still has this nonsense active_mm pointer, and, if anyone ever tries to do anything with it (e.g. the above hack to force kasan to validate it), it all blows up.

On top of this, the whole refcount-me-maybe mode seems incredibly fragile, and I don't think the kernel really benefits from having a set of refcount helpers that may or may not keep the supposedly refcounted object alive depending on config.  And the mere fact that my patch appears to work as long as kasan isn't in play should be a pretty good indicator that this whole thing is not terribly robust.

So I think there are a few credible choices:

1. Find an alternative solution that gets the performance we want without dangling references.

2. Make the MMU_LAZY_TLB_REFCOUNT mode genuinely safe.  This means literally ifdeffing out active_mm so it can't dangle.  Doing that cleanly will be a lot of nasty arch work.

I again apologize that my series is taking so long, although I think it's finally getting into decent shape.  I still need to deal with the scs mess (that's new), finish tidying up kthread, and make sure hotplug is good.  But all this is because this is really hairy code and I'm trying to do it right.

If anyone wants to help, help is welcome.  Otherwise, I really do intend to get it all the way done soon.

--Andy


^ permalink raw reply related	[flat|nested] 485+ messages in thread

* Re: [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables
  2021-11-05 20:57   ` Nadav Amit
@ 2021-11-06 18:54     ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 18:54 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Andrew Morton, Andrea Arcangeli, Andrew Cooper, Dave Hansen,
	Linux-MM, Andy Lutomirski, mm-commits, Nick Piggin,
	Peter Zijlstra, Thomas Gleixner, Will Deacon, Yu Zhao,
	Hugh Dickins

On Fri, Nov 5, 2021 at 1:57 PM Nadav Amit <namit@vmware.com> wrote:
>
> Hugh correctly indicated that I made a silly bug, and this patch is not
> healping.
>
> Nothing would explode, but I assumed the patch would be dropped for me
> to submit v2.
>
> I’ll send a fix to this fix instead unless it is dropped.

I've dropped it.

                Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable
  2021-11-06  4:29   ` Andy Lutomirski
@ 2021-11-06 19:10     ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 19:10 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Morton, Anton Blanchard, Benjamin Herrenschmidt, Linux-MM,
	mm-commits, Nicholas Piggin, Paul Mackerras, Randy Dunlap,
	Peter Zijlstra (Intel)

Dropped once again until people can agree on this all..

               Linus

On Fri, Nov 5, 2021 at 9:29 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> So I think there are a few credible choices:
>
> 1. Find an alternative solution that gets the performance we want without dangling references.
>
> 2. Make the MMU_LAZY_TLB_REFCOUNT mode genuinely safe.  This means literally ifdeffing out active_mm so it can't dangle.  Doing that cleanly will be a lot of nasty arch work.
>
> I again apologize that my series is taking so long, although I think it's finally getting into decent shape.  I still need to deal with the scs mess (that's new), finish tidying up kthread, and make sure hotplug is good.  But all this is because this is really hairy code and I'm trying to do it right.
>
> If anyone wants to help, help is welcome.  Otherwise, I really do intend to get it all the way done soon.
>
> --Andy


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-05 21:02   ` Matthew Wilcox
@ 2021-11-06 20:49     ` Linus Torvalds
  2021-11-06 21:12       ` Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 20:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Andreas Dilger, Jonathan Corbet, Dave Chinner,
	Darrick J. Wong, Johannes Weiner, Linux-MM, Mel Gorman,
	Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o, Vlastimil Babka

On Fri, Nov 5, 2021 at 2:05 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> hmm?  I think you based on some older version of Linus' tree that didn't
> have folios.

Andrew these days actually maintains a base commit model exactly so
that he doesn't end up rebasing during development.

So the whole series is based on plain 5.15, and I'll take care of the
conflict resolution.

This workflow can result in more conflicts for me than what Andrew
used to do ("send against current linus tip"), but it means that when
conflicts happen, they get all the merge resolution help that git
gives you, and hopefully what gets tested (over the months that it can
be in -mm) is closer to what gets sent to me.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 20:49     ` Linus Torvalds
@ 2021-11-06 21:12       ` Linus Torvalds
  2021-11-06 21:13         ` Vlastimil Babka
  2021-11-06 22:45         ` Matthew Wilcox
  0 siblings, 2 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 21:12 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Andreas Dilger, Jonathan Corbet, Dave Chinner,
	Darrick J. Wong, Johannes Weiner, Linux-MM, Mel Gorman,
	Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o, Vlastimil Babka

On Sat, Nov 6, 2021 at 1:49 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This workflow can result in more conflicts for me than what Andrew
> used to do ("send against current linus tip"), but it means that when
> conflicts happen, they get all the merge resolution help that git
> gives you, and hopefully what gets tested (over the months that it can
> be in -mm) is closer to what gets sent to me.

.. and resolving the conflicts (none of which looked bad), I think
that part of the resolution ends up doing very similar things to your
fixup patch.

So it looks all good.

Famous last words.

                Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 21:12       ` Linus Torvalds
@ 2021-11-06 21:13         ` Vlastimil Babka
  2021-11-06 21:20           ` Andrew Morton
  2021-11-06 21:20           ` Linus Torvalds
  2021-11-06 22:45         ` Matthew Wilcox
  1 sibling, 2 replies; 485+ messages in thread
From: Vlastimil Babka @ 2021-11-06 21:13 UTC (permalink / raw)
  To: Linus Torvalds, Matthew Wilcox
  Cc: Andrew Morton, Andreas Dilger, Jonathan Corbet, Dave Chinner,
	Darrick J. Wong, Johannes Weiner, Linux-MM, Mel Gorman,
	Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o

On 11/6/21 22:12, Linus Torvalds wrote:
> On Sat, Nov 6, 2021 at 1:49 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> This workflow can result in more conflicts for me than what Andrew
>> used to do ("send against current linus tip"), but it means that when
>> conflicts happen, they get all the merge resolution help that git
>> gives you, and hopefully what gets tested (over the months that it can
>> be in -mm) is closer to what gets sent to me.
> 
> .. and resolving the conflicts (none of which looked bad), I think
> that part of the resolution ends up doing very similar things to your
> fixup patch.

If this needed resolution, didn't the resolution exist in -next already?

> So it looks all good.
> 
> Famous last words.
> 
>                 Linus
> 



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 21:13         ` Vlastimil Babka
@ 2021-11-06 21:20           ` Andrew Morton
  2021-11-06 21:20           ` Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-06 21:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Linus Torvalds, Matthew Wilcox, Andreas Dilger, Jonathan Corbet,
	Dave Chinner, Darrick J. Wong, Johannes Weiner, Linux-MM,
	Mel Gorman, Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o

On Sat, 6 Nov 2021 22:13:34 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 11/6/21 22:12, Linus Torvalds wrote:
> > On Sat, Nov 6, 2021 at 1:49 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >>
> >> This workflow can result in more conflicts for me than what Andrew
> >> used to do ("send against current linus tip"), but it means that when
> >> conflicts happen, they get all the merge resolution help that git
> >> gives you, and hopefully what gets tested (over the months that it can
> >> be in -mm) is closer to what gets sent to me.
> > 
> > .. and resolving the conflicts (none of which looked bad), I think
> > that part of the resolution ends up doing very similar things to your
> > fixup patch.
> 
> If this needed resolution, didn't the resolution exist in -next already?

Yes, but I had it queued after linux-next.patch so it got lost in the
unholy mess that linux-next becomes during the merge window.

I'm still figuring this out.  In retrospect I should have moved this
patch "mm/vmscan: throttle reclaim until some writeback completes if
congested" to the post-linux-next section weeks ago, then waited for
the prerequisites to be merged into mainline.  That way the unaltered,
tested patch would have smoothly slotted in late in the merge window.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 21:13         ` Vlastimil Babka
  2021-11-06 21:20           ` Andrew Morton
@ 2021-11-06 21:20           ` Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 21:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Matthew Wilcox, Andrew Morton, Andreas Dilger, Jonathan Corbet,
	Dave Chinner, Darrick J. Wong, Johannes Weiner, Linux-MM,
	Mel Gorman, Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o

On Sat, Nov 6, 2021 at 2:13 PM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> If this needed resolution, didn't the resolution exist in -next already?

Oh, I'm sure it was there in -next.

But I just always do my own merge resolution anyway because I want to
see what's going on.

I don't look at other peoples resolutions, and I much prefer to
actually look at the history itself in order to actually understand
what the history and cause for the conflicts is (and what the proper
resolution was).

Of course, in many cases it's so trivial that there's not a lot to
"understand", and most merge conflicts by far are not the kind that
need a lot of thought.

But just to clarify: I do actually like seeing people send their
resolutions to me (possibly as an addendum to the pull request email,
or possibly as a separate "resolved" branch).

I don't use those to guide my resolution, but if there is any subtle
issues at all, I will then compare the end results to verify that they
agreed. Often any differences tend to be just whitespace or similar,
but it can be interesting to see when there are meaningful semantic
differences.

          Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 21:12       ` Linus Torvalds
  2021-11-06 21:13         ` Vlastimil Babka
@ 2021-11-06 22:45         ` Matthew Wilcox
  2021-11-06 23:26           ` Linus Torvalds
  1 sibling, 1 reply; 485+ messages in thread
From: Matthew Wilcox @ 2021-11-06 22:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Andreas Dilger, Jonathan Corbet, Dave Chinner,
	Darrick J. Wong, Johannes Weiner, Linux-MM, Mel Gorman,
	Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o, Vlastimil Babka

On Sat, Nov 06, 2021 at 02:12:02PM -0700, Linus Torvalds wrote:
> On Sat, Nov 6, 2021 at 1:49 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > This workflow can result in more conflicts for me than what Andrew
> > used to do ("send against current linus tip"), but it means that when
> > conflicts happen, they get all the merge resolution help that git
> > gives you, and hopefully what gets tested (over the months that it can
> > be in -mm) is closer to what gets sent to me.
> 
> .. and resolving the conflicts (none of which looked bad), I think
> that part of the resolution ends up doing very similar things to your
> fixup patch.

Reviewed what you did in the merge commit, looks good to me.  And I've
learned I need to run git log --cc instead of -p in order to see all
changes to a file.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested
  2021-11-06 22:45         ` Matthew Wilcox
@ 2021-11-06 23:26           ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-11-06 23:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Andreas Dilger, Jonathan Corbet, Dave Chinner,
	Darrick J. Wong, Johannes Weiner, Linux-MM, Mel Gorman,
	Michal Hocko, mm-commits, Neil Brown, Rik van Riel,
	Theodore Ts'o, Vlastimil Babka

On Sat, Nov 6, 2021 at 3:46 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> Reviewed what you did in the merge commit, looks good to me.  And I've
> learned I need to run git log --cc instead of -p in order to see all
> changes to a file.

Heh.

If this is your first time using "--cc" (although it's the default for
"git show", so you may have used it without being aware of it), it's
very useful and powerful, but it's worth keeping in mind that it's
also a lot more limited than the merge-time "git diff" output.

At merge time, git has computed the shared state parenthood, and "git
diff" knows about not only the current state, but also the state of
both parents and the base state of the file (in a three-way merge kind
of sense, although with recursive merges the "base state" may be much
more complex than just a shared parent state).

But "git log --cc" (and related "show commit" kind of things, like
"git show" and friends) only sees the final result and the parent
information. The full common parent and base state isn't there
after-the-fact.

That means that "git log --cc" doesn't have quite as much information
to go by, and the "--cc" output can sometimes be a bit misleading.

In particular, if there was a conflict, and the resolution ended up
basically being "take one side where the conflict was", then "git log
--cc" will not show the conflict resolution as a conflict at all - it
will just think "ok, development was done on that branch, the other
side was irrelevant".

So "--cc" is very useful, and often shows that interesting sub-part of
the merge where there were conflicts. But it's definitely somewhat
limited, and can end up looking like there was no conflict at all even
when there was something.

           Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags
  2021-11-05 20:39 ` [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags Andrew Morton
@ 2021-11-08  9:25   ` Michal Hocko
  2021-11-08 17:15     ` Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Michal Hocko @ 2021-11-08  9:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: akpm, david, hch, idryomov, jlayton, linux-mm, mm-commits, neilb,
	torvalds, urezki

On Fri 05-11-21 13:39:50, Andrew Morton wrote:
> From: Michal Hocko <mhocko@suse.com>
> Subject: mm/vmalloc: be more explicit about supported gfp flags
> 
> The core of the vmalloc allocator __vmalloc_area_node doesn't say anything
> about gfp mask argument.  Not all gfp flags are supported though.  Be more
> explicit about constraints.
> 
> Link: https://lkml.kernel.org/r/20211020082545.4830-1-mhocko@kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Neil Brown <neilb@suse.de>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Ilya Dryomov <idryomov@gmail.com>
> Cc: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

As already pointed out
http://lkml.kernel.org/r/YXE+hcodJ7zxeYA7@dhcp22.suse.cz this patch
cannot be applied without other patches from the same series.

> ---
> 
>  mm/vmalloc.c |   12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> --- a/mm/vmalloc.c~mm-vmalloc-be-more-explicit-about-supported-gfp-flags
> +++ a/mm/vmalloc.c
> @@ -2983,8 +2983,16 @@ fail:
>   * @caller:		  caller's return address
>   *
>   * Allocate enough pages to cover @size from the page level
> - * allocator with @gfp_mask flags.  Map them into contiguous
> - * kernel virtual space, using a pagetable protection of @prot.
> + * allocator with @gfp_mask flags. Please note that the full set of gfp
> + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> + * __GFP_NOWARN can be used to suppress error messages about failures.
> + *
> + * Map them into contiguous kernel virtual space, using a pagetable
> + * protection of @prot.
>   *
>   * Return: the address of the area or %NULL on failure
>   */
> _

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags
  2021-11-08  9:25   ` Michal Hocko
@ 2021-11-08 17:15     ` Linus Torvalds
  2021-11-08 17:30       ` Michal Hocko
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-11-08 17:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Linux Kernel Mailing List, Andrew Morton, Dave Chinner,
	Christoph Hellwig, Ilya Dryomov, Jeff Layton, Linux-MM,
	mm-commits, Neil Brown, Uladzislau Rezki

On Mon, Nov 8, 2021 at 1:25 AM Michal Hocko <mhocko@suse.com> wrote:
>
> As already pointed out
> http://lkml.kernel.org/r/YXE+hcodJ7zxeYA7@dhcp22.suse.cz this patch
> cannot be applied without other patches from the same series.

Hmm. I've taken it already.

Not a huge deal, since it's a comment change - and the code will
presumably eventually match the updated comment.

I guess it's a new thing that instead of stale comments, we have
future-proof ones ;)

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags
  2021-11-08 17:15     ` Linus Torvalds
@ 2021-11-08 17:30       ` Michal Hocko
  0 siblings, 0 replies; 485+ messages in thread
From: Michal Hocko @ 2021-11-08 17:30 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Dave Chinner, Christoph Hellwig, Ilya Dryomov,
	Jeff Layton, Linux-MM, mm-commits, Neil Brown, Uladzislau Rezki

On Mon 08-11-21 09:15:04, Linus Torvalds wrote:
> On Mon, Nov 8, 2021 at 1:25 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > As already pointed out
> > http://lkml.kernel.org/r/YXE+hcodJ7zxeYA7@dhcp22.suse.cz this patch
> > cannot be applied without other patches from the same series.
> 
> Hmm. I've taken it already.
> 
> Not a huge deal, since it's a comment change - and the code will
> presumably eventually match the updated comment.

I plan to send the rest after the merge window.
 
> I guess it's a new thing that instead of stale comments, we have
> future-proof ones ;)

I just hope nobody gets confused about which are not supported yet. E.g.
GFP_NOFAIL, GFP_NO{FS,IO}. In both cases the direct use could lead to
bugs.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-27 19:41 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-27 19:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

2 patches, based on d615b5416f8a1afeb82d13b238f8152c572d59c0.

Subsystems affected by this patch series:

  mm/kasan
  mm/debug

Subsystem: mm/kasan

    Zqiang <qiang1.zhang@intel.com>:
      kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time

Subsystem: mm/debug

    Akira Yokosawa <akiyks@gmail.com>:
      docs: vm/page_owner: use literal blocks for param description

 Documentation/vm/page_owner.rst |    5 +++--
 mm/kasan/quarantine.c           |    7 +++++++
 2 files changed, 10 insertions(+), 2 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-21 23:35 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-21 23:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches

13 patches, based on b253435746d9a4a701b5f09211b9c14d3370d0da.

Subsystems affected by this patch series:

  mm/memory-failure
  mm/memcg
  mm/userfaultfd
  mm/hugetlbfs
  mm/mremap
  mm/oom-kill
  mm/kasan
  kcov
  mm/hmm

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()

    Xu Yu <xuyu@linux.alibaba.com>:
      mm/memory-failure.c: skip huge_zero_page in memory_failure()

Subsystem: mm/memcg

    Shakeel Butt <shakeelb@google.com>:
      memcg: sync flush only if periodic flush is delayed

Subsystem: mm/userfaultfd

    Nadav Amit <namit@vmware.com>:
      userfaultfd: mark uffd_wp regardless of VM_WRITE flag

Subsystem: mm/hugetlbfs

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm, hugetlb: allow for "high" userspace addresses

Subsystem: mm/mremap

    Sidhartha Kumar <sidhartha.kumar@oracle.com>:
      selftest/vm: verify mmap addr in mremap_test
      selftest/vm: verify remap destination address in mremap_test
      selftest/vm: support xfail in mremap_test
      selftest/vm: add skip support to mremap_test

Subsystem: mm/oom-kill

    Nico Pache <npache@redhat.com>:
      oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup

Subsystem: mm/kasan

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      MAINTAINERS: add Vincenzo Frascino to KASAN reviewers

Subsystem: kcov

    Aleksandr Nogikh <nogikh@google.com>:
      kcov: don't generate a warning on vm_insert_page()'s failure

Subsystem: mm/hmm

    Alistair Popple <apopple@nvidia.com>:
      mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()

 MAINTAINERS                               |    1 
 fs/hugetlbfs/inode.c                      |    9 -
 include/linux/hugetlb.h                   |    6 +
 include/linux/memcontrol.h                |    5 
 include/linux/mm.h                        |    8 +
 include/linux/sched.h                     |    1 
 include/linux/sched/mm.h                  |    8 +
 kernel/kcov.c                             |    7 -
 mm/hugetlb.c                              |   10 +
 mm/memcontrol.c                           |   12 ++
 mm/memory-failure.c                       |  158 ++++++++++++++++++++++--------
 mm/mmap.c                                 |    8 -
 mm/mmu_notifier.c                         |   14 ++
 mm/oom_kill.c                             |   54 +++++++---
 mm/userfaultfd.c                          |   15 +-
 mm/workingset.c                           |    2 
 tools/testing/selftests/vm/mremap_test.c  |   85 +++++++++++++++-
 tools/testing/selftests/vm/run_vmtests.sh |   11 +-
 18 files changed, 327 insertions(+), 87 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-15  2:12 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-15  2:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

14 patches, based on 115acbb56978941bb7537a97dfc303da286106c1.

Subsystems affected by this patch series:

  MAINTAINERS
  mm/tmpfs
  m/secretmem
  mm/kasan
  mm/kfence
  mm/pagealloc
  mm/zram
  mm/compaction
  mm/hugetlb
  binfmt
  mm/vmalloc
  mm/kmemleak

Subsystem: MAINTAINERS

    Joe Perches <joe@perches.com>:
      MAINTAINERS: Broadcom internal lists aren't maintainers

Subsystem: mm/tmpfs

    Hugh Dickins <hughd@google.com>:
      tmpfs: fix regressions from wider use of ZERO_PAGE

Subsystem: m/secretmem

    Axel Rasmussen <axelrasmussen@google.com>:
      mm/secretmem: fix panic when growing a memfd_secret

Subsystem: mm/kasan

    Zqiang <qiang1.zhang@intel.com>:
      irq_work: use kasan_record_aux_stack_noalloc() record callstack

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      kasan: fix hw tags enablement when KUNIT tests are disabled

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      mm, kfence: support kmem_dump_obj() for KFENCE objects

Subsystem: mm/pagealloc

    Juergen Gross <jgross@suse.com>:
      mm, page_alloc: fix build_zonerefs_node()

Subsystem: mm/zram

    Minchan Kim <minchan@kernel.org>:
      mm: fix unexpected zeroed page mapping with zram swap

Subsystem: mm/compaction

    Charan Teja Kalla <quic_charante@quicinc.com>:
      mm: compaction: fix compiler warning when CONFIG_COMPACTION=n

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: do not demote poisoned hugetlb pages

Subsystem: binfmt

    Andrew Morton <akpm@linux-foundation.org>:
      revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"
      revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"

Subsystem: mm/vmalloc

    Omar Sandoval <osandov@fb.com>:
      mm/vmalloc: fix spinning drain_vmap_work after reading from /proc/vmcore

Subsystem: mm/kmemleak

    Patrick Wang <patrick.wang.shcn@gmail.com>:
      mm: kmemleak: take a full lowmem check in kmemleak_*_phys()

 MAINTAINERS                     |   64 ++++++++++++++++++++--------------------
 arch/x86/include/asm/io.h       |    2 -
 arch/x86/kernel/crash_dump_64.c |    1 
 fs/binfmt_elf.c                 |    6 +--
 include/linux/kfence.h          |   24 +++++++++++++++
 kernel/irq_work.c               |    2 -
 mm/compaction.c                 |   10 +++---
 mm/filemap.c                    |    6 ---
 mm/hugetlb.c                    |   17 ++++++----
 mm/kasan/hw_tags.c              |    5 +--
 mm/kasan/kasan.h                |   10 +++---
 mm/kfence/core.c                |   21 -------------
 mm/kfence/kfence.h              |   21 +++++++++++++
 mm/kfence/report.c              |   47 +++++++++++++++++++++++++++++
 mm/kmemleak.c                   |    8 ++---
 mm/page_alloc.c                 |    2 -
 mm/page_io.c                    |   54 ---------------------------------
 mm/secretmem.c                  |   17 ++++++++++
 mm/shmem.c                      |   31 ++++++++++++-------
 mm/slab.c                       |    2 -
 mm/slab.h                       |    2 -
 mm/slab_common.c                |    9 +++++
 mm/slob.c                       |    2 -
 mm/slub.c                       |    2 -
 mm/vmalloc.c                    |   11 ------
 25 files changed, 207 insertions(+), 169 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-08 20:08 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-08 20:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

9 patches, based on d00c50b35101b862c3db270ffeba53a63a1063d9.

Subsystems affected by this patch series:

  mm/migration
  mm/highmem
  lz4
  mm/sparsemem
  mm/mremap
  mm/mempolicy
  mailmap
  mm/memcg
  MAINTAINERS

Subsystem: mm/migration

    Zi Yan <ziy@nvidia.com>:
      mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.

Subsystem: mm/highmem

    Max Filippov <jcmvbkbc@gmail.com>:
      highmem: fix checks in __kmap_local_sched_{in,out}

Subsystem: lz4

    Guo Xuenan <guoxuenan@huawei.com>:
      lz4: fix LZ4_decompress_safe_partial read out of bound

Subsystem: mm/sparsemem

    Waiman Long <longman@redhat.com>:
      mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning

Subsystem: mm/mremap

    Paolo Bonzini <pbonzini@redhat.com>:
      mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)

Subsystem: mm/mempolicy

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mempolicy: fix mpol_new leak in shared_policy_replace

Subsystem: mailmap

    Vasily Averin <vasily.averin@linux.dev>:
      mailmap: update Vasily Averin's email address

Subsystem: mm/memcg

    Andrew Morton <akpm@linux-foundation.org>:
      mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"

Subsystem: MAINTAINERS

    Tom Rix <trix@redhat.com>:
      MAINTAINERS: add Tom as clang reviewer

 .mailmap                 |    4 ++++
 MAINTAINERS              |    1 +
 include/linux/mmzone.h   |   11 +++++++----
 lib/lz4/lz4_decompress.c |    8 ++++++--
 mm/highmem.c             |    4 ++--
 mm/list_lru.c            |    6 ------
 mm/mempolicy.c           |    3 ++-
 mm/migrate.c             |    2 +-
 mm/mremap.c              |    3 +++
 9 files changed, 26 insertions(+), 16 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-01 18:27 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-01 18:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

16 patches, based on e8b767f5e04097aaedcd6e06e2270f9fe5282696.

Subsystems affected by this patch series:

  mm/madvise
  ofs2
  nilfs2
  mm/mlock
  mm/mfence
  mailmap
  mm/memory-failure
  mm/kasan
  mm/debug
  mm/kmemleak
  mm/damon

Subsystem: mm/madvise

    Charan Teja Kalla <quic_charante@quicinc.com>:
      Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"

Subsystem: ofs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: fix crash when mount with quota enabled

Subsystem: nilfs2

    Ryusuke Konishi <konishi.ryusuke@gmail.com>:
    Patch series "nilfs2 lockdep warning fixes":
      nilfs2: fix lockdep warnings in page operations for btree nodes
      nilfs2: fix lockdep warnings during disk space reclamation
      nilfs2: get rid of nilfs_mapping_init()

Subsystem: mm/mlock

    Hugh Dickins <hughd@google.com>:
      mm/munlock: add lru_add_drain() to fix memcg_stat_test
      mm/munlock: update Documentation/vm/unevictable-lru.rst

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/munlock: protect the per-CPU pagevec by a local_lock_t

Subsystem: mm/kfence

    Muchun Song <songmuchun@bytedance.com>:
      mm: kfence: fix objcgs vector allocation

Subsystem: mailmap

    Kirill Tkhai <kirill.tkhai@openvz.org>:
      mailmap: update Kirill's email

Subsystem: mm/memory-failure

    Rik van Riel <riel@surriel.com>:
      mm,hwpoison: unmap poisoned page before invalidation

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP

Subsystem: mm/debug

    Yinan Zhang <zhangyinan2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: remove -c option
      doc/vm/page_owner.rst: remove content related to -c option

Subsystem: mm/kmemleak

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
      mm/kmemleak: reset tag when compare object pointer

Subsystem: mm/damon

    Jonghyeon Kim <tome01@ajou.ac.kr>:
      mm/damon: prevent activated scheme from sleeping by deactivated schemes

 .mailmap                             |    1 
 Documentation/vm/page_owner.rst      |    1 
 Documentation/vm/unevictable-lru.rst |  473 +++++++++++++++--------------------
 fs/nilfs2/btnode.c                   |   23 +
 fs/nilfs2/btnode.h                   |    1 
 fs/nilfs2/btree.c                    |   27 +
 fs/nilfs2/dat.c                      |    4 
 fs/nilfs2/gcinode.c                  |    7 
 fs/nilfs2/inode.c                    |  167 +++++++++++-
 fs/nilfs2/mdt.c                      |   45 ++-
 fs/nilfs2/mdt.h                      |    6 
 fs/nilfs2/nilfs.h                    |   16 -
 fs/nilfs2/page.c                     |   16 -
 fs/nilfs2/page.h                     |    1 
 fs/nilfs2/segment.c                  |    9 
 fs/nilfs2/super.c                    |    5 
 fs/ocfs2/quota_global.c              |   23 -
 fs/ocfs2/quota_local.c               |    2 
 include/linux/gfp.h                  |    4 
 mm/damon/core.c                      |    5 
 mm/gup.c                             |   10 
 mm/internal.h                        |    6 
 mm/kfence/core.c                     |   11 
 mm/kfence/kfence.h                   |    3 
 mm/kmemleak.c                        |    9 
 mm/madvise.c                         |    9 
 mm/memory.c                          |   12 
 mm/migrate.c                         |    2 
 mm/mlock.c                           |   46 ++-
 mm/page_alloc.c                      |    1 
 mm/rmap.c                            |    4 
 mm/swap.c                            |    4 
 tools/vm/page_owner_sort.c           |    6 
 33 files changed, 560 insertions(+), 399 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2022-04-01 18:20 incoming Andrew Morton
@ 2022-04-01 18:27 ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-04-01 18:27 UTC (permalink / raw)
  To: Linus Torvalds, linux-mm, mm-commits, patches

Argh, messed up in-reply-to.  Let me redo...


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-04-01 18:20 Andrew Morton
  2022-04-01 18:27 ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2022-04-01 18:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

16 patches, based on e8b767f5e04097aaedcd6e06e2270f9fe5282696.

Subsystems affected by this patch series:

  mm/madvise
  ofs2
  nilfs2
  mm/mlock
  mm/mfence
  mailmap
  mm/memory-failure
  mm/kasan
  mm/debug
  mm/kmemleak
  mm/damon

Subsystem: mm/madvise

    Charan Teja Kalla <quic_charante@quicinc.com>:
      Revert "mm: madvise: skip unmapped vma holes passed to process_madvise"

Subsystem: ofs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: fix crash when mount with quota enabled

Subsystem: nilfs2

    Ryusuke Konishi <konishi.ryusuke@gmail.com>:
    Patch series "nilfs2 lockdep warning fixes":
      nilfs2: fix lockdep warnings in page operations for btree nodes
      nilfs2: fix lockdep warnings during disk space reclamation
      nilfs2: get rid of nilfs_mapping_init()

Subsystem: mm/mlock

    Hugh Dickins <hughd@google.com>:
      mm/munlock: add lru_add_drain() to fix memcg_stat_test
      mm/munlock: update Documentation/vm/unevictable-lru.rst

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/munlock: protect the per-CPU pagevec by a local_lock_t

Subsystem: mm/kfence

    Muchun Song <songmuchun@bytedance.com>:
      mm: kfence: fix objcgs vector allocation

Subsystem: mailmap

    Kirill Tkhai <kirill.tkhai@openvz.org>:
      mailmap: update Kirill's email

Subsystem: mm/memory-failure

    Rik van Riel <riel@surriel.com>:
      mm,hwpoison: unmap poisoned page before invalidation

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      mm, kasan: fix __GFP_BITS_SHIFT definition breaking LOCKDEP

Subsystem: mm/debug

    Yinan Zhang <zhangyinan2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: remove -c option
      doc/vm/page_owner.rst: remove content related to -c option

Subsystem: mm/kmemleak

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
      mm/kmemleak: reset tag when compare object pointer

Subsystem: mm/damon

    Jonghyeon Kim <tome01@ajou.ac.kr>:
      mm/damon: prevent activated scheme from sleeping by deactivated schemes

 .mailmap                             |    1 
 Documentation/vm/page_owner.rst      |    1 
 Documentation/vm/unevictable-lru.rst |  473 +++++++++++++++--------------------
 fs/nilfs2/btnode.c                   |   23 +
 fs/nilfs2/btnode.h                   |    1 
 fs/nilfs2/btree.c                    |   27 +
 fs/nilfs2/dat.c                      |    4 
 fs/nilfs2/gcinode.c                  |    7 
 fs/nilfs2/inode.c                    |  167 +++++++++++-
 fs/nilfs2/mdt.c                      |   45 ++-
 fs/nilfs2/mdt.h                      |    6 
 fs/nilfs2/nilfs.h                    |   16 -
 fs/nilfs2/page.c                     |   16 -
 fs/nilfs2/page.h                     |    1 
 fs/nilfs2/segment.c                  |    9 
 fs/nilfs2/super.c                    |    5 
 fs/ocfs2/quota_global.c              |   23 -
 fs/ocfs2/quota_local.c               |    2 
 include/linux/gfp.h                  |    4 
 mm/damon/core.c                      |    5 
 mm/gup.c                             |   10 
 mm/internal.h                        |    6 
 mm/kfence/core.c                     |   11 
 mm/kfence/kfence.h                   |    3 
 mm/kmemleak.c                        |    9 
 mm/madvise.c                         |    9 
 mm/memory.c                          |   12 
 mm/migrate.c                         |    2 
 mm/mlock.c                           |   46 ++-
 mm/page_alloc.c                      |    1 
 mm/rmap.c                            |    4 
 mm/swap.c                            |    4 
 tools/vm/page_owner_sort.c           |    6 
 33 files changed, 560 insertions(+), 399 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-03-25  1:07 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-03-25  1:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches


This is the material which was staged after willystuff in linux-next. 
Everything applied seamlessly on your latest, all looks well.



114 patches, based on 52deda9551a01879b3562e7b41748e85c591f14c.

Subsystems affected by this patch series:

  mm/debug
  mm/selftests
  mm/pagecache
  mm/thp
  mm/rmap
  mm/migration
  mm/kasan
  mm/hugetlb
  mm/pagemap
  mm/madvise
  selftests

Subsystem: mm/debug

    Sean Anderson <seanga2@gmail.com>:
      tools/vm/page_owner_sort.c: sort by stacktrace before culling
      tools/vm/page_owner_sort.c: support sorting by stack trace

    Yinan Zhang <zhangyinan2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: add switch between culling by stacktrace and txt

    Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: support sorting pid and time

    Shenghong Han <hanshenghong2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: two trivial fixes

    Yixuan Cao <caoyixuan2019@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: delete invalid duplicate code

    Shenghong Han <hanshenghong2019@email.szu.edu.cn>:
      Documentation/vm/page_owner.rst: update the documentation

    Shuah Khan <skhan@linuxfoundation.org>:
      Documentation/vm/page_owner.rst: fix unexpected indentation warns

    Waiman Long <longman@redhat.com>:
    Patch series "mm/page_owner: Extend page_owner to show memcg information", v4:
      lib/vsprintf: avoid redundant work with 0 size
      mm/page_owner: use scnprintf() to avoid excessive buffer overrun check
      mm/page_owner: print memcg information
      mm/page_owner: record task command name

    Yixuan Cao <caoyixuan2019@email.szu.edu.cn>:
      mm/page_owner.c: record tgid
      tools/vm/page_owner_sort.c: fix the instructions for use

    Jiajian Ye <yejiajian2018@email.szu.edu.cn>:
      tools/vm/page_owner_sort.c: fix comments
      tools/vm/page_owner_sort.c: add a security check
      tools/vm/page_owner_sort.c: support sorting by tgid and update documentation
      tools/vm/page_owner_sort: fix three trivival places
      tools/vm/page_owner_sort: support for sorting by task command name
      tools/vm/page_owner_sort.c: support for selecting by PID, TGID or task command name
      tools/vm/page_owner_sort.c: support for user-defined culling rules

    Christoph Hellwig <hch@lst.de>:
      mm: unexport page_init_poison

Subsystem: mm/selftests

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      selftest/vm: add util.h and and move helper functions there

    Mike Rapoport <rppt@kernel.org>:
      selftest/vm: add helpers to detect PAGE_SIZE and PAGE_SHIFT

Subsystem: mm/pagecache

    Hugh Dickins <hughd@google.com>:
      mm: delete __ClearPageWaiters()
      mm: filemap_unaccount_folio() large skip mapcount fixup

Subsystem: mm/thp

    Hugh Dickins <hughd@google.com>:
      mm/thp: fix NR_FILE_MAPPED accounting in page_*_file_rmap()

Subsystem: mm/rmap

Subsystem: mm/migration

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/migration: Add trace events", v3:
      mm/migration: add trace events for THP migrations
      mm/migration: add trace events for base page and HugeTLB migrations

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan, vmalloc, arm64: add vmalloc tagging support for SW/HW_TAGS", v6:
      kasan, page_alloc: deduplicate should_skip_kasan_poison
      kasan, page_alloc: move tag_clear_highpage out of kernel_init_free_pages
      kasan, page_alloc: merge kasan_free_pages into free_pages_prepare
      kasan, page_alloc: simplify kasan_poison_pages call site
      kasan, page_alloc: init memory of skipped pages on free
      kasan: drop skip_kasan_poison variable in free_pages_prepare
      mm: clarify __GFP_ZEROTAGS comment
      kasan: only apply __GFP_ZEROTAGS when memory is zeroed
      kasan, page_alloc: refactor init checks in post_alloc_hook
      kasan, page_alloc: merge kasan_alloc_pages into post_alloc_hook
      kasan, page_alloc: combine tag_clear_highpage calls in post_alloc_hook
      kasan, page_alloc: move SetPageSkipKASanPoison in post_alloc_hook
      kasan, page_alloc: move kernel_init_free_pages in post_alloc_hook
      kasan, page_alloc: rework kasan_unpoison_pages call site
      kasan: clean up metadata byte definitions
      kasan: define KASAN_VMALLOC_INVALID for SW_TAGS
      kasan, x86, arm64, s390: rename functions for modules shadow
      kasan, vmalloc: drop outdated VM_KASAN comment
      kasan: reorder vmalloc hooks
      kasan: add wrappers for vmalloc hooks
      kasan, vmalloc: reset tags in vmalloc functions
      kasan, fork: reset pointer tags of vmapped stacks
      kasan, arm64: reset pointer tags of vmapped stacks
      kasan, vmalloc: add vmalloc tagging for SW_TAGS
      kasan, vmalloc, arm64: mark vmalloc mappings as pgprot_tagged
      kasan, vmalloc: unpoison VM_ALLOC pages after mapping
      kasan, mm: only define ___GFP_SKIP_KASAN_POISON with HW_TAGS
      kasan, page_alloc: allow skipping unpoisoning for HW_TAGS
      kasan, page_alloc: allow skipping memory init for HW_TAGS
      kasan, vmalloc: add vmalloc tagging for HW_TAGS
      kasan, vmalloc: only tag normal vmalloc allocations
      kasan, arm64: don't tag executable vmalloc allocations
      kasan: mark kasan_arg_stacktrace as __initdata
      kasan: clean up feature flags for HW_TAGS mode
      kasan: add kasan.vmalloc command line flag
      kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS
      arm64: select KASAN_VMALLOC for SW/HW_TAGS modes
      kasan: documentation updates
      kasan: improve vmalloc tests
      kasan: test: support async (again) and asymm modes for HW_TAGS

    tangmeng <tangmeng@uniontech.com>:
      mm/kasan: remove unnecessary CONFIG_KASAN option

    Peter Collingbourne <pcc@google.com>:
      kasan: update function name in comments

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: print virtual mapping info in reports
    Patch series "kasan: report clean-ups and improvements":
      kasan: drop addr check from describe_object_addr
      kasan: more line breaks in reports
      kasan: rearrange stack frame info in reports
      kasan: improve stack frame info in reports
      kasan: print basic stack frame info for SW_TAGS
      kasan: simplify async check in end_report()
      kasan: simplify kasan_update_kunit_status() and call sites
      kasan: check CONFIG_KASAN_KUNIT_TEST instead of CONFIG_KUNIT
      kasan: move update_kunit_status to start_report
      kasan: move disable_trace_on_warning to start_report
      kasan: split out print_report from __kasan_report
      kasan: simplify kasan_find_first_bad_addr call sites
      kasan: restructure kasan_report
      kasan: merge __kasan_report into kasan_report
      kasan: call print_report from kasan_report_invalid_free
      kasan: move and simplify kasan_report_async
      kasan: rename kasan_access_info to kasan_report_info
      kasan: add comment about UACCESS regions to kasan_report
      kasan: respect KASAN_BIT_REPORTED in all reporting routines
      kasan: reorder reporting functions
      kasan: move and hide kasan_save_enable/restore_multi_shot
      kasan: disable LOCKDEP when printing reports

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "Add hugetlb MADV_DONTNEED support", v3:
      mm: enable MADV_DONTNEED for hugetlb mappings
      selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test
      userfaultfd/selftests: enable hugetlb remap and remove event testing

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/huge_memory: make is_transparent_hugepage() static

Subsystem: mm/pagemap

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: COW fixes part 1: fix the COW security issue for THP and swap", v3:
      mm: optimize do_wp_page() for exclusive pages in the swapcache
      mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
      mm: slightly clarify KSM logic in do_swap_page()
      mm: streamline COW logic in do_swap_page()
      mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()
      mm/khugepaged: remove reuse_swap_page() usage
      mm/swapfile: remove stale reuse_swap_page()
      mm/huge_memory: remove stale page_trans_huge_mapcount()
      mm/huge_memory: remove stale locking logic from __split_huge_pmd()

    Hugh Dickins <hughd@google.com>:
      mm: warn on deleting redirtied only if accounted
      mm: unmap_mapping_range_tree() with i_mmap_rwsem shared

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm: generalize ARCH_HAS_FILTER_PGPROT

Subsystem: mm/madvise

    Mauricio Faria de Oliveira <mfo@canonical.com>:
      mm: fix race between MADV_FREE reclaim and blkdev direct IO read

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: madvise: MADV_DONTNEED_LOCKED

Subsystem: selftests

    Muhammad Usama Anjum <usama.anjum@collabora.com>:
      selftests: vm: remove dependecy from internal kernel macros

    Kees Cook <keescook@chromium.org>:
      selftests: kselftest framework: provide "finished" helper

 Documentation/dev-tools/kasan.rst             |   17 
 Documentation/vm/page_owner.rst               |   72 ++
 arch/alpha/include/uapi/asm/mman.h            |    2 
 arch/arm64/Kconfig                            |    2 
 arch/arm64/include/asm/vmalloc.h              |    6 
 arch/arm64/include/asm/vmap_stack.h           |    5 
 arch/arm64/kernel/module.c                    |    5 
 arch/arm64/mm/pageattr.c                      |    2 
 arch/arm64/net/bpf_jit_comp.c                 |    3 
 arch/mips/include/uapi/asm/mman.h             |    2 
 arch/parisc/include/uapi/asm/mman.h           |    2 
 arch/powerpc/mm/book3s64/trace.c              |    1 
 arch/s390/kernel/module.c                     |    2 
 arch/x86/Kconfig                              |    3 
 arch/x86/kernel/module.c                      |    2 
 arch/x86/mm/init.c                            |    1 
 arch/xtensa/include/uapi/asm/mman.h           |    2 
 include/linux/gfp.h                           |   53 +-
 include/linux/huge_mm.h                       |    6 
 include/linux/kasan.h                         |  136 +++--
 include/linux/mm.h                            |    5 
 include/linux/page-flags.h                    |    2 
 include/linux/pagemap.h                       |    3 
 include/linux/swap.h                          |    4 
 include/linux/vmalloc.h                       |   18 
 include/trace/events/huge_memory.h            |    1 
 include/trace/events/migrate.h                |   31 +
 include/trace/events/mmflags.h                |   18 
 include/trace/events/thp.h                    |   27 +
 include/uapi/asm-generic/mman-common.h        |    2 
 kernel/fork.c                                 |   13 
 kernel/scs.c                                  |   16 
 lib/Kconfig.kasan                             |   18 
 lib/test_kasan.c                              |  239 ++++++++-
 lib/vsprintf.c                                |    8 
 mm/Kconfig                                    |    3 
 mm/debug.c                                    |    1 
 mm/filemap.c                                  |   63 +-
 mm/huge_memory.c                              |  109 ----
 mm/kasan/Makefile                             |    2 
 mm/kasan/common.c                             |    4 
 mm/kasan/hw_tags.c                            |  243 +++++++---
 mm/kasan/kasan.h                              |   76 ++-
 mm/kasan/report.c                             |  516 +++++++++++----------
 mm/kasan/report_generic.c                     |   34 -
 mm/kasan/report_hw_tags.c                     |    1 
 mm/kasan/report_sw_tags.c                     |   16 
 mm/kasan/report_tags.c                        |    2 
 mm/kasan/shadow.c                             |   76 +--
 mm/khugepaged.c                               |   11 
 mm/madvise.c                                  |   57 +-
 mm/memory.c                                   |  129 +++--
 mm/memremap.c                                 |    2 
 mm/migrate.c                                  |    4 
 mm/page-writeback.c                           |   18 
 mm/page_alloc.c                               |  270 ++++++-----
 mm/page_owner.c                               |   86 ++-
 mm/rmap.c                                     |   62 +-
 mm/swap.c                                     |    4 
 mm/swapfile.c                                 |  104 ----
 mm/vmalloc.c                                  |  167 ++++--
 tools/testing/selftests/kselftest.h           |   10 
 tools/testing/selftests/vm/.gitignore         |    1 
 tools/testing/selftests/vm/Makefile           |    1 
 tools/testing/selftests/vm/gup_test.c         |    3 
 tools/testing/selftests/vm/hugetlb-madvise.c  |  410 ++++++++++++++++
 tools/testing/selftests/vm/ksm_tests.c        |   38 -
 tools/testing/selftests/vm/memfd_secret.c     |    2 
 tools/testing/selftests/vm/run_vmtests.sh     |   15 
 tools/testing/selftests/vm/transhuge-stress.c |   41 -
 tools/testing/selftests/vm/userfaultfd.c      |   72 +-
 tools/testing/selftests/vm/util.h             |   75 ++-
 tools/vm/page_owner_sort.c                    |  628 +++++++++++++++++++++-----
 73 files changed, 2797 insertions(+), 1288 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-03-23 23:04 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-03-23 23:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches


Various misc subsystems, before getting into the post-linux-next material.

This is all based on v5.17.  I tested applying and compiling against
today's 1bc191051dca28fa6.  One patch required an extra whack, all
looks good.


41 patches, based on f443e374ae131c168a065ea1748feac6b2e76613.

Subsystems affected by this patch series:

  procfs
  misc
  core-kernel
  lib
  checkpatch
  init
  pipe
  minix
  fat
  cgroups
  kexec
  kdump
  taskstats
  panic
  kcov
  resource
  ubsan

Subsystem: procfs

    Hao Lee <haolee.swjtu@gmail.com>:
      proc: alloc PATH_MAX bytes for /proc/${pid}/fd/ symlinks

    David Hildenbrand <david@redhat.com>:
      proc/vmcore: fix possible deadlock on concurrent mmap and read

    Yang Li <yang.lee@linux.alibaba.com>:
      proc/vmcore: fix vmcore_alloc_buf() kernel-doc comment

Subsystem: misc

    Bjorn Helgaas <bhelgaas@google.com>:
      linux/types.h: remove unnecessary __bitwise__
      Documentation/sparse: add hints about __CHECKER__

Subsystem: core-kernel

    Miaohe Lin <linmiaohe@huawei.com>:
      kernel/ksysfs.c: use helper macro __ATTR_RW

Subsystem: lib

    Kees Cook <keescook@chromium.org>:
      Kconfig.debug: make DEBUG_INFO selectable from a choice

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      include: drop pointless __compiler_offsetof indirection

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      ilog2: force inlining of __ilog2_u32() and __ilog2_u64()

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      bitfield: add explicit inclusions to the example

    Feng Tang <feng.tang@intel.com>:
      lib/Kconfig.debug: add ARCH dependency for FUNCTION_ALIGN option

    Randy Dunlap <rdunlap@infradead.org>:
      lib: bitmap: fix many kernel-doc warnings

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: prefer MODULE_LICENSE("GPL") over MODULE_LICENSE("GPL v2")
      checkpatch: add --fix option for some TRAILING_STATEMENTS
      checkpatch: add early_param exception to blank line after struct/function test

    Sagar Patel <sagarmp@cs.unc.edu>:
      checkpatch: use python3 to find codespell dictionary

Subsystem: init

    Mark-PK Tsai <mark-pk.tsai@mediatek.com>:
      init: use ktime_us_delta() to make initcall_debug log more precise

    Randy Dunlap <rdunlap@infradead.org>:
      init.h: improve __setup and early_param documentation
      init/main.c: return 1 from handled __setup() functions

Subsystem: pipe

    Andrei Vagin <avagin@gmail.com>:
      fs/pipe: use kvcalloc to allocate a pipe_buffer array
      fs/pipe.c: local vars have to match types of proper pipe_inode_info fields

Subsystem: minix

    Qinghua Jin <qhjin.dev@gmail.com>:
      minix: fix bug when opening a file with O_DIRECT

Subsystem: fat

    Helge Deller <deller@gmx.de>:
      fat: use pointer to simple type in put_user()

Subsystem: cgroups

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      cgroup: use irqsave in cgroup_rstat_flush_locked().
cgroup: add a comment to cgroup_rstat_flush_locked().

Subsystem: kexec

    Jisheng Zhang <jszhang@kernel.org>:
    Patch series "kexec: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef", v2:
      kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible
      riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
      x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
      arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef

Subsystem: kdump

    Tiezhu Yang <yangtiezhu@loongson.cn>:
    Patch series "Update doc and fix some issues about kdump", v2:
      docs: kdump: update description about sysfs file system support
      docs: kdump: add scp example to write out the dump file
      panic: unset panic_on_warn inside panic()
      ubsan: no need to unset panic_on_warn in ubsan_epilogue()
      kasan: no need to unset panic_on_warn in end_report()

Subsystem: taskstats

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      taskstats: remove unneeded dead assignment

Subsystem: panic

    "Guilherme G. Piccoli" <gpiccoli@igalia.com>:
    Patch series "Some improvements on panic_print":
      docs: sysctl/kernel: add missing bit to panic_print
      panic: add option to dump all CPUs backtraces in panic_print
      panic: move panic_print before kmsg dumpers

Subsystem: kcov

    Aleksandr Nogikh <nogikh@google.com>:
    Patch series "kcov: improve mmap processing", v3:
      kcov: split ioctl handling into locked and unlocked parts
      kcov: properly handle subsequent mmap calls

Subsystem: resource

    Miaohe Lin <linmiaohe@huawei.com>:
      kernel/resource: fix kfree() of bootmem memory again

Subsystem: ubsan

    Marco Elver <elver@google.com>:
      Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang"

 Documentation/admin-guide/kdump/kdump.rst       |   10 +
 Documentation/admin-guide/kernel-parameters.txt |    5 
 Documentation/admin-guide/sysctl/kernel.rst     |    2 
 Documentation/dev-tools/sparse.rst              |    2 
 arch/arm64/mm/init.c                            |    9 -
 arch/riscv/mm/init.c                            |    6 -
 arch/x86/kernel/setup.c                         |   10 -
 fs/fat/dir.c                                    |    2 
 fs/minix/inode.c                                |    3 
 fs/pipe.c                                       |   13 +-
 fs/proc/base.c                                  |    8 -
 fs/proc/vmcore.c                                |   43 +++----
 include/linux/bitfield.h                        |    3 
 include/linux/compiler_types.h                  |    3 
 include/linux/init.h                            |   11 +
 include/linux/kexec.h                           |   12 +-
 include/linux/log2.h                            |    4 
 include/linux/stddef.h                          |    6 -
 include/uapi/linux/types.h                      |    6 -
 init/main.c                                     |   14 +-
 kernel/cgroup/rstat.c                           |   13 +-
 kernel/kcov.c                                   |  102 ++++++++---------
 kernel/ksysfs.c                                 |    3 
 kernel/panic.c                                  |   37 ++++--
 kernel/resource.c                               |   41 +-----
 kernel/taskstats.c                              |    5 
 lib/Kconfig.debug                               |  142 ++++++++++++------------
 lib/Kconfig.kcsan                               |   11 -
 lib/Kconfig.ubsan                               |   12 --
 lib/bitmap.c                                    |   24 ++--
 lib/ubsan.c                                     |   10 -
 mm/kasan/report.c                               |   10 -
 scripts/checkpatch.pl                           |   31 ++++-
 tools/include/linux/types.h                     |    5 
 34 files changed, 313 insertions(+), 305 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-03-22 21:38 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-03-22 21:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches


- A few misc subsystems

- There is a lot of MM material in Willy's tree.  Folio work and
  non-folio patches which depended on that work.

  Here I send almost all the MM patches which precede the patches in
  Willy's tree.  The remaining ~100 MM patches are staged on Willy's
  tree and I'll send those along once Willy is merged up.

  I tried this batch against your current tree (as of
  51912904076680281) and a couple need some extra persuasion to apply,
  but all looks OK otherwise.


227 patches, based on f443e374ae131c168a065ea1748feac6b2e76613

Subsystems affected by this patch series:

  kthread
  scripts
  ntfs
  ocfs2
  block
  vfs
  mm/kasan
  mm/pagecache
  mm/gup
  mm/swap
  mm/shmem
  mm/memcg
  mm/selftests
  mm/pagemap
  mm/mremap
  mm/sparsemem
  mm/vmalloc
  mm/pagealloc
  mm/memory-failure
  mm/mlock
  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/compaction
  mm/mempolicy
  mm/oom-kill
  mm/migration
  mm/thp
  mm/cma
  mm/autonuma
  mm/psi
  mm/ksm
  mm/page-poison
  mm/madvise
  mm/memory-hotplug
  mm/rmap
  mm/zswap
  mm/uaccess
  mm/ioremap
  mm/highmem
  mm/cleanups
  mm/kfence
  mm/hmm
  mm/damon

Subsystem: kthread

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      linux/kthread.h: remove unused macros

Subsystem: scripts

    Colin Ian King <colin.i.king@gmail.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ntfs

    Dongliang Mu <mudongliangabcd@gmail.com>:
      ntfs: add sanity check on allocation size

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: cleanup some return variables

    hongnanli <hongnan.li@linux.alibaba.com>:
      fs/ocfs2: fix comments mentioning i_mutex

Subsystem: block

    NeilBrown <neilb@suse.de>:
    Patch series "Remove remaining parts of congestion tracking code", v2:
      doc: convert 'subsection' to 'section' in gfp.h
      mm: document and polish read-ahead code
      mm: improve cleanup when ->readpages doesn't process all pages
      fuse: remove reliance on bdi congestion
      nfs: remove reliance on bdi congestion
      ceph: remove reliance on bdi congestion
      remove inode_congested()
      remove bdi_congested() and wb_congested() and related functions
      f2fs: replace congestion_wait() calls with io_schedule_timeout()
      block/bfq-iosched.c: use "false" rather than "BLK_RW_ASYNC"
      remove congestion tracking framework

Subsystem: vfs

    Anthony Iliopoulos <ailiop@suse.com>:
      mount: warn only once about timestamp range expiration

Subsystem: mm/kasan

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/memremap: avoid calling kasan_remove_zero_shadow() for device private memory

Subsystem: mm/pagecache

    Miaohe Lin <linmiaohe@huawei.com>:
      filemap: remove find_get_pages()
      mm/writeback: minor clean up for highmem_dirtyable_memory

    Minchan Kim <minchan@kernel.org>:
      mm: fs: fix lru_cache_disabled race in bh_lru

Subsystem: mm/gup

    Peter Xu <peterx@redhat.com>:
    Patch series "mm/gup: some cleanups", v5:
      mm: fix invalid page pointer returned with FOLL_PIN gups

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: follow_pfn_pte(): -EEXIST cleanup
      mm/gup: remove unused pin_user_pages_locked()
      mm: change lookup_node() to use get_user_pages_fast()
      mm/gup: remove unused get_user_pages_locked()

Subsystem: mm/swap

    Bang Li <libang.linuxer@gmail.com>:
      mm/swap: fix confusing comment in folio_mark_accessed

Subsystem: mm/shmem

    Xavier Roche <xavier.roche@algolia.com>:
      tmpfs: support for file creation time

    Hugh Dickins <hughd@google.com>:
      shmem: mapping_set_exiting() to help mapped resilience
      tmpfs: do not allocate pages on read

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: shmem: use helper macro __ATTR_RW

Subsystem: mm/memcg

    Shakeel Butt <shakeelb@google.com>:
      memcg: replace in_interrupt() with !in_task()

    Yosry Ahmed <yosryahmed@google.com>:
      memcg: add per-memcg total kernel memory stat

    Wei Yang <richard.weiyang@gmail.com>:
      mm/memcg: mem_cgroup_per_node is already set to 0 on allocation
      mm/memcg: retrieve parent memcg from css.parent

    Shakeel Butt <shakeelb@google.com>:
    Patch series "memcg: robust enforcement of memory.high", v2:
      memcg: refactor mem_cgroup_oom
      memcg: unify force charging conditions
      selftests: memcg: test high limit for single entry allocation
      memcg: synchronously enforce memory.high for large overcharges

    Randy Dunlap <rdunlap@infradead.org>:
      mm/memcontrol: return 1 from cgroup.memory __setup() handler

    Michal Hocko <mhocko@suse.com>:
    Patch series "mm/memcg: Address PREEMPT_RT problems instead of disabling it", v5:
      mm/memcg: revert ("mm/memcg: optimize user context object stock access")

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/memcg: disable threshold event handlers on PREEMPT_RT
      mm/memcg: protect per-CPU counter by disabling preemption on PREEMPT_RT where needed.

    Johannes Weiner <hannes@cmpxchg.org>:
      mm/memcg: opencode the inner part of obj_cgroup_uncharge_pages() in drain_obj_stock()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/memcg: protect memcg_stock with a local_lock_t
      mm/memcg: disable migration instead of preemption in drain_all_stock().

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Optimize list lru memory consumption", v6:
      mm: list_lru: transpose the array of per-node per-memcg lru lists
      mm: introduce kmem_cache_alloc_lru
      fs: introduce alloc_inode_sb() to allocate filesystems specific inode
      fs: allocate inode by using alloc_inode_sb()
      f2fs: allocate inode by using alloc_inode_sb()
      mm: dcache: use kmem_cache_alloc_lru() to allocate dentry
      xarray: use kmem_cache_alloc_lru to allocate xa_node
      mm: memcontrol: move memcg_online_kmem() to mem_cgroup_css_online()
      mm: list_lru: allocate list_lru_one only when needed
      mm: list_lru: rename memcg_drain_all_list_lrus to memcg_reparent_list_lrus
      mm: list_lru: replace linear array with xarray
      mm: memcontrol: reuse memory cgroup ID for kmem ID
      mm: memcontrol: fix cannot alloc the maximum memcg ID
      mm: list_lru: rename list_lru_per_memcg to list_lru_memcg
      mm: memcontrol: rename memcg_cache_id to memcg_kmem_id

    Vasily Averin <vvs@virtuozzo.com>:
      memcg: enable accounting for tty-related objects

Subsystem: mm/selftests

    Guillaume Tucker <guillaume.tucker@collabora.com>:
      selftests, x86: fix how check_cc.sh is being invoked

Subsystem: mm/pagemap

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm: merge pte_mkhuge() call into arch_make_huge_pte()

    Stafford Horne <shorne@gmail.com>:
      mm: remove mmu_gathers storage from remaining architectures

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Fix some cache flush bugs", v5:
      mm: thp: fix wrong cache flush in remove_migration_pmd()
      mm: fix missing cache flush for all tail pages of compound page
      mm: hugetlb: fix missing cache flush in copy_huge_page_from_user()
      mm: hugetlb: fix missing cache flush in hugetlb_mcopy_atomic_pte()
      mm: shmem: fix missing cache flush in shmem_mfill_atomic_pte()
      mm: userfaultfd: fix missing cache flush in mcopy_atomic_pte() and __mcopy_atomic()
      mm: replace multiple dcache flush with flush_dcache_folio()

    Peter Xu <peterx@redhat.com>:
    Patch series "mm: Rework zap ptes on swap entries", v5:
      mm: don't skip swap entry even if zap_details specified
      mm: rename zap_skip_check_mapping() to should_zap_page()
      mm: change zap_details.zap_mapping into even_cows
      mm: rework swap handling of zap_pte_range

    Randy Dunlap <rdunlap@infradead.org>:
      mm/mmap: return 1 from stack_guard_gap __setup() handler

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/memory.c: use helper function range_in_vma()
      mm/memory.c: use helper macro min and max in unmap_mapping_range_tree()

    Hugh Dickins <hughd@google.com>:
      mm: _install_special_mapping() apply VM_LOCKED_CLEAR_MASK

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mmap: remove obsolete comment in ksys_mmap_pgoff

Subsystem: mm/mremap

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mremap:: use vma_lookup() instead of find_vma()

Subsystem: mm/sparsemem

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/sparse: make mminit_validate_memmodel_limits() static

Subsystem: mm/vmalloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/vmalloc: remove unneeded function forward declaration

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: Move draining areas out of caller context

    Uladzislau Rezki <uladzislau.rezki@sony.com>:
      mm/vmalloc: add adjust_search_size parameter

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: eliminate an extra orig_gfp_mask

    Jiapeng Chong <jiapeng.chong@linux.alibaba.com>:
      mm/vmalloc.c: fix "unused function" warning

    Bang Li <libang.linuxer@gmail.com>:
      mm/vmalloc: fix comments about vmap_area struct

Subsystem: mm/pagealloc

    Zi Yan <ziy@nvidia.com>:
      mm: page_alloc: avoid merging non-fallbackable pageblocks with others

    Peter Collingbourne <pcc@google.com>:
      mm/mmzone.c: use try_cmpxchg() in page_cpupid_xchg_last()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mmzone.h: remove unused macros

    Nicolas Saenz Julienne <nsaenzju@redhat.com>:
      mm/page_alloc: don't pass pfn to free_unref_page_commit()

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: enforce pageblock_order < MAX_ORDER":
      cma: factor out minimum alignment requirement
      mm: enforce pageblock_order < MAX_ORDER

    Nathan Chancellor <nathan@kernel.org>:
      mm/page_alloc: mark pagesets as __maybe_unused

    Alistair Popple <apopple@nvidia.com>:
      mm/pages_alloc.c: don't create ZONE_MOVABLE beyond the end of a node

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Follow-up on high-order PCP caching", v2:
      mm/page_alloc: fetch the correct pcp buddy during bulk free
      mm/page_alloc: track range of active PCP lists during bulk free
      mm/page_alloc: simplify how many pages are selected per pcp list during bulk free
      mm/page_alloc: drain the requested list first during bulk free
      mm/page_alloc: free pages in a single pass during bulk free
      mm/page_alloc: limit number of high-order pages on PCP during bulk free
      mm/page_alloc: do not prefetch buddies during bulk free

    Oscar Salvador <osalvador@suse.de>:
      arch/x86/mm/numa: Do not initialize nodes twice

    Suren Baghdasaryan <surenb@google.com>:
      mm: count time in drain_all_pages during direct reclaim as memory pressure

    Eric Dumazet <edumazet@google.com>:
      mm/page_alloc: call check_new_pages() while zone spinlock is not held

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: check high-order pages for corruption during PCP operations

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/memory-failure.c: remove obsolete comment
      mm/hwpoison: fix error page recovered but reported "not recovered"

    Rik van Riel <riel@surriel.com>:
      mm: invalidate hwpoison page cache page in fault path

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "A few cleanup and fixup patches for memory failure", v3:
      mm/memory-failure.c: minor clean up for memory_failure_dev_pagemap
      mm/memory-failure.c: catch unexpected -EFAULT from vma_address()
      mm/memory-failure.c: rework the signaling logic in kill_proc
      mm/memory-failure.c: fix race with changing page more robustly
      mm/memory-failure.c: remove PageSlab check in hwpoison_filter_dev
      mm/memory-failure.c: rework the try_to_unmap logic in hwpoison_user_mappings()
      mm/memory-failure.c: remove obsolete comment in __soft_offline_page
      mm/memory-failure.c: remove unnecessary PageTransTail check
      mm/hwpoison-inject: support injecting hwpoison to free page

    luofei <luofei@unicloud.com>:
      mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler
      mm/hwpoison: add in-use hugepage hwpoison filter judgement

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "A few fixup patches for memory failure", v2:
      mm/memory-failure.c: fix race with changing page compound again
      mm/memory-failure.c: avoid calling invalidate_inode_page() with unexpected pages
      mm/memory-failure.c: make non-LRU movable pages unhandlable

    Vlastimil Babka <vbabka@suse.cz>:
      mm, fault-injection: declare should_fail_alloc_page()

Subsystem: mm/mlock

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mlock: fix potential imbalanced rlimit ucounts adjustment

Subsystem: mm/hugetlb

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Free the 2nd vmemmap page associated with each HugeTLB page", v7:
      mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB page
      mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key
      mm: sparsemem: use page table lock to protect kernel pmd operations
      selftests: vm: add a hugetlb test case
      mm: sparsemem: move vmemmap related to HugeTLB to CONFIG_HUGETLB_PAGE_FREE_VMEMMAP

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/hugetlb: generalize ARCH_WANT_GENERAL_HUGETLB

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: clean up potential spectre issue warnings

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: use helper macro __ATTR_RW

    David Howells <dhowells@redhat.com>:
      mm/hugetlb.c: export PageHeadHuge()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: remove unneeded local variable follflags

Subsystem: mm/userfaultfd

    Nadav Amit <namit@vmware.com>:
      userfaultfd: provide unmasked address on page-fault

    Guo Zhengkui <guozhengkui@vivo.com>:
      userfaultfd/selftests: fix uninitialized_var.cocci warning

Subsystem: mm/vmscan

    Hugh Dickins <hughd@google.com>:
      mm/fs: delete PF_SWAPWRITE
      mm: __isolate_lru_page_prepare() in isolate_migratepages_block()

    Waiman Long <longman@redhat.com>:
      mm/list_lru: optimize memcg_reparent_list_lru_node()

    Marcelo Tosatti <mtosatti@redhat.com>:
      mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm: workingset: replace IRQ-off check with a lockdep assert.

    Charan Teja Kalla <quic_charante@quicinc.com>:
      mm: vmscan: fix documentation for page_check_references()

Subsystem: mm/compaction

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: compaction: cleanup the compaction trace events

Subsystem: mm/mempolicy

    Hugh Dickins <hughd@google.com>:
      mempolicy: mbind_range() set_policy() after vma_merge()

Subsystem: mm/oom-kill

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/oom_kill: remove unneeded is_memcg_oom check

Subsystem: mm/migration

    Huang Ying <ying.huang@intel.com>:
      mm,migrate: fix establishing demotion target

    "andrew.yang" <andrew.yang@mediatek.com>:
      mm/migrate: fix race between lock page and clear PG_Isolated

Subsystem: mm/thp

    Hugh Dickins <hughd@google.com>:
      mm/thp: refix __split_huge_pmd_locked() for migration PMD

Subsystem: mm/cma

    Hari Bathini <hbathini@linux.ibm.com>:
    Patch series "powerpc/fadump: handle CMA activation failure appropriately", v3:
      mm/cma: provide option to opt out from exposing pages on activation failure
      powerpc/fadump: opt out from freeing pages on cma activation failure

Subsystem: mm/autonuma

    Huang Ying <ying.huang@intel.com>:
    Patch series "NUMA balancing: optimize memory placement for memory tiering system", v13:
      NUMA Balancing: add page promotion counter
      NUMA balancing: optimize page placement for memory tiering system
      memory tiering: skip to scan fast memory

Subsystem: mm/psi

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: page_io: fix psi memory pressure error on cold swapins

Subsystem: mm/ksm

    Yang Yang <yang.yang29@zte.com.cn>:
      mm/vmstat: add event for ksm swapping in copy

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/ksm: use helper macro __ATTR_RW

Subsystem: mm/page-poison

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/hwpoison: check the subpage, not the head page

Subsystem: mm/madvise

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/madvise: use vma_lookup() instead of find_vma()

    Charan Teja Kalla <quic_charante@quicinc.com>:
    Patch series "mm: madvise: return correct bytes processed with:
      mm: madvise: return correct bytes advised with process_madvise
      mm: madvise: skip unmapped vma holes passed to process_madvise

Subsystem: mm/memory-hotplug

    Michal Hocko <mhocko@suse.com>:
    Patch series "mm, memory_hotplug: handle unitialized numa node gracefully":
      mm, memory_hotplug: make arch_alloc_nodedata independent on CONFIG_MEMORY_HOTPLUG
      mm: handle uninitialized numa nodes gracefully
      mm, memory_hotplug: drop arch_free_nodedata
      mm, memory_hotplug: reorganize new pgdat initialization
      mm: make free_area_init_node aware of memory less nodes

    Wei Yang <richard.weiyang@gmail.com>:
      memcg: do not tweak node in alloc_mem_cgroup_per_node_info

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory: add memory block to memory group after registration succeeded
      drivers/base/node: consolidate node device subsystem initialization in node_dev_init()

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "A few cleanup patches around memory_hotplug":
      mm/memory_hotplug: remove obsolete comment of __add_pages
      mm/memory_hotplug: avoid calling zone_intersects() for ZONE_NORMAL
      mm/memory_hotplug: clean up try_offline_node
      mm/memory_hotplug: fix misplaced comment in offline_pages

    David Hildenbrand <david@redhat.com>:
    Patch series "drivers/base/memory: determine and store zone for single-zone memory blocks", v2:
      drivers/base/node: rename link_mem_sections() to register_memory_block_under_node()
      drivers/base/memory: determine and store zone for single-zone memory blocks
      drivers/base/memory: clarify adding and removing of memory blocks

    Oscar Salvador <osalvador@suse.de>:
      mm: only re-generate demotion targets when a numa node changes its N_CPU state

Subsystem: mm/rmap

    Hugh Dickins <hughd@google.com>:
      mm/thp: ClearPageDoubleMap in first page_add_file_rmap()

Subsystem: mm/zswap

    "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>:
      mm/zswap.c: allow handling just same-value filled pages

Subsystem: mm/uaccess

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm: remove usercopy_warn()
      mm: uninline copy_overflow()

    Randy Dunlap <rdunlap@infradead.org>:
      mm/usercopy: return 1 from hardened_usercopy __setup() handler

Subsystem: mm/ioremap

    Vlastimil Babka <vbabka@suse.cz>:
      mm/early_ioremap: declare early_memremap_pgprot_adjust()

Subsystem: mm/highmem

    Ira Weiny <ira.weiny@intel.com>:
      highmem: document kunmap_local()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/highmem: remove unnecessary done label

Subsystem: mm/cleanups

    "Dr. David Alan Gilbert" <linux@treblig.org>:
      mm/page_table_check.c: use strtobool for param parsing

Subsystem: mm/kfence

    tangmeng <tangmeng@uniontech.com>:
      mm/kfence: remove unnecessary CONFIG_KFENCE option

    Tianchen Ding <dtcccc@linux.alibaba.com>:
    Patch series "provide the flexibility to enable KFENCE", v3:
      kfence: allow re-enabling KFENCE after system startup
      kfence: alloc kfence_pool after system startup

    Peng Liu <liupeng256@huawei.com>:
    Patch series "kunit: fix a UAF bug and do some optimization", v2:
      kunit: fix UAF when run kfence test case test_gfpzero
      kunit: make kunit_test_timeout compatible with comment
      kfence: test: try to avoid test_gfpzero trigger rcu_stall

    Marco Elver <elver@google.com>:
      kfence: allow use of a deferrable timer

Subsystem: mm/hmm

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hmm.c: remove unneeded local variable ret

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
    Patch series "Remove the type-unclear target id concept":
      mm/damon/dbgfs/init_regions: use target index instead of target id
      Docs/admin-guide/mm/damon/usage: update for changed initail_regions file input
      mm/damon/core: move damon_set_targets() into dbgfs
      mm/damon: remove the target id concept

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm/damon: remove redundant page validation

    SeongJae Park <sj@kernel.org>:
    Patch series "Allow DAMON user code independent of monitoring primitives":
      mm/damon: rename damon_primitives to damon_operations
      mm/damon: let monitoring operations can be registered and selected
      mm/damon/paddr,vaddr: register themselves to DAMON in subsys_initcall
      mm/damon/reclaim: use damon_select_ops() instead of damon_{v,p}a_set_operations()
      mm/damon/dbgfs: use damon_select_ops() instead of damon_{v,p}a_set_operations()
      mm/damon/dbgfs: use operations id for knowing if the target has pid
      mm/damon/dbgfs-test: fix is_target_id() change
      mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}()

    tangmeng <tangmeng@uniontech.com>:
      mm/damon: remove unnecessary CONFIG_DAMON option

    SeongJae Park <sj@kernel.org>:
    Patch series "Docs/damon: Update documents for better consistency":
      Docs/vm/damon: call low level monitoring primitives the operations
      Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling
      Docs/damon: update outdated term 'regions update interval'
    Patch series "Introduce DAMON sysfs interface", v3:
      mm/damon/core: allow non-exclusive DAMON start/stop
      mm/damon/core: add number of each enum type values
      mm/damon: implement a minimal stub for sysfs-based DAMON interface
      mm/damon/sysfs: link DAMON for virtual address spaces monitoring
      mm/damon/sysfs: support the physical address space monitoring
      mm/damon/sysfs: support DAMON-based Operation Schemes
      mm/damon/sysfs: support DAMOS quotas
      mm/damon/sysfs: support schemes prioritization
      mm/damon/sysfs: support DAMOS watermarks
      mm/damon/sysfs: support DAMOS stats
      selftests/damon: add a test for DAMON sysfs interface
      Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface
      Docs/ABI/testing: add DAMON sysfs interface ABI document

    Xin Hao <xhao@linux.alibaba.com>:
      mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release()

 Documentation/ABI/testing/sysfs-kernel-mm-damon  |  274 ++
 Documentation/admin-guide/cgroup-v1/memory.rst   |    2 
 Documentation/admin-guide/cgroup-v2.rst          |    5 
 Documentation/admin-guide/kernel-parameters.txt  |    2 
 Documentation/admin-guide/mm/damon/usage.rst     |  380 +++
 Documentation/admin-guide/mm/zswap.rst           |   22 
 Documentation/admin-guide/sysctl/kernel.rst      |   31 
 Documentation/core-api/mm-api.rst                |   19 
 Documentation/dev-tools/kfence.rst               |   12 
 Documentation/filesystems/porting.rst            |    6 
 Documentation/filesystems/vfs.rst                |   16 
 Documentation/vm/damon/design.rst                |   43 
 Documentation/vm/damon/faq.rst                   |    2 
 MAINTAINERS                                      |    1 
 arch/arm/Kconfig                                 |    4 
 arch/arm64/kernel/setup.c                        |    3 
 arch/arm64/mm/hugetlbpage.c                      |    1 
 arch/hexagon/mm/init.c                           |    2 
 arch/ia64/kernel/topology.c                      |   10 
 arch/ia64/mm/discontig.c                         |   11 
 arch/mips/kernel/topology.c                      |    5 
 arch/nds32/mm/init.c                             |    1 
 arch/openrisc/mm/init.c                          |    2 
 arch/powerpc/include/asm/fadump-internal.h       |    5 
 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h |    4 
 arch/powerpc/kernel/fadump.c                     |    8 
 arch/powerpc/kernel/sysfs.c                      |   17 
 arch/riscv/Kconfig                               |    4 
 arch/riscv/kernel/setup.c                        |    3 
 arch/s390/kernel/numa.c                          |    7 
 arch/sh/kernel/topology.c                        |    5 
 arch/sparc/kernel/sysfs.c                        |   12 
 arch/sparc/mm/hugetlbpage.c                      |    1 
 arch/x86/Kconfig                                 |    4 
 arch/x86/kernel/cpu/mce/core.c                   |    8 
 arch/x86/kernel/topology.c                       |    5 
 arch/x86/mm/numa.c                               |   33 
 block/bdev.c                                     |    2 
 block/bfq-iosched.c                              |    2 
 drivers/base/init.c                              |    1 
 drivers/base/memory.c                            |  149 +
 drivers/base/node.c                              |   48 
 drivers/block/drbd/drbd_int.h                    |    3 
 drivers/block/drbd/drbd_req.c                    |    3 
 drivers/dax/super.c                              |    2 
 drivers/of/of_reserved_mem.c                     |    9 
 drivers/tty/tty_io.c                             |    2 
 drivers/virtio/virtio_mem.c                      |    9 
 fs/9p/vfs_inode.c                                |    2 
 fs/adfs/super.c                                  |    2 
 fs/affs/super.c                                  |    2 
 fs/afs/super.c                                   |    2 
 fs/befs/linuxvfs.c                               |    2 
 fs/bfs/inode.c                                   |    2 
 fs/btrfs/inode.c                                 |    2 
 fs/buffer.c                                      |    8 
 fs/ceph/addr.c                                   |   22 
 fs/ceph/inode.c                                  |    2 
 fs/ceph/super.c                                  |    1 
 fs/ceph/super.h                                  |    1 
 fs/cifs/cifsfs.c                                 |    2 
 fs/coda/inode.c                                  |    2 
 fs/dcache.c                                      |    3 
 fs/ecryptfs/super.c                              |    2 
 fs/efs/super.c                                   |    2 
 fs/erofs/super.c                                 |    2 
 fs/exfat/super.c                                 |    2 
 fs/ext2/ialloc.c                                 |    5 
 fs/ext2/super.c                                  |    2 
 fs/ext4/super.c                                  |    2 
 fs/f2fs/compress.c                               |    4 
 fs/f2fs/data.c                                   |    3 
 fs/f2fs/f2fs.h                                   |    6 
 fs/f2fs/segment.c                                |    8 
 fs/f2fs/super.c                                  |   14 
 fs/fat/inode.c                                   |    2 
 fs/freevxfs/vxfs_super.c                         |    2 
 fs/fs-writeback.c                                |   40 
 fs/fuse/control.c                                |   17 
 fs/fuse/dev.c                                    |    8 
 fs/fuse/file.c                                   |   17 
 fs/fuse/inode.c                                  |    2 
 fs/gfs2/super.c                                  |    2 
 fs/hfs/super.c                                   |    2 
 fs/hfsplus/super.c                               |    2 
 fs/hostfs/hostfs_kern.c                          |    2 
 fs/hpfs/super.c                                  |    2 
 fs/hugetlbfs/inode.c                             |    2 
 fs/inode.c                                       |    2 
 fs/isofs/inode.c                                 |    2 
 fs/jffs2/super.c                                 |    2 
 fs/jfs/super.c                                   |    2 
 fs/minix/inode.c                                 |    2 
 fs/namespace.c                                   |    2 
 fs/nfs/inode.c                                   |    2 
 fs/nfs/write.c                                   |   14 
 fs/nilfs2/segbuf.c                               |   16 
 fs/nilfs2/super.c                                |    2 
 fs/ntfs/inode.c                                  |    6 
 fs/ntfs3/super.c                                 |    2 
 fs/ocfs2/alloc.c                                 |    2 
 fs/ocfs2/aops.c                                  |    2 
 fs/ocfs2/cluster/nodemanager.c                   |    2 
 fs/ocfs2/dir.c                                   |    4 
 fs/ocfs2/dlmfs/dlmfs.c                           |    2 
 fs/ocfs2/file.c                                  |   13 
 fs/ocfs2/inode.c                                 |    2 
 fs/ocfs2/localalloc.c                            |    6 
 fs/ocfs2/namei.c                                 |    2 
 fs/ocfs2/ocfs2.h                                 |    4 
 fs/ocfs2/quota_global.c                          |    2 
 fs/ocfs2/stack_user.c                            |   18 
 fs/ocfs2/super.c                                 |    2 
 fs/ocfs2/xattr.c                                 |    2 
 fs/openpromfs/inode.c                            |    2 
 fs/orangefs/super.c                              |    2 
 fs/overlayfs/super.c                             |    2 
 fs/proc/inode.c                                  |    2 
 fs/qnx4/inode.c                                  |    2 
 fs/qnx6/inode.c                                  |    2 
 fs/reiserfs/super.c                              |    2 
 fs/romfs/super.c                                 |    2 
 fs/squashfs/super.c                              |    2 
 fs/sysv/inode.c                                  |    2 
 fs/ubifs/super.c                                 |    2 
 fs/udf/super.c                                   |    2 
 fs/ufs/super.c                                   |    2 
 fs/userfaultfd.c                                 |    5 
 fs/vboxsf/super.c                                |    2 
 fs/xfs/libxfs/xfs_btree.c                        |    2 
 fs/xfs/xfs_buf.c                                 |    3 
 fs/xfs/xfs_icache.c                              |    2 
 fs/zonefs/super.c                                |    2 
 include/linux/backing-dev-defs.h                 |    8 
 include/linux/backing-dev.h                      |   50 
 include/linux/cma.h                              |   14 
 include/linux/damon.h                            |   95 
 include/linux/fault-inject.h                     |    2 
 include/linux/fs.h                               |   21 
 include/linux/gfp.h                              |   10 
 include/linux/highmem-internal.h                 |   10 
 include/linux/hugetlb.h                          |    8 
 include/linux/kthread.h                          |   22 
 include/linux/list_lru.h                         |   45 
 include/linux/memcontrol.h                       |   46 
 include/linux/memory.h                           |   12 
 include/linux/memory_hotplug.h                   |  132 -
 include/linux/migrate.h                          |    8 
 include/linux/mm.h                               |   11 
 include/linux/mmzone.h                           |   22 
 include/linux/nfs_fs_sb.h                        |    1 
 include/linux/node.h                             |   25 
 include/linux/page-flags.h                       |   96 
 include/linux/pageblock-flags.h                  |    7 
 include/linux/pagemap.h                          |    7 
 include/linux/sched.h                            |    1 
 include/linux/sched/sysctl.h                     |   10 
 include/linux/shmem_fs.h                         |    1 
 include/linux/slab.h                             |    3 
 include/linux/swap.h                             |    6 
 include/linux/thread_info.h                      |    5 
 include/linux/uaccess.h                          |    2 
 include/linux/vm_event_item.h                    |    3 
 include/linux/vmalloc.h                          |    4 
 include/linux/xarray.h                           |    9 
 include/ras/ras_event.h                          |    1 
 include/trace/events/compaction.h                |   26 
 include/trace/events/writeback.h                 |   28 
 include/uapi/linux/userfaultfd.h                 |    8 
 ipc/mqueue.c                                     |    2 
 kernel/dma/contiguous.c                          |    4 
 kernel/sched/core.c                              |   21 
 kernel/sysctl.c                                  |    2 
 lib/Kconfig.kfence                               |   12 
 lib/kunit/try-catch.c                            |    3 
 lib/xarray.c                                     |   10 
 mm/Kconfig                                       |    6 
 mm/backing-dev.c                                 |   57 
 mm/cma.c                                         |   31 
 mm/cma.h                                         |    1 
 mm/compaction.c                                  |   60 
 mm/damon/Kconfig                                 |   19 
 mm/damon/Makefile                                |    7 
 mm/damon/core-test.h                             |   23 
 mm/damon/core.c                                  |  190 +
 mm/damon/dbgfs-test.h                            |  103 
 mm/damon/dbgfs.c                                 |  264 +-
 mm/damon/ops-common.c                            |  133 +
 mm/damon/ops-common.h                            |   16 
 mm/damon/paddr.c                                 |   62 
 mm/damon/prmtv-common.c                          |  133 -
 mm/damon/prmtv-common.h                          |   16 
 mm/damon/reclaim.c                               |   11 
 mm/damon/sysfs.c                                 | 2632 ++++++++++++++++++++++-
 mm/damon/vaddr-test.h                            |    8 
 mm/damon/vaddr.c                                 |   67 
 mm/early_ioremap.c                               |    1 
 mm/fadvise.c                                     |    5 
 mm/filemap.c                                     |   17 
 mm/gup.c                                         |  103 
 mm/highmem.c                                     |    9 
 mm/hmm.c                                         |    3 
 mm/huge_memory.c                                 |   41 
 mm/hugetlb.c                                     |   23 
 mm/hugetlb_vmemmap.c                             |   74 
 mm/hwpoison-inject.c                             |    7 
 mm/internal.h                                    |   19 
 mm/kfence/Makefile                               |    2 
 mm/kfence/core.c                                 |  147 +
 mm/kfence/kfence_test.c                          |    3 
 mm/ksm.c                                         |    6 
 mm/list_lru.c                                    |  690 ++----
 mm/maccess.c                                     |    6 
 mm/madvise.c                                     |   18 
 mm/memcontrol.c                                  |  549 ++--
 mm/memory-failure.c                              |  148 -
 mm/memory.c                                      |  116 -
 mm/memory_hotplug.c                              |  136 -
 mm/mempolicy.c                                   |   29 
 mm/memremap.c                                    |    3 
 mm/migrate.c                                     |  128 -
 mm/mlock.c                                       |    1 
 mm/mmap.c                                        |    5 
 mm/mmzone.c                                      |    7 
 mm/mprotect.c                                    |   13 
 mm/mremap.c                                      |    4 
 mm/oom_kill.c                                    |    3 
 mm/page-writeback.c                              |   12 
 mm/page_alloc.c                                  |  429 +--
 mm/page_io.c                                     |    7 
 mm/page_table_check.c                            |   10 
 mm/ptdump.c                                      |   16 
 mm/readahead.c                                   |  124 +
 mm/rmap.c                                        |   15 
 mm/shmem.c                                       |   46 
 mm/slab.c                                        |   39 
 mm/slab.h                                        |   25 
 mm/slob.c                                        |    6 
 mm/slub.c                                        |   42 
 mm/sparse-vmemmap.c                              |   70 
 mm/sparse.c                                      |    2 
 mm/swap.c                                        |   25 
 mm/swapfile.c                                    |    1 
 mm/usercopy.c                                    |   16 
 mm/userfaultfd.c                                 |    3 
 mm/vmalloc.c                                     |  102 
 mm/vmscan.c                                      |  138 -
 mm/vmstat.c                                      |   19 
 mm/workingset.c                                  |    7 
 mm/zswap.c                                       |   15 
 net/socket.c                                     |    2 
 net/sunrpc/rpc_pipe.c                            |    2 
 scripts/spelling.txt                             |   16 
 tools/testing/selftests/cgroup/cgroup_util.c     |   15 
 tools/testing/selftests/cgroup/cgroup_util.h     |    1 
 tools/testing/selftests/cgroup/test_memcontrol.c |   78 
 tools/testing/selftests/damon/Makefile           |    1 
 tools/testing/selftests/damon/sysfs.sh           |  306 ++
 tools/testing/selftests/vm/.gitignore            |    1 
 tools/testing/selftests/vm/Makefile              |    7 
 tools/testing/selftests/vm/hugepage-vmemmap.c    |  144 +
 tools/testing/selftests/vm/run_vmtests.sh        |   11 
 tools/testing/selftests/vm/userfaultfd.c         |    2 
 tools/testing/selftests/x86/Makefile             |    6 
 264 files changed, 7205 insertions(+), 3090 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-03-16 23:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-03-16 23:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches

4 patches, based on 56e337f2cf1326323844927a04e9dbce9a244835.

Subsystems affected by this patch series:

  mm/swap
  kconfig
  ocfs2
  selftests

Subsystem: mm/swap

    Guo Ziliang <guo.ziliang@zte.com.cn>:
      mm: swap: get rid of deadloop in swapin readahead

Subsystem: kconfig

    Qian Cai <quic_qiancai@quicinc.com>:
      configs/debug: restore DEBUG_INFO=y for overriding

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: fix crash when initialize filecheck kobj fails

Subsystem: selftests

    Yosry Ahmed <yosryahmed@google.com>:
      selftests: vm: fix clang build error multiple output files

 fs/ocfs2/super.c                    |   22 +++++++++++-----------
 kernel/configs/debug.config         |    1 +
 mm/swap_state.c                     |    2 +-
 tools/testing/selftests/vm/Makefile |    6 ++----
 4 files changed, 15 insertions(+), 16 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-03-05  4:28 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-03-05  4:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches

8 patches, based on 07ebd38a0da24d2534da57b4841346379db9f354.

Subsystems affected by this patch series:

  mm/hugetlb
  mm/pagemap
  memfd
  selftests
  mm/userfaultfd
  kconfig

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      selftests/vm: cleanup hugetlb file after mremap test

Subsystem: mm/pagemap

    Suren Baghdasaryan <surenb@google.com>:
      mm: refactor vm_area_struct::anon_vma_name usage code
      mm: prevent vm_area_struct::anon_name refcount saturation
      mm: fix use-after-free when anon vma name is used after vma is freed

Subsystem: memfd

    Hugh Dickins <hughd@google.com>:
      memfd: fix F_SEAL_WRITE after shmem huge page allocated

Subsystem: selftests

    Chengming Zhou <zhouchengming@bytedance.com>:
      kselftest/vm: fix tests build with old libc

Subsystem: mm/userfaultfd

    Yun Zhou <yun.zhou@windriver.com>:
      proc: fix documentation and description of pagemap

Subsystem: kconfig

    Qian Cai <quic_qiancai@quicinc.com>:
      configs/debug: set CONFIG_DEBUG_INFO=y properly

 Documentation/admin-guide/mm/pagemap.rst     |    2 
 fs/proc/task_mmu.c                           |    9 +-
 fs/userfaultfd.c                             |    6 -
 include/linux/mm.h                           |    7 +
 include/linux/mm_inline.h                    |  105 ++++++++++++++++++---------
 include/linux/mm_types.h                     |    5 +
 kernel/configs/debug.config                  |    2 
 kernel/fork.c                                |    4 -
 kernel/sys.c                                 |   19 +++-
 mm/madvise.c                                 |   98 +++++++++----------------
 mm/memfd.c                                   |   40 +++++++---
 mm/mempolicy.c                               |    2 
 mm/mlock.c                                   |    2 
 mm/mmap.c                                    |   12 +--
 mm/mprotect.c                                |    2 
 tools/testing/selftests/vm/hugepage-mremap.c |   26 ++++--
 tools/testing/selftests/vm/run_vmtests.sh    |    3 
 tools/testing/selftests/vm/userfaultfd.c     |    1 
 18 files changed, 201 insertions(+), 144 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-02-26  3:10 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-02-26  3:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm, patches

12 patches, based on c47658311d60be064b839f329c0e4d34f5f0735b.

Subsystems affected by this patch series:

  MAINTAINERS
  mm/hugetlb
  mm/kasan
  mm/hugetlbfs
  mm/pagemap
  mm/selftests
  mm/memcg
  m/slab
  mailmap
  memfd

Subsystem: MAINTAINERS

    Luis Chamberlain <mcgrof@kernel.org>:
      MAINTAINERS: add sysctl-next git tree

Subsystem: mm/hugetlb

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/hugetlb: fix kernel crash with hugetlb mremap

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: test: prevent cache merging in kmem_cache_double_destroy

Subsystem: mm/hugetlbfs

    Liu Yuntao <liuyuntao10@huawei.com>:
      hugetlbfs: fix a truncation issue in hugepages parameter

Subsystem: mm/pagemap

    Suren Baghdasaryan <surenb@google.com>:
      mm: fix use-after-free bug when mm->mmap is reused after being freed

Subsystem: mm/selftests

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      selftest/vm: fix map_fixed_noreplace test failure

Subsystem: mm/memcg

    Roman Gushchin <roman.gushchin@linux.dev>:
      MAINTAINERS: add Roman as a memcg co-maintainer

    Vladimir Davydov <vdavydov.dev@gmail.com>:
      MAINTAINERS: remove Vladimir from memcg maintainers

    Shakeel Butt <shakeelb@google.com>:
      MAINTAINERS: add Shakeel as a memcg co-maintainer

Subsystem: m/slab

    Vlastimil Babka <vbabka@suse.cz>:
      MAINTAINERS, SLAB: add Roman as reviewer, git tree

Subsystem: mailmap

    Roman Gushchin <roman.gushchin@linux.dev>:
      mailmap: update Roman Gushchin's email

Subsystem: memfd

    Mike Kravetz <mike.kravetz@oracle.com>:
      selftests/memfd: clean up mapping in mfd_fail_write

 .mailmap                                         |    3 +
 MAINTAINERS                                      |    6 ++
 lib/test_kasan.c                                 |    5 +-
 mm/hugetlb.c                                     |   11 ++---
 mm/mmap.c                                        |    1 
 tools/testing/selftests/memfd/memfd_test.c       |    1 
 tools/testing/selftests/vm/map_fixed_noreplace.c |   49 +++++++++++++++++------
 7 files changed, 56 insertions(+), 20 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2022-02-12  2:02 ` incoming Linus Torvalds
@ 2022-02-12  5:24   ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-02-12  5:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux-MM, mm-commits, patches

On Fri, 11 Feb 2022 18:02:53 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Feb 11, 2022 at 4:27 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > 5 patches, based on f1baf68e1383f6ed93eb9cff2866d46562607a43.
> 
> So this *completely* flummoxed 'b4', because you first sent the wrong
> series, and then sent the right one in the same thread.
> 
> I fetched the emails  manually, but honestly, this was confusing even
> then, with two "[PATCH x/5]" series where the only way to tell the
> right one was basically by date of email. They did arrive in the same
> order in my mailbox, but even that wouldn't have been guaranteed if
> there had been some mailer delays somewhere..

Yes, I wondered.  Sorry bout that.

> So next time when you mess up, resend it all as a completely new
> series and completely new threading - so with a new header email too.
> Please?

Wilco.

> And since I'm here, let me just verify that yes, the series you
> actually want me to apply is this one (as described by the head
> email):
> 
>   Subject: [patch 1/5] fs/binfmt_elf: fix PT_LOAD p_align values ..
>   Subject: [patch 2/5] fs/proc: task_mmu.c: don't read mapcount f..
>   Subject: [patch 3/5] mm: vmscan: remove deadlock due to throttl..
>   Subject: [patch 4/5] mm: memcg: synchronize objcg lists with a ..
>   Subject: [patch 5/5] kfence: make test case compatible with run..
> 
> and not the other one with GUP patches?

Those are the ones.  Five fixes, three with cc:stable.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2022-02-12  0:27 incoming Andrew Morton
@ 2022-02-12  2:02 ` Linus Torvalds
  2022-02-12  5:24   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2022-02-12  2:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits, patches

On Fri, Feb 11, 2022 at 4:27 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> 5 patches, based on f1baf68e1383f6ed93eb9cff2866d46562607a43.

So this *completely* flummoxed 'b4', because you first sent the wrong
series, and then sent the right one in the same thread.

I fetched the emails  manually, but honestly, this was confusing even
then, with two "[PATCH x/5]" series where the only way to tell the
right one was basically by date of email. They did arrive in the same
order in my mailbox, but even that wouldn't have been guaranteed if
there had been some mailer delays somewhere..

So next time when you mess up, resend it all as a completely new
series and completely new threading - so with a new header email too.
Please?

And since I'm here, let me just verify that yes, the series you
actually want me to apply is this one (as described by the head
email):

  Subject: [patch 1/5] fs/binfmt_elf: fix PT_LOAD p_align values ..
  Subject: [patch 2/5] fs/proc: task_mmu.c: don't read mapcount f..
  Subject: [patch 3/5] mm: vmscan: remove deadlock due to throttl..
  Subject: [patch 4/5] mm: memcg: synchronize objcg lists with a ..
  Subject: [patch 5/5] kfence: make test case compatible with run..

and not the other one with GUP patches?

             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-02-12  0:27 Andrew Morton
  2022-02-12  2:02 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2022-02-12  0:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits, patches

5 patches, based on f1baf68e1383f6ed93eb9cff2866d46562607a43.

Subsystems affected by this patch series:

  binfmt
  procfs
  mm/vmscan
  mm/memcg
  mm/kfence

Subsystem: binfmt

    Mike Rapoport <rppt@linux.ibm.com>:
      fs/binfmt_elf: fix PT_LOAD p_align values for loaders

Subsystem: procfs

    Yang Shi <shy828301@gmail.com>:
      fs/proc: task_mmu.c: don't read mapcount for migration entry

Subsystem: mm/vmscan

    Mel Gorman <mgorman@suse.de>:
      mm: vmscan: remove deadlock due to throttling failing to make progress

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg: synchronize objcg lists with a dedicated spinlock

Subsystem: mm/kfence

    Peng Liu <liupeng256@huawei.com>:
      kfence: make test case compatible with run time set sample interval

 fs/binfmt_elf.c            |    2 +-
 fs/proc/task_mmu.c         |   40 +++++++++++++++++++++++++++++++---------
 include/linux/kfence.h     |    2 ++
 include/linux/memcontrol.h |    5 +++--
 mm/kfence/core.c           |    3 ++-
 mm/kfence/kfence_test.c    |    8 ++++----
 mm/memcontrol.c            |   10 +++++-----
 mm/vmscan.c                |    4 +++-
 8 files changed, 51 insertions(+), 23 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-02-04  4:48 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-02-04  4:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

10 patches, based on 1f2cfdd349b7647f438c1e552dc1b983da86d830.

Subsystems affected by this patch series:

  mm/vmscan
  mm/debug
  mm/pagemap
  ipc
  mm/kmemleak
  MAINTAINERS
  mm/selftests

Subsystem: mm/vmscan

    Chen Wandun <chenwandun@huawei.com>:
      Revert "mm/page_isolation: unset migratetype directly for non Buddy page"

Subsystem: mm/debug

    Pasha Tatashin <pasha.tatashin@soleen.com>:
    Patch series "page table check fixes and cleanups", v5:
      mm/debug_vm_pgtable: remove pte entry from the page table
      mm/page_table_check: use unsigned long for page counters and cleanup
      mm/khugepaged: unify collapse pmd clear, flush and free
      mm/page_table_check: check entries at pmd levels

Subsystem: mm/pagemap

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/pgtable: define pte_index so that preprocessor could recognize it

Subsystem: ipc

    Minghao Chi <chi.minghao@zte.com.cn>:
      ipc/sem: do not sleep with a spin lock held

Subsystem: mm/kmemleak

    Lang Yu <lang.yu@amd.com>:
      mm/kmemleak: avoid scanning potential huge holes

Subsystem: MAINTAINERS

    Mike Rapoport <rppt@linux.ibm.com>:
      MAINTAINERS: update rppt's email

Subsystem: mm/selftests

    Shuah Khan <skhan@linuxfoundation.org>:
      kselftest/vm: revert "tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner"

 MAINTAINERS                              |    2 -
 include/linux/page_table_check.h         |   19 ++++++++++
 include/linux/pgtable.h                  |    1 
 ipc/sem.c                                |    4 +-
 mm/debug_vm_pgtable.c                    |    2 +
 mm/khugepaged.c                          |   37 +++++++++++---------
 mm/kmemleak.c                            |   13 +++----
 mm/page_isolation.c                      |    2 -
 mm/page_table_check.c                    |   55 +++++++++++++++----------------
 tools/testing/selftests/vm/userfaultfd.c |   11 ++++--
 10 files changed, 89 insertions(+), 57 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-01-29 21:40 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-01-29 21:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

12 patches, based on f8c7e4ede46fe63ff10000669652648aab09d112.

Subsystems affected by this patch series:

  sysctl
  binfmt
  ia64
  mm/memory-failure
  mm/folios
  selftests
  mm/kasan
  mm/psi
  ocfs2

Subsystem: sysctl

    Andrew Morton <akpm@linux-foundation.org>:
      include/linux/sysctl.h: fix register_sysctl_mount_point() return type

Subsystem: binfmt

    Tong Zhang <ztong0001@gmail.com>:
      binfmt_misc: fix crash when load/unload module

Subsystem: ia64

    Randy Dunlap <rdunlap@infradead.org>:
      ia64: make IA64_MCA_RECOVERY bool instead of tristate

Subsystem: mm/memory-failure

    Joao Martins <joao.m.martins@oracle.com>:
      memory-failure: fetch compound_head after pgmap_pfn_valid()

Subsystem: mm/folios

    Wei Yang <richard.weiyang@gmail.com>:
      mm: page->mapping folio->mapping should have the same offset

Subsystem: selftests

    Maor Gottlieb <maorg@nvidia.com>:
      tools/testing/scatterlist: add missing defines

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: test: fix compatibility with FORTIFY_SOURCE

    Peter Collingbourne <pcc@google.com>:
      mm, kasan: use compare-exchange operation to set KASAN page tag

Subsystem: mm/psi

    Suren Baghdasaryan <surenb@google.com>:
      psi: fix "no previous prototype" warnings when CONFIG_CGROUPS=n
      psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
    Patch series "ocfs2: fix a deadlock case":
      jbd2: export jbd2_journal_[grab|put]_journal_head
      ocfs2: fix a deadlock when commit trans

 arch/ia64/Kconfig                    |    2 
 fs/binfmt_misc.c                     |    8 +--
 fs/jbd2/journal.c                    |    2 
 fs/ocfs2/suballoc.c                  |   25 ++++-------
 include/linux/mm.h                   |   17 +++++--
 include/linux/mm_types.h             |    1 
 include/linux/psi.h                  |   11 ++--
 include/linux/sysctl.h               |    2 
 kernel/sched/psi.c                   |   79 ++++++++++++++++++-----------------
 lib/test_kasan.c                     |    5 ++
 mm/memory-failure.c                  |    6 ++
 tools/testing/scatterlist/linux/mm.h |    3 -
 12 files changed, 91 insertions(+), 70 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2022-01-29  4:25 ` incoming Matthew Wilcox
@ 2022-01-29  6:23   ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-01-29  6:23 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Linus Torvalds, mm-commits, linux-mm

On Sat, 29 Jan 2022 04:25:33 +0000 Matthew Wilcox <willy@infradead.org> wrote:

> On Fri, Jan 28, 2022 at 06:13:41PM -0800, Andrew Morton wrote:
> > 12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.
>   ^^
> 
> I see 7?

Crap, sorry, ignore all this, shall redo tomorrow.

(It wasn't a good day over here.  The thing with disk drives is that
the bigger they are, the harder they fall).



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2022-01-29  2:13 incoming Andrew Morton
@ 2022-01-29  4:25 ` Matthew Wilcox
  2022-01-29  6:23   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Matthew Wilcox @ 2022-01-29  4:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linus Torvalds, mm-commits, linux-mm

On Fri, Jan 28, 2022 at 06:13:41PM -0800, Andrew Morton wrote:
> 12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.
  ^^

I see 7?


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-01-29  2:13 Andrew Morton
  2022-01-29  4:25 ` incoming Matthew Wilcox
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2022-01-29  2:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

12 patches, based on 169387e2aa291a4e3cb856053730fe99d6cec06f.

Subsystems affected by this patch series:

  sysctl
  binfmt
  ia64
  mm/memory-failure
  mm/folios
  selftests
  mm/kasan
  mm/psi
  ocfs2

Subsystem: sysctl

    Andrew Morton <akpm@linux-foundation.org>:
      include/linux/sysctl.h: fix register_sysctl_mount_point() return type

Subsystem: binfmt

    Tong Zhang <ztong0001@gmail.com>:
      binfmt_misc: fix crash when load/unload module

Subsystem: ia64

    Randy Dunlap <rdunlap@infradead.org>:
      ia64: make IA64_MCA_RECOVERY bool instead of tristate

Subsystem: mm/memory-failure

    Joao Martins <joao.m.martins@oracle.com>:
      memory-failure: fetch compound_head after pgmap_pfn_valid()

Subsystem: mm/folios

    Wei Yang <richard.weiyang@gmail.com>:
      mm: page->mapping folio->mapping should have the same offset

Subsystem: selftests

    Maor Gottlieb <maorg@nvidia.com>:
      tools/testing/scatterlist: add missing defines

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: test: fix compatibility with FORTIFY_SOURCE

    Peter Collingbourne <pcc@google.com>:
      mm, kasan: use compare-exchange operation to set KASAN page tag

Subsystem: mm/psi

    Suren Baghdasaryan <surenb@google.com>:
      psi: fix "no previous prototype" warnings when CONFIG_CGROUPS=n
      psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
    Patch series "ocfs2: fix a deadlock case":
      jbd2: export jbd2_journal_[grab|put]_journal_head
      ocfs2: fix a deadlock when commit trans

 arch/ia64/Kconfig                    |    2 
 fs/binfmt_misc.c                     |    8 +--
 fs/jbd2/journal.c                    |    2 
 fs/ocfs2/suballoc.c                  |   25 ++++-------
 include/linux/mm.h                   |   17 +++++--
 include/linux/mm_types.h             |    1 
 include/linux/psi.h                  |   11 ++--
 include/linux/sysctl.h               |    2 
 kernel/sched/psi.c                   |   79 ++++++++++++++++++-----------------
 lib/test_kasan.c                     |    5 ++
 mm/memory-failure.c                  |    6 ++
 tools/testing/scatterlist/linux/mm.h |    3 -
 12 files changed, 91 insertions(+), 70 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-01-22  6:10 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-01-22  6:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


This is the post-linux-next queue.  Material which was based on or
dependent upon material which was in -next.

69 patches, based on 9b57f458985742bd1c585f4c7f36d04634ce1143.

Subsystems affected by this patch series:

  mm/migration
  sysctl
  mm/zsmalloc
  proc
  lib

Subsystem: mm/migration

    Alistair Popple <apopple@nvidia.com>:
      mm/migrate.c: rework migration_entry_wait() to not take a pageref

Subsystem: sysctl

    Xiaoming Ni <nixiaoming@huawei.com>:
    Patch series "sysctl: first set of kernel/sysctl cleanups", v2:
      sysctl: add a new register_sysctl_init() interface
      sysctl: move some boundary constants from sysctl.c to sysctl_vals
      hung_task: move hung_task sysctl interface to hung_task.c
      watchdog: move watchdog sysctl interface to watchdog.c

    Stephen Kitt <steve@sk2.org>:
      sysctl: make ngroups_max const

    Xiaoming Ni <nixiaoming@huawei.com>:
      sysctl: use const for typically used max/min proc sysctls
      sysctl: use SYSCTL_ZERO to replace some static int zero uses
      aio: move aio sysctl to aio.c
      dnotify: move dnotify sysctl to dnotify.c

    Luis Chamberlain <mcgrof@kernel.org>:
    Patch series "sysctl: second set of kernel/sysctl cleanups", v2:
      hpet: simplify subdirectory registration with register_sysctl()
      i915: simplify subdirectory registration with register_sysctl()
      macintosh/mac_hid.c: simplify subdirectory registration with register_sysctl()
      ocfs2: simplify subdirectory registration with register_sysctl()
      test_sysctl: simplify subdirectory registration with register_sysctl()

    Xiaoming Ni <nixiaoming@huawei.com>:
      inotify: simplify subdirectory registration with register_sysctl()

    Luis Chamberlain <mcgrof@kernel.org>:
      cdrom: simplify subdirectory registration with register_sysctl()

    Xiaoming Ni <nixiaoming@huawei.com>:
      eventpoll: simplify sysctl declaration with register_sysctl()
    Patch series "sysctl: 3rd set of kernel/sysctl cleanups", v2:
      firmware_loader: move firmware sysctl to its own files
      random: move the random sysctl declarations to its own file

    Luis Chamberlain <mcgrof@kernel.org>:
      sysctl: add helper to register a sysctl mount point
      fs: move binfmt_misc sysctl to its own file

    Xiaoming Ni <nixiaoming@huawei.com>:
      printk: move printk sysctl to printk/sysctl.c
      scsi/sg: move sg-big-buff sysctl to scsi/sg.c
      stackleak: move stack_erasing sysctl to stackleak.c

    Luis Chamberlain <mcgrof@kernel.org>:
      sysctl: share unsigned long const values
    Patch series "sysctl: 4th set of kernel/sysctl cleanups":
      fs: move inode sysctls to its own file
      fs: move fs stat sysctls to file_table.c
      fs: move dcache sysctls to its own file
      sysctl: move maxolduid as a sysctl specific const
      fs: move shared sysctls to fs/sysctls.c
      fs: move locking sysctls where they are used
      fs: move namei sysctls to its own file
      fs: move fs/exec.c sysctls into its own file
      fs: move pipe sysctls to is own file
    Patch series "sysctl: add and use base directory declarer and registration helper":
      sysctl: add and use base directory declarer and registration helper
      fs: move namespace sysctls and declare fs base directory
      kernel/sysctl.c: rename sysctl_init() to sysctl_init_bases()

    Xiaoming Ni <nixiaoming@huawei.com>:
      printk: fix build warning when CONFIG_PRINTK=n
      fs/coredump: move coredump sysctls into its own file
      kprobe: move sysctl_kprobes_optimization to kprobes.c

    Colin Ian King <colin.i.king@gmail.com>:
      kernel/sysctl.c: remove unused variable ten_thousand

    Baokun Li <libaokun1@huawei.com>:
      sysctl: returns -EINVAL when a negative value is passed to proc_doulongvec_minmax

Subsystem: mm/zsmalloc

    Minchan Kim <minchan@kernel.org>:
    Patch series "zsmalloc: remove bit_spin_lock", v2:
      zsmalloc: introduce some helper functions
      zsmalloc: rename zs_stat_type to class_stat_type
      zsmalloc: decouple class actions from zspage works
      zsmalloc: introduce obj_allocated
      zsmalloc: move huge compressed obj from page to zspage
      zsmalloc: remove zspage isolation for migration
      locking/rwlocks: introduce write_lock_nested
      zsmalloc: replace per zpage lock with pool->migrate_lock

    Mike Galbraith <umgwanakikbuti@gmail.com>:
      zsmalloc: replace get_cpu_var with local_lock

Subsystem: proc

    Muchun Song <songmuchun@bytedance.com>:
      fs: proc: store PDE()->data into inode->i_private
      proc: remove PDE_DATA() completely

Subsystem: lib

    Vlastimil Babka <vbabka@suse.cz>:
      lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()
lib/stackdepot: fix spelling mistake and grammar in pr_err message
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup3
lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() - fixup4

    Marco Elver <elver@google.com>:
      lib/stackdepot: always do filter_irq_stacks() in stack_depot_save()

    Christoph Hellwig <hch@lst.de>:
    Patch series "remove Xen tmem leftovers":
      mm: remove cleancache
      frontswap: remove frontswap_writethrough
      frontswap: remove frontswap_tmem_exclusive_gets
      frontswap: remove frontswap_shrink
      frontswap: remove frontswap_curr_pages
      frontswap: simplify frontswap_init
      frontswap: remove the frontswap exports
      mm: simplify try_to_unuse
      frontswap: remove frontswap_test
      frontswap: simplify frontswap_register_ops
      mm: mark swap_lock and swap_active_head static
      frontswap: remove support for multiple ops
      mm: hide the FRONTSWAP Kconfig symbol

 Documentation/vm/cleancache.rst                        |  296 ------
 Documentation/vm/frontswap.rst                         |   31 
 Documentation/vm/index.rst                             |    1 
 MAINTAINERS                                            |    7 
 arch/alpha/kernel/srm_env.c                            |    4 
 arch/arm/configs/bcm2835_defconfig                     |    1 
 arch/arm/configs/qcom_defconfig                        |    1 
 arch/arm/kernel/atags_proc.c                           |    2 
 arch/arm/mm/alignment.c                                |    2 
 arch/ia64/kernel/salinfo.c                             |   10 
 arch/m68k/configs/amiga_defconfig                      |    1 
 arch/m68k/configs/apollo_defconfig                     |    1 
 arch/m68k/configs/atari_defconfig                      |    1 
 arch/m68k/configs/bvme6000_defconfig                   |    1 
 arch/m68k/configs/hp300_defconfig                      |    1 
 arch/m68k/configs/mac_defconfig                        |    1 
 arch/m68k/configs/multi_defconfig                      |    1 
 arch/m68k/configs/mvme147_defconfig                    |    1 
 arch/m68k/configs/mvme16x_defconfig                    |    1 
 arch/m68k/configs/q40_defconfig                        |    1 
 arch/m68k/configs/sun3_defconfig                       |    1 
 arch/m68k/configs/sun3x_defconfig                      |    1 
 arch/powerpc/kernel/proc_powerpc.c                     |    4 
 arch/s390/configs/debug_defconfig                      |    1 
 arch/s390/configs/defconfig                            |    1 
 arch/sh/mm/alignment.c                                 |    4 
 arch/xtensa/platforms/iss/simdisk.c                    |    4 
 block/bdev.c                                           |    5 
 drivers/acpi/proc.c                                    |    2 
 drivers/base/firmware_loader/fallback.c                |    7 
 drivers/base/firmware_loader/fallback.h                |   11 
 drivers/base/firmware_loader/fallback_table.c          |   25 
 drivers/cdrom/cdrom.c                                  |   23 
 drivers/char/hpet.c                                    |   22 
 drivers/char/random.c                                  |   14 
 drivers/gpu/drm/drm_dp_mst_topology.c                  |    1 
 drivers/gpu/drm/drm_mm.c                               |    4 
 drivers/gpu/drm/drm_modeset_lock.c                     |    9 
 drivers/gpu/drm/i915/i915_perf.c                       |   22 
 drivers/gpu/drm/i915/intel_runtime_pm.c                |    3 
 drivers/hwmon/dell-smm-hwmon.c                         |    4 
 drivers/macintosh/mac_hid.c                            |   24 
 drivers/net/bonding/bond_procfs.c                      |    8 
 drivers/net/wireless/cisco/airo.c                      |   22 
 drivers/net/wireless/intersil/hostap/hostap_ap.c       |   16 
 drivers/net/wireless/intersil/hostap/hostap_download.c |    2 
 drivers/net/wireless/intersil/hostap/hostap_proc.c     |   24 
 drivers/net/wireless/ray_cs.c                          |    2 
 drivers/nubus/proc.c                                   |   36 
 drivers/parisc/led.c                                   |    4 
 drivers/pci/proc.c                                     |   10 
 drivers/platform/x86/thinkpad_acpi.c                   |    4 
 drivers/platform/x86/toshiba_acpi.c                    |   16 
 drivers/pnp/isapnp/proc.c                              |    2 
 drivers/pnp/pnpbios/proc.c                             |    4 
 drivers/scsi/scsi_proc.c                               |    4 
 drivers/scsi/sg.c                                      |   35 
 drivers/usb/gadget/function/rndis.c                    |    4 
 drivers/zorro/proc.c                                   |    2 
 fs/Makefile                                            |    4 
 fs/afs/proc.c                                          |    6 
 fs/aio.c                                               |   31 
 fs/binfmt_misc.c                                       |    6 
 fs/btrfs/extent_io.c                                   |   10 
 fs/btrfs/super.c                                       |    2 
 fs/coredump.c                                          |   66 +
 fs/dcache.c                                            |   37 
 fs/eventpoll.c                                         |   10 
 fs/exec.c                                              |  145 +--
 fs/ext4/mballoc.c                                      |   14 
 fs/ext4/readpage.c                                     |    6 
 fs/ext4/super.c                                        |    3 
 fs/f2fs/data.c                                         |   13 
 fs/file_table.c                                        |   47 -
 fs/inode.c                                             |   39 
 fs/jbd2/journal.c                                      |    2 
 fs/locks.c                                             |   34 
 fs/mpage.c                                             |    7 
 fs/namei.c                                             |   58 +
 fs/namespace.c                                         |   24 
 fs/notify/dnotify/dnotify.c                            |   21 
 fs/notify/fanotify/fanotify_user.c                     |   10 
 fs/notify/inotify/inotify_user.c                       |   11 
 fs/ntfs3/ntfs_fs.h                                     |    1 
 fs/ocfs2/stackglue.c                                   |   25 
 fs/ocfs2/super.c                                       |    2 
 fs/pipe.c                                              |   64 +
 fs/proc/generic.c                                      |    6 
 fs/proc/inode.c                                        |    1 
 fs/proc/internal.h                                     |    5 
 fs/proc/proc_net.c                                     |    8 
 fs/proc/proc_sysctl.c                                  |   67 +
 fs/super.c                                             |    3 
 fs/sysctls.c                                           |   47 -
 include/linux/aio.h                                    |    4 
 include/linux/cleancache.h                             |  124 --
 include/linux/coredump.h                               |   10 
 include/linux/dcache.h                                 |   10 
 include/linux/dnotify.h                                |    1 
 include/linux/fanotify.h                               |    2 
 include/linux/frontswap.h                              |   35 
 include/linux/fs.h                                     |   18 
 include/linux/inotify.h                                |    3 
 include/linux/kprobes.h                                |    6 
 include/linux/migrate.h                                |    2 
 include/linux/mount.h                                  |    3 
 include/linux/pipe_fs_i.h                              |    4 
 include/linux/poll.h                                   |    2 
 include/linux/printk.h                                 |    4 
 include/linux/proc_fs.h                                |   17 
 include/linux/ref_tracker.h                            |    2 
 include/linux/rwlock.h                                 |    6 
 include/linux/rwlock_api_smp.h                         |    8 
 include/linux/rwlock_rt.h                              |   10 
 include/linux/sched/sysctl.h                           |   14 
 include/linux/seq_file.h                               |    2 
 include/linux/shmem_fs.h                               |    3 
 include/linux/spinlock_api_up.h                        |    1 
 include/linux/stackdepot.h                             |   25 
 include/linux/stackleak.h                              |    5 
 include/linux/swapfile.h                               |    3 
 include/linux/sysctl.h                                 |   67 +
 include/scsi/sg.h                                      |    4 
 init/main.c                                            |    9 
 ipc/util.c                                             |    2 
 kernel/hung_task.c                                     |   81 +
 kernel/irq/proc.c                                      |    8 
 kernel/kprobes.c                                       |   30 
 kernel/locking/spinlock.c                              |   10 
 kernel/locking/spinlock_rt.c                           |   12 
 kernel/printk/Makefile                                 |    5 
 kernel/printk/internal.h                               |    8 
 kernel/printk/printk.c                                 |    4 
 kernel/printk/sysctl.c                                 |   85 +
 kernel/resource.c                                      |    4 
 kernel/stackleak.c                                     |   26 
 kernel/sysctl.c                                        |  790 +----------------
 kernel/watchdog.c                                      |  101 ++
 lib/Kconfig                                            |    4 
 lib/Kconfig.kasan                                      |    2 
 lib/stackdepot.c                                       |   46 
 lib/test_sysctl.c                                      |   22 
 mm/Kconfig                                             |   40 
 mm/Makefile                                            |    1 
 mm/cleancache.c                                        |  315 ------
 mm/filemap.c                                           |  102 +-
 mm/frontswap.c                                         |  259 -----
 mm/kasan/common.c                                      |    1 
 mm/migrate.c                                           |   38 
 mm/page_owner.c                                        |    2 
 mm/shmem.c                                             |   33 
 mm/swapfile.c                                          |   90 -
 mm/truncate.c                                          |   15 
 mm/zsmalloc.c                                          |  557 ++++-------
 mm/zswap.c                                             |    8 
 net/atm/proc.c                                         |    4 
 net/bluetooth/af_bluetooth.c                           |    8 
 net/can/bcm.c                                          |    2 
 net/can/proc.c                                         |    2 
 net/core/neighbour.c                                   |    6 
 net/core/pktgen.c                                      |    6 
 net/ipv4/netfilter/ipt_CLUSTERIP.c                     |    6 
 net/ipv4/raw.c                                         |    8 
 net/ipv4/tcp_ipv4.c                                    |    2 
 net/ipv4/udp.c                                         |    6 
 net/netfilter/x_tables.c                               |   10 
 net/netfilter/xt_hashlimit.c                           |   18 
 net/netfilter/xt_recent.c                              |    4 
 net/sunrpc/auth_gss/svcauth_gss.c                      |    4 
 net/sunrpc/cache.c                                     |   24 
 net/sunrpc/stats.c                                     |    2 
 sound/core/info.c                                      |    4 
 172 files changed, 1877 insertions(+), 2931 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-01-20  2:07 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-01-20  2:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

55 patches, based on df0cc57e057f18e44dac8e6c18aba47ab53202f9 ("Linux 5.16")

Subsystems affected by this patch series:

  percpu
  procfs
  sysctl
  misc
  core-kernel
  get_maintainer
  lib
  checkpatch
  binfmt
  nilfs2
  hfs
  fat
  adfs
  panic
  delayacct
  kconfig
  kcov
  ubsan

Subsystem: percpu

    Kefeng Wang <wangkefeng.wang@huawei.com>:
    Patch series "mm: percpu: Cleanup percpu first chunk function":
      mm: percpu: generalize percpu related config
      mm: percpu: add pcpu_fc_cpu_to_node_fn_t typedef
      mm: percpu: add generic pcpu_fc_alloc/free funciton
      mm: percpu: add generic pcpu_populate_pte() function

Subsystem: procfs

    David Hildenbrand <david@redhat.com>:
      proc/vmcore: don't fake reading zeroes on surprise vmcore_cb unregistration

    Hans de Goede <hdegoede@redhat.com>:
      proc: make the proc_create[_data]() stubs static inlines

    Qi Zheng <zhengqi.arch@bytedance.com>:
      proc: convert the return type of proc_fd_access_allowed() to be boolean

Subsystem: sysctl

    Geert Uytterhoeven <geert+renesas@glider.be>:
      sysctl: fix duplicate path separator in printed entries

    luo penghao <luo.penghao@zte.com.cn>:
      sysctl: remove redundant ret assignment

Subsystem: misc

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      include/linux/unaligned: replace kernel.h with the necessary inclusions
      kernel.h: include a note to discourage people from including it in headers

Subsystem: core-kernel

    Yafang Shao <laoar.shao@gmail.com>:
    Patch series "task comm cleanups", v2:
      fs/exec: replace strlcpy with strscpy_pad in __set_task_comm
      fs/exec: replace strncpy with strscpy_pad in __get_task_comm
      drivers/infiniband: replace open-coded string copy with get_task_comm
      fs/binfmt_elf: replace open-coded string copy with get_task_comm
      samples/bpf/test_overhead_kprobe_kern: replace bpf_probe_read_kernel with bpf_probe_read_kernel_str to get task comm
      tools/bpf/bpftool/skeleton: replace bpf_probe_read_kernel with bpf_probe_read_kernel_str to get task comm
      tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN
      kthread: dynamically allocate memory to store kthread's full name

    Davidlohr Bueso <dave@stgolabs.net>:
      kernel/sys.c: only take tasklist_lock for get/setpriority(PRIO_PGRP)

Subsystem: get_maintainer

    Randy Dunlap <rdunlap@infradead.org>:
      get_maintainer: don't remind about no git repo when --nogit is used

Subsystem: lib

    Alexey Dobriyan <adobriyan@gmail.com>:
      kstrtox: uninline everything

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      list: introduce list_is_head() helper and re-use it in list.h

    Zhen Lei <thunder.leizhen@huawei.com>:
      lib/list_debug.c: print more list debugging context in __list_del_entry_valid()

    Isabella Basso <isabbasso@riseup.net>:
    Patch series "test_hash.c: refactor into KUnit", v3:
      hash.h: remove unused define directive
      test_hash.c: split test_int_hash into arch-specific functions
      test_hash.c: split test_hash_init
      lib/Kconfig.debug: properly split hash test kernel entries
      test_hash.c: refactor into kunit

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kunit: replace kernel.h with the necessary inclusions
      uuid: discourage people from using UAPI header in new code
      uuid: remove licence boilerplate text from the header

    Andrey Konovalov <andreyknvl@google.com>:
      lib/test_meminit: destroy cache in kmem_cache_alloc_bulk() test

Subsystem: checkpatch

    Jerome Forissier <jerome@forissier.org>:
      checkpatch: relax regexp for COMMIT_LOG_LONG_LINE

    Joe Perches <joe@perches.com>:
      checkpatch: improve Kconfig help test

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      const_structs.checkpatch: add frequently used ops structs

Subsystem: binfmt

    "H.J. Lu" <hjl.tools@gmail.com>:
      fs/binfmt_elf: use PT_LOAD p_align values for static PIE

Subsystem: nilfs2

    Colin Ian King <colin.i.king@gmail.com>:
      nilfs2: remove redundant pointer sbufs

Subsystem: hfs

    Kees Cook <keescook@chromium.org>:
      hfsplus: use struct_group_attr() for memcpy() region

Subsystem: fat

    "NeilBrown" <neilb@suse.de>:
       FAT: use io_schedule_timeout() instead of congestion_wait()

Subsystem: adfs

    Minghao Chi <chi.minghao@zte.com.cn>:
      fs/adfs: remove unneeded variable make code cleaner

Subsystem: panic

    Marco Elver <elver@google.com>:
      panic: use error_report_end tracepoint on warnings

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      panic: remove oops_id

Subsystem: delayacct

    Yang Yang <yang.yang29@zte.com.cn>:
      delayacct: support swapin delay accounting for swapping without blkio
      delayacct: fix incomplete disable operation when switch enable to disable
      delayacct: cleanup flags in struct task_delay_info and functions use it

    wangyong <wang.yong12@zte.com.cn>:
      Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
      delayacct: track delays from memory compact

Subsystem: kconfig

    Qian Cai <quic_qiancai@quicinc.com>:
      configs: introduce debug.config for CI-like setup

    Nathan Chancellor <nathan@kernel.org>:
    Patch series "Fix CONFIG_TEST_KMOD with 256kB page size":
      arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
      btrfs: use generic Kconfig option for 256kB page size limit
      lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB

Subsystem: kcov

    Marco Elver <elver@google.com>:
      kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR

Subsystem: ubsan

    Kees Cook <keescook@chromium.org>:
      ubsan: remove CONFIG_UBSAN_OBJECT_SIZE

    Colin Ian King <colin.i.king@gmail.com>:
      lib: remove redundant assignment to variable ret

 Documentation/accounting/delay-accounting.rst           |   63 +-
 arch/Kconfig                                            |    4 
 arch/arm64/Kconfig                                      |   20 
 arch/ia64/Kconfig                                       |    9 
 arch/mips/Kconfig                                       |   10 
 arch/mips/mm/init.c                                     |   28 -
 arch/powerpc/Kconfig                                    |   17 
 arch/powerpc/kernel/setup_64.c                          |  113 ----
 arch/riscv/Kconfig                                      |   10 
 arch/sparc/Kconfig                                      |   12 
 arch/sparc/kernel/led.c                                 |    8 
 arch/sparc/kernel/smp_64.c                              |  119 -----
 arch/x86/Kconfig                                        |   19 
 arch/x86/kernel/setup_percpu.c                          |   82 ---
 drivers/base/arch_numa.c                                |   78 ---
 drivers/infiniband/hw/qib/qib.h                         |    2 
 drivers/infiniband/hw/qib/qib_file_ops.c                |    2 
 drivers/infiniband/sw/rxe/rxe_qp.c                      |    3 
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/xtlv.c |    2 
 fs/adfs/inode.c                                         |    4 
 fs/binfmt_elf.c                                         |    6 
 fs/btrfs/Kconfig                                        |    3 
 fs/exec.c                                               |    5 
 fs/fat/file.c                                           |    5 
 fs/hfsplus/hfsplus_raw.h                                |   12 
 fs/hfsplus/xattr.c                                      |    4 
 fs/nilfs2/page.c                                        |    4 
 fs/proc/array.c                                         |    3 
 fs/proc/base.c                                          |    4 
 fs/proc/proc_sysctl.c                                   |    9 
 fs/proc/vmcore.c                                        |   10 
 include/kunit/assert.h                                  |    2 
 include/linux/delayacct.h                               |  107 ++--
 include/linux/elfcore-compat.h                          |    5 
 include/linux/elfcore.h                                 |    5 
 include/linux/hash.h                                    |    5 
 include/linux/kernel.h                                  |    9 
 include/linux/kthread.h                                 |    1 
 include/linux/list.h                                    |   36 -
 include/linux/percpu.h                                  |   21 
 include/linux/proc_fs.h                                 |   12 
 include/linux/sched.h                                   |    9 
 include/linux/unaligned/packed_struct.h                 |    2 
 include/trace/events/error_report.h                     |    8 
 include/uapi/linux/taskstats.h                          |    6 
 include/uapi/linux/uuid.h                               |   10 
 kernel/configs/debug.config                             |  105 ++++
 kernel/delayacct.c                                      |   49 +-
 kernel/kthread.c                                        |   32 +
 kernel/panic.c                                          |   21 
 kernel/sys.c                                            |   16 
 lib/Kconfig.debug                                       |   45 +
 lib/Kconfig.ubsan                                       |   13 
 lib/Makefile                                            |    5 
 lib/asn1_encoder.c                                      |    2 
 lib/kstrtox.c                                           |   12 
 lib/list_debug.c                                        |    8 
 lib/lz4/lz4defs.h                                       |    2 
 lib/test_hash.c                                         |  375 +++++++---------
 lib/test_meminit.c                                      |    1 
 lib/test_ubsan.c                                        |   22 
 mm/Kconfig                                              |   12 
 mm/memory.c                                             |    4 
 mm/page_alloc.c                                         |    3 
 mm/page_io.c                                            |    3 
 mm/percpu.c                                             |  168 +++++--
 samples/bpf/offwaketime_kern.c                          |    4 
 samples/bpf/test_overhead_kprobe_kern.c                 |   11 
 samples/bpf/test_overhead_tp_kern.c                     |    5 
 scripts/Makefile.ubsan                                  |    1 
 scripts/checkpatch.pl                                   |   54 +-
 scripts/const_structs.checkpatch                        |   23 
 scripts/get_maintainer.pl                               |    2 
 tools/accounting/getdelays.c                            |    8 
 tools/bpf/bpftool/skeleton/pid_iter.bpf.c               |    4 
 tools/include/linux/hash.h                              |    5 
 tools/testing/selftests/bpf/progs/test_stacktrace_map.c |    6 
 tools/testing/selftests/bpf/progs/test_tracepoint.c     |    6 
 78 files changed, 943 insertions(+), 992 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2022-01-14 22:02 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2022-01-14 22:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

146 patches, based on df0cc57e057f18e44dac8e6c18aba47ab53202f9 ("Linux 5.16")

Subsystems affected by this patch series:

  kthread
  ia64
  scripts
  ntfs
  squashfs
  ocfs2
  vfs
  mm/slab-generic
  mm/slab
  mm/kmemleak
  mm/dax
  mm/kasan
  mm/debug
  mm/pagecache
  mm/gup
  mm/shmem
  mm/frontswap
  mm/memremap
  mm/memcg
  mm/selftests
  mm/pagemap
  mm/dma
  mm/vmalloc
  mm/memory-failure
  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/mempolicy
  mm/oom-kill
  mm/hugetlbfs
  mm/migration
  mm/thp
  mm/ksm
  mm/page-poison
  mm/percpu
  mm/rmap
  mm/zswap
  mm/zram
  mm/cleanups
  mm/hmm
  mm/damon

Subsystem: kthread

    Cai Huoqing <caihuoqing@baidu.com>:
      kthread: add the helper function kthread_run_on_cpu()
      RDMA/siw: make use of the helper function kthread_run_on_cpu()
      ring-buffer: make use of the helper function kthread_run_on_cpu()
      rcutorture: make use of the helper function kthread_run_on_cpu()
      trace/osnoise: make use of the helper function kthread_run_on_cpu()
      trace/hwlat: make use of the helper function kthread_run_on_cpu()

Subsystem: ia64

    Yang Guang <yang.guang5@zte.com.cn>:
      ia64: module: use swap() to make code cleaner
      arch/ia64/kernel/setup.c: use swap() to make code cleaner

    Jason Wang <wangborong@cdjrlc.com>:
      ia64: fix typo in a comment

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      ia64: topology: use default_groups in kobj_type

Subsystem: scripts

    Drew Fustini <dfustini@baylibre.com>:
      scripts/spelling.txt: add "oveflow"

Subsystem: ntfs

    Yang Li <yang.lee@linux.alibaba.com>:
      fs/ntfs/attrib.c: fix one kernel-doc comment

Subsystem: squashfs

    Zheng Liang <zhengliang6@huawei.com>:
      squashfs: provide backing_dev_info in order to disable read-ahead

Subsystem: ocfs2

    Zhang Mingyu <zhang.mingyu@zte.com.cn>:
      ocfs2: use BUG_ON instead of if condition followed by BUG.

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: clearly handle ocfs2_grab_pages_for_write() return value

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      ocfs2: use default_groups in kobj_type

    Colin Ian King <colin.i.king@gmail.com>:
      ocfs2: remove redundant assignment to pointer root_bh

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      ocfs2: cluster: use default_groups in kobj_type

    Colin Ian King <colin.i.king@gmail.com>:
      ocfs2: remove redundant assignment to variable free_space

Subsystem: vfs

    Amit Daniel Kachhap <amit.kachhap@arm.com>:
      fs/ioctl: remove unnecessary __user annotation

Subsystem: mm/slab-generic

    Marco Elver <elver@google.com>:
      mm/slab_common: use WARN() if cache still has objects on destroy

Subsystem: mm/slab

    Muchun Song <songmuchun@bytedance.com>:
      mm: slab: make slab iterator functions static

Subsystem: mm/kmemleak

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
      kmemleak: fix kmemleak false positive report with HW tag-based kasan enable

    Calvin Zhang <calvinzhang.cool@gmail.com>:
      mm: kmemleak: alloc gray object for reserved region with direct map

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: defer kmemleak object creation of module_alloc()

Subsystem: mm/dax

    Joao Martins <joao.m.martins@oracle.com>:
    Patch series "mm, device-dax: Introduce compound pages in devmap", v7:
      mm/page_alloc: split prep_compound_page into head and tail subparts
      mm/page_alloc: refactor memmap_init_zone_device() page init
      mm/memremap: add ZONE_DEVICE support for compound pages
      device-dax: use ALIGN() for determining pgoff
      device-dax: use struct_size()
      device-dax: ensure dev_dax->pgmap is valid for dynamic devices
      device-dax: factor out page mapping initialization
      device-dax: set mapping prior to vmf_insert_pfn{,_pmd,pud}()
      device-dax: remove pfn from __dev_dax_{pte,pmd,pud}_fault()
      device-dax: compound devmap support

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: test: add globals left-out-of-bounds test
      kasan: add ability to detect double-kmem_cache_destroy()
      kasan: test: add test case for double-kmem_cache_destroy()

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix quarantine conflicting with init_on_free

Subsystem: mm/debug

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm,fs: split dump_mapping() out from dump_page()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/debug_vm_pgtable: update comments regarding migration swap entries

Subsystem: mm/pagecache

    chiminghao <chi.minghao@zte.com.cn>:
      mm/truncate.c: remove unneeded variable

Subsystem: mm/gup

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      gup: avoid multiple user access locking/unlocking in fault_in_{read/write}able

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/gup.c: stricter check on THP migration entry during follow_pmd_mask

Subsystem: mm/shmem

    Yang Shi <shy828301@gmail.com>:
      mm: shmem: don't truncate page if memory failure happens

    Gang Li <ligang.bdlg@bytedance.com>:
      shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode

Subsystem: mm/frontswap

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      mm/frontswap.c: use non-atomic '__set_bit()' when possible

Subsystem: mm/memremap

Subsystem: mm/memcg

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: make cgroup_memory_nokmem static

    Donghai Qiao <dqiao@redhat.com>:
      mm/page_counter: remove an incorrect call to propagate_protected_usage()

    Dan Schatzberg <schatzberg.dan@gmail.com>:
      mm/memcg: add oom_group_kill memory event

    Shakeel Butt <shakeelb@google.com>:
      memcg: better bounds on the memcg stats updates

    Wang Weiyang <wangweiyang2@huawei.com>:
      mm/memcg: use struct_size() helper in kzalloc()

    Shakeel Butt <shakeelb@google.com>:
      memcg: add per-memcg vmalloc stat

Subsystem: mm/selftests

    chiminghao <chi.minghao@zte.com.cn>:
      tools/testing/selftests/vm/userfaultfd.c: use swap() to make code cleaner

Subsystem: mm/pagemap

    Qi Zheng <zhengqi.arch@bytedance.com>:
      mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit

    Colin Cross <ccross@google.com>:
    Patch series "mm: rearrange madvise code to allow for reuse", v11:
      mm: rearrange madvise code to allow for reuse
      mm: add a field to store names for private anonymous memory

    Suren Baghdasaryan <surenb@google.com>:
      mm: add anonymous vma name refcounting

    Arnd Bergmann <arnd@arndb.de>:
      mm: move anon_vma declarations to linux/mm_inline.h
      mm: move tlb_flush_pending inline helpers to mm_inline.h

    Suren Baghdasaryan <surenb@google.com>:
      mm: protect free_pgtables with mmap_lock write lock in exit_mmap
      mm: document locking restrictions for vm_operations_struct::close
      mm/oom_kill: allow process_mrelease to run under mmap_lock protection

    Shuah Khan <skhan@linuxfoundation.org>:
      docs/vm: add vmalloced-kernel-stacks document

    Pasha Tatashin <pasha.tatashin@soleen.com>:
    Patch series "page table check", v3:
      mm: change page type prior to adding page table entry
      mm: ptep_clear() page table helper
      mm: page table check
      x86: mm: add x86_64 support for page table check

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: remove last argument of reuse_swap_page()
      mm: remove the total_mapcount argument from page_trans_huge_map_swapcount()
      mm: remove the total_mapcount argument from page_trans_huge_mapcount()

Subsystem: mm/dma

    Christian König <christian.koenig@amd.com>:
      mm/dmapool.c: revert "make dma pool to use kmalloc_node"

Subsystem: mm/vmalloc

    Michal Hocko <mhocko@suse.com>:
    Patch series "extend vmalloc support for constrained allocations", v2:
      mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc
      mm/vmalloc: add support for __GFP_NOFAIL
      mm/vmalloc: be more explicit about supported gfp flags.
      mm: allow !GFP_KERNEL allocations for kvmalloc
      mm: make slab and vmalloc allocators __GFP_NOLOCKDEP aware

    "NeilBrown" <neilb@suse.de>:
      mm: introduce memalloc_retry_wait()

    Suren Baghdasaryan <surenb@google.com>:
      mm/pagealloc: sysctl: change watermark_scale_factor max limit to 30%

    Changcheng Deng <deng.changcheng@zte.com.cn>:
      mm: fix boolreturn.cocci warning

    Xiongwei Song <sxwjean@gmail.com>:
      mm: page_alloc: fix building error on -Werror=array-compare

    Michal Hocko <mhocko@suse.com>:
      mm: drop node from alloc_pages_vma

    Miles Chen <miles.chen@mediatek.com>:
      include/linux/gfp.h: further document GFP_DMA32

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/page_alloc.c: modify the comment section for alloc_contig_pages()

    Baoquan He <bhe@redhat.com>:
    Patch series "Handle warning of allocation failure on DMA zone w/o managed pages", v4:
      mm_zone: add function to check if managed dma zone exists
      dma/pool: create dma atomic pool only if dma zone has managed pages
      mm/page_alloc.c: do not warn allocation failure on zone DMA if no managed pages

Subsystem: mm/memory-failure

Subsystem: mm/hugetlb

    Mina Almasry <almasrymina@google.com>:
      hugetlb: add hugetlb.*.numa_stat file

    Yosry Ahmed <yosryahmed@google.com>:
      mm, hugepages: make memory size variable in hugepage-mremap selftest

    Yang Yang <yang.yang29@zte.com.cn>:
      mm/vmstat: add events for THP max_ptes_* exceeds

    Waiman Long <longman@redhat.com>:
      selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting

Subsystem: mm/userfaultfd

    Peter Xu <peterx@redhat.com>:
      selftests/uffd: allow EINTR/EAGAIN

    Mike Kravetz <mike.kravetz@oracle.com>:
      userfaultfd/selftests: clean up hugetlb allocation code

Subsystem: mm/vmscan

    Gang Li <ligang.bdlg@bytedance.com>:
      vmscan: make drop_slab_node static

    Chen Wandun <chenwandun@huawei.com>:
      mm/page_isolation: unset migratetype directly for non Buddy page

Subsystem: mm/mempolicy

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "mm: add new syscall set_mempolicy_home_node", v6:
      mm/mempolicy: use policy_node helper with MPOL_PREFERRED_MANY
      mm/mempolicy: add set_mempolicy_home_node syscall
      mm/mempolicy: wire up syscall set_mempolicy_home_node

    Randy Dunlap <rdunlap@infradead.org>:
      mm/mempolicy: fix all kernel-doc warnings

Subsystem: mm/oom-kill

    Jann Horn <jannh@google.com>:
      mm, oom: OOM sysrq should always kill a process

Subsystem: mm/hugetlbfs

    Sean Christopherson <seanjc@google.com>:
      hugetlbfs: fix off-by-one error in hugetlb_vmdelete_list()

Subsystem: mm/migration

    Baolin Wang <baolin.wang@linux.alibaba.com>:
    Patch series "Improve the migration stats":
      mm: migrate: fix the return value of migrate_pages()
      mm: migrate: correct the hugetlb migration stats
      mm: compaction: fix the migration stats in trace_mm_compaction_migratepages()
      mm: migrate: support multiple target nodes demotion
      mm: migrate: add more comments for selecting target node randomly

    Huang Ying <ying.huang@intel.com>:
      mm/migrate: move node demotion code to near its user

    Colin Ian King <colin.i.king@gmail.com>:
      mm/migrate: remove redundant variables used in a for-loop

Subsystem: mm/thp

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/thp: drop unused trace events hugepage_[invalidate|splitting]

Subsystem: mm/ksm

    Nanyong Sun <sunnanyong@huawei.com>:
      mm: ksm: fix use-after-free kasan report in ksm_might_need_to_copy

Subsystem: mm/page-poison

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
    Patch series "mm/hwpoison: fix unpoison_memory()", v4:
      mm/hwpoison: mf_mutex for soft offline and unpoison
      mm/hwpoison: remove MF_MSG_BUDDY_2ND and MF_MSG_POISONED_HUGE
      mm/hwpoison: fix unpoison_memory()

Subsystem: mm/percpu

    Qi Zheng <zhengqi.arch@bytedance.com>:
      mm: memcg/percpu: account extra objcg space to memory cgroups

Subsystem: mm/rmap

    Huang Ying <ying.huang@intel.com>:
      mm/rmap: fix potential batched TLB flush race

Subsystem: mm/zswap

    Zhaoyu Liu <zackary.liu.pro@gmail.com>:
      zpool: remove the list of pools_head

Subsystem: mm/zram

    Luis Chamberlain <mcgrof@kernel.org>:
      zram: use ATTRIBUTE_GROUPS

Subsystem: mm/cleanups

    Quanfa Fu <fuqf0919@gmail.com>:
      mm: fix some comment errors

    Ting Liu <liuting.0x7c00@bytedance.com>:
      mm: make some vars and functions static or __init

Subsystem: mm/hmm

    Alistair Popple <apopple@nvidia.com>:
      mm/hmm.c: allow VM_MIXEDMAP to work with hmm_range_fault

Subsystem: mm/damon

    Xin Hao <xhao@linux.alibaba.com>:
    Patch series "mm/damon: Do some small changes", v4:
      mm/damon: unified access_check function naming rules
      mm/damon: add 'age' of region tracepoint support
      mm/damon/core: use abs() instead of diff_of()
      mm/damon: remove some unneeded function definitions in damon.h

    Yihao Han <hanyihao@vivo.com>:
      mm/damon/vaddr: remove swap_ranges() and replace it with swap()

    Xin Hao <xhao@linux.alibaba.com>:
      mm/damon/schemes: add the validity judgment of thresholds
      mm/damon: move damon_rand() definition into damon.h
      mm/damon: modify damon_rand() macro to static inline function

    SeongJae Park <sj@kernel.org>:
    Patch series "mm/damon: Misc cleanups":
      mm/damon: convert macro functions to static inline functions
      Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks
      Docs/admin-guide/mm/damon/usage: remove redundant information
      Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning
      Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk|rm)_contexts
      mm/damon: remove a mistakenly added comment for a future feature
    Patch series "mm/damon/schemes: Extend stats for better online analysis and tuning":
      mm/damon/schemes: account scheme actions that successfully applied
      mm/damon/schemes: account how many times quota limit has exceeded
      mm/damon/reclaim: provide reclamation statistics
      Docs/admin-guide/mm/damon/reclaim: document statistics parameters
      mm/damon/dbgfs: support all DAMOS stats
      Docs/admin-guide/mm/damon/usage: update for schemes statistics

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm/damon: add access checking for hugetlb pages

    Guoqing Jiang <guoqing.jiang@linux.dev>:
      mm/damon: move the implementation of damon_insert_region to damon.h

    SeongJae Park <sj@kernel.org>:
    Patch series "mm/damon: Hide unnecessary information disclosures":
      mm/damon/dbgfs: remove an unnecessary variable
      mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging
      mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log
      mm/damon: hide kernel pointer from tracepoint event

 Documentation/admin-guide/cgroup-v1/hugetlb.rst        |    4 
 Documentation/admin-guide/cgroup-v2.rst                |   11 
 Documentation/admin-guide/mm/damon/reclaim.rst         |   25 
 Documentation/admin-guide/mm/damon/usage.rst           |  235 +++++--
 Documentation/admin-guide/mm/numa_memory_policy.rst    |   16 
 Documentation/admin-guide/sysctl/vm.rst                |    2 
 Documentation/filesystems/proc.rst                     |    6 
 Documentation/vm/arch_pgtable_helpers.rst              |   20 
 Documentation/vm/index.rst                             |    2 
 Documentation/vm/page_migration.rst                    |   12 
 Documentation/vm/page_table_check.rst                  |   56 +
 Documentation/vm/vmalloced-kernel-stacks.rst           |  153 ++++
 MAINTAINERS                                            |    9 
 arch/Kconfig                                           |    3 
 arch/alpha/kernel/syscalls/syscall.tbl                 |    1 
 arch/alpha/mm/fault.c                                  |   16 
 arch/arc/mm/fault.c                                    |    3 
 arch/arm/mm/fault.c                                    |    2 
 arch/arm/tools/syscall.tbl                             |    1 
 arch/arm64/include/asm/unistd.h                        |    2 
 arch/arm64/include/asm/unistd32.h                      |    2 
 arch/arm64/kernel/module.c                             |    4 
 arch/arm64/mm/fault.c                                  |    6 
 arch/hexagon/mm/vm_fault.c                             |    8 
 arch/ia64/kernel/module.c                              |    6 
 arch/ia64/kernel/setup.c                               |    5 
 arch/ia64/kernel/syscalls/syscall.tbl                  |    1 
 arch/ia64/kernel/topology.c                            |    3 
 arch/ia64/kernel/uncached.c                            |    2 
 arch/ia64/mm/fault.c                                   |   16 
 arch/m68k/kernel/syscalls/syscall.tbl                  |    1 
 arch/m68k/mm/fault.c                                   |   18 
 arch/microblaze/kernel/syscalls/syscall.tbl            |    1 
 arch/microblaze/mm/fault.c                             |   18 
 arch/mips/kernel/syscalls/syscall_n32.tbl              |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl              |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl              |    1 
 arch/mips/mm/fault.c                                   |   19 
 arch/nds32/mm/fault.c                                  |   16 
 arch/nios2/mm/fault.c                                  |   18 
 arch/openrisc/mm/fault.c                               |   18 
 arch/parisc/kernel/syscalls/syscall.tbl                |    1 
 arch/parisc/mm/fault.c                                 |   18 
 arch/powerpc/kernel/syscalls/syscall.tbl               |    1 
 arch/powerpc/mm/fault.c                                |    6 
 arch/riscv/mm/fault.c                                  |    2 
 arch/s390/kernel/module.c                              |    5 
 arch/s390/kernel/syscalls/syscall.tbl                  |    1 
 arch/s390/mm/fault.c                                   |   28 
 arch/sh/kernel/syscalls/syscall.tbl                    |    1 
 arch/sh/mm/fault.c                                     |   18 
 arch/sparc/kernel/syscalls/syscall.tbl                 |    1 
 arch/sparc/mm/fault_32.c                               |   16 
 arch/sparc/mm/fault_64.c                               |   16 
 arch/um/kernel/trap.c                                  |    8 
 arch/x86/Kconfig                                       |    1 
 arch/x86/entry/syscalls/syscall_32.tbl                 |    1 
 arch/x86/entry/syscalls/syscall_64.tbl                 |    1 
 arch/x86/include/asm/pgtable.h                         |   31 -
 arch/x86/kernel/module.c                               |    7 
 arch/x86/mm/fault.c                                    |    3 
 arch/xtensa/kernel/syscalls/syscall.tbl                |    1 
 arch/xtensa/mm/fault.c                                 |   17 
 drivers/block/zram/zram_drv.c                          |   11 
 drivers/dax/bus.c                                      |   32 +
 drivers/dax/bus.h                                      |    1 
 drivers/dax/device.c                                   |  140 ++--
 drivers/infiniband/sw/siw/siw_main.c                   |    7 
 drivers/of/fdt.c                                       |    6 
 fs/ext4/extents.c                                      |    8 
 fs/ext4/inline.c                                       |    5 
 fs/ext4/page-io.c                                      |    9 
 fs/f2fs/data.c                                         |    4 
 fs/f2fs/gc.c                                           |    5 
 fs/f2fs/inode.c                                        |    4 
 fs/f2fs/node.c                                         |    4 
 fs/f2fs/recovery.c                                     |    6 
 fs/f2fs/segment.c                                      |    9 
 fs/f2fs/super.c                                        |    5 
 fs/hugetlbfs/inode.c                                   |    7 
 fs/inode.c                                             |   49 +
 fs/ioctl.c                                             |    2 
 fs/ntfs/attrib.c                                       |    2 
 fs/ocfs2/alloc.c                                       |    2 
 fs/ocfs2/aops.c                                        |   26 
 fs/ocfs2/cluster/masklog.c                             |   11 
 fs/ocfs2/dir.c                                         |    2 
 fs/ocfs2/filecheck.c                                   |    3 
 fs/ocfs2/journal.c                                     |    6 
 fs/proc/task_mmu.c                                     |   13 
 fs/squashfs/super.c                                    |   33 +
 fs/userfaultfd.c                                       |    8 
 fs/xfs/kmem.c                                          |    3 
 fs/xfs/xfs_buf.c                                       |    2 
 include/linux/ceph/libceph.h                           |    1 
 include/linux/damon.h                                  |   93 +--
 include/linux/fs.h                                     |    1 
 include/linux/gfp.h                                    |   12 
 include/linux/hugetlb.h                                |    4 
 include/linux/hugetlb_cgroup.h                         |    7 
 include/linux/kasan.h                                  |    4 
 include/linux/kthread.h                                |   25 
 include/linux/memcontrol.h                             |   22 
 include/linux/mempolicy.h                              |    1 
 include/linux/memremap.h                               |   11 
 include/linux/mm.h                                     |   76 --
 include/linux/mm_inline.h                              |  136 ++++
 include/linux/mm_types.h                               |  252 +++-----
 include/linux/mmzone.h                                 |    9 
 include/linux/page-flags.h                             |    6 
 include/linux/page_idle.h                              |    1 
 include/linux/page_table_check.h                       |  147 ++++
 include/linux/pgtable.h                                |    8 
 include/linux/sched/mm.h                               |   26 
 include/linux/swap.h                                   |    8 
 include/linux/syscalls.h                               |    3 
 include/linux/vm_event_item.h                          |    3 
 include/linux/vmalloc.h                                |    7 
 include/ras/ras_event.h                                |    2 
 include/trace/events/compaction.h                      |   24 
 include/trace/events/damon.h                           |   15 
 include/trace/events/thp.h                             |   35 -
 include/uapi/asm-generic/unistd.h                      |    5 
 include/uapi/linux/prctl.h                             |    3 
 kernel/dma/pool.c                                      |    4 
 kernel/fork.c                                          |    3 
 kernel/kthread.c                                       |    1 
 kernel/rcu/rcutorture.c                                |    7 
 kernel/sys.c                                           |   63 ++
 kernel/sys_ni.c                                        |    1 
 kernel/sysctl.c                                        |    3 
 kernel/trace/ring_buffer.c                             |    7 
 kernel/trace/trace_hwlat.c                             |    6 
 kernel/trace/trace_osnoise.c                           |    3 
 lib/test_hmm.c                                         |   24 
 lib/test_kasan.c                                       |   30 
 mm/Kconfig                                             |   14 
 mm/Kconfig.debug                                       |   24 
 mm/Makefile                                            |    1 
 mm/compaction.c                                        |    7 
 mm/damon/core.c                                        |   45 -
 mm/damon/dbgfs.c                                       |   20 
 mm/damon/paddr.c                                       |   24 
 mm/damon/prmtv-common.h                                |    4 
 mm/damon/reclaim.c                                     |   46 +
 mm/damon/vaddr.c                                       |  186 ++++--
 mm/debug.c                                             |   52 -
 mm/debug_vm_pgtable.c                                  |    6 
 mm/dmapool.c                                           |    2 
 mm/frontswap.c                                         |    4 
 mm/gup.c                                               |   31 -
 mm/hmm.c                                               |    5 
 mm/huge_memory.c                                       |   32 -
 mm/hugetlb.c                                           |    6 
 mm/hugetlb_cgroup.c                                    |  133 +++-
 mm/internal.h                                          |    7 
 mm/kasan/quarantine.c                                  |   11 
 mm/kasan/shadow.c                                      |    9 
 mm/khugepaged.c                                        |   23 
 mm/kmemleak.c                                          |   21 
 mm/ksm.c                                               |    5 
 mm/madvise.c                                           |  510 ++++++++++------
 mm/mapping_dirty_helpers.c                             |    1 
 mm/memcontrol.c                                        |   44 -
 mm/memory-failure.c                                    |  189 +++---
 mm/memory.c                                            |   12 
 mm/mempolicy.c                                         |   95 ++-
 mm/memremap.c                                          |   18 
 mm/migrate.c                                           |  527 ++++++++++-------
 mm/mlock.c                                             |    2 
 mm/mmap.c                                              |   55 +
 mm/mmu_gather.c                                        |    1 
 mm/mprotect.c                                          |    2 
 mm/oom_kill.c                                          |   30 
 mm/page_alloc.c                                        |  198 ++++--
 mm/page_counter.c                                      |    1 
 mm/page_ext.c                                          |    8 
 mm/page_isolation.c                                    |    2 
 mm/page_owner.c                                        |    4 
 mm/page_table_check.c                                  |  270 ++++++++
 mm/percpu-internal.h                                   |   18 
 mm/percpu.c                                            |   10 
 mm/pgtable-generic.c                                   |    1 
 mm/rmap.c                                              |   43 +
 mm/shmem.c                                             |   91 ++
 mm/slab.h                                              |    5 
 mm/slab_common.c                                       |   34 -
 mm/swap.c                                              |    2 
 mm/swapfile.c                                          |   46 -
 mm/truncate.c                                          |    5 
 mm/userfaultfd.c                                       |    5 
 mm/util.c                                              |   15 
 mm/vmalloc.c                                           |   75 +-
 mm/vmscan.c                                            |    2 
 mm/vmstat.c                                            |    3 
 mm/zpool.c                                             |   12 
 net/ceph/buffer.c                                      |    4 
 net/ceph/ceph_common.c                                 |   27 
 net/ceph/crypto.c                                      |    2 
 net/ceph/messenger.c                                   |    2 
 net/ceph/messenger_v2.c                                |    2 
 net/ceph/osdmap.c                                      |   12 
 net/sunrpc/svc_xprt.c                                  |    3 
 scripts/spelling.txt                                   |    1 
 tools/testing/selftests/vm/charge_reserved_hugetlb.sh  |   34 -
 tools/testing/selftests/vm/hmm-tests.c                 |   42 +
 tools/testing/selftests/vm/hugepage-mremap.c           |   46 -
 tools/testing/selftests/vm/hugetlb_reparenting_test.sh |   21 
 tools/testing/selftests/vm/run_vmtests.sh              |    2 
 tools/testing/selftests/vm/userfaultfd.c               |   33 -
 tools/testing/selftests/vm/write_hugetlb_memory.sh     |    2 
 211 files changed, 3980 insertions(+), 1759 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-12-31  4:12 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-12-31  4:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

2 patches, based on 4f3d93c6eaff6b84e43b63e0d7a119c5920e1020.

Subsystems affected by this patch series:

  mm/userfaultfd
  mm/damon

Subsystem: mm/userfaultfd

    Mike Kravetz <mike.kravetz@oracle.com>:
      userfaultfd/selftests: fix hugetlb area allocations

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
      mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'

 mm/damon/dbgfs.c                         |    9 +++++++--
 tools/testing/selftests/vm/userfaultfd.c |   16 ++++++++++------
 2 files changed, 17 insertions(+), 8 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-12-25  5:11 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-12-25  5:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

9 patches, based on bc491fb12513e79702c6f936c838f792b5389129.

Subsystems affected by this patch series:

  mm/kfence
  mm/mempolicy
  core-kernel
  MAINTAINERS
  mm/memory-failure
  mm/pagemap
  mm/pagealloc
  mm/damon
  mm/memory-failure

Subsystem: mm/kfence

    Baokun Li <libaokun1@huawei.com>:
      kfence: fix memory leak when cat kfence objects

Subsystem: mm/mempolicy

    Andrey Ryabinin <arbn@yandex-team.com>:
      mm: mempolicy: fix THP allocations escaping mempolicy restrictions

Subsystem: core-kernel

    Philipp Rudo <prudo@redhat.com>:
      kernel/crash_core: suppress unknown crashkernel parameter warning

Subsystem: MAINTAINERS

    Randy Dunlap <rdunlap@infradead.org>:
      MAINTAINERS: mark more list instances as moderated

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm, hwpoison: fix condition in free hugetlb page path

Subsystem: mm/pagemap

    Hugh Dickins <hughd@google.com>:
      mm: delete unsafe BUG from page_cache_add_speculative()

Subsystem: mm/pagealloc

    Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>:
      mm/page_alloc: fix __alloc_size attribute for alloc_pages_exact_nid

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
      mm/damon/dbgfs: protect targets destructions with kdamond_lock

Subsystem: mm/memory-failure

    Liu Shixin <liushixin2@huawei.com>:
      mm/hwpoison: clear MF_COUNT_INCREASED before retrying get_any_page()

 MAINTAINERS             |    4 ++--
 include/linux/gfp.h     |    2 +-
 include/linux/pagemap.h |    1 -
 kernel/crash_core.c     |   11 +++++++++++
 mm/damon/dbgfs.c        |    2 ++
 mm/kfence/core.c        |    1 +
 mm/memory-failure.c     |   14 +++++---------
 mm/mempolicy.c          |    3 +--
 8 files changed, 23 insertions(+), 15 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-12-10 22:45 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-12-10 22:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

21 patches, based on c741e49150dbb0c0aebe234389f4aa8b47958fa8.

Subsystems affected by this patch series:

  mm/mlock
  MAINTAINERS
  mailmap
  mm/pagecache
  mm/damon
  mm/slub
  mm/memcg
  mm/hugetlb
  mm/pagecache

Subsystem: mm/mlock

    Drew DeVault <sir@cmpwn.com>:
      Increase default MLOCK_LIMIT to 8 MiB

Subsystem: MAINTAINERS

    Dave Young <dyoung@redhat.com>:
      MAINTAINERS: update kdump maintainers

Subsystem: mailmap

    Guo Ren <guoren@linux.alibaba.com>:
      mailmap: update email address for Guo Ren

Subsystem: mm/pagecache

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      filemap: remove PageHWPoison check from next_uptodate_page()

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
    Patch series "mm/damon: Fix fake /proc/loadavg reports", v3:
      timers: implement usleep_idle_range()
      mm/damon/core: fix fake load reports due to uninterruptible sleeps
    Patch series "mm/damon: Trivial fixups and improvements":
      mm/damon/core: use better timer mechanisms selection threshold
      mm/damon/dbgfs: remove an unnecessary error message
      mm/damon/core: remove unnecessary error messages
      mm/damon/vaddr: remove an unnecessary warning message
      mm/damon/vaddr-test: split a test function having >1024 bytes frame size
      mm/damon/vaddr-test: remove unnecessary variables
      selftests/damon: skip test if DAMON is running
      selftests/damon: test DAMON enabling with empty target_ids case
      selftests/damon: test wrong DAMOS condition ranges input
      selftests/damon: test debugfs file reads/writes with huge count
      selftests/damon: split test cases

Subsystem: mm/slub

    Gerald Schaefer <gerald.schaefer@linux.ibm.com>:
      mm/slub: fix endianness bug for alloc/free_traces attributes

Subsystem: mm/memcg

    Waiman Long <longman@redhat.com>:
      mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()

Subsystem: mm/hugetlb

    Zhenguo Yao <yaozhenguo1@gmail.com>:
      hugetlbfs: fix issue of preallocation of gigantic pages can't work

Subsystem: mm/pagecache

    Manjong Lee <mj0123.lee@samsung.com>:
      mm: bdi: initialize bdi_min_ratio when bdi is unregistered

 .mailmap                                                       |    2 
 MAINTAINERS                                                    |    2 
 include/linux/delay.h                                          |   14 
 include/uapi/linux/resource.h                                  |   13 
 kernel/time/timer.c                                            |   16 -
 mm/backing-dev.c                                               |    7 
 mm/damon/core.c                                                |   20 -
 mm/damon/dbgfs.c                                               |    4 
 mm/damon/vaddr-test.h                                          |   85 ++---
 mm/damon/vaddr.c                                               |    1 
 mm/filemap.c                                                   |    2 
 mm/hugetlb.c                                                   |    2 
 mm/memcontrol.c                                                |  106 +++----
 mm/slub.c                                                      |   15 -
 tools/testing/selftests/damon/.gitignore                       |    2 
 tools/testing/selftests/damon/Makefile                         |    7 
 tools/testing/selftests/damon/_debugfs_common.sh               |   52 +++
 tools/testing/selftests/damon/debugfs_attrs.sh                 |  149 ++--------
 tools/testing/selftests/damon/debugfs_empty_targets.sh         |   13 
 tools/testing/selftests/damon/debugfs_huge_count_read_write.sh |   22 +
 tools/testing/selftests/damon/debugfs_schemes.sh               |   19 +
 tools/testing/selftests/damon/debugfs_target_ids.sh            |   19 +
 tools/testing/selftests/damon/huge_count_read_write.c          |   39 ++
 23 files changed, 363 insertions(+), 248 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-11-20  0:42 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-20  0:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

15 patches, based on a90af8f15bdc9449ee2d24e1d73fa3f7e8633f81.

Subsystems affected by this patch series:

  mm/swap
  ipc
  mm/slab-generic
  hexagon
  mm/kmemleak
  mm/hugetlb
  mm/kasan
  mm/damon
  mm/highmem
  proc

Subsystem: mm/swap

    Matthew Wilcox <willy@infradead.org>:
      mm/swap.c:put_pages_list(): reinitialise the page list

Subsystem: ipc

    Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>:
    Patch series "shm: shm_rmid_forced feature fixes":
      ipc: WARN if trying to remove ipc object which is absent
      shm: extend forced shm destroy to support objects from several IPC nses

Subsystem: mm/slab-generic

    Yunfeng Ye <yeyunfeng@huawei.com>:
      mm: emit the "free" trace report before freeing memory in kmem_cache_free()

Subsystem: hexagon

    Nathan Chancellor <nathan@kernel.org>:
    Patch series "Fixes for ARCH=hexagon allmodconfig", v2:
      hexagon: export raw I/O routines for modules
      hexagon: clean up timer-regs.h
      hexagon: ignore vmlinux.lds

Subsystem: mm/kmemleak

    Rustam Kovhaev <rkovhaev@gmail.com>:
      mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag

Subsystem: mm/hugetlb

    Bui Quang Minh <minhquangbui99@gmail.com>:
      hugetlb: fix hugetlb cgroup refcounting during mremap

    Mina Almasry <almasrymina@google.com>:
      hugetlb, userfaultfd: fix reservation restore on userfaultfd error

Subsystem: mm/kasan

    Kees Cook <keescook@chromium.org>:
      kasan: test: silence intentional read overflow warnings

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
    Patch series "DAMON fixes":
      mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
      mm/damon/dbgfs: fix missed use of damon_dbgfs_lock

Subsystem: mm/highmem

    Ard Biesheuvel <ardb@kernel.org>:
      kmap_local: don't assume kmap PTEs are linear arrays in memory

Subsystem: proc

    David Hildenbrand <david@redhat.com>:
      proc/vmcore: fix clearing user buffer by properly using clear_user()

 arch/arm/Kconfig                      |    1 
 arch/hexagon/include/asm/timer-regs.h |   26 ----
 arch/hexagon/include/asm/timex.h      |    3 
 arch/hexagon/kernel/.gitignore        |    1 
 arch/hexagon/kernel/time.c            |   12 +-
 arch/hexagon/lib/io.c                 |    4 
 fs/proc/vmcore.c                      |   20 ++-
 include/linux/hugetlb_cgroup.h        |   12 ++
 include/linux/ipc_namespace.h         |   15 ++
 include/linux/sched/task.h            |    2 
 ipc/shm.c                             |  189 +++++++++++++++++++++++++---------
 ipc/util.c                            |    6 -
 lib/test_kasan.c                      |    2 
 mm/Kconfig                            |    3 
 mm/damon/dbgfs.c                      |   20 ++-
 mm/highmem.c                          |   32 +++--
 mm/hugetlb.c                          |   11 +
 mm/slab.c                             |    3 
 mm/slab.h                             |    2 
 mm/slob.c                             |    3 
 mm/slub.c                             |    2 
 mm/swap.c                             |    1 
 22 files changed, 254 insertions(+), 116 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-11-11  4:32 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-11  4:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

The post-linux-next material.

7 patches, based on debe436e77c72fcee804fb867f275e6d31aa999c.

Subsystems affected by this patch series:

  mm/debug
  mm/slab-generic
  mm/migration
  mm/memcg
  mm/kasan

Subsystem: mm/debug

    Yixuan Cao <caoyixuan2019@email.szu.edu.cn>:
      mm/page_owner.c: modify the type of argument "order" in some functions

Subsystem: mm/slab-generic

    Ingo Molnar <mingo@kernel.org>:
      mm: allow only SLUB on PREEMPT_RT

Subsystem: mm/migration

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: migrate: simplify the file-backed pages validation when migrating its mapping

    Alistair Popple <apopple@nvidia.com>:
      mm/migrate.c: remove MIGRATE_PFN_LOCKED

Subsystem: mm/memcg

    Christoph Hellwig <hch@lst.de>:
    Patch series "unexport memcg locking helpers":
      mm: unexport folio_memcg_{,un}lock
      mm: unexport {,un}lock_page_memcg

Subsystem: mm/kasan

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
      kasan: add kasan mode messages when kasan init

 Documentation/vm/hmm.rst                 |    2 
 arch/arm64/mm/kasan_init.c               |    2 
 arch/powerpc/kvm/book3s_hv_uvmem.c       |    4 
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |    2 
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |    4 
 include/linux/migrate.h                  |    1 
 include/linux/page_owner.h               |   12 +-
 init/Kconfig                             |    2 
 lib/test_hmm.c                           |    5 -
 mm/kasan/hw_tags.c                       |   14 ++
 mm/kasan/sw_tags.c                       |    2 
 mm/memcontrol.c                          |    4 
 mm/migrate.c                             |  151 +++++--------------------------
 mm/page_owner.c                          |    6 -
 14 files changed, 61 insertions(+), 150 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-11-09  2:30 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-11-09  2:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

87 patches, based on 8bb7eca972ad531c9b149c0a51ab43a417385813, plus
previously sent material.

Subsystems affected by this patch series:

  mm/pagecache
  mm/hugetlb
  procfs
  misc
  MAINTAINERS
  lib
  checkpatch
  binfmt
  kallsyms
  ramfs
  init
  codafs
  nilfs2
  hfs
  crash_dump
  signals
  seq_file
  fork
  sysvfs
  kcov
  gdb
  resource
  selftests
  ipc

Subsystem: mm/pagecache

    Johannes Weiner <hannes@cmpxchg.org>:
      vfs: keep inodes with page cache off the inode shrinker LRU

Subsystem: mm/hugetlb

    zhangyiru <zhangyiru3@huawei.com>:
      mm,hugetlb: remove mlock ulimit for SHM_HUGETLB

Subsystem: procfs

    Florian Weimer <fweimer@redhat.com>:
      procfs: do not list TID 0 in /proc/<pid>/task

    David Hildenbrand <david@redhat.com>:
      x86/xen: update xen_oldmem_pfn_is_ram() documentation
      x86/xen: simplify xen_oldmem_pfn_is_ram()
      x86/xen: print a warning when HVMOP_get_mem_type fails
      proc/vmcore: let pfn_is_ram() return a bool
      proc/vmcore: convert oldmem_pfn_is_ram callback to more generic vmcore callbacks
      virtio-mem: factor out hotplug specifics from virtio_mem_init() into virtio_mem_init_hotplug()
      virtio-mem: factor out hotplug specifics from virtio_mem_probe() into virtio_mem_init_hotplug()
      virtio-mem: factor out hotplug specifics from virtio_mem_remove() into virtio_mem_deinit_hotplug()
      virtio-mem: kdump mode to sanitize /proc/vmcore access

    Stephen Brennan <stephen.s.brennan@oracle.com>:
      proc: allow pid_revalidate() during LOOKUP_RCU

Subsystem: misc

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
    Patch series "kernel.h further split", v5:
      kernel.h: drop unneeded <linux/kernel.h> inclusion from other headers
      kernel.h: split out container_of() and typeof_member() macros
      include/kunit/test.h: replace kernel.h with the necessary inclusions
      include/linux/list.h: replace kernel.h with the necessary inclusions
      include/linux/llist.h: replace kernel.h with the necessary inclusions
      include/linux/plist.h: replace kernel.h with the necessary inclusions
      include/media/media-entity.h: replace kernel.h with the necessary inclusions
      include/linux/delay.h: replace kernel.h with the necessary inclusions
      include/linux/sbitmap.h: replace kernel.h with the necessary inclusions
      include/linux/radix-tree.h: replace kernel.h with the necessary inclusions
      include/linux/generic-radix-tree.h: replace kernel.h with the necessary inclusions

    Stephen Rothwell <sfr@canb.auug.org.au>:
      kernel.h: split out instruction pointer accessors

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      linux/container_of.h: switch to static_assert

    Colin Ian King <colin.i.king@googlemail.com>:
      mailmap: update email address for Colin King

Subsystem: MAINTAINERS

    Kees Cook <keescook@chromium.org>:
      MAINTAINERS: add "exec & binfmt" section with myself and Eric

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
    Patch series "Rectify file references for dt-bindings in MAINTAINERS", v5:
      MAINTAINERS: rectify entry for ARM/TOSHIBA VISCONTI ARCHITECTURE
      MAINTAINERS: rectify entry for HIKEY960 ONBOARD USB GPIO HUB DRIVER
      MAINTAINERS: rectify entry for INTEL KEEM BAY DRM DRIVER
      MAINTAINERS: rectify entry for ALLWINNER HARDWARE SPINLOCK SUPPORT

Subsystem: lib

    Imran Khan <imran.f.khan@oracle.com>:
    Patch series "lib, stackdepot: check stackdepot handle before accessing slabs", v2:
      lib, stackdepot: check stackdepot handle before accessing slabs
      lib, stackdepot: add helper to print stack entries
      lib, stackdepot: add helper to print stack entries into buffer

    Lucas De Marchi <lucas.demarchi@intel.com>:
      include/linux/string_helpers.h: add linux/string.h for strlen()

    Alexey Dobriyan <adobriyan@gmail.com>:
      lib: uninline simple_strntoull() as well

    Thomas Gleixner <tglx@linutronix.de>:
      mm/scatterlist: replace the !preemptible warning in sg_miter_stop()

Subsystem: checkpatch

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      const_structs.checkpatch: add a few sound ops structs

    Joe Perches <joe@perches.com>:
      checkpatch: improve EXPORT_SYMBOL test for EXPORT_SYMBOL_NS uses

    Peter Ujfalusi <peter.ujfalusi@linux.intel.com>:
      checkpatch: get default codespell dictionary path from package location

Subsystem: binfmt

    Kees Cook <keescook@chromium.org>:
      binfmt_elf: reintroduce using MAP_FIXED_NOREPLACE

    Alexey Dobriyan <adobriyan@gmail.com>:
      ELF: simplify STACK_ALLOC macro

Subsystem: kallsyms

    Kefeng Wang <wangkefeng.wang@huawei.com>:
    Patch series "sections: Unify kernel sections range check and use", v4:
      kallsyms: remove arch specific text and data check
      kallsyms: fix address-checks for kernel related range
      sections: move and rename core_kernel_data() to is_kernel_core_data()
      sections: move is_kernel_inittext() into sections.h
      x86: mm: rename __is_kernel_text() to is_x86_32_kernel_text()
      sections: provide internal __is_kernel() and __is_kernel_text() helper
      mm: kasan: use is_kernel() helper
      extable: use is_kernel_text() helper
      powerpc/mm: use core_kernel_text() helper
      microblaze: use is_kernel_text() helper
      alpha: use is_kernel_text() helper

Subsystem: ramfs

    yangerkun <yangerkun@huawei.com>:
      ramfs: fix mount source show for ramfs

Subsystem: init

    Andrew Halaney <ahalaney@redhat.com>:
      init: make unknown command line param message clearer

Subsystem: codafs

    Jan Harkes <jaharkes@cs.cmu.edu>:
    Patch series "Coda updates for -next":
      coda: avoid NULL pointer dereference from a bad inode
      coda: check for async upcall request using local state

    Alex Shi <alex.shi@linux.alibaba.com>:
      coda: remove err which no one care

    Jan Harkes <jaharkes@cs.cmu.edu>:
      coda: avoid flagging NULL inodes
      coda: avoid hidden code duplication in rename
      coda: avoid doing bad things on inode type changes during revalidation

    Xiyu Yang <xiyuyang19@fudan.edu.cn>:
      coda: convert from atomic_t to refcount_t on coda_vm_ops->refcnt

    Jing Yangyang <jing.yangyang@zte.com.cn>:
      coda: use vmemdup_user to replace the open code

    Jan Harkes <jaharkes@cs.cmu.edu>:
      coda: bump module version to 7.2

Subsystem: nilfs2

    Qing Wang <wangqing@vivo.com>:
    Patch series "nilfs2 updates":
      nilfs2: replace snprintf in show functions with sysfs_emit

    Ryusuke Konishi <konishi.ryusuke@gmail.com>:
      nilfs2: remove filenames from file comments

Subsystem: hfs

    Arnd Bergmann <arnd@arndb.de>:
      hfs/hfsplus: use WARN_ON for sanity check

Subsystem: crash_dump

    Changcheng Deng <deng.changcheng@zte.com.cn>:
      crash_dump: fix boolreturn.cocci warning

    Ye Guojin <ye.guojin@zte.com.cn>:
      crash_dump: remove duplicate include in crash_dump.h

Subsystem: signals

    Ye Guojin <ye.guojin@zte.com.cn>:
      signal: remove duplicate include in signal.h

Subsystem: seq_file

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      seq_file: move seq_escape() to a header

    Muchun Song <songmuchun@bytedance.com>:
      seq_file: fix passing wrong private data

Subsystem: fork

    Ran Xiaokai <ran.xiaokai@zte.com.cn>:
      kernel/fork.c: unshare(): use swap() to make code cleaner

Subsystem: sysvfs

    Pavel Skripkin <paskripkin@gmail.com>:
      sysv: use BUILD_BUG_ON instead of runtime check

Subsystem: kcov

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
    Patch series "kcov: PREEMPT_RT fixup + misc", v2:
      Documentation/kcov: include types.h in the example
      Documentation/kcov: define `ip' in the example
      kcov: allocate per-CPU memory on the relevant node
      kcov: avoid enable+disable interrupts if !in_task()
      kcov: replace local_irq_save() with a local_lock_t

Subsystem: gdb

    Douglas Anderson <dianders@chromium.org>:
      scripts/gdb: handle split debug for vmlinux

Subsystem: resource

    David Hildenbrand <david@redhat.com>:
    Patch series "virtio-mem: disallow mapping virtio-mem memory via /dev/mem", v5:
      kernel/resource: clean up and optimize iomem_is_exclusive()
      kernel/resource: disallow access to exclusive system RAM regions
      virtio-mem: disallow mapping virtio-mem memory via /dev/mem

Subsystem: selftests

    SeongJae Park <sjpark@amazon.de>:
      selftests/kselftest/runner/run_one(): allow running non-executable files

Subsystem: ipc

    Michal Clapinski <mclapinski@google.com>:
      ipc: check checkpoint_restore_ns_capable() to modify C/R proc files

    Manfred Spraul <manfred@colorfullife.com>:
      ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL

 .mailmap                                             |    2 
 Documentation/dev-tools/kcov.rst                     |    5 
 MAINTAINERS                                          |   21 +
 arch/alpha/kernel/traps.c                            |    4 
 arch/microblaze/mm/pgtable.c                         |    3 
 arch/powerpc/mm/pgtable_32.c                         |    7 
 arch/riscv/lib/delay.c                               |    4 
 arch/s390/include/asm/facility.h                     |    4 
 arch/x86/kernel/aperture_64.c                        |   13 
 arch/x86/kernel/unwind_orc.c                         |    2 
 arch/x86/mm/init_32.c                                |   14 
 arch/x86/xen/mmu_hvm.c                               |   39 --
 drivers/gpu/drm/drm_dp_mst_topology.c                |    5 
 drivers/gpu/drm/drm_mm.c                             |    5 
 drivers/gpu/drm/i915/i915_vma.c                      |    5 
 drivers/gpu/drm/i915/intel_runtime_pm.c              |   20 -
 drivers/media/dvb-frontends/cxd2880/cxd2880_common.h |    1 
 drivers/virtio/Kconfig                               |    1 
 drivers/virtio/virtio_mem.c                          |  321 +++++++++++++------
 fs/binfmt_elf.c                                      |   33 +
 fs/coda/cnode.c                                      |   13 
 fs/coda/coda_linux.c                                 |   39 +-
 fs/coda/coda_linux.h                                 |    6 
 fs/coda/dir.c                                        |   20 -
 fs/coda/file.c                                       |   12 
 fs/coda/psdev.c                                      |   14 
 fs/coda/upcall.c                                     |    3 
 fs/hfs/inode.c                                       |    6 
 fs/hfsplus/inode.c                                   |   12 
 fs/hugetlbfs/inode.c                                 |   23 -
 fs/inode.c                                           |   46 +-
 fs/internal.h                                        |    1 
 fs/nilfs2/alloc.c                                    |    2 
 fs/nilfs2/alloc.h                                    |    2 
 fs/nilfs2/bmap.c                                     |    2 
 fs/nilfs2/bmap.h                                     |    2 
 fs/nilfs2/btnode.c                                   |    2 
 fs/nilfs2/btnode.h                                   |    2 
 fs/nilfs2/btree.c                                    |    2 
 fs/nilfs2/btree.h                                    |    2 
 fs/nilfs2/cpfile.c                                   |    2 
 fs/nilfs2/cpfile.h                                   |    2 
 fs/nilfs2/dat.c                                      |    2 
 fs/nilfs2/dat.h                                      |    2 
 fs/nilfs2/dir.c                                      |    2 
 fs/nilfs2/direct.c                                   |    2 
 fs/nilfs2/direct.h                                   |    2 
 fs/nilfs2/file.c                                     |    2 
 fs/nilfs2/gcinode.c                                  |    2 
 fs/nilfs2/ifile.c                                    |    2 
 fs/nilfs2/ifile.h                                    |    2 
 fs/nilfs2/inode.c                                    |    2 
 fs/nilfs2/ioctl.c                                    |    2 
 fs/nilfs2/mdt.c                                      |    2 
 fs/nilfs2/mdt.h                                      |    2 
 fs/nilfs2/namei.c                                    |    2 
 fs/nilfs2/nilfs.h                                    |    2 
 fs/nilfs2/page.c                                     |    2 
 fs/nilfs2/page.h                                     |    2 
 fs/nilfs2/recovery.c                                 |    2 
 fs/nilfs2/segbuf.c                                   |    2 
 fs/nilfs2/segbuf.h                                   |    2 
 fs/nilfs2/segment.c                                  |    2 
 fs/nilfs2/segment.h                                  |    2 
 fs/nilfs2/sufile.c                                   |    2 
 fs/nilfs2/sufile.h                                   |    2 
 fs/nilfs2/super.c                                    |    2 
 fs/nilfs2/sysfs.c                                    |   78 ++--
 fs/nilfs2/sysfs.h                                    |    2 
 fs/nilfs2/the_nilfs.c                                |    2 
 fs/nilfs2/the_nilfs.h                                |    2 
 fs/proc/base.c                                       |   21 -
 fs/proc/vmcore.c                                     |  109 ++++--
 fs/ramfs/inode.c                                     |   11 
 fs/seq_file.c                                        |   16 
 fs/sysv/super.c                                      |    6 
 include/asm-generic/sections.h                       |   75 +++-
 include/kunit/test.h                                 |   13 
 include/linux/bottom_half.h                          |    3 
 include/linux/container_of.h                         |   52 ++-
 include/linux/crash_dump.h                           |   30 +
 include/linux/delay.h                                |    2 
 include/linux/fs.h                                   |    1 
 include/linux/fwnode.h                               |    1 
 include/linux/generic-radix-tree.h                   |    3 
 include/linux/hugetlb.h                              |    6 
 include/linux/instruction_pointer.h                  |    8 
 include/linux/kallsyms.h                             |   21 -
 include/linux/kernel.h                               |   39 --
 include/linux/list.h                                 |    4 
 include/linux/llist.h                                |    4 
 include/linux/pagemap.h                              |   50 ++
 include/linux/plist.h                                |    5 
 include/linux/radix-tree.h                           |    4 
 include/linux/rwsem.h                                |    1 
 include/linux/sbitmap.h                              |   11 
 include/linux/seq_file.h                             |   19 +
 include/linux/signal.h                               |    1 
 include/linux/smp.h                                  |    1 
 include/linux/spinlock.h                             |    1 
 include/linux/stackdepot.h                           |    5 
 include/linux/string_helpers.h                       |    1 
 include/media/media-entity.h                         |    3 
 init/main.c                                          |    4 
 ipc/ipc_sysctl.c                                     |   42 +-
 ipc/shm.c                                            |    8 
 kernel/extable.c                                     |   33 -
 kernel/fork.c                                        |    9 
 kernel/kcov.c                                        |   40 +-
 kernel/locking/lockdep.c                             |    3 
 kernel/resource.c                                    |   54 ++-
 kernel/trace/ftrace.c                                |    2 
 lib/scatterlist.c                                    |   11 
 lib/stackdepot.c                                     |   46 ++
 lib/vsprintf.c                                       |    3 
 mm/Kconfig                                           |    7 
 mm/filemap.c                                         |    8 
 mm/kasan/report.c                                    |   17 -
 mm/memfd.c                                           |    4 
 mm/mmap.c                                            |    3 
 mm/page_owner.c                                      |   18 -
 mm/truncate.c                                        |   19 +
 mm/vmscan.c                                          |    7 
 mm/workingset.c                                      |   10 
 net/sysctl_net.c                                     |    2 
 scripts/checkpatch.pl                                |   33 +
 scripts/const_structs.checkpatch                     |    4 
 scripts/gdb/linux/symbols.py                         |    3 
 tools/testing/selftests/kselftest/runner.sh          |   28 +
 tools/testing/selftests/proc/.gitignore              |    1 
 tools/testing/selftests/proc/Makefile                |    2 
 tools/testing/selftests/proc/proc-tid0.c             |   81 ++++
 132 files changed, 1206 insertions(+), 681 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-10-28 21:35 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-10-28 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

11 patches, based on 411a44c24a561e449b592ff631b7ae321f1eb559.

Subsystems affected by this patch series:

  mm/memcg
  mm/memory-failure
  mm/oom-kill
  ocfs2
  mm/secretmem
  mm/vmalloc
  mm/hugetlb
  mm/damon
  mm/tools

Subsystem: mm/memcg

    Shakeel Butt <shakeelb@google.com>:
      memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT

Subsystem: mm/memory-failure

    Yang Shi <shy828301@gmail.com>:
      mm: hwpoison: remove the unnecessary THP check
      mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

Subsystem: mm/oom-kill

    Suren Baghdasaryan <surenb@google.com>:
      mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap

Subsystem: ocfs2

    Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>:
      ocfs2: fix race between searching chunks and release journal_head from buffer_head

Subsystem: mm/secretmem

    Kees Cook <keescook@chromium.org>:
      mm/secretmem: avoid letting secretmem_users drop to zero

Subsystem: mm/vmalloc

    Chen Wandun <chenwandun@huawei.com>:
      mm/vmalloc: fix numa spreading for large hash tables

Subsystem: mm/hugetlb

    Rongwei Wang <rongwei.wang@linux.alibaba.com>:
      mm, thp: bail out early in collapse_file for writeback page

    Yang Shi <shy828301@gmail.com>:
      mm: khugepaged: skip huge page collapse for special files

Subsystem: mm/damon

    SeongJae Park <sj@kernel.org>:
      mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'

Subsystem: mm/tools

    David Yang <davidcomponentone@gmail.com>:
      tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer

 fs/ocfs2/suballoc.c                               |   22 ++++++++++-------
 include/linux/page-flags.h                        |   23 ++++++++++++++++++
 mm/damon/core-test.h                              |    4 +--
 mm/huge_memory.c                                  |    2 +
 mm/khugepaged.c                                   |   26 +++++++++++++-------
 mm/memory-failure.c                               |   28 +++++++++++-----------
 mm/memory.c                                       |    9 +++++++
 mm/oom_kill.c                                     |   23 +++++++++---------
 mm/page_alloc.c                                   |    8 +++++-
 mm/secretmem.c                                    |    2 -
 mm/vmalloc.c                                      |   15 +++++++----
 tools/testing/selftests/vm/split_huge_page_test.c |    2 -
 12 files changed, 110 insertions(+), 54 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-10-18 22:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-10-18 22:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


19 patches, based on 519d81956ee277b4419c723adfb154603c2565ba.

Subsystems affected by this patch series:

  mm/userfaultfd
  mm/migration
  ocfs2
  mm/memblock
  mm/mempolicy
  mm/slub
  binfmt
  vfs
  mm/secretmem
  mm/thp
  misc

Subsystem: mm/userfaultfd

    Peter Xu <peterx@redhat.com>:
      mm/userfaultfd: selftests: fix memory corruption with thp enabled

    Nadav Amit <namit@vmware.com>:
      userfaultfd: fix a race between writeprotect and exit_mmap()

Subsystem: mm/migration

    Dave Hansen <dave.hansen@linux.intel.com>:
    Patch series "mm/migrate: 5.15 fixes for automatic demotion", v2:
      mm/migrate: optimize hotplug-time demotion order updates
      mm/migrate: add CPU hotplug to demotion #ifdef

    Huang Ying <ying.huang@intel.com>:
      mm/migrate: fix CPUHP state to update node demotion order

Subsystem: ocfs2

    Jan Kara <jack@suse.cz>:
      ocfs2: fix data corruption after conversion from inline format

    Valentin Vidic <vvidic@valentin-vidic.from.hr>:
      ocfs2: mount fails with buffer overflow in strlen

Subsystem: mm/memblock

    Peng Fan <peng.fan@nxp.com>:
      memblock: check memory total_size

Subsystem: mm/mempolicy

    Eric Dumazet <edumazet@google.com>:
      mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in mbind()

Subsystem: mm/slub

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Fixups for slub":
      mm, slub: fix two bugs in slab_debug_trace_open()
      mm, slub: fix mismatch between reconstructed freelist depth and cnt
      mm, slub: fix potential memoryleak in kmem_cache_open()
      mm, slub: fix potential use-after-free in slab_debugfs_fops
      mm, slub: fix incorrect memcg slab count for bulk free

Subsystem: binfmt

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      elfcore: correct reference to CONFIG_UML

Subsystem: vfs

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      vfs: check fd has read access in kernel_read_file_from_fd()

Subsystem: mm/secretmem

    Sean Christopherson <seanjc@google.com>:
      mm/secretmem: fix NULL page->mapping dereference in page_is_secretmem()

Subsystem: mm/thp

    Marek Szyprowski <m.szyprowski@samsung.com>:
      mm/thp: decrease nr_thps in file's mapping on THP split

Subsystem: misc

    Andrej Shadura <andrew.shadura@collabora.co.uk>:
      mailmap: add Andrej Shadura

 .mailmap                                 |    2 +
 fs/kernel_read_file.c                    |    2 -
 fs/ocfs2/alloc.c                         |   46 ++++++-----------------
 fs/ocfs2/super.c                         |   14 +++++--
 fs/userfaultfd.c                         |   12 ++++--
 include/linux/cpuhotplug.h               |    4 ++
 include/linux/elfcore.h                  |    2 -
 include/linux/memory.h                   |    5 ++
 include/linux/secretmem.h                |    2 -
 mm/huge_memory.c                         |    6 ++-
 mm/memblock.c                            |    2 -
 mm/mempolicy.c                           |   16 ++------
 mm/migrate.c                             |   62 ++++++++++++++++++-------------
 mm/page_ext.c                            |    4 --
 mm/slab.c                                |    4 +-
 mm/slub.c                                |   31 ++++++++++++---
 tools/testing/selftests/vm/userfaultfd.c |   23 ++++++++++-
 17 files changed, 138 insertions(+), 99 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-24 22:42 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-09-24 22:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

16 patches, based on 7d42e98182586f57f376406d033f05fe135edb75.

Subsystems affected by this patch series:

  mm/memory-failure
  mm/kasan
  mm/damon
  xtensa
  mm/shmem
  ocfs2
  scripts
  mm/tools
  lib
  mm/pagecache
  mm/debug
  sh
  mm/kasan
  mm/memory-failure
  mm/pagemap

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS

Subsystem: mm/damon

    Adam Borowski <kilobyte@angband.pl>:
      mm/damon: don't use strnlen() with known-bogus source length

Subsystem: xtensa

    Guenter Roeck <linux@roeck-us.net>:
      xtensa: increase size of gcc stack frame check

Subsystem: mm/shmem

    Liu Yuntao <liuyuntao10@huawei.com>:
      mm/shmem.c: fix judgment error in shmem_is_huge()

Subsystem: ocfs2

    Wengang Wang <wen.gang.wang@oracle.com>:
      ocfs2: drop acl cache for directories too

Subsystem: scripts

    Miles Chen <miles.chen@mediatek.com>:
      scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error

Subsystem: mm/tools

    Changbin Du <changbin.du@gmail.com>:
      tools/vm/page-types: remove dependency on opt_file for idle page tracking

Subsystem: lib

    Paul Menzel <pmenzel@molgen.mpg.de>:
      lib/zlib_inflate/inffast: check config in C to avoid unused function warning

Subsystem: mm/pagecache

    Minchan Kim <minchan@kernel.org>:
      mm: fs: invalidate bh_lrus for only cold path

Subsystem: mm/debug

    Weizhao Ouyang <o451686892@gmail.com>:
      mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
      mm/debug: sync up latest migrate_reason to migrate_reason_names

Subsystem: sh

    Geert Uytterhoeven <geert+renesas@glider.be>:
      sh: pgtable-3level: fix cast to pointer from integer of different size

Subsystem: mm/kasan

    Nathan Chancellor <nathan@kernel.org>:
      kasan: always respect CONFIG_KASAN_STACK

Subsystem: mm/memory-failure

    Qi Zheng <zhengqi.arch@bytedance.com>:
      mm/memory_failure: fix the missing pte_unmap() call

Subsystem: mm/pagemap

    Chen Jun <chenjun102@huawei.com>:
      mm: fix uninitialized use in overcommit_policy_handler

 arch/sh/include/asm/pgtable-3level.h |    2 +-
 fs/buffer.c                          |    8 ++++++--
 fs/ocfs2/dlmglue.c                   |    3 ++-
 include/linux/buffer_head.h          |    4 ++--
 include/linux/migrate.h              |    6 +++++-
 lib/Kconfig.debug                    |    2 +-
 lib/Kconfig.kasan                    |    2 ++
 lib/zlib_inflate/inffast.c           |   13 ++++++-------
 mm/damon/dbgfs-test.h                |   16 ++++++++--------
 mm/debug.c                           |    4 +++-
 mm/memory-failure.c                  |   12 ++++++------
 mm/shmem.c                           |    4 ++--
 mm/swap.c                            |   19 ++++++++++++++++---
 mm/util.c                            |    4 ++--
 scripts/Makefile.kasan               |    3 ++-
 scripts/sorttable.c                  |    4 ++++
 tools/vm/page-types.c                |    2 +-
 17 files changed, 69 insertions(+), 39 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-09-10 17:11 ` incoming Kees Cook
@ 2021-09-10 20:13   ` Kees Cook
  0 siblings, 0 replies; 485+ messages in thread
From: Kees Cook @ 2021-09-10 20:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Andrew Morton, linux-mm, mm-commits

On Fri, Sep 10, 2021 at 10:11:53AM -0700, Kees Cook wrote:
> On Thu, Sep 09, 2021 at 08:09:48PM -0700, Andrew Morton wrote:
> > 
> > More post linux-next material.
> > 
> > 9 patches, based on f154c806676ad7153c6e161f30c53a44855329d6.
> > 
> > Subsystems affected by this patch series:
> > 
> >   mm/slab-generic
> >   rapidio
> >   mm/debug
> > 
> > Subsystem: mm/slab-generic
> > 
> >     "Matthew Wilcox (Oracle)" <willy@infradead.org>:
> >       mm: move kvmalloc-related functions to slab.h
> > 
> > Subsystem: rapidio
> > 
> >     Kees Cook <keescook@chromium.org>:
> >       rapidio: avoid bogus __alloc_size warning
> > 
> > Subsystem: mm/debug
> > 
> >     Kees Cook <keescook@chromium.org>:
> >     Patch series "Add __alloc_size() for better bounds checking", v2:
> >       Compiler Attributes: add __alloc_size() for better bounds checking
> >       checkpatch: add __alloc_size() to known $Attribute
> >       slab: clean up function declarations
> >       slab: add __alloc_size attributes for better bounds checking
> >       mm/page_alloc: add __alloc_size attributes for better bounds checking
> >       percpu: add __alloc_size attributes for better bounds checking
> >       mm/vmalloc: add __alloc_size attributes for better bounds checking
> 
> Hi,
> 
> FYI, in overnight build testing I found yet another corner case in
> GCC's handling of the __alloc_size attribute. It's the gift that keeps
> on giving. The fix is here:
> 
> https://lore.kernel.org/lkml/20210910165851.3296624-1-keescook@chromium.org/

I'm so glad it's Friday. Here's the v2 fix... *sigh*

https://lore.kernel.org/lkml/20210910201132.3809437-1-keescook@chromium.org/

-Kees

> 
> > 
> >  Makefile                                 |   15 +++
> >  drivers/of/kexec.c                       |    1 
> >  drivers/rapidio/devices/rio_mport_cdev.c |    9 +-
> >  include/linux/compiler_attributes.h      |    6 +
> >  include/linux/gfp.h                      |    2 
> >  include/linux/mm.h                       |   34 --------
> >  include/linux/percpu.h                   |    3 
> >  include/linux/slab.h                     |  122 ++++++++++++++++++++++---------
> >  include/linux/vmalloc.h                  |   11 ++
> >  scripts/checkpatch.pl                    |    3 
> >  10 files changed, 132 insertions(+), 74 deletions(-)
> > 
> 
> -- 
> Kees Cook

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-09-10  3:09 incoming Andrew Morton
@ 2021-09-10 17:11 ` Kees Cook
  2021-09-10 20:13   ` incoming Kees Cook
  0 siblings, 1 reply; 485+ messages in thread
From: Kees Cook @ 2021-09-10 17:11 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton; +Cc: linux-mm, mm-commits

On Thu, Sep 09, 2021 at 08:09:48PM -0700, Andrew Morton wrote:
> 
> More post linux-next material.
> 
> 9 patches, based on f154c806676ad7153c6e161f30c53a44855329d6.
> 
> Subsystems affected by this patch series:
> 
>   mm/slab-generic
>   rapidio
>   mm/debug
> 
> Subsystem: mm/slab-generic
> 
>     "Matthew Wilcox (Oracle)" <willy@infradead.org>:
>       mm: move kvmalloc-related functions to slab.h
> 
> Subsystem: rapidio
> 
>     Kees Cook <keescook@chromium.org>:
>       rapidio: avoid bogus __alloc_size warning
> 
> Subsystem: mm/debug
> 
>     Kees Cook <keescook@chromium.org>:
>     Patch series "Add __alloc_size() for better bounds checking", v2:
>       Compiler Attributes: add __alloc_size() for better bounds checking
>       checkpatch: add __alloc_size() to known $Attribute
>       slab: clean up function declarations
>       slab: add __alloc_size attributes for better bounds checking
>       mm/page_alloc: add __alloc_size attributes for better bounds checking
>       percpu: add __alloc_size attributes for better bounds checking
>       mm/vmalloc: add __alloc_size attributes for better bounds checking

Hi,

FYI, in overnight build testing I found yet another corner case in
GCC's handling of the __alloc_size attribute. It's the gift that keeps
on giving. The fix is here:

https://lore.kernel.org/lkml/20210910165851.3296624-1-keescook@chromium.org/

> 
>  Makefile                                 |   15 +++
>  drivers/of/kexec.c                       |    1 
>  drivers/rapidio/devices/rio_mport_cdev.c |    9 +-
>  include/linux/compiler_attributes.h      |    6 +
>  include/linux/gfp.h                      |    2 
>  include/linux/mm.h                       |   34 --------
>  include/linux/percpu.h                   |    3 
>  include/linux/slab.h                     |  122 ++++++++++++++++++++++---------
>  include/linux/vmalloc.h                  |   11 ++
>  scripts/checkpatch.pl                    |    3 
>  10 files changed, 132 insertions(+), 74 deletions(-)
> 

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-10  3:09 Andrew Morton
  2021-09-10 17:11 ` incoming Kees Cook
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-09-10  3:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


More post linux-next material.

9 patches, based on f154c806676ad7153c6e161f30c53a44855329d6.

Subsystems affected by this patch series:

  mm/slab-generic
  rapidio
  mm/debug

Subsystem: mm/slab-generic

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: move kvmalloc-related functions to slab.h

Subsystem: rapidio

    Kees Cook <keescook@chromium.org>:
      rapidio: avoid bogus __alloc_size warning

Subsystem: mm/debug

    Kees Cook <keescook@chromium.org>:
    Patch series "Add __alloc_size() for better bounds checking", v2:
      Compiler Attributes: add __alloc_size() for better bounds checking
      checkpatch: add __alloc_size() to known $Attribute
      slab: clean up function declarations
      slab: add __alloc_size attributes for better bounds checking
      mm/page_alloc: add __alloc_size attributes for better bounds checking
      percpu: add __alloc_size attributes for better bounds checking
      mm/vmalloc: add __alloc_size attributes for better bounds checking

 Makefile                                 |   15 +++
 drivers/of/kexec.c                       |    1 
 drivers/rapidio/devices/rio_mport_cdev.c |    9 +-
 include/linux/compiler_attributes.h      |    6 +
 include/linux/gfp.h                      |    2 
 include/linux/mm.h                       |   34 --------
 include/linux/percpu.h                   |    3 
 include/linux/slab.h                     |  122 ++++++++++++++++++++++---------
 include/linux/vmalloc.h                  |   11 ++
 scripts/checkpatch.pl                    |    3 
 10 files changed, 132 insertions(+), 74 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-09  1:08 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-09-09  1:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


A bunch of hotfixes, mostly cc:stable.


8 patches, based on 2d338201d5311bcd79d42f66df4cecbcbc5f4f2c.

Subsystems affected by this patch series:

  mm/hmm
  mm/hugetlb
  mm/vmscan
  mm/pagealloc
  mm/pagemap
  mm/kmemleak
  mm/mempolicy
  mm/memblock

Subsystem: mm/hmm

    Li Zhijian <lizhijian@cn.fujitsu.com>:
      mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled

Subsystem: mm/hugetlb

    Liu Zixian <liuzixian4@huawei.com>:
      mm/hugetlb: initialize hugetlb_usage in mm_init

Subsystem: mm/vmscan

    Rik van Riel <riel@surriel.com>:
      mm,vmscan: fix divide by zero in get_scan_count

Subsystem: mm/pagealloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype

Subsystem: mm/pagemap

    Liam Howlett <liam.howlett@oracle.com>:
      mmap_lock: change trace and locking order

Subsystem: mm/kmemleak

    Naohiro Aota <naohiro.aota@wdc.com>:
      mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp

Subsystem: mm/mempolicy

    yanghui <yanghui.def@bytedance.com>:
      mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
      nds32/setup: remove unused memblock_region variable in setup_memory()

 arch/nds32/kernel/setup.c |    1 -
 include/linux/hugetlb.h   |    9 +++++++++
 include/linux/mmap_lock.h |    8 ++++----
 kernel/fork.c             |    1 +
 mm/hmm.c                  |    5 ++++-
 mm/kmemleak.c             |    3 ++-
 mm/mempolicy.c            |   17 +++++++++++++----
 mm/page_alloc.c           |    4 +++-
 mm/vmscan.c               |    2 +-
 9 files changed, 37 insertions(+), 13 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-08 22:17 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-09-08 22:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


This is the post-linux-next material, so it is based upon latest
upstream to catch the now-merged dependencies.

10 patches, based on 2d338201d5311bcd79d42f66df4cecbcbc5f4f2c.

Subsystems affected by this patch series:

  mm/vmstat
  mm/migration
  compat

Subsystem: mm/vmstat

    Ingo Molnar <mingo@elte.hu>:
      mm/vmstat: protect per cpu variables with preempt disable on RT

Subsystem: mm/migration

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: migrate: introduce a local variable to get the number of pages
      mm: migrate: fix the incorrect function name in comments
      mm: migrate: change to use bool type for 'page_was_mapped'

Subsystem: compat

    Arnd Bergmann <arnd@arndb.de>:
    Patch series "compat: remove compat_alloc_user_space", v5:
      kexec: move locking into do_kexec_load
      kexec: avoid compat_alloc_user_space
      mm: simplify compat_sys_move_pages
      mm: simplify compat numa syscalls
      compat: remove some compat entry points
      arch: remove compat_alloc_user_space

 arch/arm64/include/asm/compat.h           |    5 
 arch/arm64/include/asm/uaccess.h          |   11 -
 arch/arm64/include/asm/unistd32.h         |   10 -
 arch/arm64/lib/Makefile                   |    2 
 arch/arm64/lib/copy_in_user.S             |   77 ----------
 arch/mips/cavium-octeon/octeon-memcpy.S   |    2 
 arch/mips/include/asm/compat.h            |    8 -
 arch/mips/include/asm/uaccess.h           |   26 ---
 arch/mips/kernel/syscalls/syscall_n32.tbl |   10 -
 arch/mips/kernel/syscalls/syscall_o32.tbl |   10 -
 arch/mips/lib/memcpy.S                    |   11 -
 arch/parisc/include/asm/compat.h          |    6 
 arch/parisc/include/asm/uaccess.h         |    2 
 arch/parisc/kernel/syscalls/syscall.tbl   |    8 -
 arch/parisc/lib/memcpy.c                  |    9 -
 arch/powerpc/include/asm/compat.h         |   16 --
 arch/powerpc/kernel/syscalls/syscall.tbl  |   10 -
 arch/s390/include/asm/compat.h            |   10 -
 arch/s390/include/asm/uaccess.h           |    3 
 arch/s390/kernel/syscalls/syscall.tbl     |   10 -
 arch/s390/lib/uaccess.c                   |   63 --------
 arch/sparc/include/asm/compat.h           |   19 --
 arch/sparc/kernel/process_64.c            |    2 
 arch/sparc/kernel/signal32.c              |   12 -
 arch/sparc/kernel/signal_64.c             |    8 -
 arch/sparc/kernel/syscalls/syscall.tbl    |   10 -
 arch/x86/entry/syscalls/syscall_32.tbl    |    4 
 arch/x86/entry/syscalls/syscall_64.tbl    |    2 
 arch/x86/include/asm/compat.h             |   13 -
 arch/x86/include/asm/uaccess_64.h         |    7 
 include/linux/compat.h                    |   39 +----
 include/linux/uaccess.h                   |   10 -
 include/uapi/asm-generic/unistd.h         |   10 -
 kernel/compat.c                           |   21 --
 kernel/kexec.c                            |  105 +++++---------
 kernel/sys_ni.c                           |    5 
 mm/mempolicy.c                            |  213 +++++++-----------------------
 mm/migrate.c                              |   69 +++++----
 mm/vmstat.c                               |   48 ++++++
 39 files changed, 243 insertions(+), 663 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-09-08  2:52 incoming Andrew Morton
@ 2021-09-08  8:57 ` Vlastimil Babka
  0 siblings, 0 replies; 485+ messages in thread
From: Vlastimil Babka @ 2021-09-08  8:57 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds
  Cc: linux-mm, mm-commits, Mike Galbraith, Mel Gorman

On 9/8/21 04:52, Andrew Morton wrote:
> Subsystem: mm/slub
> 
>     Vlastimil Babka <vbabka@suse.cz>:
>     Patch series "SLUB: reduce irq disabled scope and make it RT compatible", v6:
>       mm, slub: don't call flush_all() from slab_debug_trace_open()
>       mm, slub: allocate private object map for debugfs listings
>       mm, slub: allocate private object map for validate_slab_cache()
>       mm, slub: don't disable irq for debug_check_no_locks_freed()
>       mm, slub: remove redundant unfreeze_partials() from put_cpu_partial()
>       mm, slub: extract get_partial() from new_slab_objects()
>       mm, slub: dissolve new_slab_objects() into ___slab_alloc()
>       mm, slub: return slab page from get_partial() and set c->page afterwards
>       mm, slub: restructure new page checks in ___slab_alloc()
>       mm, slub: simplify kmem_cache_cpu and tid setup
>       mm, slub: move disabling/enabling irqs to ___slab_alloc()
>       mm, slub: do initial checks in ___slab_alloc() with irqs enabled
>       mm, slub: move disabling irqs closer to get_partial() in ___slab_alloc()
>       mm, slub: restore irqs around calling new_slab()
>       mm, slub: validate slab from partial list or page allocator before making it cpu slab
>       mm, slub: check new pages with restored irqs
>       mm, slub: stop disabling irqs around get_partial()
>       mm, slub: move reset of c->page and freelist out of deactivate_slab()
>       mm, slub: make locking in deactivate_slab() irq-safe
>       mm, slub: call deactivate_slab() without disabling irqs
>       mm, slub: move irq control into unfreeze_partials()
>       mm, slub: discard slabs in unfreeze_partials() without irqs disabled
>       mm, slub: detach whole partial list at once in unfreeze_partials()
>       mm, slub: separate detaching of partial list in unfreeze_partials() from unfreezing
>       mm, slub: only disable irq with spin_lock in __unfreeze_partials()
>       mm, slub: don't disable irqs in slub_cpu_dead()
>       mm, slab: split out the cpu offline variant of flush_slab()
> 
>     Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
>       mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context
>       mm: slub: make object_map_lock a raw_spinlock_t
> 
>     Vlastimil Babka <vbabka@suse.cz>:
>       mm, slub: make slab_lock() disable irqs with PREEMPT_RT
>       mm, slub: protect put_cpu_partial() with disabled irqs instead of cmpxchg
>       mm, slub: use migrate_disable() on PREEMPT_RT
>       mm, slub: convert kmem_cpu_slab protection to local_lock

For my own piece of mind, I've checked that this part (patches 1 to 33)
are identical to the v6 posting [1] and git version [2] that Mel and
Mike tested (replies to [1]).

[1] https://lore.kernel.org/all/20210904105003.11688-1-vbabka@suse.cz/
[2] git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git
tags/mm-slub-5.15-rc1


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-08  2:52 Andrew Morton
  2021-09-08  8:57 ` incoming Vlastimil Babka
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-09-08  2:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

147 patches, based on 7d2a07b769330c34b4deabeed939325c77a7ec2f.

Subsystems affected by this patch series:

  mm/slub
  mm/memory-hotplug
  mm/rmap
  mm/ioremap
  mm/highmem
  mm/cleanups
  mm/secretmem
  mm/kfence
  mm/damon
  alpha
  percpu
  procfs
  misc
  core-kernel
  MAINTAINERS
  lib
  bitops
  checkpatch
  epoll
  init
  nilfs2
  coredump
  fork
  pids
  criu
  kconfig
  selftests
  ipc
  mm/vmscan
  scripts

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "SLUB: reduce irq disabled scope and make it RT compatible", v6:
      mm, slub: don't call flush_all() from slab_debug_trace_open()
      mm, slub: allocate private object map for debugfs listings
      mm, slub: allocate private object map for validate_slab_cache()
      mm, slub: don't disable irq for debug_check_no_locks_freed()
      mm, slub: remove redundant unfreeze_partials() from put_cpu_partial()
      mm, slub: extract get_partial() from new_slab_objects()
      mm, slub: dissolve new_slab_objects() into ___slab_alloc()
      mm, slub: return slab page from get_partial() and set c->page afterwards
      mm, slub: restructure new page checks in ___slab_alloc()
      mm, slub: simplify kmem_cache_cpu and tid setup
      mm, slub: move disabling/enabling irqs to ___slab_alloc()
      mm, slub: do initial checks in ___slab_alloc() with irqs enabled
      mm, slub: move disabling irqs closer to get_partial() in ___slab_alloc()
      mm, slub: restore irqs around calling new_slab()
      mm, slub: validate slab from partial list or page allocator before making it cpu slab
      mm, slub: check new pages with restored irqs
      mm, slub: stop disabling irqs around get_partial()
      mm, slub: move reset of c->page and freelist out of deactivate_slab()
      mm, slub: make locking in deactivate_slab() irq-safe
      mm, slub: call deactivate_slab() without disabling irqs
      mm, slub: move irq control into unfreeze_partials()
      mm, slub: discard slabs in unfreeze_partials() without irqs disabled
      mm, slub: detach whole partial list at once in unfreeze_partials()
      mm, slub: separate detaching of partial list in unfreeze_partials() from unfreezing
      mm, slub: only disable irq with spin_lock in __unfreeze_partials()
      mm, slub: don't disable irqs in slub_cpu_dead()
      mm, slab: split out the cpu offline variant of flush_slab()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context
      mm: slub: make object_map_lock a raw_spinlock_t

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: make slab_lock() disable irqs with PREEMPT_RT
      mm, slub: protect put_cpu_partial() with disabled irqs instead of cmpxchg
      mm, slub: use migrate_disable() on PREEMPT_RT
      mm, slub: convert kmem_cpu_slab protection to local_lock

Subsystem: mm/memory-hotplug

    David Hildenbrand <david@redhat.com>:
    Patch series "memory-hotplug.rst: complete admin-guide overhaul", v3:
      memory-hotplug.rst: remove locking details from admin-guide
      memory-hotplug.rst: complete admin-guide overhaul

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE":
      mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE
      mm: memory_hotplug: cleanup after removal of pfn_valid_within()

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: preparatory patches for new online policy and memory":
      mm/memory_hotplug: use "unsigned long" for PFN in zone_for_pfn_range()
      mm/memory_hotplug: remove nid parameter from arch_remove_memory()
      mm/memory_hotplug: remove nid parameter from remove_memory() and friends
      ACPI: memhotplug: memory resources cannot be enabled yet
    Patch series "mm/memory_hotplug: "auto-movable" online policy and memory groups", v3:
      mm: track present early pages per zone
      mm/memory_hotplug: introduce "auto-movable" online policy
      drivers/base/memory: introduce "memory groups" to logically group memory blocks
      mm/memory_hotplug: track present pages in memory groups
      ACPI: memhotplug: use a single static memory group for a single memory device
      dax/kmem: use a single static memory group for a single probed unit
      virtio-mem: use a single dynamic memory group for a single virtio-mem device
      mm/memory_hotplug: memory group aware "auto-movable" online policy
      mm/memory_hotplug: improved dynamic memory group aware "auto-movable" online policy

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixups for memory hotplug":
      mm/memory_hotplug: use helper zone_is_zone_device() to simplify the code

Subsystem: mm/rmap

    Muchun Song <songmuchun@bytedance.com>:
      mm: remove redundant compound_head() calling

Subsystem: mm/ioremap

    Christoph Hellwig <hch@lst.de>:
      riscv: only select GENERIC_IOREMAP if MMU support is enabled
    Patch series "small ioremap cleanups":
      mm: move ioremap_page_range to vmalloc.c
      mm: don't allow executable ioremap mappings

    Weizhao Ouyang <o451686892@gmail.com>:
      mm/early_ioremap.c: remove redundant early_ioremap_shutdown()

Subsystem: mm/highmem

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      highmem: don't disable preemption on RT in kmap_atomic()

Subsystem: mm/cleanups

    Changbin Du <changbin.du@gmail.com>:
      mm: in_irq() cleanup

    Muchun Song <songmuchun@bytedance.com>:
      mm: introduce PAGEFLAGS_MASK to replace ((1UL << NR_PAGEFLAGS) - 1)

Subsystem: mm/secretmem

    Jordy Zomer <jordy@jordyzomer.github.io>:
      mm/secretmem: use refcount_t instead of atomic_t

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: show cpu and timestamp in alloc/free info
      kfence: test: fail fast if disabled at boot

Subsystem: mm/damon

    SeongJae Park <sjpark@amazon.de>:
    Patch series "Introduce Data Access MONitor (DAMON)", v34:
      mm: introduce Data Access MONitor (DAMON)
      mm/damon/core: implement region-based sampling
      mm/damon: adaptively adjust regions
      mm/idle_page_tracking: make PG_idle reusable
      mm/damon: implement primitives for the virtual memory address spaces
      mm/damon: add a tracepoint
      mm/damon: implement a debugfs-based user space interface
      mm/damon/dbgfs: export kdamond pid to the user space
      mm/damon/dbgfs: support multiple contexts
      Documentation: add documents for DAMON
      mm/damon: add kunit tests
      mm/damon: add user space selftests
      MAINTAINERS: update for DAMON

Subsystem: alpha

    Randy Dunlap <rdunlap@infradead.org>:
      alpha: agp: make empty macros use do-while-0 style
      alpha: pci-sysfs: fix all kernel-doc warnings

Subsystem: percpu

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      percpu: remove export of pcpu_base_addr

Subsystem: procfs

    Feng Zhou <zhoufeng.zf@bytedance.com>:
      fs/proc/kcore.c: add mmap interface

    Christoph Hellwig <hch@lst.de>:
      proc: stop using seq_get_buf in proc_task_name

    Ohhoon Kwon <ohoono.kwon@samsung.com>:
      connector: send event on write to /proc/[pid]/comm

Subsystem: misc

    Colin Ian King <colin.king@canonical.com>:
      arch: Kconfig: fix spelling mistake "seperate" -> "separate"

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      include/linux/once.h: fix trivia typo Not -> Note

    Daniel Lezcano <daniel.lezcano@linaro.org>:
    Patch series "Add Hz macros", v3:
      units: change from 'L' to 'UL'
      units: add the HZ macros
      thermal/drivers/devfreq_cooling: use HZ macros
      devfreq: use HZ macros
      iio/drivers/as73211: use HZ macros
      hwmon/drivers/mr75203: use HZ macros
      iio/drivers/hid-sensor: use HZ macros
      i2c/drivers/ov02q10: use HZ macros
      mtd/drivers/nand: use HZ macros
      phy/drivers/stm32: use HZ macros

Subsystem: core-kernel

    Yang Yang <yang.yang29@zte.com.cn>:
      kernel/acct.c: use dedicated helper to access rlimit values

    Pavel Skripkin <paskripkin@gmail.com>:
      profiling: fix shift-out-of-bounds bugs

Subsystem: MAINTAINERS

    Nathan Chancellor <nathan@kernel.org>:
      MAINTAINERS: update ClangBuiltLinux mailing list
      Documentation/llvm: update mailing list
      Documentation/llvm: update IRC location

Subsystem: lib

    Geert Uytterhoeven <geert@linux-m68k.org>:
    Patch series "math: RATIONAL and RATIONAL_KUNIT_TEST improvements":
      math: make RATIONAL tristate
      math: RATIONAL_KUNIT_TEST should depend on RATIONAL instead of selecting it

    Matteo Croce <mcroce@microsoft.com>:
    Patch series "lib/string: optimized mem* functions", v2:
      lib/string: optimized memcpy
      lib/string: optimized memmove
      lib/string: optimized memset

    Daniel Latypov <dlatypov@google.com>:
      lib/test: convert test_sort.c to use KUnit

    Randy Dunlap <rdunlap@infradead.org>:
      lib/dump_stack: correct kernel-doc notation
      lib/iov_iter.c: fix kernel-doc warnings

Subsystem: bitops

    Yury Norov <yury.norov@gmail.com>:
    Patch series "Resend bitmap patches":
      bitops: protect find_first_{,zero}_bit properly
      bitops: move find_bit_*_le functions from le.h to find.h
      include: move find.h from asm_generic to linux
      arch: remove GENERIC_FIND_FIRST_BIT entirely
      lib: add find_first_and_bit()
      cpumask: use find_first_and_bit()
      all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate
      tools: sync tools/bitmap with mother linux
      cpumask: replace cpumask_next_* with cpumask_first_* where appropriate
      include/linux: move for_each_bit() macros from bitops.h to find.h
      find: micro-optimize for_each_{set,clear}_bit()
      bitops: replace for_each_*_bit_from() with for_each_*_bit() where appropriate

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      tools: rename bitmap_alloc() to bitmap_zalloc()

    Yury Norov <yury.norov@gmail.com>:
      mm/percpu: micro-optimize pcpu_is_populated()
      bitmap: unify find_bit operations
      lib: bitmap: add performance test for bitmap_print_to_pagebuf
      vsprintf: rework bitmap_list_string

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: support wide strings

    Mimi Zohar <zohar@linux.ibm.com>:
      checkpatch: make email address check case insensitive

    Joe Perches <joe@perches.com>:
      checkpatch: improve GIT_COMMIT_ID test

Subsystem: epoll

    Nicholas Piggin <npiggin@gmail.com>:
      fs/epoll: use a per-cpu counter for user's watches count

Subsystem: init

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      init: move usermodehelper_enable() to populate_rootfs()

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      trap: cleanup trap_init()

Subsystem: nilfs2

    Nanyong Sun <sunnanyong@huawei.com>:
    Patch series "nilfs2: fix incorrect usage of kobject":
      nilfs2: fix memory leak in nilfs_sysfs_create_device_group
      nilfs2: fix NULL pointer in nilfs_##name##_attr_release
      nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
      nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
      nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
      nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group

    Zhen Lei <thunder.leizhen@huawei.com>:
      nilfs2: use refcount_dec_and_lock() to fix potential UAF

Subsystem: coredump

    David Oberhollenzer <david.oberhollenzer@sigma-star.at>:
      fs/coredump.c: log if a core dump is aborted due to changed file permissions

    QiuXi <qiuxi1@huawei.com>:
      coredump: fix memleak in dump_vma_snapshot()

Subsystem: fork

    Christoph Hellwig <hch@lst.de>:
      kernel/fork.c: unexport get_{mm,task}_exe_file

Subsystem: pids

    Takahiro Itazuri <itazur@amazon.com>:
      pid: cleanup the stale comment mentioning pidmap_init().

Subsystem: criu

    Cyrill Gorcunov <gorcunov@gmail.com>:
      prctl: allow to setup brk for et_dyn executables

Subsystem: kconfig

    Zenghui Yu <yuzenghui@huawei.com>:
      configs: remove the obsolete CONFIG_INPUT_POLLDEV

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH

Subsystem: selftests

    Greg Thelen <gthelen@google.com>:
      selftests/memfd: remove unused variable

Subsystem: ipc

    Rafael Aquini <aquini@redhat.com>:
      ipc: replace costly bailout check in sysvipc_find_ipc()

Subsystem: mm/vmscan

    Randy Dunlap <rdunlap@infradead.org>:
      mm/workingset: correct kernel-doc notations

Subsystem: scripts

    Randy Dunlap <rdunlap@infradead.org>:
      scripts: check_extable: fix typo in user error message

 a/Documentation/admin-guide/mm/damon/index.rst            |   15 
 a/Documentation/admin-guide/mm/damon/start.rst            |  114 +
 a/Documentation/admin-guide/mm/damon/usage.rst            |  112 +
 a/Documentation/admin-guide/mm/index.rst                  |    1 
 a/Documentation/admin-guide/mm/memory-hotplug.rst         |  842 ++++++-----
 a/Documentation/dev-tools/kfence.rst                      |   98 -
 a/Documentation/kbuild/llvm.rst                           |    5 
 a/Documentation/vm/damon/api.rst                          |   20 
 a/Documentation/vm/damon/design.rst                       |  166 ++
 a/Documentation/vm/damon/faq.rst                          |   51 
 a/Documentation/vm/damon/index.rst                        |   30 
 a/Documentation/vm/index.rst                              |    1 
 a/MAINTAINERS                                             |   17 
 a/arch/Kconfig                                            |    2 
 a/arch/alpha/include/asm/agp.h                            |    4 
 a/arch/alpha/include/asm/bitops.h                         |    2 
 a/arch/alpha/kernel/pci-sysfs.c                           |   12 
 a/arch/arc/Kconfig                                        |    1 
 a/arch/arc/include/asm/bitops.h                           |    1 
 a/arch/arc/kernel/traps.c                                 |    5 
 a/arch/arm/configs/dove_defconfig                         |    1 
 a/arch/arm/configs/pxa_defconfig                          |    1 
 a/arch/arm/include/asm/bitops.h                           |    1 
 a/arch/arm/kernel/traps.c                                 |    5 
 a/arch/arm64/Kconfig                                      |    1 
 a/arch/arm64/include/asm/bitops.h                         |    1 
 a/arch/arm64/mm/mmu.c                                     |    3 
 a/arch/csky/include/asm/bitops.h                          |    1 
 a/arch/h8300/include/asm/bitops.h                         |    1 
 a/arch/h8300/kernel/traps.c                               |    4 
 a/arch/hexagon/include/asm/bitops.h                       |    1 
 a/arch/hexagon/kernel/traps.c                             |    4 
 a/arch/ia64/include/asm/bitops.h                          |    2 
 a/arch/ia64/mm/init.c                                     |    3 
 a/arch/m68k/include/asm/bitops.h                          |    2 
 a/arch/mips/Kconfig                                       |    1 
 a/arch/mips/configs/lemote2f_defconfig                    |    1 
 a/arch/mips/configs/pic32mzda_defconfig                   |    1 
 a/arch/mips/configs/rt305x_defconfig                      |    1 
 a/arch/mips/configs/xway_defconfig                        |    1 
 a/arch/mips/include/asm/bitops.h                          |    1 
 a/arch/nds32/kernel/traps.c                               |    5 
 a/arch/nios2/kernel/traps.c                               |    5 
 a/arch/openrisc/include/asm/bitops.h                      |    1 
 a/arch/openrisc/kernel/traps.c                            |    5 
 a/arch/parisc/configs/generic-32bit_defconfig             |    1 
 a/arch/parisc/include/asm/bitops.h                        |    2 
 a/arch/parisc/kernel/traps.c                              |    4 
 a/arch/powerpc/include/asm/bitops.h                       |    2 
 a/arch/powerpc/include/asm/cputhreads.h                   |    2 
 a/arch/powerpc/kernel/traps.c                             |    5 
 a/arch/powerpc/mm/mem.c                                   |    3 
 a/arch/powerpc/platforms/pasemi/dma_lib.c                 |    4 
 a/arch/powerpc/platforms/pseries/hotplug-memory.c         |    9 
 a/arch/riscv/Kconfig                                      |    2 
 a/arch/riscv/include/asm/bitops.h                         |    1 
 a/arch/riscv/kernel/traps.c                               |    5 
 a/arch/s390/Kconfig                                       |    1 
 a/arch/s390/include/asm/bitops.h                          |    1 
 a/arch/s390/kvm/kvm-s390.c                                |    2 
 a/arch/s390/mm/init.c                                     |    3 
 a/arch/sh/include/asm/bitops.h                            |    1 
 a/arch/sh/mm/init.c                                       |    3 
 a/arch/sparc/include/asm/bitops_32.h                      |    1 
 a/arch/sparc/include/asm/bitops_64.h                      |    2 
 a/arch/um/kernel/trap.c                                   |    4 
 a/arch/x86/Kconfig                                        |    1 
 a/arch/x86/configs/i386_defconfig                         |    1 
 a/arch/x86/configs/x86_64_defconfig                       |    1 
 a/arch/x86/include/asm/bitops.h                           |    2 
 a/arch/x86/kernel/apic/vector.c                           |    4 
 a/arch/x86/mm/init_32.c                                   |    3 
 a/arch/x86/mm/init_64.c                                   |    3 
 a/arch/x86/um/Kconfig                                     |    1 
 a/arch/xtensa/include/asm/bitops.h                        |    1 
 a/block/blk-mq.c                                          |    2 
 a/drivers/acpi/acpi_memhotplug.c                          |   46 
 a/drivers/base/memory.c                                   |  231 ++-
 a/drivers/base/node.c                                     |    2 
 a/drivers/block/rnbd/rnbd-clt.c                           |    2 
 a/drivers/dax/kmem.c                                      |   43 
 a/drivers/devfreq/devfreq.c                               |    2 
 a/drivers/dma/ti/edma.c                                   |    2 
 a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c                   |    4 
 a/drivers/hwmon/ltc2992.c                                 |    3 
 a/drivers/hwmon/mr75203.c                                 |    2 
 a/drivers/iio/adc/ad7124.c                                |    2 
 a/drivers/iio/common/hid-sensors/hid-sensor-attributes.c  |    3 
 a/drivers/iio/light/as73211.c                             |    3 
 a/drivers/infiniband/hw/irdma/hw.c                        |   16 
 a/drivers/media/cec/core/cec-core.c                       |    2 
 a/drivers/media/i2c/ov02a10.c                             |    2 
 a/drivers/media/mc/mc-devnode.c                           |    2 
 a/drivers/mmc/host/renesas_sdhi_core.c                    |    2 
 a/drivers/mtd/nand/raw/intel-nand-controller.c            |    2 
 a/drivers/net/virtio_net.c                                |    2 
 a/drivers/pci/controller/dwc/pci-dra7xx.c                 |    2 
 a/drivers/phy/st/phy-stm32-usbphyc.c                      |    2 
 a/drivers/scsi/lpfc/lpfc_sli.c                            |   10 
 a/drivers/soc/fsl/qbman/bman_portal.c                     |    2 
 a/drivers/soc/fsl/qbman/qman_portal.c                     |    2 
 a/drivers/soc/ti/k3-ringacc.c                             |    4 
 a/drivers/thermal/devfreq_cooling.c                       |    2 
 a/drivers/tty/n_tty.c                                     |    2 
 a/drivers/virt/acrn/ioreq.c                               |    3 
 a/drivers/virtio/virtio_mem.c                             |   26 
 a/fs/coredump.c                                           |   15 
 a/fs/eventpoll.c                                          |   18 
 a/fs/f2fs/segment.c                                       |    8 
 a/fs/nilfs2/sysfs.c                                       |   26 
 a/fs/nilfs2/the_nilfs.c                                   |    9 
 a/fs/ocfs2/cluster/heartbeat.c                            |    2 
 a/fs/ocfs2/dlm/dlmdomain.c                                |    4 
 a/fs/ocfs2/dlm/dlmmaster.c                                |   18 
 a/fs/ocfs2/dlm/dlmrecovery.c                              |    2 
 a/fs/ocfs2/dlm/dlmthread.c                                |    2 
 a/fs/proc/array.c                                         |   18 
 a/fs/proc/base.c                                          |    5 
 a/fs/proc/kcore.c                                         |   73 
 a/include/asm-generic/bitops.h                            |    1 
 a/include/asm-generic/bitops/find.h                       |  198 --
 a/include/asm-generic/bitops/le.h                         |   64 
 a/include/asm-generic/early_ioremap.h                     |    6 
 a/include/linux/bitmap.h                                  |   34 
 a/include/linux/bitops.h                                  |   34 
 a/include/linux/cpumask.h                                 |   46 
 a/include/linux/damon.h                                   |  290 +++
 a/include/linux/find.h                                    |  134 +
 a/include/linux/highmem-internal.h                        |   27 
 a/include/linux/memory.h                                  |   55 
 a/include/linux/memory_hotplug.h                          |   40 
 a/include/linux/mmzone.h                                  |   19 
 a/include/linux/once.h                                    |    2 
 a/include/linux/page-flags.h                              |   17 
 a/include/linux/page_ext.h                                |    2 
 a/include/linux/page_idle.h                               |    6 
 a/include/linux/pagemap.h                                 |    7 
 a/include/linux/sched/user.h                              |    3 
 a/include/linux/slub_def.h                                |    6 
 a/include/linux/threads.h                                 |    2 
 a/include/linux/units.h                                   |   10 
 a/include/linux/vmalloc.h                                 |    3 
 a/include/trace/events/damon.h                            |   43 
 a/include/trace/events/mmflags.h                          |    2 
 a/include/trace/events/page_ref.h                         |    4 
 a/init/initramfs.c                                        |    2 
 a/init/main.c                                             |    3 
 a/init/noinitramfs.c                                      |    2 
 a/ipc/util.c                                              |   16 
 a/kernel/acct.c                                           |    2 
 a/kernel/fork.c                                           |    2 
 a/kernel/profile.c                                        |   21 
 a/kernel/sys.c                                            |    7 
 a/kernel/time/clocksource.c                               |    4 
 a/kernel/user.c                                           |   25 
 a/lib/Kconfig                                             |    3 
 a/lib/Kconfig.debug                                       |    9 
 a/lib/dump_stack.c                                        |    3 
 a/lib/find_bit.c                                          |   21 
 a/lib/find_bit_benchmark.c                                |   21 
 a/lib/genalloc.c                                          |    2 
 a/lib/iov_iter.c                                          |    8 
 a/lib/math/Kconfig                                        |    2 
 a/lib/math/rational.c                                     |    3 
 a/lib/string.c                                            |  130 +
 a/lib/test_bitmap.c                                       |   37 
 a/lib/test_printf.c                                       |    2 
 a/lib/test_sort.c                                         |   40 
 a/lib/vsprintf.c                                          |   26 
 a/mm/Kconfig                                              |   15 
 a/mm/Makefile                                             |    4 
 a/mm/compaction.c                                         |   20 
 a/mm/damon/Kconfig                                        |   68 
 a/mm/damon/Makefile                                       |    5 
 a/mm/damon/core-test.h                                    |  253 +++
 a/mm/damon/core.c                                         |  748 ++++++++++
 a/mm/damon/dbgfs-test.h                                   |  126 +
 a/mm/damon/dbgfs.c                                        |  631 ++++++++
 a/mm/damon/vaddr-test.h                                   |  329 ++++
 a/mm/damon/vaddr.c                                        |  672 +++++++++
 a/mm/early_ioremap.c                                      |    5 
 a/mm/highmem.c                                            |    2 
 a/mm/ioremap.c                                            |   25 
 a/mm/kfence/core.c                                        |    3 
 a/mm/kfence/kfence.h                                      |    2 
 a/mm/kfence/kfence_test.c                                 |    3 
 a/mm/kfence/report.c                                      |   19 
 a/mm/kmemleak.c                                           |    2 
 a/mm/memory_hotplug.c                                     |  396 ++++-
 a/mm/memremap.c                                           |    5 
 a/mm/page_alloc.c                                         |   27 
 a/mm/page_ext.c                                           |   12 
 a/mm/page_idle.c                                          |   10 
 a/mm/page_isolation.c                                     |    7 
 a/mm/page_owner.c                                         |   14 
 a/mm/percpu.c                                             |   36 
 a/mm/rmap.c                                               |    6 
 a/mm/secretmem.c                                          |    9 
 a/mm/slab_common.c                                        |    2 
 a/mm/slub.c                                               | 1023 +++++++++-----
 a/mm/vmalloc.c                                            |   24 
 a/mm/workingset.c                                         |    2 
 a/net/ncsi/ncsi-manage.c                                  |    4 
 a/scripts/check_extable.sh                                |    2 
 a/scripts/checkpatch.pl                                   |   93 -
 a/tools/include/linux/bitmap.h                            |    4 
 a/tools/perf/bench/find-bit-bench.c                       |    2 
 a/tools/perf/builtin-c2c.c                                |    6 
 a/tools/perf/builtin-record.c                             |    2 
 a/tools/perf/tests/bitmap.c                               |    2 
 a/tools/perf/tests/mem2node.c                             |    2 
 a/tools/perf/util/affinity.c                              |    4 
 a/tools/perf/util/header.c                                |    4 
 a/tools/perf/util/metricgroup.c                           |    2 
 a/tools/perf/util/mmap.c                                  |    4 
 a/tools/testing/selftests/damon/Makefile                  |    7 
 a/tools/testing/selftests/damon/_chk_dependency.sh        |   28 
 a/tools/testing/selftests/damon/debugfs_attrs.sh          |   75 +
 a/tools/testing/selftests/kvm/dirty_log_perf_test.c       |    2 
 a/tools/testing/selftests/kvm/dirty_log_test.c            |    4 
 a/tools/testing/selftests/kvm/x86_64/vmx_dirty_log_test.c |    2 
 a/tools/testing/selftests/memfd/memfd_test.c              |    2 
 b/MAINTAINERS                                             |    2 
 b/tools/include/asm-generic/bitops.h                      |    1 
 b/tools/include/linux/bitmap.h                            |    7 
 b/tools/include/linux/find.h                              |   81 +
 b/tools/lib/find_bit.c                                    |   20 
 227 files changed, 6695 insertions(+), 1875 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-09-02 21:48 incoming Andrew Morton
@ 2021-09-02 21:49 ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-09-02 21:49 UTC (permalink / raw)
  To: Linus Torvalds, linux-mm, mm-commits

On Thu, 2 Sep 2021 14:48:20 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:

> 212 patches, based on 4a3bb4200a5958d76cc26ebe4db4257efa56812b.

Make that "based on 7d2a07b769330c34b4deabeed939325c77a7ec2f".


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-09-02 21:48 Andrew Morton
  2021-09-02 21:49 ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-09-02 21:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

212 patches, based on 4a3bb4200a5958d76cc26ebe4db4257efa56812b.

Subsystems affected by this patch series:

  ia64
  ocfs2
  block
  mm/slub
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/shmem
  mm/memcg
  mm/selftests
  mm/pagemap
  mm/mremap
  mm/bootmem
  mm/sparsemem
  mm/vmalloc
  mm/kasan
  mm/pagealloc
  mm/memory-failure
  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/compaction
  mm/mempolicy
  mm/memblock
  mm/oom-kill
  mm/migration
  mm/ksm
  mm/percpu
  mm/vmstat
  mm/madvise

Subsystem: ia64

    Jason Wang <wangborong@cdjrlc.com>:
      ia64: fix typo in a comment

    Geert Uytterhoeven <geert+renesas@glider.be>:
    Patch series "ia64: Miscellaneous fixes and cleanups":
      ia64: fix #endif comment for reserve_elfcorehdr()
      ia64: make reserve_elfcorehdr() static
      ia64: make num_rsvd_regions static

Subsystem: ocfs2

    Dan Carpenter <dan.carpenter@oracle.com>:
      ocfs2: remove an unnecessary condition

    Tuo Li <islituo@gmail.com>:
      ocfs2: quota_local: fix possible uninitialized-variable access in ocfs2_local_read_info()

    Gang He <ghe@suse.com>:
      ocfs2: ocfs2_downconvert_lock failure results in deadlock

Subsystem: block

    kernel test robot <lkp@intel.com>:
      arch/csky/kernel/probes/kprobes.c: fix bugon.cocci warnings

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "SLUB: reduce irq disabled scope and make it RT compatible", v4:
      mm, slub: don't call flush_all() from slab_debug_trace_open()
      mm, slub: allocate private object map for debugfs listings
      mm, slub: allocate private object map for validate_slab_cache()
      mm, slub: don't disable irq for debug_check_no_locks_freed()
      mm, slub: remove redundant unfreeze_partials() from put_cpu_partial()
      mm, slub: unify cmpxchg_double_slab() and __cmpxchg_double_slab()
      mm, slub: extract get_partial() from new_slab_objects()
      mm, slub: dissolve new_slab_objects() into ___slab_alloc()
      mm, slub: return slab page from get_partial() and set c->page afterwards
      mm, slub: restructure new page checks in ___slab_alloc()
      mm, slub: simplify kmem_cache_cpu and tid setup
      mm, slub: move disabling/enabling irqs to ___slab_alloc()
      mm, slub: do initial checks in ___slab_alloc() with irqs enabled
      mm, slub: move disabling irqs closer to get_partial() in ___slab_alloc()
      mm, slub: restore irqs around calling new_slab()
      mm, slub: validate slab from partial list or page allocator before making it cpu slab
      mm, slub: check new pages with restored irqs
      mm, slub: stop disabling irqs around get_partial()
      mm, slub: move reset of c->page and freelist out of deactivate_slab()
      mm, slub: make locking in deactivate_slab() irq-safe
      mm, slub: call deactivate_slab() without disabling irqs
      mm, slub: move irq control into unfreeze_partials()
      mm, slub: discard slabs in unfreeze_partials() without irqs disabled
      mm, slub: detach whole partial list at once in unfreeze_partials()
      mm, slub: separate detaching of partial list in unfreeze_partials() from unfreezing
      mm, slub: only disable irq with spin_lock in __unfreeze_partials()
      mm, slub: don't disable irqs in slub_cpu_dead()
      mm, slab: make flush_slab() possible to call with irqs enabled

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context
      mm: slub: make object_map_lock a raw_spinlock_t

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: optionally save/restore irqs in slab_[un]lock()/
      mm, slub: make slab_lock() disable irqs with PREEMPT_RT
      mm, slub: protect put_cpu_partial() with disabled irqs instead of cmpxchg
      mm, slub: use migrate_disable() on PREEMPT_RT
      mm, slub: convert kmem_cpu_slab protection to local_lock

Subsystem: mm/debug

    Gavin Shan <gshan@redhat.com>:
    Patch series "mm/debug_vm_pgtable: Enhancements", v6:
      mm/debug_vm_pgtable: introduce struct pgtable_debug_args
      mm/debug_vm_pgtable: use struct pgtable_debug_args in basic tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in leaf and savewrite tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in protnone and devmap tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in soft_dirty and swap tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in migration and thp tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in PTE modifying tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in PMD modifying tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in PUD modifying tests
      mm/debug_vm_pgtable: use struct pgtable_debug_args in PGD and P4D modifying tests
      mm/debug_vm_pgtable: remove unused code
      mm/debug_vm_pgtable: fix corrupted page flag

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: report a more useful address for reclaim acquisition

    liuhailong <liuhailong@oppo.com>:
      mm: add kernel_misc_reclaimable in show_free_areas

Subsystem: mm/pagecache

    Jan Kara <jack@suse.cz>:
    Patch series "writeback: Fix bandwidth estimates", v4:
      writeback: track number of inodes under writeback
      writeback: reliably update bandwidth estimation
      writeback: fix bandwidth estimate for spiky workload
      writeback: rename domain_update_bandwidth()
      writeback: use READ_ONCE for unlocked reads of writeback stats

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: remove irqsave/restore locking from contexts with irqs enabled
      fs: drop_caches: fix skipping over shadow cache inodes
      fs: inode: count invalidated shadow pages in pginodesteal

    Shakeel Butt <shakeelb@google.com>:
      writeback: memcg: simplify cgroup_writeback_by_id

    Jing Yangyang <jing.yangyang@zte.com.cn>:
      include/linux/buffer_head.h: fix boolreturn.cocci warnings

Subsystem: mm/gup

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups and fixup for gup":
      mm: gup: remove set but unused local variable major
      mm: gup: remove unneed local variable orig_refs
      mm: gup: remove useless BUG_ON in __get_user_pages()
      mm: gup: fix potential pgmap refcnt leak in __gup_device_huge()
      mm: gup: use helper PAGE_ALIGNED in populate_vma_page_range()

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "A few gup refactorings and documentation updates", v3:
      mm/gup: documentation corrections for gup/pup
      mm/gup: small refactoring: simplify try_grab_page()
      mm/gup: remove try_get_page(), call try_get_compound_head() directly

Subsystem: mm/swap

    Hugh Dickins <hughd@google.com>:
      fs, mm: fix race in unlinking swapfile

    John Hubbard <jhubbard@nvidia.com>:
      mm: delete unused get_kernel_page()

Subsystem: mm/shmem

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      shmem: use raw_spinlock_t for ->stat_lock

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups for shmem":
      shmem: remove unneeded variable ret
      shmem: remove unneeded header file
      shmem: remove unneeded function forward declaration
      shmem: include header file to declare swap_info

    Hugh Dickins <hughd@google.com>:
    Patch series "huge tmpfs: shmem_is_huge() fixes and cleanups":
      huge tmpfs: fix fallocate(vanilla) advance over huge pages
      huge tmpfs: fix split_huge_page() after FALLOC_FL_KEEP_SIZE
      huge tmpfs: remove shrinklist addition from shmem_setattr()
      huge tmpfs: revert shmem's use of transhuge_vma_enabled()
      huge tmpfs: move shmem_huge_enabled() upwards
      huge tmpfs: SGP_NOALLOC to stop collapse_file() on race
      huge tmpfs: shmem_is_huge(vma, inode, index)
      huge tmpfs: decide stat.st_blksize by shmem_is_huge()
      shmem: shmem_writepage() split unlikely i915 THP

Subsystem: mm/memcg

    Suren Baghdasaryan <surenb@google.com>:
      mm, memcg: add mem_cgroup_disabled checks in vmpressure and swap-related functions
      mm, memcg: inline mem_cgroup_{charge/uncharge} to improve disabled memcg config
      mm, memcg: inline swap-related functions to improve disabled memcg config

    Vasily Averin <vvs@virtuozzo.com>:
      memcg: enable accounting for pids in nested pid namespaces

    Shakeel Butt <shakeelb@google.com>:
      memcg: switch lruvec stats to rstat
      memcg: infrastructure to flush memcg stats

    Yutian Yang <nglaive@gmail.com>:
      memcg: charge fs_context and legacy_fs_context

    Vasily Averin <vvs@virtuozzo.com>:
    Patch series "memcg accounting from OpenVZ", v7:
      memcg: enable accounting for mnt_cache entries
      memcg: enable accounting for pollfd and select bits arrays
      memcg: enable accounting for file lock caches
      memcg: enable accounting for fasync_cache
      memcg: enable accounting for new namesapces and struct nsproxy
      memcg: enable accounting of ipc resources
      memcg: enable accounting for signals
      memcg: enable accounting for posix_timers_cache slab
      memcg: enable accounting for ldt_struct objects

    Shakeel Butt <shakeelb@google.com>:
      memcg: cleanup racy sum avoidance code

    Vasily Averin <vvs@virtuozzo.com>:
      memcg: replace in_interrupt() by !in_task() in active_memcg()

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: memcontrol: set the correct memcg swappiness restriction

    Miaohe Lin <linmiaohe@huawei.com>:
      mm, memcg: remove unused functions
      mm, memcg: save some atomic ops when flush is already true

    Michal Hocko <mhocko@suse.com>:
      memcg: fix up drain_local_stock comment

    Shakeel Butt <shakeelb@google.com>:
      memcg: make memcg->event_list_lock irqsafe

Subsystem: mm/selftests

    Po-Hsu Lin <po-hsu.lin@canonical.com>:
      selftests/vm: use kselftest skip code for skipped tests

    Colin Ian King <colin.king@canonical.com>:
      selftests: Fix spelling mistake "cann't" -> "cannot"

Subsystem: mm/pagemap

    Nicholas Piggin <npiggin@gmail.com>:
    Patch series "shoot lazy tlbs", v4:
      lazy tlb: introduce lazy mm refcount helper functions
      lazy tlb: allow lazy tlb mm refcounting to be configurable
      lazy tlb: shoot lazies, a non-refcounting lazy tlb option
      powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN

    Christoph Hellwig <hch@lst.de>:
    Patch series "_kernel_dcache_page fixes and removal":
      mmc: JZ4740: remove the flush_kernel_dcache_page call in jz4740_mmc_read_data
      mmc: mmc_spi: replace flush_kernel_dcache_page with flush_dcache_page
      scatterlist: replace flush_kernel_dcache_page with flush_dcache_page
      mm: remove flush_kernel_dcache_page

    Huang Ying <ying.huang@intel.com>:
      mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      mm: change fault_in_pages_* to have an unsigned size parameter

    Luigi Rizzo <lrizzo@google.com>:
      mm/pagemap: add mmap_assert_locked() annotations to find_vma*()

    "Liam R. Howlett" <Liam.Howlett@Oracle.com>:
      remap_file_pages: Use vma_lookup() instead of find_vma()

Subsystem: mm/mremap

    Chen Wandun <chenwandun@huawei.com>:
      mm/mremap: fix memory account on do_munmap() failure

Subsystem: mm/bootmem

    Muchun Song <songmuchun@bytedance.com>:
      mm/bootmem_info.c: mark __init on register_page_bootmem_info_section

Subsystem: mm/sparsemem

    Ohhoon Kwon <ohoono.kwon@samsung.com>:
    Patch series "mm: sparse: remove __section_nr() function", v4:
      mm: sparse: pass section_nr to section_mark_present
      mm: sparse: pass section_nr to find_memory_block
      mm: sparse: remove __section_nr() function

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/sparse: set SECTION_NID_SHIFT to 6

    Matthew Wilcox <willy@infradead.org>:
      include/linux/mmzone.h: avoid a warning in sparse memory support

    Miles Chen <miles.chen@mediatek.com>:
      mm/sparse: clarify pgdat_to_phys

Subsystem: mm/vmalloc

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: use batched page requests in bulk-allocator
      mm/vmalloc: remove gfpflags_allow_blocking() check
      lib/test_vmalloc.c: add a new 'nr_pages' parameter

    Chen Wandun <chenwandun@huawei.com>:
      mm/vmalloc: fix wrong behavior in vread

Subsystem: mm/kasan

    Woody Lin <woodylin@google.com>:
      mm/kasan: move kasan.fault to mm/kasan/report.c

    Andrey Konovalov <andreyknvl@gmail.com>:
    Patch series "kasan: test: avoid crashing the kernel with HW_TAGS", v2:
      kasan: test: rework kmalloc_oob_right
      kasan: test: avoid writing invalid memory
      kasan: test: avoid corrupting memory via memset
      kasan: test: disable kmalloc_memmove_invalid_size for HW_TAGS
      kasan: test: only do kmalloc_uaf_memset for generic mode
      kasan: test: clean up ksize_uaf
      kasan: test: avoid corrupting memory in copy_user_test
      kasan: test: avoid corrupting memory in kasan_rcu_uaf

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: ensure consistency of memory map poisoning":
      mm/page_alloc: always initialize memory map for the holes
      microblaze: simplify pte_alloc_one_kernel()
      mm: introduce memmap_alloc() to unify memory map allocation
      memblock: stop poisoning raw allocations

    Nico Pache <npache@redhat.com>:
      mm/page_alloc.c: fix 'zone_id' may be used uninitialized in this function warning

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/page_alloc: make alloc_node_mem_map() __init rather than __ref

    Vasily Averin <vvs@virtuozzo.com>:
      mm/page_alloc.c: use in_task()

    "George G. Davis" <davis.george@siemens.com>:
      mm/page_isolation: tracing: trace all test_pages_isolated failures

Subsystem: mm/memory-failure

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups and fixup for hwpoison":
      mm/hwpoison: remove unneeded variable unmap_success
      mm/hwpoison: fix potential pte_unmap_unlock pte error
      mm/hwpoison: change argument struct page **hpagep to *hpage
      mm/hwpoison: fix some obsolete comments

    Yang Shi <shy828301@gmail.com>:
      mm: hwpoison: don't drop slab caches for offlining non-LRU page
      doc: hwpoison: correct the support for hugepage
      mm: hwpoison: dump page for unhandlable page

    Michael Wang <yun.wang@linux.alibaba.com>:
      mm: fix panic caused by __page_handle_poison()

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: simplify prep_compound_gigantic_page ref count racing code
      hugetlb: drop ref count earlier after page allocation
      hugetlb: before freeing hugetlb page set dtor to appropriate value
      hugetlb: fix hugetlb cgroup refcounting during vma split

Subsystem: mm/userfaultfd

    Nadav Amit <namit@vmware.com>:
    Patch series "userfaultfd: minor bug fixes":
      userfaultfd: change mmap_changing to atomic
      userfaultfd: prevent concurrent API initialization
      selftests/vm/userfaultfd: wake after copy failure

Subsystem: mm/vmscan

    Dave Hansen <dave.hansen@linux.intel.com>:
    Patch series "Migrate Pages in lieu of discard", v11:
      mm/numa: automatically generate node migration order
      mm/migrate: update node demotion order on hotplug events

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/migrate: enable returning precise migrate_pages() success count

    Dave Hansen <dave.hansen@linux.intel.com>:
      mm/migrate: demote pages during reclaim

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/vmscan: add page demotion counter

    Dave Hansen <dave.hansen@linux.intel.com>:
      mm/vmscan: add helper for querying ability to age anonymous pages

    Keith Busch <kbusch@kernel.org>:
      mm/vmscan: Consider anonymous pages without swap

    Dave Hansen <dave.hansen@linux.intel.com>:
      mm/vmscan: never demote for memcg reclaim

    Huang Ying <ying.huang@intel.com>:
      mm/migrate: add sysfs interface to enable reclaim migration

    Hui Su <suhui@zeku.com>:
      mm/vmpressure: replace vmpressure_to_css() with vmpressure_to_memcg()

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups for vmscan", v2:
      mm/vmscan: remove the PageDirty check after MADV_FREE pages are page_ref_freezed
      mm/vmscan: remove misleading setting to sc->priority
      mm/vmscan: remove unneeded return value of kswapd_run()
      mm/vmscan: add 'else' to remove check_pending label

    Vlastimil Babka <vbabka@suse.cz>:
      mm, vmscan: guarantee drop_slab_node() termination

Subsystem: mm/compaction

    Charan Teja Reddy <charante@codeaurora.org>:
      mm: compaction: optimize proactive compaction deferrals
      mm: compaction: support triggering of proactive compaction by user

Subsystem: mm/mempolicy

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm/mempolicy: use readable NUMA_NO_NODE macro instead of magic number

    Dave Hansen <dave.hansen@linux.intel.com>:
    Patch series "Introduce multi-preference mempolicy", v7:
      mm/mempolicy: add MPOL_PREFERRED_MANY for multiple preferred nodes

    Feng Tang <feng.tang@intel.com>:
      mm/memplicy: add page allocation function for MPOL_PREFERRED_MANY policy

    Ben Widawsky <ben.widawsky@intel.com>:
      mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
      mm/mempolicy: advertise new MPOL_PREFERRED_MANY

    Feng Tang <feng.tang@intel.com>:
      mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies

    Vasily Averin <vvs@virtuozzo.com>:
      mm/mempolicy.c: use in_task() in mempolicy_slab_node()

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
      memblock: make memblock_find_in_range method private

Subsystem: mm/oom-kill

    Suren Baghdasaryan <surenb@google.com>:
      mm: introduce process_mrelease system call
      mm: wire up syscall process_mrelease

Subsystem: mm/migration

    Randy Dunlap <rdunlap@infradead.org>:
      mm/migrate: correct kernel-doc notation

Subsystem: mm/ksm

    Zhansaya Bagdauletkyzy <zhansayabagdaulet@gmail.com>:
    Patch series "add KSM selftests":
      selftests: vm: add KSM merge test
      selftests: vm: add KSM unmerge test
      selftests: vm: add KSM zero page merging test
      selftests: vm: add KSM merging across nodes test
      mm: KSM: fix data type
    Patch series "add KSM performance tests", v3:
      selftests: vm: add KSM merging time test
      selftests: vm: add COW time test for KSM pages

Subsystem: mm/percpu

    Jing Xiangfeng <jingxiangfeng@huawei.com>:
      mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()

Subsystem: mm/vmstat

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup for vmstat":
      mm/vmstat: correct some wrong comments
      mm/vmstat: simplify the array size calculation
      mm/vmstat: remove unneeded return value

Subsystem: mm/madvise

    zhangkui <zhangkui@oppo.com>:
      mm/madvise: add MADV_WILLNEED to process_madvise()

 Documentation/ABI/testing/sysfs-kernel-mm-numa         |   24 
 Documentation/admin-guide/mm/numa_memory_policy.rst    |   15 
 Documentation/admin-guide/sysctl/vm.rst                |    3 
 Documentation/core-api/cachetlb.rst                    |   86 -
 Documentation/dev-tools/kasan.rst                      |   13 
 Documentation/translations/zh_CN/core-api/cachetlb.rst |    9 
 Documentation/vm/hwpoison.rst                          |    1 
 arch/Kconfig                                           |   28 
 arch/alpha/kernel/syscalls/syscall.tbl                 |    2 
 arch/arm/include/asm/cacheflush.h                      |    4 
 arch/arm/kernel/setup.c                                |   20 
 arch/arm/mach-rpc/ecard.c                              |    2 
 arch/arm/mm/flush.c                                    |   33 
 arch/arm/mm/nommu.c                                    |    6 
 arch/arm/tools/syscall.tbl                             |    2 
 arch/arm64/include/asm/unistd.h                        |    2 
 arch/arm64/include/asm/unistd32.h                      |    2 
 arch/arm64/kvm/hyp/reserved_mem.c                      |    9 
 arch/arm64/mm/init.c                                   |   38 
 arch/csky/abiv1/cacheflush.c                           |   11 
 arch/csky/abiv1/inc/abi/cacheflush.h                   |    4 
 arch/csky/kernel/probes/kprobes.c                      |    3 
 arch/ia64/include/asm/meminit.h                        |    2 
 arch/ia64/kernel/acpi.c                                |    2 
 arch/ia64/kernel/setup.c                               |   55 
 arch/ia64/kernel/syscalls/syscall.tbl                  |    2 
 arch/m68k/kernel/syscalls/syscall.tbl                  |    2 
 arch/microblaze/include/asm/page.h                     |    3 
 arch/microblaze/include/asm/pgtable.h                  |    2 
 arch/microblaze/kernel/syscalls/syscall.tbl            |    2 
 arch/microblaze/mm/init.c                              |   12 
 arch/microblaze/mm/pgtable.c                           |   17 
 arch/mips/include/asm/cacheflush.h                     |    8 
 arch/mips/kernel/setup.c                               |   14 
 arch/mips/kernel/syscalls/syscall_n32.tbl              |    2 
 arch/mips/kernel/syscalls/syscall_n64.tbl              |    2 
 arch/mips/kernel/syscalls/syscall_o32.tbl              |    2 
 arch/nds32/include/asm/cacheflush.h                    |    3 
 arch/nds32/mm/cacheflush.c                             |    9 
 arch/parisc/include/asm/cacheflush.h                   |    8 
 arch/parisc/kernel/cache.c                             |    3 
 arch/parisc/kernel/syscalls/syscall.tbl                |    2 
 arch/powerpc/Kconfig                                   |    1 
 arch/powerpc/kernel/smp.c                              |    2 
 arch/powerpc/kernel/syscalls/syscall.tbl               |    2 
 arch/powerpc/mm/book3s64/radix_tlb.c                   |    4 
 arch/powerpc/platforms/pseries/hotplug-memory.c        |    4 
 arch/riscv/mm/init.c                                   |   44 
 arch/s390/kernel/setup.c                               |    9 
 arch/s390/kernel/syscalls/syscall.tbl                  |    2 
 arch/s390/mm/fault.c                                   |    2 
 arch/sh/include/asm/cacheflush.h                       |    8 
 arch/sh/kernel/syscalls/syscall.tbl                    |    2 
 arch/sparc/kernel/syscalls/syscall.tbl                 |    2 
 arch/x86/entry/syscalls/syscall_32.tbl                 |    1 
 arch/x86/entry/syscalls/syscall_64.tbl                 |    1 
 arch/x86/kernel/aperture_64.c                          |    5 
 arch/x86/kernel/ldt.c                                  |    6 
 arch/x86/mm/init.c                                     |   23 
 arch/x86/mm/numa.c                                     |    5 
 arch/x86/mm/numa_emulation.c                           |    5 
 arch/x86/realmode/init.c                               |    2 
 arch/xtensa/kernel/syscalls/syscall.tbl                |    2 
 block/blk-map.c                                        |    2 
 drivers/acpi/tables.c                                  |    5 
 drivers/base/arch_numa.c                               |    5 
 drivers/base/memory.c                                  |    4 
 drivers/mmc/host/jz4740_mmc.c                          |    4 
 drivers/mmc/host/mmc_spi.c                             |    2 
 drivers/of/of_reserved_mem.c                           |   12 
 fs/drop_caches.c                                       |    3 
 fs/exec.c                                              |   12 
 fs/fcntl.c                                             |    3 
 fs/fs-writeback.c                                      |   28 
 fs/fs_context.c                                        |    4 
 fs/inode.c                                             |    2 
 fs/locks.c                                             |    6 
 fs/namei.c                                             |    8 
 fs/namespace.c                                         |    7 
 fs/ocfs2/dlmglue.c                                     |   14 
 fs/ocfs2/quota_global.c                                |    1 
 fs/ocfs2/quota_local.c                                 |    2 
 fs/pipe.c                                              |    2 
 fs/select.c                                            |    4 
 fs/userfaultfd.c                                       |  116 -
 include/linux/backing-dev-defs.h                       |    2 
 include/linux/backing-dev.h                            |   19 
 include/linux/buffer_head.h                            |    2 
 include/linux/compaction.h                             |    2 
 include/linux/highmem.h                                |    5 
 include/linux/hugetlb_cgroup.h                         |   12 
 include/linux/memblock.h                               |    2 
 include/linux/memcontrol.h                             |  118 +
 include/linux/memory.h                                 |    2 
 include/linux/mempolicy.h                              |   16 
 include/linux/migrate.h                                |   14 
 include/linux/mm.h                                     |   17 
 include/linux/mmzone.h                                 |    4 
 include/linux/page-flags.h                             |    9 
 include/linux/pagemap.h                                |    4 
 include/linux/sched/mm.h                               |   35 
 include/linux/shmem_fs.h                               |   25 
 include/linux/slub_def.h                               |    6 
 include/linux/swap.h                                   |   28 
 include/linux/syscalls.h                               |    1 
 include/linux/userfaultfd_k.h                          |    8 
 include/linux/vm_event_item.h                          |    2 
 include/linux/vmpressure.h                             |    2 
 include/linux/writeback.h                              |    4 
 include/trace/events/migrate.h                         |    3 
 include/uapi/asm-generic/unistd.h                      |    4 
 include/uapi/linux/mempolicy.h                         |    1 
 ipc/msg.c                                              |    2 
 ipc/namespace.c                                        |    2 
 ipc/sem.c                                              |    9 
 ipc/shm.c                                              |    2 
 kernel/cgroup/namespace.c                              |    2 
 kernel/cpu.c                                           |    2 
 kernel/exit.c                                          |    2 
 kernel/fork.c                                          |   51 
 kernel/kthread.c                                       |   21 
 kernel/nsproxy.c                                       |    2 
 kernel/pid_namespace.c                                 |    5 
 kernel/sched/core.c                                    |   37 
 kernel/sched/sched.h                                   |    4 
 kernel/signal.c                                        |    2 
 kernel/sys_ni.c                                        |    1 
 kernel/sysctl.c                                        |    2 
 kernel/time/namespace.c                                |    4 
 kernel/time/posix-timers.c                             |    4 
 kernel/user_namespace.c                                |    2 
 lib/scatterlist.c                                      |    5 
 lib/test_kasan.c                                       |   80 -
 lib/test_kasan_module.c                                |   20 
 lib/test_vmalloc.c                                     |    5 
 mm/backing-dev.c                                       |   11 
 mm/bootmem_info.c                                      |    4 
 mm/compaction.c                                        |   69 -
 mm/debug_vm_pgtable.c                                  |  982 +++++++++------
 mm/filemap.c                                           |   15 
 mm/gup.c                                               |  109 -
 mm/huge_memory.c                                       |   32 
 mm/hugetlb.c                                           |  173 ++
 mm/hwpoison-inject.c                                   |    2 
 mm/internal.h                                          |    9 
 mm/kasan/hw_tags.c                                     |   43 
 mm/kasan/kasan.h                                       |    1 
 mm/kasan/report.c                                      |   29 
 mm/khugepaged.c                                        |    2 
 mm/ksm.c                                               |    8 
 mm/madvise.c                                           |    1 
 mm/memblock.c                                          |   22 
 mm/memcontrol.c                                        |  234 +--
 mm/memory-failure.c                                    |   53 
 mm/memory_hotplug.c                                    |    2 
 mm/mempolicy.c                                         |  207 ++-
 mm/migrate.c                                           |  319 ++++
 mm/mmap.c                                              |    7 
 mm/mremap.c                                            |    2 
 mm/oom_kill.c                                          |   70 +
 mm/page-writeback.c                                    |  133 +-
 mm/page_alloc.c                                        |   62 
 mm/page_isolation.c                                    |   13 
 mm/percpu.c                                            |    3 
 mm/shmem.c                                             |  309 ++--
 mm/slab_common.c                                       |    2 
 mm/slub.c                                              | 1085 ++++++++++-------
 mm/sparse.c                                            |   46 
 mm/swap.c                                              |   22 
 mm/swapfile.c                                          |   14 
 mm/truncate.c                                          |   28 
 mm/userfaultfd.c                                       |   15 
 mm/vmalloc.c                                           |   79 -
 mm/vmpressure.c                                        |   10 
 mm/vmscan.c                                            |  220 ++-
 mm/vmstat.c                                            |   25 
 security/tomoyo/domain.c                               |   13 
 tools/testing/scatterlist/linux/mm.h                   |    1 
 tools/testing/selftests/vm/.gitignore                  |    1 
 tools/testing/selftests/vm/Makefile                    |    3 
 tools/testing/selftests/vm/charge_reserved_hugetlb.sh  |    5 
 tools/testing/selftests/vm/hugetlb_reparenting_test.sh |    5 
 tools/testing/selftests/vm/ksm_tests.c                 |  696 ++++++++++
 tools/testing/selftests/vm/mlock-random-test.c         |    2 
 tools/testing/selftests/vm/run_vmtests.sh              |   98 +
 tools/testing/selftests/vm/userfaultfd.c               |   13 
 186 files changed, 4488 insertions(+), 2281 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-08-25 19:17 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-08-25 19:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

2 patches, based on 6e764bcd1cf72a2846c0e53d3975a09b242c04c9.

Subsystems affected by this patch series:

  mm/memory-hotplug
  MAINTAINERS

Subsystem: mm/memory-hotplug

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/memory_hotplug: fix potential permanent lru cache disable

Subsystem: MAINTAINERS

    Namjae Jeon <namjae.jeon@samsung.com>:
      MAINTAINERS: exfat: update my email address

 MAINTAINERS         |    2 +-
 mm/memory_hotplug.c |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-08-20  2:03 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-08-20  2:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

10 patches, based on 614cb2751d3150850d459bee596c397f344a7936.

Subsystems affected by this patch series:

  mm/shmem
  mm/pagealloc
  mm/tracing
  MAINTAINERS
  mm/memcg
  mm/memory-failure
  mm/vmscan
  mm/kfence
  mm/hugetlb

Subsystem: mm/shmem

    Yang Shi <shy828301@gmail.com>:
      Revert "mm/shmem: fix shmem_swapin() race with swapoff"
      Revert "mm: swap: check if swap backing device is congested or not"

Subsystem: mm/pagealloc

    Doug Berger <opendmb@gmail.com>:
      mm/page_alloc: don't corrupt pcppage_migratetype

Subsystem: mm/tracing

    Mike Rapoport <rppt@linux.ibm.com>:
      mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names

Subsystem: MAINTAINERS

    Nathan Chancellor <nathan@kernel.org>:
      MAINTAINERS: update ClangBuiltLinux IRC chat

Subsystem: mm/memcg

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/hwpoison: retry with shake_page() for unhandlable pages

Subsystem: mm/vmscan

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: vmscan: fix missing psi annotation for node_reclaim()

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: don't pass page cache pages to restore_reserve_on_error

 MAINTAINERS                    |    2 +-
 include/linux/kfence.h         |    7 ++++---
 include/linux/memcontrol.h     |   29 +++++++++++++++--------------
 include/trace/events/mmflags.h |    4 +++-
 mm/hugetlb.c                   |   19 ++++++++++++++-----
 mm/memory-failure.c            |   12 +++++++++---
 mm/page_alloc.c                |   25 ++++++++++++-------------
 mm/shmem.c                     |   14 +-------------
 mm/swap_state.c                |    7 -------
 mm/vmscan.c                    |   30 ++++++++++++++++++++++--------
 10 files changed, 81 insertions(+), 68 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-08-13 23:53 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-08-13 23:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

7 patches, based on f8e6dfc64f6135d1b6c5215c14cd30b9b60a0008.

Subsystems affected by this patch series:

  mm/kasan
  mm/slub
  mm/madvise
  mm/memcg
  lib

Subsystem: mm/kasan

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
    Patch series "kasan, slub: reset tag when printing address", v3:
      kasan, kmemleak: reset tags when scanning block
      kasan, slub: reset tag when printing address

Subsystem: mm/slub

    Shakeel Butt <shakeelb@google.com>:
      slub: fix kmalloc_pagealloc_invalid_free unit test

    Vlastimil Babka <vbabka@suse.cz>:
      mm: slub: fix slub_debug disabling for list of slabs

Subsystem: mm/madvise

    David Hildenbrand <david@redhat.com>:
      mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)

Subsystem: mm/memcg

    Waiman Long <longman@redhat.com>:
      mm/memcg: fix incorrect flushing of lruvec data in obj_stock

Subsystem: lib

    Liang Wang <wangliang101@huawei.com>:
      lib: use PFN_PHYS() in devmem_is_allowed()

 lib/devmem_is_allowed.c |    2 +-
 mm/gup.c                |    7 +++++--
 mm/kmemleak.c           |    6 +++---
 mm/madvise.c            |    4 +++-
 mm/memcontrol.c         |    6 ++++--
 mm/slub.c               |   25 ++++++++++++++-----------
 6 files changed, 30 insertions(+), 20 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-07-29 21:52 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-07-29 21:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

7 patches, based on 7e96bf476270aecea66740a083e51b38c1371cd2.

Subsystems affected by this patch series:

  lib
  ocfs2
  mm/memcg
  mm/migration
  mm/slub
  mm/memcg

Subsystem: lib

    Matteo Croce <mcroce@microsoft.com>:
      lib/test_string.c: move string selftest in the Runtime Testing menu

Subsystem: ocfs2

    Junxiao Bi <junxiao.bi@oracle.com>:
      ocfs2: fix zero out valid data
      ocfs2: issue zeroout to EOF blocks

Subsystem: mm/memcg

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code

Subsystem: mm/migration

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/migrate: fix NR_ISOLATED corruption on 64-bit

Subsystem: mm/slub

    Shakeel Butt <shakeelb@google.com>:
      slub: fix unreclaimable slab stat for bulk free

Subsystem: mm/memcg

    Wang Hai <wanghai38@huawei.com>:
      mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

 fs/ocfs2/file.c   |  103 ++++++++++++++++++++++++++++++++----------------------
 lib/Kconfig       |    3 -
 lib/Kconfig.debug |    3 +
 mm/memcontrol.c   |    3 +
 mm/migrate.c      |    2 -
 mm/slab.h         |    2 -
 mm/slub.c         |   22 ++++++-----
 7 files changed, 81 insertions(+), 57 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-07-23 22:49 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-07-23 22:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

15 patches, based on 704f4cba43d4ed31ef4beb422313f1263d87bc55.

Subsystems affected by this patch series:

  mm/userfaultfd
  mm/kfence
  mm/highmem
  mm/pagealloc
  mm/memblock
  mm/pagecache
  mm/secretmem
  mm/pagemap
  mm/hugetlbfs

Subsystem: mm/userfaultfd

    Peter Collingbourne <pcc@google.com>:
    Patch series "userfaultfd: do not untag user pointers", v5:
      userfaultfd: do not untag user pointers
      selftest: use mmap instead of posix_memalign to allocate memory

Subsystem: mm/kfence

    Weizhao Ouyang <o451686892@gmail.com>:
      kfence: defer kfence_test_init to ensure that kunit debugfs is created

    Alexander Potapenko <glider@google.com>:
      kfence: move the size check to the beginning of __kfence_alloc()
      kfence: skip all GFP_ZONEMASK allocations

Subsystem: mm/highmem

    Christoph Hellwig <hch@lst.de>:
      mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
      mm: use kmap_local_page in memzero_page

Subsystem: mm/pagealloc

    Sergei Trofimovich <slyfox@gentoo.org>:
      mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
      memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions

Subsystem: mm/pagecache

    Roman Gushchin <guro@fb.com>:
      writeback, cgroup: remove wb from offline list before releasing refcnt
      writeback, cgroup: do not reparent dax inodes

Subsystem: mm/secretmem

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/secretmem: wire up ->set_page_dirty

Subsystem: mm/pagemap

    Muchun Song <songmuchun@bytedance.com>:
      mm: mmap_lock: fix disabling preemption directly

    Qi Zheng <zhengqi.arch@bytedance.com>:
      mm: fix the deadlock in finish_fault()

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: fix mount mode command line processing

 Documentation/arm64/tagged-address-abi.rst |   26 ++++++++++++++++++--------
 fs/fs-writeback.c                          |    3 +++
 fs/hugetlbfs/inode.c                       |    2 +-
 fs/userfaultfd.c                           |   26 ++++++++++++--------------
 include/linux/highmem.h                    |    6 ++++--
 include/linux/memblock.h                   |    4 ++--
 mm/backing-dev.c                           |    2 +-
 mm/kfence/core.c                           |   19 ++++++++++++++++---
 mm/kfence/kfence_test.c                    |    2 +-
 mm/memblock.c                              |    3 ++-
 mm/memory.c                                |   11 ++++++++++-
 mm/mmap_lock.c                             |    4 ++--
 mm/page_alloc.c                            |   29 ++++++++++++++++-------------
 mm/secretmem.c                             |    1 +
 tools/testing/selftests/vm/userfaultfd.c   |    6 ++++--
 15 files changed, 93 insertions(+), 51 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-07-15  4:26 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-07-15  4:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

13 patches, based on 40226a3d96ef8ab8980f032681c8bfd46d63874e.

Subsystems affected by this patch series:

  mm/kasan
  mm/pagealloc
  mm/rmap
  mm/hmm
  hfs
  mm/hugetlb

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      mm: move helper to check slub_debug_enabled

    Yee Lee <yee.lee@mediatek.com>:
      kasan: add memzero init for unaligned size at DEBUG

    Marco Elver <elver@google.com>:
      kasan: fix build by including kernel.h

Subsystem: mm/pagealloc

    Matteo Croce <mcroce@microsoft.com>:
      Revert "mm/page_alloc: make should_fail_alloc_page() static"

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: avoid page allocator recursion with pagesets.lock held

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/page_alloc: correct return value when failing at preparing

    Chuck Lever <chuck.lever@oracle.com>:
      mm/page_alloc: further fix __alloc_pages_bulk() return value

Subsystem: mm/rmap

    Christoph Hellwig <hch@lst.de>:
      mm: fix the try_to_unmap prototype for !CONFIG_MMU

Subsystem: mm/hmm

    Alistair Popple <apopple@nvidia.com>:
      lib/test_hmm: remove set but unused page variable

Subsystem: hfs

    Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>:
    Patch series "hfs: fix various errors", v2:
      hfs: add missing clean-up in hfs_fill_super
      hfs: fix high memory mapping in hfs_bnode_read
      hfs: add lock nesting notation to hfs_find_init

Subsystem: mm/hugetlb

    Joao Martins <joao.m.martins@oracle.com>:
      mm/hugetlb: fix refs calculation from unaligned @vaddr

 fs/hfs/bfind.c        |   14 +++++++++++++-
 fs/hfs/bnode.c        |   25 ++++++++++++++++++++-----
 fs/hfs/btree.h        |    7 +++++++
 fs/hfs/super.c        |   10 +++++-----
 include/linux/kasan.h |    1 +
 include/linux/rmap.h  |    4 +++-
 lib/test_hmm.c        |    2 --
 mm/hugetlb.c          |    5 +++--
 mm/kasan/kasan.h      |   12 ++++++++++++
 mm/page_alloc.c       |   30 ++++++++++++++++++++++--------
 mm/slab.h             |   15 +++++++++++----
 mm/slub.c             |   14 --------------
 12 files changed, 97 insertions(+), 42 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-07-08  0:59 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-07-08  0:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

54 patches, based on a931dd33d370896a683236bba67c0d6f3d01144d.

Subsystems affected by this patch series:

  lib
  mm/slub
  mm/secretmem
  mm/cleanups
  mm/init
  debug
  mm/pagemap
  mm/mremap

Subsystem: lib

    Zhen Lei <thunder.leizhen@huawei.com>:
      lib/test: fix spelling mistakes
      lib: fix spelling mistakes
      lib: fix spelling mistakes in header files

Subsystem: mm/slub

    Nathan Chancellor <nathan@kernel.org>:
    Patch series "hexagon: Fix build error with CONFIG_STACKDEPOT and select CONFIG_ARCH_WANT_LD_ORPHAN_WARN":
      hexagon: handle {,SOFT}IRQENTRY_TEXT in linker script
      hexagon: use common DISCARDS macro
      hexagon: select ARCH_WANT_LD_ORPHAN_WARN

    Oliver Glitta <glittao@gmail.com>:
      mm/slub: use stackdepot to save stack trace in objects

Subsystem: mm/secretmem

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: introduce memfd_secret system call to create "secret" memory areas", v20:
      mmap: make mlock_future_check() global
      riscv/Kconfig: make direct map manipulation options depend on MMU
      set_memory: allow querying whether set_direct_map_*() is actually enabled
      mm: introduce memfd_secret system call to create "secret" memory areas
      PM: hibernate: disable when there are active secretmem users
      arch, mm: wire up memfd_secret system call where relevant
      secretmem: test: add basic selftest for memfd_secret(2)

Subsystem: mm/cleanups

    Zhen Lei <thunder.leizhen@huawei.com>:
      mm: fix spelling mistakes in header files

Subsystem: mm/init

    Kefeng Wang <wangkefeng.wang@huawei.com>:
    Patch series "init_mm: cleanup ARCH's text/data/brk setup code", v3:
      mm: add setup_initial_init_mm() helper
      arc: convert to setup_initial_init_mm()
      arm: convert to setup_initial_init_mm()
      arm64: convert to setup_initial_init_mm()
      csky: convert to setup_initial_init_mm()
      h8300: convert to setup_initial_init_mm()
      m68k: convert to setup_initial_init_mm()
      nds32: convert to setup_initial_init_mm()
      nios2: convert to setup_initial_init_mm()
      openrisc: convert to setup_initial_init_mm()
      powerpc: convert to setup_initial_init_mm()
      riscv: convert to setup_initial_init_mm()
      s390: convert to setup_initial_init_mm()
      sh: convert to setup_initial_init_mm()
      x86: convert to setup_initial_init_mm()

Subsystem: debug

    Stephen Boyd <swboyd@chromium.org>:
    Patch series "Add build ID to stacktraces", v6:
      buildid: only consider GNU notes for build ID parsing
      buildid: add API to parse build ID out of buffer
      buildid: stash away kernels build ID on init
      dump_stack: add vmlinux build ID to stack traces
      module: add printk formats to add module build ID to stacktraces
      arm64: stacktrace: use %pSb for backtrace printing
      x86/dumpstack: use %pSb/%pBb for backtrace printing
      scripts/decode_stacktrace.sh: support debuginfod
      scripts/decode_stacktrace.sh: silence stderr messages from addr2line/nm
      scripts/decode_stacktrace.sh: indicate 'auto' can be used for base path
      buildid: mark some arguments const
      buildid: fix kernel-doc notation
      kdump: use vmlinux_build_id to simplify

Subsystem: mm/pagemap

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *
      mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *

Subsystem: mm/mremap

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "mrermap fixes", v2:
      selftest/mremap_test: update the test to handle pagesize other than 4K
      selftest/mremap_test: avoid crash with static build
      mm/mremap: convert huge PUD move to separate helper
      mm/mremap: don't enable optimized PUD move if page table levels is 2
      mm/mremap: use pmd/pud_poplulate to update page table entries
      mm/mremap: hold the rmap lock in write mode when moving page table entries.
    Patch series "Speedup mremap on ppc64", v8:
      mm/mremap: allow arch runtime override
      powerpc/book3s64/mm: update flush_tlb_range to flush page walk cache
      powerpc/mm: enable HAVE_MOVE_PMD support

 Documentation/core-api/printk-formats.rst           |   11 
 arch/alpha/include/asm/pgtable.h                    |    8 
 arch/arc/mm/init.c                                  |    5 
 arch/arm/include/asm/pgtable-3level.h               |    2 
 arch/arm/kernel/setup.c                             |    5 
 arch/arm64/include/asm/Kbuild                       |    1 
 arch/arm64/include/asm/cacheflush.h                 |    6 
 arch/arm64/include/asm/kfence.h                     |    2 
 arch/arm64/include/asm/pgtable.h                    |    8 
 arch/arm64/include/asm/set_memory.h                 |   17 +
 arch/arm64/include/uapi/asm/unistd.h                |    1 
 arch/arm64/kernel/machine_kexec.c                   |    1 
 arch/arm64/kernel/setup.c                           |    5 
 arch/arm64/kernel/stacktrace.c                      |    2 
 arch/arm64/mm/mmu.c                                 |    7 
 arch/arm64/mm/pageattr.c                            |   13 
 arch/csky/kernel/setup.c                            |    5 
 arch/h8300/kernel/setup.c                           |    5 
 arch/hexagon/Kconfig                                |    1 
 arch/hexagon/kernel/vmlinux.lds.S                   |    9 
 arch/ia64/include/asm/pgtable.h                     |    4 
 arch/m68k/include/asm/motorola_pgtable.h            |    2 
 arch/m68k/kernel/setup_mm.c                         |    5 
 arch/m68k/kernel/setup_no.c                         |    5 
 arch/mips/include/asm/pgtable-64.h                  |    8 
 arch/nds32/kernel/setup.c                           |    5 
 arch/nios2/kernel/setup.c                           |    5 
 arch/openrisc/kernel/setup.c                        |    5 
 arch/parisc/include/asm/pgtable.h                   |    4 
 arch/powerpc/include/asm/book3s/64/pgtable.h        |   11 
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h |    2 
 arch/powerpc/include/asm/nohash/64/pgtable-4k.h     |    6 
 arch/powerpc/include/asm/nohash/64/pgtable.h        |    6 
 arch/powerpc/include/asm/tlb.h                      |    6 
 arch/powerpc/kernel/setup-common.c                  |    5 
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c        |    8 
 arch/powerpc/mm/book3s64/radix_pgtable.c            |    6 
 arch/powerpc/mm/book3s64/radix_tlb.c                |   44 +-
 arch/powerpc/mm/pgtable_64.c                        |    4 
 arch/powerpc/platforms/Kconfig.cputype              |    2 
 arch/riscv/Kconfig                                  |    4 
 arch/riscv/include/asm/pgtable-64.h                 |    4 
 arch/riscv/include/asm/unistd.h                     |    1 
 arch/riscv/kernel/setup.c                           |    5 
 arch/s390/kernel/setup.c                            |    5 
 arch/sh/include/asm/pgtable-3level.h                |    4 
 arch/sh/kernel/setup.c                              |    5 
 arch/sparc/include/asm/pgtable_32.h                 |    6 
 arch/sparc/include/asm/pgtable_64.h                 |   10 
 arch/um/include/asm/pgtable-3level.h                |    2 
 arch/x86/entry/syscalls/syscall_32.tbl              |    1 
 arch/x86/entry/syscalls/syscall_64.tbl              |    1 
 arch/x86/include/asm/pgtable.h                      |    8 
 arch/x86/kernel/dumpstack.c                         |    2 
 arch/x86/kernel/setup.c                             |    5 
 arch/x86/mm/init_64.c                               |    4 
 arch/x86/mm/pat/set_memory.c                        |    4 
 arch/x86/mm/pgtable.c                               |    2 
 include/asm-generic/pgtable-nop4d.h                 |    2 
 include/asm-generic/pgtable-nopmd.h                 |    2 
 include/asm-generic/pgtable-nopud.h                 |    4 
 include/linux/bootconfig.h                          |    4 
 include/linux/buildid.h                             |   10 
 include/linux/compaction.h                          |    4 
 include/linux/cpumask.h                             |    2 
 include/linux/crash_core.h                          |   12 
 include/linux/debugobjects.h                        |    2 
 include/linux/hmm.h                                 |    2 
 include/linux/hugetlb.h                             |    6 
 include/linux/kallsyms.h                            |   21 +
 include/linux/list_lru.h                            |    4 
 include/linux/lru_cache.h                           |    8 
 include/linux/mm.h                                  |    3 
 include/linux/mmu_notifier.h                        |    8 
 include/linux/module.h                              |    9 
 include/linux/nodemask.h                            |    6 
 include/linux/percpu-defs.h                         |    2 
 include/linux/percpu-refcount.h                     |    2 
 include/linux/pgtable.h                             |    4 
 include/linux/scatterlist.h                         |    2 
 include/linux/secretmem.h                           |   54 +++
 include/linux/set_memory.h                          |   12 
 include/linux/shrinker.h                            |    2 
 include/linux/syscalls.h                            |    1 
 include/linux/vmalloc.h                             |    4 
 include/uapi/asm-generic/unistd.h                   |    7 
 include/uapi/linux/magic.h                          |    1 
 init/Kconfig                                        |    1 
 init/main.c                                         |    2 
 kernel/crash_core.c                                 |   50 ---
 kernel/kallsyms.c                                   |  104 +++++--
 kernel/module.c                                     |   42 ++
 kernel/power/hibernate.c                            |    5 
 kernel/sys_ni.c                                     |    2 
 lib/Kconfig.debug                                   |   17 -
 lib/asn1_encoder.c                                  |    2 
 lib/buildid.c                                       |   80 ++++-
 lib/devres.c                                        |    2 
 lib/dump_stack.c                                    |   13 
 lib/dynamic_debug.c                                 |    2 
 lib/fonts/font_pearl_8x8.c                          |    2 
 lib/kfifo.c                                         |    2 
 lib/list_sort.c                                     |    2 
 lib/nlattr.c                                        |    4 
 lib/oid_registry.c                                  |    2 
 lib/pldmfw/pldmfw.c                                 |    2 
 lib/reed_solomon/test_rslib.c                       |    2 
 lib/refcount.c                                      |    2 
 lib/rhashtable.c                                    |    2 
 lib/sbitmap.c                                       |    2 
 lib/scatterlist.c                                   |    4 
 lib/seq_buf.c                                       |    2 
 lib/sort.c                                          |    2 
 lib/stackdepot.c                                    |    2 
 lib/test_bitops.c                                   |    2 
 lib/test_bpf.c                                      |    2 
 lib/test_kasan.c                                    |    2 
 lib/test_kmod.c                                     |    6 
 lib/test_scanf.c                                    |    2 
 lib/vsprintf.c                                      |   10 
 mm/Kconfig                                          |    4 
 mm/Makefile                                         |    1 
 mm/gup.c                                            |   12 
 mm/init-mm.c                                        |    9 
 mm/internal.h                                       |    3 
 mm/mlock.c                                          |    3 
 mm/mmap.c                                           |    5 
 mm/mremap.c                                         |  108 ++++++-
 mm/secretmem.c                                      |  254 +++++++++++++++++
 mm/slub.c                                           |   79 +++--
 scripts/checksyscalls.sh                            |    4 
 scripts/decode_stacktrace.sh                        |   89 +++++-
 tools/testing/selftests/vm/.gitignore               |    1 
 tools/testing/selftests/vm/Makefile                 |    3 
 tools/testing/selftests/vm/memfd_secret.c           |  296 ++++++++++++++++++++
 tools/testing/selftests/vm/mremap_test.c            |  116 ++++---
 tools/testing/selftests/vm/run_vmtests.sh           |   17 +
 137 files changed, 1470 insertions(+), 442 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-07-03  0:28 ` incoming Linus Torvalds
@ 2021-07-03  1:06   ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-07-03  1:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Fri, Jul 2, 2021 at 5:28 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Commit e058a84bfddc42ba356a2316f2cf1141974625c9 is good, and looking
> at the pulls and merges I've done since, this -mm series looks like
> the obvious culprit.

No, unless my bisection is wrong, the -mm branch is innocent, and was
discarded from the suspects on the very first bisection trial.

So never mind.

             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-07-01  1:46 incoming Andrew Morton
@ 2021-07-03  0:28 ` Linus Torvalds
  2021-07-03  1:06   ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-07-03  0:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Wed, Jun 30, 2021 at 6:46 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> This is the rest of the -mm tree, less 66 patches which are dependent on
> things which are (or were recently) in linux-next.  I'll trickle that
> material over next week.

I haven't bisected this yet, but with the current -git I'm getting

   watchdog: BUG: soft lockup - CPU#41 stuck for 49s!

and the common call chain seems to be in flush_tlb_mm_range ->
on_each_cpu_cond_mask.

Commit e058a84bfddc42ba356a2316f2cf1141974625c9 is good, and looking
at the pulls and merges I've done since, this -mm series looks like
the obvious culprit.

I'll go start bisection, but I thought I'd give a heads-up in case
somebody else has seen TLB-flush-related lockups and already figured
out the guilty party..

                 Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-07-01  1:46 Andrew Morton
  2021-07-03  0:28 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-07-01  1:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


This is the rest of the -mm tree, less 66 patches which are dependent on
things which are (or were recently) in linux-next.  I'll trickle that
material over next week.


192 patches, based on 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2 plus the
June 28 sendings.

Subsystems affected by this patch series:

  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/kconfig
  mm/proc
  mm/z3fold
  mm/zbud
  mm/ras
  mm/mempolicy
  mm/memblock
  mm/migration
  mm/thp
  mm/nommu
  mm/kconfig
  mm/madvise
  mm/memory-hotplug
  mm/zswap
  mm/zsmalloc
  mm/zram
  mm/cleanups
  mm/kfence
  mm/hmm
  procfs
  sysctl
  misc
  core-kernel
  lib
  lz4
  checkpatch
  init
  kprobes
  nilfs2
  hfs
  signals
  exec
  kcov
  selftests
  compress/decompress
  ipc

Subsystem: mm/hugetlb

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Free some vmemmap pages of HugeTLB page", v23:
      mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c
      mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP
      mm: hugetlb: gather discrete indexes of tail page
      mm: hugetlb: free the vmemmap pages associated with each HugeTLB page
      mm: hugetlb: defer freeing of HugeTLB pages
      mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page
      mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap
      mm: memory_hotplug: disable memmap_on_memory when hugetlb_free_vmemmap enabled
      mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate

    Shixin Liu <liushixin2@huawei.com>:
      mm/debug_vm_pgtable: move {pmd/pud}_huge_tests out of CONFIG_TRANSPARENT_HUGEPAGE
      mm/debug_vm_pgtable: remove redundant pfn_{pmd/pte}() and fix one comment mistake

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for huge_memory:, v3:
      mm/huge_memory.c: remove dedicated macro HPAGE_CACHE_INDEX_MASK
      mm/huge_memory.c: use page->deferred_list
      mm/huge_memory.c: add missing read-only THP checking in transparent_hugepage_enabled()
      mm/huge_memory.c: remove unnecessary tlb_remove_page_size() for huge zero pmd
      mm/huge_memory.c: don't discard hugepage if other processes are mapping it

    Christophe Leroy <christophe.leroy@csgroup.eu>:
    Patch series "Subject: [PATCH v2 0/5] Implement huge VMAP and VMALLOC on powerpc 8xx", v2:
      mm/hugetlb: change parameters of arch_make_huge_pte()
      mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge
      mm/vmalloc: enable mapping of huge pages at pte level in vmap
      mm/vmalloc: enable mapping of huge pages at pte level in vmalloc
      powerpc/8xx: add support for huge pages on VMAP and VMALLOC

    Nanyong Sun <sunnanyong@huawei.com>:
      khugepaged: selftests: remove debug_cow

    Mina Almasry <almasrymina@google.com>:
      mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Split huge PMD mapping of vmemmap pages", v4:
      mm: sparsemem: split the huge PMD mapping of vmemmap pages
      mm: sparsemem: use huge PMD mapping for vmemmap pages
      mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "Fix prep_compound_gigantic_page ref count adjustment":
      hugetlb: remove prep_compound_huge_page cleanup
      hugetlb: address ref count racing in prep_compound_gigantic_page

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/hwpoison: disable pcp for page_handle_poison()

Subsystem: mm/userfaultfd

    Peter Xu <peterx@redhat.com>:
    Patch series "userfaultfd/selftests: A few cleanups", v2:
      userfaultfd/selftests: use user mode only
      userfaultfd/selftests: remove the time() check on delayed uffd
      userfaultfd/selftests: dropping VERIFY check in locking_thread
      userfaultfd/selftests: only dump counts if mode enabled
      userfaultfd/selftests: unify error handling
    Patch series "mm/uffd: Misc fix for uffd-wp and one more test":
      mm/thp: simplify copying of huge zero page pmd when fork
      mm/userfaultfd: fix uffd-wp special cases for fork()
      mm/userfaultfd: fail uffd-wp registration if not supported
      mm/pagemap: export uffd-wp protection information
      userfaultfd/selftests: add pagemap uffd-wp test

    Axel Rasmussen <axelrasmussen@google.com>:
    Patch series "userfaultfd: add minor fault handling for shmem", v6:
      userfaultfd/shmem: combine shmem_{mcopy_atomic,mfill_zeropage}_pte
      userfaultfd/shmem: support minor fault registration for shmem
      userfaultfd/shmem: support UFFDIO_CONTINUE for shmem
      userfaultfd/shmem: advertise shmem minor fault support
      userfaultfd/shmem: modify shmem_mfill_atomic_pte to use install_pte()
      userfaultfd/selftests: use memfd_create for shmem test type
      userfaultfd/selftests: create alias mappings in the shmem test
      userfaultfd/selftests: reinitialize test context in each test
      userfaultfd/selftests: exercise minor fault handling shmem support

Subsystem: mm/vmscan

    Yu Zhao <yuzhao@google.com>:
      mm/vmscan.c: fix potential deadlock in reclaim_pages()
      include/trace/events/vmscan.h: remove mm_vmscan_inactive_list_is_low

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: workingset: define macro WORKINGSET_SHIFT

Subsystem: mm/kconfig

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm/kconfig: move HOLES_IN_ZONE into mm

Subsystem: mm/proc

    Mike Rapoport <rppt@linux.ibm.com>:
      docs: proc.rst: meminfo: briefly describe gaps in memory accounting

    David Hildenbrand <david@redhat.com>:
    Patch series "fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages", v3:
      fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER
      fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM
      fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages
      mm: introduce page_offline_(begin|end|freeze|thaw) to synchronize setting PageOffline()
      virtio-mem: use page_offline_(start|end) when setting PageOffline()
      fs/proc/kcore: use page_offline_(freeze|thaw)

Subsystem: mm/z3fold

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for z3fold":
      mm/z3fold: define macro NCHUNKS as TOTAL_CHUNKS - ZHDR_CHUNKS
      mm/z3fold: avoid possible underflow in z3fold_alloc()
      mm/z3fold: remove magic number in z3fold_create_pool()
      mm/z3fold: remove unused function handle_to_z3fold_header()
      mm/z3fold: fix potential memory leak in z3fold_destroy_pool()
      mm/z3fold: use release_z3fold_page_locked() to release locked z3fold page

Subsystem: mm/zbud

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanups for zbud", v2:
      mm/zbud: reuse unbuddied[0] as buddied in zbud_pool
      mm/zbud: don't export any zbud API

Subsystem: mm/ras

    YueHaibing <yuehaibing@huawei.com>:
      mm/compaction: use DEVICE_ATTR_WO macro

    Liu Xiang <liu.xiang@zlingsmart.com>:
      mm: compaction: remove duplicate !list_empty(&sublist) check

    Wonhyuk Yang <vvghjk1234@gmail.com>:
      mm/compaction: fix 'limit' in fast_isolate_freepages

Subsystem: mm/mempolicy

    Feng Tang <feng.tang@intel.com>:
    Patch series "mm/mempolicy: some fix and semantics cleanup", v4:
      mm/mempolicy: cleanup nodemask intersection check for oom
      mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy
      mm/mempolicy: unify the parameter sanity check for mbind and set_mempolicy

    Yang Shi <shy828301@gmail.com>:
      mm: mempolicy: don't have to split pmd for huge zero page

    Ben Widawsky <ben.widawsky@intel.com>:
      mm/mempolicy: use unified 'nodes' for bind/interleave/prefer policies

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "arm64: drop pfn_valid_within() and simplify pfn_valid()", v4:
      include/linux/mmzone.h: add documentation for pfn_valid()
      memblock: update initialization of reserved pages
      arm64: decouple check whether pfn is in linear map from pfn_valid()
      arm64: drop pfn_valid_within() and simplify pfn_valid()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      arm64/mm: drop HAVE_ARCH_PFN_VALID

Subsystem: mm/migration

    Muchun Song <songmuchun@bytedance.com>:
      mm: migrate: fix missing update page_private to hugetlb_page_subpool

Subsystem: mm/thp

    Collin Fijalkovich <cfijalkovich@google.com>:
      mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs

    Yang Shi <shy828301@gmail.com>:
      mm: memory: add orig_pmd to struct vm_fault
      mm: memory: make numa_migrate_prep() non-static
      mm: thp: refactor NUMA fault handling
      mm: migrate: account THP NUMA migration counters correctly
      mm: migrate: don't split THP for misplaced NUMA page
      mm: migrate: check mapcount for THP instead of refcount
      mm: thp: skip make PMD PROT_NONE if THP migration is not supported

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/thp: make ARCH_ENABLE_SPLIT_PMD_PTLOCK dependent on PGTABLE_LEVELS > 2

    Yang Shi <shy828301@gmail.com>:
      mm: rmap: make try_to_unmap() void function

    Hugh Dickins <hughd@google.com>:
      mm/thp: remap_page() is only needed on anonymous THP
      mm: hwpoison_user_mappings() try_to_unmap() with TTU_SYNC

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/thp: fix strncpy warning

Subsystem: mm/nommu

    Chen Li <chenli@uniontech.com>:
      nommu: remove __GFP_HIGHMEM in vmalloc/vzalloc

    Liam Howlett <liam.howlett@oracle.com>:
      mm/nommu: unexport do_munmap()

Subsystem: mm/kconfig

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: generalize ZONE_[DMA|DMA32]

Subsystem: mm/madvise

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables", v2:
      mm: make variable names for populate_vma_page_range() consistent
      mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables
      MAINTAINERS: add tools/testing/selftests/vm/ to MEMORY MANAGEMENT
      selftests/vm: add protection_keys_32 / protection_keys_64 to gitignore
      selftests/vm: add test for MADV_POPULATE_(READ|WRITE)

Subsystem: mm/memory-hotplug

    Liam Mark <lmark@codeaurora.org>:
      mm/memory_hotplug: rate limit page migration warnings

    Oscar Salvador <osalvador@suse.de>:
      mm,memory_hotplug: drop unneeded locking

Subsystem: mm/zswap

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for zswap":
      mm/zswap.c: remove unused function zswap_debugfs_exit()
      mm/zswap.c: avoid unnecessary copy-in at map time
      mm/zswap.c: fix two bugs in zswap_writeback_entry()

Subsystem: mm/zsmalloc

    Zhaoyang Huang <zhaoyang.huang@unisoc.com>:
      mm: zram: amend SLAB_RECLAIM_ACCOUNT on zspage_cachep

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup for zsmalloc":
      mm/zsmalloc.c: remove confusing code in obj_free()
      mm/zsmalloc.c: improve readability for async_free_zspage()

Subsystem: mm/zram

    Yue Hu <huyue2@yulong.com>:
      zram: move backing_dev under macro CONFIG_ZRAM_WRITEBACK

Subsystem: mm/cleanups

    Hyeonggon Yoo <42.hyeyoo@gmail.com>:
      mm: fix typos and grammar error in comments

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm: define default value for FIRST_USER_ADDRESS

    Zhen Lei <thunder.leizhen@huawei.com>:
      mm: fix spelling mistakes

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Clean W=1 build warnings for mm/":
      mm/vmscan: remove kerneldoc-like comment from isolate_lru_pages
      mm/vmalloc: include header for prototype of set_iounmap_nonlazy
      mm/page_alloc: make should_fail_alloc_page() static
      mm/mapping_dirty_helpers: remove double Note in kerneldoc
      mm/memcontrol.c: fix kerneldoc comment for mem_cgroup_calculate_protection
      mm/memory_hotplug: fix kerneldoc comment for __try_online_node
      mm/memory_hotplug: fix kerneldoc comment for __remove_memory
      mm/zbud: add kerneldoc fields for zbud_pool
      mm/z3fold: add kerneldoc fields for z3fold_pool
      mm/swap: make swap_address_space an inline function
      mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations
      mm/page_alloc: move prototype for find_suitable_fallback
      mm/swap: make NODE_DATA an inline function on CONFIG_FLATMEM

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/thp: define default pmd_pgtable()

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: unconditionally use unbound work queue

Subsystem: mm/hmm

    Alistair Popple <apopple@nvidia.com>:
    Patch series "Add support for SVM atomics in Nouveau", v11:
      mm: remove special swap entry functions
      mm/swapops: rework swap entry manipulation code
      mm/rmap: split try_to_munlock from try_to_unmap
      mm/rmap: split migration into its own function
      mm: rename migrate_pgmap_owner
      mm/memory.c: allow different return codes for copy_nonpresent_pte()
      mm: device exclusive memory access
      mm: selftests for exclusive device memory
      nouveau/svm: refactor nouveau_range_fault
      nouveau/svm: implement atomic SVM access

Subsystem: procfs

    Marcelo Henrique Cerri <marcelo.cerri@canonical.com>:
      proc: Avoid mixing integer types in mem_rw()

    ZHOUFENG <zhoufeng.zf@bytedance.com>:
      fs/proc/kcore.c: add mmap interface

    Kalesh Singh <kaleshsingh@google.com>:
      procfs: allow reading fdinfo with PTRACE_MODE_READ
      procfs/dmabuf: add inode number to /proc/*/fdinfo

Subsystem: sysctl

    Jiapeng Chong <jiapeng.chong@linux.alibaba.com>:
      sysctl: remove redundant assignment to first

Subsystem: misc

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      drm: include only needed headers in ascii85.h

Subsystem: core-kernel

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: split out panic and oops helpers

Subsystem: lib

    Zhen Lei <thunder.leizhen@huawei.com>:
      lib: decompress_bunzip2: remove an unneeded semicolon

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
    Patch series "lib/string_helpers: get rid of ugly *_escape_mem_ascii()", v3:
      lib/string_helpers: switch to use BIT() macro
      lib/string_helpers: move ESCAPE_NP check inside 'else' branch in a loop
      lib/string_helpers: drop indentation level in string_escape_mem()
      lib/string_helpers: introduce ESCAPE_NA for escaping non-ASCII
      lib/string_helpers: introduce ESCAPE_NAP to escape non-ASCII and non-printable
      lib/string_helpers: allow to append additional characters to be escaped
      lib/test-string_helpers: print flags in hexadecimal format
      lib/test-string_helpers: get rid of trailing comma in terminators
      lib/test-string_helpers: add test cases for new features
      MAINTAINERS: add myself as designated reviewer for generic string library
      seq_file: introduce seq_escape_mem()
      seq_file: add seq_escape_str() as replica of string_escape_str()
      seq_file: convert seq_escape() to use seq_escape_str()
      nfsd: avoid non-flexible API in seq_quote_mem()
      seq_file: drop unused *_escape_mem_ascii()

    Trent Piepho <tpiepho@gmail.com>:
      lib/math/rational.c: fix divide by zero
      lib/math/rational: add Kunit test cases

    Zhen Lei <thunder.leizhen@huawei.com>:
      lib/decompressors: fix spelling mistakes
      lib/mpi: fix spelling mistakes

    Alexey Dobriyan <adobriyan@gmail.com>:
      lib: memscan() fixlet
      lib: uninline simple_strtoull()

    Matteo Croce <mcroce@microsoft.com>:
      lib/test_string.c: allow module removal

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: split out kstrtox() and simple_strtox() to a separate header

Subsystem: lz4

    Rajat Asthana <thisisrast7@gmail.com>:
      lz4_decompress: declare LZ4_decompress_safe_withPrefix64k static

    Dimitri John Ledkov <dimitri.ledkov@canonical.com>:
      lib/decompress_unlz4.c: correctly handle zero-padding around initrds.

Subsystem: checkpatch

    Guenter Roeck <linux@roeck-us.net>:
      checkpatch: scripts/spdxcheck.py now requires python3

    Joe Perches <joe@perches.com>:
      checkpatch: improve the indented label test

    Guenter Roeck <linux@roeck-us.net>:
      checkpatch: do not complain about positive return values starting with EPOLL

Subsystem: init

    Andrew Halaney <ahalaney@redhat.com>:
      init: print out unknown kernel parameters

Subsystem: kprobes

    Barry Song <song.bao.hua@hisilicon.com>:
      kprobes: remove duplicated strong free_insn_page in x86 and s390

Subsystem: nilfs2

    Colin Ian King <colin.king@canonical.com>:
      nilfs2: remove redundant continue statement in a while-loop

Subsystem: hfs

    Zhen Lei <thunder.leizhen@huawei.com>:
      hfsplus: remove unnecessary oom message

    Chung-Chiang Cheng <shepjeng@gmail.com>:
      hfsplus: report create_date to kstat.btime

Subsystem: signals

    Al Viro <viro@zeniv.linux.org.uk>:
      x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned

Subsystem: exec

    Alexey Dobriyan <adobriyan@gmail.com>:
      exec: remove checks in __register_bimfmt()

Subsystem: kcov

    Marco Elver <elver@google.com>:
      kcov: add __no_sanitize_coverage to fix noinstr for all architectures

Subsystem: selftests

    Dave Hansen <dave.hansen@linux.intel.com>:
    Patch series "selftests/vm/pkeys: Bug fixes and a new test":
      selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
      selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
      selftests/vm/pkeys: refill shadow register after implicit kernel write
      selftests/vm/pkeys: exercise x86 XSAVE init state

Subsystem: compress/decompress

    Yu Kuai <yukuai3@huawei.com>:
      lib/decompressors: remove set but not used variabled 'level'

Subsystem: ipc

    Vasily Averin <vvs@virtuozzo.com>:
    Patch series "ipc: allocations cleanup", v2:
      ipc sem: use kvmalloc for sem_undo allocation
      ipc: use kmalloc for msg_queue and shmid_kernel

    Manfred Spraul <manfred@colorfullife.com>:
      ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
      ipc/util.c: use binary search for max_idx

 Documentation/admin-guide/kernel-parameters.txt    |   35 
 Documentation/admin-guide/mm/hugetlbpage.rst       |   11 
 Documentation/admin-guide/mm/memory-hotplug.rst    |   13 
 Documentation/admin-guide/mm/pagemap.rst           |    2 
 Documentation/admin-guide/mm/userfaultfd.rst       |    3 
 Documentation/core-api/kernel-api.rst              |    7 
 Documentation/filesystems/proc.rst                 |   48 
 Documentation/vm/hmm.rst                           |   19 
 Documentation/vm/unevictable-lru.rst               |   33 
 MAINTAINERS                                        |   10 
 arch/alpha/Kconfig                                 |    5 
 arch/alpha/include/asm/pgalloc.h                   |    1 
 arch/alpha/include/asm/pgtable.h                   |    1 
 arch/alpha/include/uapi/asm/mman.h                 |    3 
 arch/alpha/kernel/setup.c                          |    2 
 arch/arc/include/asm/pgalloc.h                     |    2 
 arch/arc/include/asm/pgtable.h                     |    8 
 arch/arm/Kconfig                                   |    3 
 arch/arm/include/asm/pgalloc.h                     |    1 
 arch/arm64/Kconfig                                 |   15 
 arch/arm64/include/asm/hugetlb.h                   |    3 
 arch/arm64/include/asm/memory.h                    |    2 
 arch/arm64/include/asm/page.h                      |    4 
 arch/arm64/include/asm/pgalloc.h                   |    1 
 arch/arm64/include/asm/pgtable.h                   |    2 
 arch/arm64/kernel/setup.c                          |    1 
 arch/arm64/kvm/mmu.c                               |    2 
 arch/arm64/mm/hugetlbpage.c                        |    5 
 arch/arm64/mm/init.c                               |   51 
 arch/arm64/mm/ioremap.c                            |    4 
 arch/arm64/mm/mmu.c                                |   22 
 arch/csky/include/asm/pgalloc.h                    |    2 
 arch/csky/include/asm/pgtable.h                    |    1 
 arch/hexagon/include/asm/pgtable.h                 |    4 
 arch/ia64/Kconfig                                  |    7 
 arch/ia64/include/asm/pal.h                        |    1 
 arch/ia64/include/asm/pgalloc.h                    |    1 
 arch/ia64/include/asm/pgtable.h                    |    1 
 arch/m68k/Kconfig                                  |    5 
 arch/m68k/include/asm/mcf_pgalloc.h                |    2 
 arch/m68k/include/asm/mcf_pgtable.h                |    2 
 arch/m68k/include/asm/motorola_pgalloc.h           |    1 
 arch/m68k/include/asm/motorola_pgtable.h           |    2 
 arch/m68k/include/asm/pgtable_mm.h                 |    1 
 arch/m68k/include/asm/sun3_pgalloc.h               |    1 
 arch/microblaze/Kconfig                            |    4 
 arch/microblaze/include/asm/pgalloc.h              |    2 
 arch/microblaze/include/asm/pgtable.h              |    2 
 arch/mips/Kconfig                                  |   10 
 arch/mips/include/asm/pgalloc.h                    |    1 
 arch/mips/include/asm/pgtable-32.h                 |    1 
 arch/mips/include/asm/pgtable-64.h                 |    1 
 arch/mips/include/uapi/asm/mman.h                  |    3 
 arch/mips/kernel/relocate.c                        |    1 
 arch/mips/sgi-ip22/ip22-reset.c                    |    1 
 arch/mips/sgi-ip32/ip32-reset.c                    |    1 
 arch/nds32/include/asm/pgalloc.h                   |    5 
 arch/nios2/include/asm/pgalloc.h                   |    1 
 arch/nios2/include/asm/pgtable.h                   |    2 
 arch/openrisc/include/asm/pgalloc.h                |    2 
 arch/openrisc/include/asm/pgtable.h                |    1 
 arch/parisc/include/asm/pgalloc.h                  |    1 
 arch/parisc/include/asm/pgtable.h                  |    2 
 arch/parisc/include/uapi/asm/mman.h                |    3 
 arch/parisc/kernel/pdc_chassis.c                   |    1 
 arch/powerpc/Kconfig                               |    6 
 arch/powerpc/include/asm/book3s/pgtable.h          |    1 
 arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h   |    5 
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h       |   43 
 arch/powerpc/include/asm/nohash/32/pgtable.h       |    1 
 arch/powerpc/include/asm/nohash/64/pgtable.h       |    2 
 arch/powerpc/include/asm/pgalloc.h                 |    5 
 arch/powerpc/include/asm/pgtable.h                 |    6 
 arch/powerpc/kernel/setup-common.c                 |    1 
 arch/powerpc/platforms/Kconfig.cputype             |    1 
 arch/riscv/Kconfig                                 |    5 
 arch/riscv/include/asm/pgalloc.h                   |    2 
 arch/riscv/include/asm/pgtable.h                   |    2 
 arch/s390/Kconfig                                  |    6 
 arch/s390/include/asm/pgalloc.h                    |    3 
 arch/s390/include/asm/pgtable.h                    |    5 
 arch/s390/kernel/ipl.c                             |    1 
 arch/s390/kernel/kprobes.c                         |    5 
 arch/s390/mm/pgtable.c                             |    2 
 arch/sh/include/asm/pgalloc.h                      |    1 
 arch/sh/include/asm/pgtable.h                      |    2 
 arch/sparc/Kconfig                                 |    5 
 arch/sparc/include/asm/pgalloc_32.h                |    1 
 arch/sparc/include/asm/pgalloc_64.h                |    1 
 arch/sparc/include/asm/pgtable_32.h                |    3 
 arch/sparc/include/asm/pgtable_64.h                |    8 
 arch/sparc/kernel/sstate.c                         |    1 
 arch/sparc/mm/hugetlbpage.c                        |    6 
 arch/sparc/mm/init_64.c                            |    1 
 arch/um/drivers/mconsole_kern.c                    |    1 
 arch/um/include/asm/pgalloc.h                      |    1 
 arch/um/include/asm/pgtable-2level.h               |    1 
 arch/um/include/asm/pgtable-3level.h               |    1 
 arch/um/kernel/um_arch.c                           |    1 
 arch/x86/Kconfig                                   |   17 
 arch/x86/include/asm/desc.h                        |    1 
 arch/x86/include/asm/pgalloc.h                     |    2 
 arch/x86/include/asm/pgtable_types.h               |    2 
 arch/x86/kernel/cpu/mshyperv.c                     |    1 
 arch/x86/kernel/kprobes/core.c                     |    6 
 arch/x86/kernel/setup.c                            |    1 
 arch/x86/mm/init_64.c                              |   21 
 arch/x86/mm/pgtable.c                              |   34 
 arch/x86/purgatory/purgatory.c                     |    2 
 arch/x86/xen/enlighten.c                           |    1 
 arch/xtensa/include/asm/pgalloc.h                  |    2 
 arch/xtensa/include/asm/pgtable.h                  |    1 
 arch/xtensa/include/uapi/asm/mman.h                |    3 
 arch/xtensa/platforms/iss/setup.c                  |    1 
 drivers/block/zram/zram_drv.h                      |    2 
 drivers/bus/brcmstb_gisb.c                         |    1 
 drivers/char/ipmi/ipmi_msghandler.c                |    1 
 drivers/clk/analogbits/wrpll-cln28hpc.c            |    4 
 drivers/edac/altera_edac.c                         |    1 
 drivers/firmware/google/gsmi.c                     |    1 
 drivers/gpu/drm/nouveau/include/nvif/if000c.h      |    1 
 drivers/gpu/drm/nouveau/nouveau_svm.c              |  162 ++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h      |    1 
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c |    6 
 drivers/hv/vmbus_drv.c                             |    1 
 drivers/hwtracing/coresight/coresight-cpu-debug.c  |    1 
 drivers/leds/trigger/ledtrig-activity.c            |    1 
 drivers/leds/trigger/ledtrig-heartbeat.c           |    1 
 drivers/leds/trigger/ledtrig-panic.c               |    1 
 drivers/misc/bcm-vk/bcm_vk_dev.c                   |    1 
 drivers/misc/ibmasm/heartbeat.c                    |    1 
 drivers/misc/pvpanic/pvpanic.c                     |    1 
 drivers/net/ipa/ipa_smp2p.c                        |    1 
 drivers/parisc/power.c                             |    1 
 drivers/power/reset/ltc2952-poweroff.c             |    1 
 drivers/remoteproc/remoteproc_core.c               |    1 
 drivers/s390/char/con3215.c                        |    1 
 drivers/s390/char/con3270.c                        |    1 
 drivers/s390/char/sclp.c                           |    1 
 drivers/s390/char/sclp_con.c                       |    1 
 drivers/s390/char/sclp_vt220.c                     |    1 
 drivers/s390/char/zcore.c                          |    1 
 drivers/soc/bcm/brcmstb/pm/pm-arm.c                |    1 
 drivers/staging/olpc_dcon/olpc_dcon.c              |    1 
 drivers/video/fbdev/hyperv_fb.c                    |    1 
 drivers/virtio/virtio_mem.c                        |    2 
 fs/Kconfig                                         |   15 
 fs/exec.c                                          |    3 
 fs/hfsplus/inode.c                                 |    5 
 fs/hfsplus/xattr.c                                 |    1 
 fs/nfsd/nfs4state.c                                |    2 
 fs/nilfs2/btree.c                                  |    1 
 fs/open.c                                          |   13 
 fs/proc/base.c                                     |    6 
 fs/proc/fd.c                                       |   20 
 fs/proc/kcore.c                                    |  136 ++
 fs/proc/task_mmu.c                                 |   34 
 fs/seq_file.c                                      |   43 
 fs/userfaultfd.c                                   |   15 
 include/asm-generic/bug.h                          |    3 
 include/linux/ascii85.h                            |    3 
 include/linux/bootmem_info.h                       |   68 +
 include/linux/compat.h                             |    2 
 include/linux/compiler-clang.h                     |   17 
 include/linux/compiler-gcc.h                       |    6 
 include/linux/compiler_types.h                     |    2 
 include/linux/huge_mm.h                            |   74 -
 include/linux/hugetlb.h                            |   80 +
 include/linux/hugetlb_cgroup.h                     |   19 
 include/linux/kcore.h                              |    3 
 include/linux/kernel.h                             |  227 ----
 include/linux/kprobes.h                            |    1 
 include/linux/kstrtox.h                            |  155 ++
 include/linux/memblock.h                           |    4 
 include/linux/memory_hotplug.h                     |   27 
 include/linux/mempolicy.h                          |    9 
 include/linux/memremap.h                           |    2 
 include/linux/migrate.h                            |   27 
 include/linux/mm.h                                 |   18 
 include/linux/mm_types.h                           |    2 
 include/linux/mmu_notifier.h                       |   26 
 include/linux/mmzone.h                             |   27 
 include/linux/mpi.h                                |    4 
 include/linux/page-flags.h                         |   22 
 include/linux/panic.h                              |   98 +
 include/linux/panic_notifier.h                     |   12 
 include/linux/pgtable.h                            |   44 
 include/linux/rmap.h                               |   13 
 include/linux/seq_file.h                           |   10 
 include/linux/shmem_fs.h                           |   19 
 include/linux/signal.h                             |    2 
 include/linux/string.h                             |    7 
 include/linux/string_helpers.h                     |   31 
 include/linux/sunrpc/cache.h                       |    1 
 include/linux/swap.h                               |   19 
 include/linux/swapops.h                            |  171 +--
 include/linux/thread_info.h                        |    1 
 include/linux/userfaultfd_k.h                      |    5 
 include/linux/vmalloc.h                            |   15 
 include/linux/zbud.h                               |   23 
 include/trace/events/vmscan.h                      |   41 
 include/uapi/asm-generic/mman-common.h             |    3 
 include/uapi/linux/mempolicy.h                     |    1 
 include/uapi/linux/userfaultfd.h                   |    7 
 init/main.c                                        |   42 
 ipc/msg.c                                          |    6 
 ipc/sem.c                                          |   25 
 ipc/shm.c                                          |    6 
 ipc/util.c                                         |   44 
 ipc/util.h                                         |    3 
 kernel/hung_task.c                                 |    1 
 kernel/kexec_core.c                                |    1 
 kernel/kprobes.c                                   |    2 
 kernel/panic.c                                     |    1 
 kernel/rcu/tree.c                                  |    2 
 kernel/signal.c                                    |   14 
 kernel/sysctl.c                                    |    4 
 kernel/trace/trace.c                               |    1 
 lib/Kconfig.debug                                  |   12 
 lib/decompress_bunzip2.c                           |    6 
 lib/decompress_unlz4.c                             |    8 
 lib/decompress_unlzo.c                             |    3 
 lib/decompress_unxz.c                              |    2 
 lib/decompress_unzstd.c                            |    4 
 lib/kstrtox.c                                      |    5 
 lib/lz4/lz4_decompress.c                           |    2 
 lib/math/Makefile                                  |    1 
 lib/math/rational-test.c                           |   56 +
 lib/math/rational.c                                |   16 
 lib/mpi/longlong.h                                 |    4 
 lib/mpi/mpicoder.c                                 |    6 
 lib/mpi/mpiutil.c                                  |    2 
 lib/parser.c                                       |    1 
 lib/string.c                                       |    2 
 lib/string_helpers.c                               |  142 +-
 lib/test-string_helpers.c                          |  157 ++-
 lib/test_hmm.c                                     |  127 ++
 lib/test_hmm_uapi.h                                |    2 
 lib/test_string.c                                  |    5 
 lib/vsprintf.c                                     |    1 
 lib/xz/xz_dec_bcj.c                                |    2 
 lib/xz/xz_dec_lzma2.c                              |    8 
 lib/zlib_inflate/inffast.c                         |    2 
 lib/zstd/huf.h                                     |    2 
 mm/Kconfig                                         |   16 
 mm/Makefile                                        |    2 
 mm/bootmem_info.c                                  |  127 ++
 mm/compaction.c                                    |   20 
 mm/debug_vm_pgtable.c                              |  109 --
 mm/gup.c                                           |   58 +
 mm/hmm.c                                           |   12 
 mm/huge_memory.c                                   |  269 ++---
 mm/hugetlb.c                                       |  369 +++++--
 mm/hugetlb_vmemmap.c                               |  332 ++++++
 mm/hugetlb_vmemmap.h                               |   53 -
 mm/internal.h                                      |   29 
 mm/kfence/core.c                                   |    4 
 mm/khugepaged.c                                    |   20 
 mm/madvise.c                                       |   66 +
 mm/mapping_dirty_helpers.c                         |    2 
 mm/memblock.c                                      |   28 
 mm/memcontrol.c                                    |    4 
 mm/memory-failure.c                                |   38 
 mm/memory.c                                        |  239 +++-
 mm/memory_hotplug.c                                |  161 ---
 mm/mempolicy.c                                     |  323 ++----
 mm/migrate.c                                       |  268 +----
 mm/mlock.c                                         |   12 
 mm/mmap_lock.c                                     |   59 -
 mm/mprotect.c                                      |   18 
 mm/nommu.c                                         |    5 
 mm/oom_kill.c                                      |    2 
 mm/page_alloc.c                                    |    5 
 mm/page_vma_mapped.c                               |   15 
 mm/rmap.c                                          |  644 +++++++++---
 mm/shmem.c                                         |  125 --
 mm/sparse-vmemmap.c                                |  432 +++++++-
 mm/sparse.c                                        |    1 
 mm/swap.c                                          |    2 
 mm/swapfile.c                                      |    2 
 mm/userfaultfd.c                                   |  249 ++--
 mm/util.c                                          |   40 
 mm/vmalloc.c                                       |   37 
 mm/vmscan.c                                        |   20 
 mm/workingset.c                                    |   10 
 mm/z3fold.c                                        |   39 
 mm/zbud.c                                          |  235 ++--
 mm/zsmalloc.c                                      |    5 
 mm/zswap.c                                         |   26 
 scripts/checkpatch.pl                              |   16 
 tools/testing/selftests/vm/.gitignore              |    3 
 tools/testing/selftests/vm/Makefile                |    5 
 tools/testing/selftests/vm/hmm-tests.c             |  158 +++
 tools/testing/selftests/vm/khugepaged.c            |    4 
 tools/testing/selftests/vm/madv_populate.c         |  342 ++++++
 tools/testing/selftests/vm/pkey-x86.h              |    1 
 tools/testing/selftests/vm/protection_keys.c       |   85 +
 tools/testing/selftests/vm/run_vmtests.sh          |   16 
 tools/testing/selftests/vm/userfaultfd.c           | 1094 ++++++++++-----------
 299 files changed, 6277 insertions(+), 3183 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-06-29  2:32 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-06-29  2:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

192 patches, based on 7cf3dead1ad70c72edb03e2d98e1f3dcd332cdb2.

Subsystems affected by this patch series:

  mm/gup
  mm/pagealloc
  kthread
  ia64
  scripts
  ntfs
  squashfs
  ocfs2
  z
  kernel/watchdog
  mm/slab
  mm/slub
  mm/kmemleak
  mm/dax
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/mprotect
  mm/bootmem
  mm/dma
  mm/tracing
  mm/vmalloc
  mm/kasan
  mm/initialization
  mm/pagealloc
  mm/memory-failure

Subsystem: mm/gup

    Jann Horn <jannh@google.com>:
      mm/gup: fix try_grab_compound_head() race with split_huge_page()

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/page_alloc: fix memory map initialization for descending nodes

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: correct return value of populated elements if bulk array is populated

Subsystem: kthread

    Jonathan Neuschäfer <j.neuschaefer@gmx.net>:
      kthread: switch to new kerneldoc syntax for named variable macro argument

    Petr Mladek <pmladek@suse.com>:
      kthread_worker: fix return value when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()

Subsystem: ia64

    Randy Dunlap <rdunlap@infradead.org>:
      ia64: headers: drop duplicated words

    Arnd Bergmann <arnd@arndb.de>:
      ia64: mca_drv: fix incorrect array size calculation

Subsystem: scripts

    "Steven Rostedt (VMware)" <rostedt@goodmis.org>:
    Patch series "streamline_config.pl: Fix Perl spacing":
      streamline_config.pl: make spacing consistent
      streamline_config.pl: add softtabstop=4 for vim users

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ntfs

    Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>:
      ntfs: fix validity check for file name attribute

Subsystem: squashfs

    Vincent Whitchurch <vincent.whitchurch@axis.com>:
      squashfs: add option to panic on errors

Subsystem: ocfs2

    Yang Yingliang <yangyingliang@huawei.com>:
      ocfs2: remove unnecessary INIT_LIST_HEAD()

Subsystem: z

    Dan Carpenter <dan.carpenter@oracle.com>:
      ocfs2: fix snprintf() checking

    Colin Ian King <colin.king@canonical.com>:
      ocfs2: remove redundant assignment to pointer queue

    Wan Jiabing <wanjiabing@vivo.com>:
      ocfs2: remove repeated uptodate check for buffer

    Chen Huang <chenhuang5@huawei.com>:
      ocfs2: replace simple_strtoull() with kstrtoull()

    Colin Ian King <colin.king@canonical.com>:
      ocfs2: remove redundant initialization of variable ret

Subsystem: kernel/watchdog

    Wang Qing <wangqing@vivo.com>:
      kernel: watchdog: modify the explanation related to watchdog thread
      doc: watchdog: modify the explanation related to watchdog thread
      doc: watchdog: modify the doc related to "watchdog/%u"

Subsystem: mm/slab

    gumingtao <gumingtao1225@gmail.com>:
      slab: use __func__ to trace function name

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
      kunit: make test->lock irq safe

    Oliver Glitta <glittao@gmail.com>:
      mm/slub, kunit: add a KUnit test for SLUB debugging functionality
      slub: remove resiliency_test() function

    Hyeonggon Yoo <42.hyeyoo@gmail.com>:
      mm, slub: change run-time assertion in kmalloc_index() to compile-time

    Stephen Boyd <swboyd@chromium.org>:
      slub: restore slub_debug=- behavior
      slub: actually use 'message' in restore_bytes()

    Joe Perches <joe@perches.com>:
      slub: indicate slab_fix() uses printf formats

    Stephen Boyd <swboyd@chromium.org>:
      slub: force on no_hash_pointers when slub_debug is enabled

    Faiyaz Mohammed <faiyazm@codeaurora.org>:
      mm: slub: move sysfs slab alloc/free interfaces to debugfs

    Georgi Djakov <quic_c_gdjako@quicinc.com>:
      mm/slub: add taint after the errors are printed

Subsystem: mm/kmemleak

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/kmemleak: fix possible wrong memory scanning period

Subsystem: mm/dax

    Jan Kara <jack@suse.cz>:
      dax: fix ENOMEM handling in grab_mapping_entry()

Subsystem: mm/debug

    Tang Bin <tangbin@cmss.chinamobile.com>:
      tools/vm/page_owner_sort.c: check malloc() return

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/debug_vm_pgtable: ensure THP availability via has_transparent_hugepage()

    Nicolas Saenz Julienne <nsaenzju@redhat.com>:
      mm: mmap_lock: use local locks instead of disabling preemption

    Gavin Shan <gshan@redhat.com>:
    Patch series "mm/page_reporting: Make page reporting work on arm64 with 64KB page size", v4:
      mm/page_reporting: fix code style in __page_reporting_request()
      mm/page_reporting: export reporting order as module parameter
      mm/page_reporting: allow driver to specify reporting order
      virtio_balloon: specify page reporting order if needed

Subsystem: mm/pagecache

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: page-writeback: kill get_writeback_state() comments

    Chi Wu <wuchi.zero@gmail.com>:
      mm/page-writeback: Fix performance when BDI's share of ratio is 0.
      mm/page-writeback: update the comment of Dirty position control
      mm/page-writeback: use __this_cpu_inc() in account_page_dirtied()

    Roman Gushchin <guro@fb.com>:
    Patch series "cgroup, blkcg: prevent dirty inodes to pin dying memory cgroups", v9:
      writeback, cgroup: do not switch inodes with I_WILL_FREE flag
      writeback, cgroup: add smp_mb() to cgroup_writeback_umount()
      writeback, cgroup: increment isw_nr_in_flight before grabbing an inode
      writeback, cgroup: switch to rcu_work API in inode_switch_wbs()
      writeback, cgroup: keep list of inodes attached to bdi_writeback
      writeback, cgroup: split out the functional part of inode_switch_wbs_work_fn()
      writeback, cgroup: support switching multiple inodes at once
      writeback, cgroup: release dying cgwbs by switching attached inodes

    Christoph Hellwig <hch@lst.de>:
    Patch series "remove the implicit .set_page_dirty default":
      fs: unexport __set_page_dirty
      fs: move ramfs_aops to libfs
      mm: require ->set_page_dirty to be explicitly wired up

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Further set_page_dirty cleanups":
      mm/writeback: move __set_page_dirty() to core mm
      mm/writeback: use __set_page_dirty in __set_page_dirty_nobuffers
      iomap: use __set_page_dirty_nobuffers
      fs: remove anon_set_page_dirty()
      fs: remove noop_set_page_dirty()
      mm: move page dirtying prototypes from mm.h

Subsystem: mm/gup

    Peter Xu <peterx@redhat.com>:
    Patch series "mm/gup: Fix pin page write cache bouncing on has_pinned", v2:
      mm/gup_benchmark: support threading

    Andrea Arcangeli <aarcange@redhat.com>:
      mm: gup: allow FOLL_PIN to scale in SMP
      mm: gup: pack has_pinned in MMF_HAS_PINNED

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm: pagewalk: fix walk for hugepage tables

Subsystem: mm/swap

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "close various race windows for swap", v6:
      mm/swapfile: use percpu_ref to serialize against concurrent swapoff
      swap: fix do_swap_page() race with swapoff
      mm/swap: remove confusing checking for non_swap_entry() in swap_ra_info()
      mm/shmem: fix shmem_swapin() race with swapoff
    Patch series "Cleanups for swap", v2:
      mm/swapfile: move get_swap_page_of_type() under CONFIG_HIBERNATION
      mm/swap: remove unused local variable nr_shadows
      mm/swap_slots.c: delete meaningless forward declarations

    Huang Ying <ying.huang@intel.com>:
      mm, swap: remove unnecessary smp_rmb() in swap_type_to_swap_info()
      mm: free idle swap cache page after COW
      swap: check mapping_empty() for swap cache before being freed

Subsystem: mm/memcg

    Waiman Long <longman@redhat.com>:
    Patch series "mm/memcg: Reduce kmemcache memory accounting overhead", v6:
      mm/memcg: move mod_objcg_state() to memcontrol.c
      mm/memcg: cache vmstat data in percpu memcg_stock_pcp
      mm/memcg: improve refill_obj_stock() performance
      mm/memcg: optimize user context object stock access
    Patch series "mm: memcg/slab: Fix objcg pointer array handling problem", v4:
      mm: memcg/slab: properly set up gfp flags for objcg pointer array
      mm: memcg/slab: create a new set of kmalloc-cg-<n> caches
      mm: memcg/slab: disable cache merging for KMALLOC_NORMAL caches

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: fix root_mem_cgroup charging
    Patch series "memcontrol code cleanup and simplification", v3:
      mm: memcontrol: fix page charging in page replacement
      mm: memcontrol: bail out early when !mm in get_mem_cgroup_from_mm
      mm: memcontrol: remove the pgdata parameter of mem_cgroup_page_lruvec
      mm: memcontrol: simplify lruvec_holds_page_lru_lock
      mm: memcontrol: rename lruvec_holds_page_lru_lock to page_matches_lruvec
      mm: memcontrol: simplify the logic of objcg pinning memcg
      mm: memcontrol: move obj_cgroup_uncharge_pages() out of css_set_lock
      mm: vmscan: remove noinline_for_stack

    wenhuizhang <wenhui@gwmail.gwu.edu>:
      memcontrol: use flexible-array member

    Dan Schatzberg <schatzberg.dan@gmail.com>:
    Patch series "Charge loop device i/o to issuing cgroup", v14:
      loop: use worker per cgroup instead of kworker
      mm: charge active memcg when no mm is set
      loop: charge i/o to mem and blk cg

    Huilong Deng <denghuilong@cdjrlc.com>:
      mm: memcontrol: remove trailing semicolon in macros

Subsystem: mm/pagemap

    David Hildenbrand <david@redhat.com>:
    Patch series "perf/binfmt/mm: remove in-tree usage of MAP_EXECUTABLE":
      perf: MAP_EXECUTABLE does not indicate VM_MAYEXEC
      binfmt: remove in-tree usage of MAP_EXECUTABLE
      mm: ignore MAP_EXECUTABLE in ksys_mmap_pgoff()

    Gonzalo Matias Juarez Tello <gmjuareztello@gmail.com>:
      mm/mmap.c: logic of find_vma_intersection repeated in __do_munmap

    Liam Howlett <liam.howlett@oracle.com>:
      mm/mmap: introduce unlock_range() for code cleanup
      mm/mmap: use find_vma_intersection() in do_mmap() for overlap

    Liu Xiang <liu.xiang@zlingsmart.com>:
      mm/memory.c: fix comment of finish_mkwrite_fault()

    Liam Howlett <liam.howlett@oracle.com>:
    Patch series "mm: Add vma_lookup()", v2:
      mm: add vma_lookup(), update find_vma_intersection() comments
      drm/i915/selftests: use vma_lookup() in __igt_mmap()
      arch/arc/kernel/troubleshoot: use vma_lookup() instead of find_vma()
      arch/arm64/kvm: use vma_lookup() instead of find_vma_intersection()
      arch/powerpc/kvm/book3s_hv_uvmem: use vma_lookup() instead of find_vma_intersection()
      arch/powerpc/kvm/book3s: use vma_lookup() in kvmppc_hv_setup_htab_rma()
      arch/mips/kernel/traps: use vma_lookup() instead of find_vma()
      arch/m68k/kernel/sys_m68k: use vma_lookup() in sys_cacheflush()
      x86/sgx: use vma_lookup() in sgx_encl_find()
      virt/kvm: use vma_lookup() instead of find_vma_intersection()
      vfio: use vma_lookup() instead of find_vma_intersection()
      net/ipv5/tcp: use vma_lookup() in tcp_zerocopy_receive()
      drm/amdgpu: use vma_lookup() in amdgpu_ttm_tt_get_user_pages()
      media: videobuf2: use vma_lookup() in get_vaddr_frames()
      misc/sgi-gru/grufault: use vma_lookup() in gru_find_vma()
      kernel/events/uprobes: use vma_lookup() in find_active_uprobe()
      lib/test_hmm: use vma_lookup() in dmirror_migrate()
      mm/ksm: use vma_lookup() in find_mergeable_vma()
      mm/migrate: use vma_lookup() in do_pages_stat_array()
      mm/mremap: use vma_lookup() in vma_to_resize()
      mm/memory.c: use vma_lookup() in __access_remote_vm()
      mm/mempolicy: use vma_lookup() in __access_remote_vm()

    Chen Li <chenli@uniontech.com>:
      mm: update legacy flush_tlb_* to use vma

Subsystem: mm/mprotect

    Peter Collingbourne <pcc@google.com>:
      mm: improve mprotect(R|W) efficiency on pages referenced once

Subsystem: mm/bootmem

    Souptick Joarder <jrdr.linux@gmail.com>:
      h8300: remove unused variable

Subsystem: mm/dma

    YueHaibing <yuehaibing@huawei.com>:
      mm/dmapool: use DEVICE_ATTR_RO macro

Subsystem: mm/tracing

    Vincent Whitchurch <vincent.whitchurch@axis.com>:
      mm, tracing: unify PFN format strings

Subsystem: mm/vmalloc

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
    Patch series "vmalloc() vs bulk allocator", v2:
      mm/page_alloc: add an alloc_pages_bulk_array_node() helper
      mm/vmalloc: switch to bulk allocator in __vmalloc_area_node()
      mm/vmalloc: print a warning message first on failure
      mm/vmalloc: remove quoted strings split across lines

    Uladzislau Rezki <urezki@gmail.com>:
      mm/vmalloc: fallback to a single page allocator

    Rafael Aquini <aquini@redhat.com>:
      mm: vmalloc: add cond_resched() in __vunmap()

Subsystem: mm/kasan

    Alexander Potapenko <glider@google.com>:
      printk: introduce dump_stack_lvl()
      kasan: use dump_stack_lvl(KERN_ERR) to print stacks

    David Gow <davidgow@google.com>:
      kasan: test: improve failure message in KUNIT_EXPECT_KASAN_FAIL()

    Daniel Axtens <dja@axtens.net>:
    Patch series "KASAN core changes for ppc64 radix KASAN", v16:
      kasan: allow an architecture to disable inline instrumentation
      kasan: allow architectures to provide an outline readiness check
      mm: define default MAX_PTRS_PER_* in include/pgtable.h
      kasan: use MAX_PTRS_PER_* for early shadow tables

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
    Patch series "kasan: add memory corruption identification support for hw tag-based kasan", v4:
      kasan: rename CONFIG_KASAN_SW_TAGS_IDENTIFY to CONFIG_KASAN_TAGS_IDENTIFY
      kasan: integrate the common part of two KASAN tag-based modes
      kasan: add memory corruption identification support for hardware tag-based mode

Subsystem: mm/initialization

    Jungseung Lee <js07.lee@samsung.com>:
      mm: report which part of mem is being freed on initmem case

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/mmzone.h: simplify is_highmem_idx()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Constify struct page arguments":
      mm: make __dump_page static

    Aaron Tomlin <atomlin@redhat.com>:
      mm/page_alloc: bail out on fatal signal during reclaim/compaction retry attempt

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/debug: factor PagePoisoned out of __dump_page
      mm/page_owner: constify dump_page_owner
      mm: make compound_head const-preserving
      mm: constify get_pfnblock_flags_mask and get_pfnblock_migratetype
      mm: constify page_count and page_ref_count
      mm: optimise nth_page for contiguous memmap

    Heiner Kallweit <hkallweit1@gmail.com>:
      mm/page_alloc: switch to pr_debug

    Andrii Nakryiko <andrii@kernel.org>:
      kbuild: skip per-CPU BTF generation for pahole v1.18-v1.21

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: split per cpu page lists and zone stats
      mm/page_alloc: convert per-cpu list protection to local_lock
      mm/vmstat: convert NUMA statistics to basic NUMA counters
      mm/vmstat: inline NUMA event counter updates
      mm/page_alloc: batch the accounting updates in the bulk allocator
      mm/page_alloc: reduce duration that IRQs are disabled for VM counters
      mm/page_alloc: explicitly acquire the zone lock in __free_pages_ok
      mm/page_alloc: avoid conflating IRQs disabled with zone->lock
      mm/page_alloc: update PGFREE outside the zone lock in __free_pages_ok

    Minchan Kim <minchan@kernel.org>:
      mm: page_alloc: dump migrate-failed pages only at -EBUSY

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Calculate pcp->high based on zone sizes and active CPUs", v2:
      mm/page_alloc: delete vm.percpu_pagelist_fraction
      mm/page_alloc: disassociate the pcp->high from pcp->batch
      mm/page_alloc: adjust pcp->high after CPU hotplug events
      mm/page_alloc: scale the number of pages that are batch freed
      mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
      mm/page_alloc: introduce vm.percpu_pagelist_high_fraction

    Dong Aisheng <aisheng.dong@nxp.com>:
      mm: drop SECTION_SHIFT in code comments
      mm/page_alloc: improve memmap_pages dbg msg

    Liu Shixin <liushixin2@huawei.com>:
      mm/page_alloc: fix counting of managed_pages

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Allow high order pages to be stored on PCP", v2:
      mm/page_alloc: move free_the_page

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "Remove DISCONTIGMEM memory model", v3:
      alpha: remove DISCONTIGMEM and NUMA
      arc: update comment about HIGHMEM implementation
      arc: remove support for DISCONTIGMEM
      m68k: remove support for DISCONTIGMEM
      mm: remove CONFIG_DISCONTIGMEM
      arch, mm: remove stale mentions of DISCONIGMEM
      docs: remove description of DISCONTIGMEM
      mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
      mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
      mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm,hwpoison: send SIGBUS with error virutal address
      mm,hwpoison: make get_hwpoison_page() call get_any_page()

 Documentation/admin-guide/kernel-parameters.txt    |    6 
 Documentation/admin-guide/lockup-watchdogs.rst     |    4 
 Documentation/admin-guide/sysctl/kernel.rst        |   10 
 Documentation/admin-guide/sysctl/vm.rst            |   52 -
 Documentation/dev-tools/kasan.rst                  |    9 
 Documentation/vm/memory-model.rst                  |   45 
 arch/alpha/Kconfig                                 |   22 
 arch/alpha/include/asm/machvec.h                   |    6 
 arch/alpha/include/asm/mmzone.h                    |  100 --
 arch/alpha/include/asm/pgtable.h                   |    4 
 arch/alpha/include/asm/topology.h                  |   39 
 arch/alpha/kernel/core_marvel.c                    |   53 -
 arch/alpha/kernel/core_wildfire.c                  |   29 
 arch/alpha/kernel/pci_iommu.c                      |   29 
 arch/alpha/kernel/proto.h                          |    8 
 arch/alpha/kernel/setup.c                          |   16 
 arch/alpha/kernel/sys_marvel.c                     |    5 
 arch/alpha/kernel/sys_wildfire.c                   |    5 
 arch/alpha/mm/Makefile                             |    2 
 arch/alpha/mm/init.c                               |    3 
 arch/alpha/mm/numa.c                               |  223 ----
 arch/arc/Kconfig                                   |   13 
 arch/arc/include/asm/mmzone.h                      |   40 
 arch/arc/kernel/troubleshoot.c                     |    8 
 arch/arc/mm/init.c                                 |   21 
 arch/arm/include/asm/tlbflush.h                    |   13 
 arch/arm/mm/tlb-v6.S                               |    2 
 arch/arm/mm/tlb-v7.S                               |    2 
 arch/arm64/Kconfig                                 |    2 
 arch/arm64/kvm/mmu.c                               |    2 
 arch/h8300/kernel/setup.c                          |    2 
 arch/ia64/Kconfig                                  |    2 
 arch/ia64/include/asm/pal.h                        |    2 
 arch/ia64/include/asm/spinlock.h                   |    2 
 arch/ia64/include/asm/uv/uv_hub.h                  |    2 
 arch/ia64/kernel/efi_stub.S                        |    2 
 arch/ia64/kernel/mca_drv.c                         |    2 
 arch/ia64/kernel/topology.c                        |    5 
 arch/ia64/mm/numa.c                                |    5 
 arch/m68k/Kconfig.cpu                              |   10 
 arch/m68k/include/asm/mmzone.h                     |   10 
 arch/m68k/include/asm/page.h                       |    2 
 arch/m68k/include/asm/page_mm.h                    |   35 
 arch/m68k/include/asm/tlbflush.h                   |    2 
 arch/m68k/kernel/sys_m68k.c                        |    4 
 arch/m68k/mm/init.c                                |   20 
 arch/mips/Kconfig                                  |    2 
 arch/mips/include/asm/mmzone.h                     |    8 
 arch/mips/include/asm/page.h                       |    2 
 arch/mips/kernel/traps.c                           |    4 
 arch/mips/mm/init.c                                |    7 
 arch/nds32/include/asm/memory.h                    |    6 
 arch/openrisc/include/asm/tlbflush.h               |    2 
 arch/powerpc/Kconfig                               |    2 
 arch/powerpc/include/asm/mmzone.h                  |    4 
 arch/powerpc/kernel/setup_64.c                     |    2 
 arch/powerpc/kernel/smp.c                          |    2 
 arch/powerpc/kexec/core.c                          |    4 
 arch/powerpc/kvm/book3s_hv.c                       |    4 
 arch/powerpc/kvm/book3s_hv_uvmem.c                 |    2 
 arch/powerpc/mm/Makefile                           |    2 
 arch/powerpc/mm/mem.c                              |    4 
 arch/riscv/Kconfig                                 |    2 
 arch/s390/Kconfig                                  |    2 
 arch/s390/include/asm/pgtable.h                    |    2 
 arch/sh/include/asm/mmzone.h                       |    4 
 arch/sh/kernel/topology.c                          |    2 
 arch/sh/mm/Kconfig                                 |    2 
 arch/sh/mm/init.c                                  |    2 
 arch/sparc/Kconfig                                 |    2 
 arch/sparc/include/asm/mmzone.h                    |    4 
 arch/sparc/kernel/smp_64.c                         |    2 
 arch/sparc/mm/init_64.c                            |   12 
 arch/x86/Kconfig                                   |    2 
 arch/x86/ia32/ia32_aout.c                          |    4 
 arch/x86/kernel/cpu/mce/core.c                     |   13 
 arch/x86/kernel/cpu/sgx/encl.h                     |    4 
 arch/x86/kernel/setup_percpu.c                     |    6 
 arch/x86/mm/init_32.c                              |    4 
 arch/xtensa/include/asm/page.h                     |    4 
 arch/xtensa/include/asm/tlbflush.h                 |    4 
 drivers/base/node.c                                |   18 
 drivers/block/loop.c                               |  270 ++++-
 drivers/block/loop.h                               |   15 
 drivers/dax/device.c                               |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c            |    4 
 drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c |    2 
 drivers/media/common/videobuf2/frame_vector.c      |    2 
 drivers/misc/sgi-gru/grufault.c                    |    4 
 drivers/vfio/vfio_iommu_type1.c                    |    2 
 drivers/virtio/virtio_balloon.c                    |   17 
 fs/adfs/inode.c                                    |    1 
 fs/affs/file.c                                     |    2 
 fs/bfs/file.c                                      |    1 
 fs/binfmt_aout.c                                   |    4 
 fs/binfmt_elf.c                                    |    2 
 fs/binfmt_elf_fdpic.c                              |   11 
 fs/binfmt_flat.c                                   |    2 
 fs/block_dev.c                                     |    1 
 fs/buffer.c                                        |   25 
 fs/configfs/inode.c                                |    8 
 fs/dax.c                                           |    3 
 fs/ecryptfs/mmap.c                                 |   13 
 fs/exfat/inode.c                                   |    1 
 fs/ext2/inode.c                                    |    4 
 fs/ext4/inode.c                                    |    2 
 fs/fat/inode.c                                     |    1 
 fs/fs-writeback.c                                  |  366 +++++---
 fs/fuse/dax.c                                      |    3 
 fs/gfs2/aops.c                                     |    2 
 fs/gfs2/meta_io.c                                  |    2 
 fs/hfs/inode.c                                     |    2 
 fs/hfsplus/inode.c                                 |    2 
 fs/hpfs/file.c                                     |    1 
 fs/iomap/buffered-io.c                             |   27 
 fs/jfs/inode.c                                     |    1 
 fs/kernfs/inode.c                                  |    8 
 fs/libfs.c                                         |   44 
 fs/minix/inode.c                                   |    1 
 fs/nilfs2/mdt.c                                    |    1 
 fs/ntfs/inode.c                                    |    2 
 fs/ocfs2/aops.c                                    |    4 
 fs/ocfs2/cluster/heartbeat.c                       |    7 
 fs/ocfs2/cluster/nodemanager.c                     |    2 
 fs/ocfs2/dlm/dlmmaster.c                           |    2 
 fs/ocfs2/filecheck.c                               |    6 
 fs/ocfs2/stackglue.c                               |    8 
 fs/omfs/file.c                                     |    1 
 fs/proc/task_mmu.c                                 |    2 
 fs/ramfs/inode.c                                   |    9 
 fs/squashfs/block.c                                |    5 
 fs/squashfs/squashfs_fs_sb.h                       |    1 
 fs/squashfs/super.c                                |   86 +
 fs/sysv/itree.c                                    |    1 
 fs/udf/file.c                                      |    1 
 fs/udf/inode.c                                     |    1 
 fs/ufs/inode.c                                     |    1 
 fs/xfs/xfs_aops.c                                  |    4 
 fs/zonefs/super.c                                  |    4 
 include/asm-generic/memory_model.h                 |   37 
 include/asm-generic/pgtable-nop4d.h                |    1 
 include/asm-generic/topology.h                     |    2 
 include/kunit/test.h                               |    5 
 include/linux/backing-dev-defs.h                   |   20 
 include/linux/cpuhotplug.h                         |    2 
 include/linux/fs.h                                 |    6 
 include/linux/gfp.h                                |   13 
 include/linux/iomap.h                              |    1 
 include/linux/kasan.h                              |    7 
 include/linux/kernel.h                             |    2 
 include/linux/kthread.h                            |    2 
 include/linux/memblock.h                           |    6 
 include/linux/memcontrol.h                         |   60 -
 include/linux/mm.h                                 |   53 -
 include/linux/mm_types.h                           |   10 
 include/linux/mman.h                               |    2 
 include/linux/mmdebug.h                            |    3 
 include/linux/mmzone.h                             |   96 +-
 include/linux/page-flags.h                         |   10 
 include/linux/page_owner.h                         |    6 
 include/linux/page_ref.h                           |    4 
 include/linux/page_reporting.h                     |    3 
 include/linux/pageblock-flags.h                    |    2 
 include/linux/pagemap.h                            |    4 
 include/linux/pgtable.h                            |   22 
 include/linux/printk.h                             |    5 
 include/linux/sched/coredump.h                     |    8 
 include/linux/slab.h                               |   59 +
 include/linux/swap.h                               |   19 
 include/linux/swapops.h                            |    5 
 include/linux/vmstat.h                             |   69 -
 include/linux/writeback.h                          |    1 
 include/trace/events/cma.h                         |    4 
 include/trace/events/filemap.h                     |    2 
 include/trace/events/kmem.h                        |   12 
 include/trace/events/page_pool.h                   |    4 
 include/trace/events/pagemap.h                     |    4 
 include/trace/events/vmscan.h                      |    2 
 kernel/cgroup/cgroup.c                             |    1 
 kernel/crash_core.c                                |    4 
 kernel/events/core.c                               |    2 
 kernel/events/uprobes.c                            |    4 
 kernel/fork.c                                      |    1 
 kernel/kthread.c                                   |   19 
 kernel/sysctl.c                                    |   16 
 kernel/watchdog.c                                  |   12 
 lib/Kconfig.debug                                  |   15 
 lib/Kconfig.kasan                                  |   16 
 lib/Makefile                                       |    1 
 lib/dump_stack.c                                   |   20 
 lib/kunit/test.c                                   |   18 
 lib/slub_kunit.c                                   |  152 +++
 lib/test_hmm.c                                     |    5 
 lib/test_kasan.c                                   |   11 
 lib/vsprintf.c                                     |    2 
 mm/Kconfig                                         |   38 
 mm/backing-dev.c                                   |   66 +
 mm/compaction.c                                    |    2 
 mm/debug.c                                         |   27 
 mm/debug_vm_pgtable.c                              |   63 +
 mm/dmapool.c                                       |    5 
 mm/filemap.c                                       |    2 
 mm/gup.c                                           |   81 +
 mm/hugetlb.c                                       |    2 
 mm/internal.h                                      |    9 
 mm/kasan/Makefile                                  |    4 
 mm/kasan/common.c                                  |    6 
 mm/kasan/generic.c                                 |    3 
 mm/kasan/hw_tags.c                                 |   22 
 mm/kasan/init.c                                    |    6 
 mm/kasan/kasan.h                                   |   12 
 mm/kasan/report.c                                  |    6 
 mm/kasan/report_hw_tags.c                          |    5 
 mm/kasan/report_sw_tags.c                          |   45 
 mm/kasan/report_tags.c                             |   51 +
 mm/kasan/shadow.c                                  |    6 
 mm/kasan/sw_tags.c                                 |   45 
 mm/kasan/tags.c                                    |   59 +
 mm/kfence/kfence_test.c                            |    5 
 mm/kmemleak.c                                      |   18 
 mm/ksm.c                                           |    6 
 mm/memblock.c                                      |    8 
 mm/memcontrol.c                                    |  385 ++++++--
 mm/memory-failure.c                                |  344 +++++--
 mm/memory.c                                        |   22 
 mm/memory_hotplug.c                                |    6 
 mm/mempolicy.c                                     |    4 
 mm/migrate.c                                       |    4 
 mm/mmap.c                                          |   54 -
 mm/mmap_lock.c                                     |   33 
 mm/mprotect.c                                      |   52 +
 mm/mremap.c                                        |    5 
 mm/nommu.c                                         |    2 
 mm/page-writeback.c                                |   89 +
 mm/page_alloc.c                                    |  950 +++++++++++++--------
 mm/page_ext.c                                      |    2 
 mm/page_owner.c                                    |    2 
 mm/page_reporting.c                                |   19 
 mm/page_reporting.h                                |    5 
 mm/pagewalk.c                                      |   58 +
 mm/shmem.c                                         |   18 
 mm/slab.h                                          |   24 
 mm/slab_common.c                                   |   60 -
 mm/slub.c                                          |  420 +++++----
 mm/sparse.c                                        |    2 
 mm/swap.c                                          |    4 
 mm/swap_slots.c                                    |    2 
 mm/swap_state.c                                    |   20 
 mm/swapfile.c                                      |  177 +--
 mm/vmalloc.c                                       |  181 ++--
 mm/vmscan.c                                        |   43 
 mm/vmstat.c                                        |  282 ++----
 mm/workingset.c                                    |    2 
 net/ipv4/tcp.c                                     |    4 
 scripts/kconfig/streamline_config.pl               |   76 -
 scripts/link-vmlinux.sh                            |    4 
 scripts/spelling.txt                               |   16 
 tools/testing/selftests/vm/gup_test.c              |   96 +-
 tools/vm/page_owner_sort.c                         |    4 
 virt/kvm/kvm_main.c                                |    2 
 260 files changed, 3989 insertions(+), 2996 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-06-25  1:38 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-06-25  1:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

24 patches, based on 4a09d388f2ab382f217a764e6a152b3f614246f6.

Subsystems affected by this patch series:

  mm/thp
  nilfs2
  mm/vmalloc
  kthread
  mm/hugetlb
  mm/memory-failure
  mm/pagealloc
  MAINTAINERS
  mailmap

Subsystem: mm/thp

    Hugh Dickins <hughd@google.com>:
    Patch series "mm: page_vma_mapped_walk() cleanup and THP fixes":
      mm: page_vma_mapped_walk(): use page for pvmw->page
      mm: page_vma_mapped_walk(): settle PageHuge on entry
      mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd
      mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block
      mm: page_vma_mapped_walk(): crossing page table boundary
      mm: page_vma_mapped_walk(): add a level of indentation
      mm: page_vma_mapped_walk(): use goto instead of while (1)
      mm: page_vma_mapped_walk(): get vma_address_end() earlier
      mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes
      mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk()

Subsystem: nilfs2

    Pavel Skripkin <paskripkin@gmail.com>:
      nilfs2: fix memory leak in nilfs_sysfs_delete_device_group

Subsystem: mm/vmalloc

    Claudio Imbrenda <imbrenda@linux.ibm.com>:
    Patch series "mm: add vmalloc_no_huge and use it", v4:
      mm/vmalloc: add vmalloc_no_huge
      KVM: s390: prepare for hugepage vmalloc

    Daniel Axtens <dja@axtens.net>:
      mm/vmalloc: unbreak kasan vmalloc support

Subsystem: kthread

    Petr Mladek <pmladek@suse.com>:
    Patch series "kthread_worker: Fix race between kthread_mod_delayed_work():
      kthread_worker: split code for canceling the delayed work timer
      kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync()

Subsystem: mm/hugetlb

    Hugh Dickins <hughd@google.com>:
      mm, futex: fix shared futex pgoff on shmem huge page

Subsystem: mm/memory-failure

    Tony Luck <tony.luck@intel.com>:
    Patch series "mm,hwpoison: fix sending SIGBUS for Action Required MCE", v5:
      mm/memory-failure: use a mutex to avoid memory_failure() races

    Aili Yao <yaoaili@kingsoft.com>:
      mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm/hwpoison: do not lock page again when me_huge_page() successfully recovers

Subsystem: mm/pagealloc

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array

    Mel Gorman <mgorman@techsingularity.net>:
      mm/page_alloc: do bulk array bounds check after checking populated elements

Subsystem: MAINTAINERS

    Marek Behún <kabel@kernel.org>:
      MAINTAINERS: fix Marek's identity again

Subsystem: mailmap

    Marek Behún <kabel@kernel.org>:
      mailmap: add Marek's other e-mail address and identity without diacritics

 .mailmap                |    2 
 MAINTAINERS             |    4 
 arch/s390/kvm/pv.c      |    7 +
 fs/nilfs2/sysfs.c       |    1 
 include/linux/hugetlb.h |   16 ---
 include/linux/pagemap.h |   13 +-
 include/linux/vmalloc.h |    1 
 kernel/futex.c          |    3 
 kernel/kthread.c        |   81 ++++++++++------
 mm/hugetlb.c            |    5 -
 mm/memory-failure.c     |   83 +++++++++++------
 mm/page_alloc.c         |    6 +
 mm/page_vma_mapped.c    |  233 +++++++++++++++++++++++++++---------------------
 mm/vmalloc.c            |   41 ++++++--
 14 files changed, 297 insertions(+), 199 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-06-16  1:22 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-06-16  1:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


18 patches, based on 94f0b2d4a1d0c52035aef425da5e022bd2cb1c71.

Subsystems affected by this patch series:

  mm/memory-failure
  mm/swap
  mm/slub
  mm/hugetlb
  mm/memory-failure
  coredump
  mm/slub
  mm/thp
  mm/sparsemem

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm,hwpoison: fix race with hugetlb page allocation

Subsystem: mm/swap

    Peter Xu <peterx@redhat.com>:
      mm/swap: fix pte_same_as_swp() not removing uffd-wp bit when compare

Subsystem: mm/slub

    Kees Cook <keescook@chromium.org>:
    Patch series "Actually fix freelist pointer vs redzoning", v4:
      mm/slub: clarify verification reporting
      mm/slub: fix redzoning for small allocations
      mm/slub: actually fix freelist pointer vs redzoning

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      mm/hugetlb: expand restore_reserve_on_error functionality

Subsystem: mm/memory-failure

    yangerkun <yangerkun@huawei.com>:
      mm/memory-failure: make sure wait for page writeback in memory_failure

Subsystem: coredump

    Pingfan Liu <kernelfans@gmail.com>:
      crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo

Subsystem: mm/slub

    Andrew Morton <akpm@linux-foundation.org>:
      mm/slub.c: include swab.h

Subsystem: mm/thp

    Xu Yu <xuyu@linux.alibaba.com>:
      mm, thp: use head page in __migration_entry_wait()

    Hugh Dickins <hughd@google.com>:
    Patch series "mm/thp: fix THP splitting unmap BUGs and related", v10:
      mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
      mm/thp: make is_huge_zero_pmd() safe and quicker
      mm/thp: try_to_unmap() use TTU_SYNC for safe splitting
      mm/thp: fix vma_address() if virtual address below file offset

    Jue Wang <juew@google.com>:
      mm/thp: fix page_address_in_vma() on file THP tails

    Hugh Dickins <hughd@google.com>:
      mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()

    Yang Shi <shy828301@gmail.com>:
      mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split

Subsystem: mm/sparsemem

    Miles Chen <miles.chen@mediatek.com>:
      mm/sparse: fix check_usemap_section_nr warnings

 Documentation/vm/slub.rst |   10 +--
 fs/hugetlbfs/inode.c      |    1 
 include/linux/huge_mm.h   |    8 ++
 include/linux/hugetlb.h   |    8 ++
 include/linux/mm.h        |    3 +
 include/linux/rmap.h      |    1 
 include/linux/swapops.h   |   15 +++--
 kernel/crash_core.c       |    1 
 mm/huge_memory.c          |   58 ++++++++++---------
 mm/hugetlb.c              |  137 +++++++++++++++++++++++++++++++++++++---------
 mm/internal.h             |   51 ++++++++++++-----
 mm/memory-failure.c       |   36 +++++++++++-
 mm/memory.c               |   41 +++++++++++++
 mm/migrate.c              |    1 
 mm/page_vma_mapped.c      |   27 +++++----
 mm/pgtable-generic.c      |    5 -
 mm/rmap.c                 |   41 +++++++++----
 mm/slab_common.c          |    3 -
 mm/slub.c                 |   37 +++++-------
 mm/sparse.c               |   13 +++-
 mm/swapfile.c             |    2 
 mm/truncate.c             |   43 ++++++--------
 22 files changed, 388 insertions(+), 154 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-06-05  3:00 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-06-05  3:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

13 patches, based on 16f0596fc1d78a1f3ae4628cff962bb297dc908c.

Subsystems affected by this patch series:

  mips
  mm/kfence
  init
  mm/debug
  mm/pagealloc
  mm/memory-hotplug
  mm/hugetlb
  proc
  mm/kasan
  mm/hugetlb
  lib
  ocfs2
  mailmap

Subsystem: mips

    Thomas Bogendoerfer <tsbogend@alpha.franken.de>:
      Revert "MIPS: make userspace mapping young by default"

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: use TASK_IDLE when awaiting allocation

Subsystem: init

    Mark Rutland <mark.rutland@arm.com>:
      pid: take a reference when initializing `cad_pid`

Subsystem: mm/debug

    Gerald Schaefer <gerald.schaefer@linux.ibm.com>:
      mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests()

Subsystem: mm/pagealloc

    Ding Hui <dinghui@sangfor.com.cn>:
      mm/page_alloc: fix counting of free pages after take off from buddy

Subsystem: mm/memory-hotplug

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory: fix trying offlining memory blocks with memory holes on aarch64

Subsystem: mm/hugetlb

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      hugetlb: pass head page to remove_hugetlb_page()

Subsystem: proc

    David Matlack <dmatlack@google.com>:
      proc: add .gitignore for proc-subset-pid selftest

Subsystem: mm/kasan

    Yu Kuai <yukuai3@huawei.com>:
      mm/kasan/init.c: fix doc warning

Subsystem: mm/hugetlb

    Mina Almasry <almasrymina@google.com>:
      mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY

Subsystem: lib

    YueHaibing <yuehaibing@huawei.com>:
      lib: crc64: fix kernel-doc warning

Subsystem: ocfs2

    Junxiao Bi <junxiao.bi@oracle.com>:
      ocfs2: fix data corruption by fallocate

Subsystem: mailmap

    Michel Lespinasse <michel@lespinasse.org>:
      mailmap: use private address for Michel Lespinasse

 .mailmap                                |    3 +
 arch/mips/mm/cache.c                    |   30 ++++++++---------
 drivers/base/memory.c                   |    6 +--
 fs/ocfs2/file.c                         |   55 +++++++++++++++++++++++++++++---
 include/linux/pgtable.h                 |    8 ++++
 init/main.c                             |    2 -
 lib/crc64.c                             |    2 -
 mm/debug_vm_pgtable.c                   |    4 +-
 mm/hugetlb.c                            |   16 +++++++--
 mm/kasan/init.c                         |    4 +-
 mm/kfence/core.c                        |    6 +--
 mm/memory.c                             |    4 ++
 mm/page_alloc.c                         |    2 +
 tools/testing/selftests/proc/.gitignore |    1 
 14 files changed, 107 insertions(+), 36 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-05-23  0:41 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-05-23  0:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

10 patches, based on 4ff2473bdb4cf2bb7d208ccf4418d3d7e6b1652c.

Subsystems affected by this patch series:

  mm/pagealloc
  mm/gup
  ipc
  selftests
  mm/kasan
  kernel/watchdog
  bitmap
  procfs
  lib
  mm/userfaultfd

Subsystem: mm/pagealloc

    Arnd Bergmann <arnd@arndb.de>:
      mm/shuffle: fix section mismatch warning

Subsystem: mm/gup

    Michal Hocko <mhocko@suse.com>:
      Revert "mm/gup: check page posion status for coredump."

Subsystem: ipc

    Varad Gautam <varad.gautam@suse.com>:
      ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry

Subsystem: selftests

    Yang Yingliang <yangyingliang@huawei.com>:
      tools/testing/selftests/exec: fix link error

Subsystem: mm/kasan

    Alexander Potapenko <glider@google.com>:
      kasan: slab: always reset the tag in get_freepointer_safe()

Subsystem: kernel/watchdog

    Petr Mladek <pmladek@suse.com>:
      watchdog: reliable handling of timestamps

Subsystem: bitmap

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      linux/bits.h: fix compilation error with GENMASK

Subsystem: procfs

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: remove Alexey from MAINTAINERS

Subsystem: lib

    Zhen Lei <thunder.leizhen@huawei.com>:
      lib: kunit: suppress a compilation warning of frame size

Subsystem: mm/userfaultfd

    Mike Kravetz <mike.kravetz@oracle.com>:
      userfaultfd: hugetlbfs: fix new flag usage in error path

 MAINTAINERS                           |    1 -
 fs/hugetlbfs/inode.c                  |    2 +-
 include/linux/bits.h                  |    2 +-
 include/linux/const.h                 |    8 ++++++++
 include/linux/minmax.h                |   10 ++--------
 ipc/mqueue.c                          |    6 ++++--
 ipc/msg.c                             |    6 ++++--
 ipc/sem.c                             |    6 ++++--
 kernel/watchdog.c                     |   34 ++++++++++++++++++++--------------
 lib/Makefile                          |    1 +
 mm/gup.c                              |    4 ----
 mm/internal.h                         |   20 --------------------
 mm/shuffle.h                          |    4 ++--
 mm/slub.c                             |    1 +
 mm/userfaultfd.c                      |   28 ++++++++++++++--------------
 tools/include/linux/bits.h            |    2 +-
 tools/include/linux/const.h           |    8 ++++++++
 tools/testing/selftests/exec/Makefile |    6 +++---
 18 files changed, 74 insertions(+), 75 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-05-15  0:26 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-05-15  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

13 patches, based on bd3c9cdb21a2674dd0db70199df884828e37abd4.

Subsystems affected by this patch series:

  mm/hugetlb
  mm/slub
  resource
  squashfs
  mm/userfaultfd
  mm/ksm
  mm/pagealloc
  mm/kasan
  mm/pagemap
  hfsplus
  modprobe
  mm/ioremap

Subsystem: mm/hugetlb

    Peter Xu <peterx@redhat.com>:
    Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2:
      mm/hugetlb: fix F_SEAL_FUTURE_WRITE
      mm/hugetlb: fix cow where page writtable in child

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: move slub_debug static key enabling outside slab_mutex

Subsystem: resource

    Alistair Popple <apopple@nvidia.com>:
      kernel/resource: fix return code check in __request_free_mem_region

Subsystem: squashfs

    Phillip Lougher <phillip@squashfs.org.uk>:
      squashfs: fix divide error in calculate_skip()

Subsystem: mm/userfaultfd

    Axel Rasmussen <axelrasmussen@google.com>:
      userfaultfd: release page in error path to avoid BUG_ON

Subsystem: mm/ksm

    Hugh Dickins <hughd@google.com>:
      ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()"

Subsystem: mm/pagealloc

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: fix struct page layout on 32-bit systems

Subsystem: mm/kasan

    Peter Collingbourne <pcc@google.com>:
      kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled

Subsystem: mm/pagemap

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap: fix readahead return types

Subsystem: hfsplus

    Jouni Roivas <jouni.roivas@tuxera.com>:
      hfsplus: prevent corruption in shrinking truncate

Subsystem: modprobe

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      docs: admin-guide: update description for kernel.modprobe sysctl

Subsystem: mm/ioremap

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm/ioremap: fix iomap_max_page_shift

 Documentation/admin-guide/sysctl/kernel.rst |    9 ++++---
 fs/hfsplus/extents.c                        |    7 +++--
 fs/hugetlbfs/inode.c                        |    5 ++++
 fs/iomap/buffered-io.c                      |    4 +--
 fs/squashfs/file.c                          |    6 ++--
 include/linux/mm.h                          |   32 ++++++++++++++++++++++++++
 include/linux/mm_types.h                    |    4 +--
 include/linux/pagemap.h                     |    6 ++--
 include/net/page_pool.h                     |   12 +++++++++
 kernel/resource.c                           |    2 -
 lib/test_kasan.c                            |   29 ++++++++++++++++++-----
 mm/hugetlb.c                                |    1 
 mm/ioremap.c                                |    6 ++--
 mm/ksm.c                                    |    3 +-
 mm/shmem.c                                  |   34 ++++++++++++----------------
 mm/slab_common.c                            |   10 ++++++++
 mm/slub.c                                   |    9 -------
 net/core/page_pool.c                        |   12 +++++----
 18 files changed, 129 insertions(+), 62 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-07  1:01 incoming Andrew Morton
@ 2021-05-07  7:12 ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-05-07  7:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Thu, May 6, 2021 at 6:01 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> I've been wobbly about the secretmem patches due to doubts about
> whether the feature is sufficiently useful to justify inclusion, but
> developers are now weighing in with helpful information and I've asked Mike
> for an extensively updated [0/n] changelog.  This will take a few days
> to play out so it is possible that I will prevail upon you for a post-rc1
> merge.

Oh, much too late for this release by now.

> If that's a problem, there's always 5.13-rc1.

5.13-rc1 is two days from now, it would be for 5.14-rc1.. How time -
and version numbers - fly.

             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-05-07  1:01 Andrew Morton
  2021-05-07  7:12 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-05-07  1:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


This is everything else from -mm for this merge window, with the
possible exception of Mike Rapoport's "secretmem" syscall patch series
(https://lkml.kernel.org/r/20210303162209.8609-1-rppt@kernel.org).

I've been wobbly about the secretmem patches due to doubts about
whether the feature is sufficiently useful to justify inclusion, but
developers are now weighing in with helpful information and I've asked Mike
for an extensively updated [0/n] changelog.  This will take a few days
to play out so it is possible that I will prevail upon you for a post-rc1
merge.  If that's a problem, there's always 5.13-rc1.

91 patches, based on 8ca5297e7e38f2dc8c753d33a5092e7be181fff0, plus
previously sent patches.

Thanks.



Subsystems affected by this patch series:

  alpha
  procfs
  sysctl
  misc
  core-kernel
  bitmap
  lib
  compat
  checkpatch
  epoll
  isofs
  nilfs2
  hpfs
  exit
  fork
  kexec
  gcov
  panic
  delayacct
  gdb
  resource
  selftests
  async
  initramfs
  ipc
  mm/cleanups
  drivers/char
  mm/slub
  spelling

Subsystem: alpha

    Randy Dunlap <rdunlap@infradead.org>:
      alpha: eliminate old-style function definitions
      alpha: csum_partial_copy.c: add function prototypes from <net/checksum.h>

Subsystem: procfs

    Colin Ian King <colin.king@canonical.com>:
      fs/proc/generic.c: fix incorrect pde_is_permanent check

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: save LOC in __xlate_proc_name()
      proc: mandate ->proc_lseek in "struct proc_ops"
      proc: delete redundant subset=pid check
      selftests: proc: test subset=pid

Subsystem: sysctl

    zhouchuangao <zhouchuangao@vivo.com>:
      proc/sysctl: fix function name error in comments

Subsystem: misc

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      include: remove pagemap.h from blkdev.h

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: drop inclusion in bitmap.h

    Wan Jiabing <wanjiabing@vivo.com>:
      linux/profile.h: remove unnecessary declaration

Subsystem: core-kernel

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      kernel/async.c: fix pr_debug statement
      kernel/cred.c: make init_groups static

Subsystem: bitmap

    Yury Norov <yury.norov@gmail.com>:
    Patch series "lib/find_bit: fast path for small bitmaps", v6:
      tools: disable -Wno-type-limits
      tools: bitmap: sync function declarations with the kernel
      tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel
      arch: rearrange headers inclusion order in asm/bitops for m68k, sh and h8300
      lib: extend the scope of small_const_nbits() macro
      tools: sync small_const_nbits() macro with the kernel
      lib: inline _find_next_bit() wrappers
      tools: sync find_next_bit implementation
      lib: add fast path for find_next_*_bit()
      lib: add fast path for find_first_*_bit() and find_last_bit()
      tools: sync lib/find_bit implementation
      MAINTAINERS: add entry for the bitmap API

Subsystem: lib

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      lib/bch.c: fix a typo in the file bch.c

    Wang Qing <wangqing@vivo.com>:
      lib: fix inconsistent indenting in process_bit1()

    ToastC <mrtoastcheng@gmail.com>:
      lib/list_sort.c: fix typo in function description

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      lib/genalloc.c: Fix a typo

    Richard Fitzgerald <rf@opensource.cirrus.com>:
      lib: crc8: pointer to data block should be const

    Zqiang <qiang.zhang@windriver.com>:
      lib: stackdepot: turn depot_lock spinlock to raw_spinlock

    Alex Shi <alexs@kernel.org>:
      lib/percpu_counter: tame kernel-doc compile warning
      lib/genalloc: add parameter description to fix doc compile warning

    Randy Dunlap <rdunlap@infradead.org>:
      lib: parser: clean up kernel-doc

Subsystem: compat

    Masahiro Yamada <masahiroy@kernel.org>:
      include/linux/compat.h: remove unneeded declaration from COMPAT_SYSCALL_DEFINEx()

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: warn when missing newline in return sysfs_emit() formats

    Vincent Mailhol <mailhol.vincent@wanadoo.fr>:
      checkpatch: exclude four preprocessor sub-expressions from MACRO_ARG_REUSE

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      checkpatch: improve ALLOC_ARRAY_ARGS test

Subsystem: epoll

    Davidlohr Bueso <dave@stgolabs.net>:
    Patch series "fs/epoll: restore user-visible behavior upon event ready":
      kselftest: introduce new epoll test case
      fs/epoll: restore waking from ep_done_scan()

Subsystem: isofs

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      isofs: fix fall-through warnings for Clang

Subsystem: nilfs2

    Liu xuzhi <liu.xuzhi@zte.com.cn>:
      fs/nilfs2: fix misspellings using codespell tool

    Lu Jialin <lujialin4@huawei.com>:
      nilfs2: fix typos in comments

Subsystem: hpfs

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      hpfs: replace one-element array with flexible-array member

Subsystem: exit

    Jim Newsome <jnewsome@torproject.org>:
      do_wait: make PIDTYPE_PID case O(1) instead of O(n)

Subsystem: fork

    Rolf Eike Beer <eb@emlix.com>:
      kernel/fork.c: simplify copy_mm()

    Xiaofeng Cao <cxfcosmos@gmail.com>:
      kernel/fork.c: fix typos

Subsystem: kexec

    Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com>:
      kernel/crash_core: add crashkernel=auto for vmcore creation

    Joe LeVeque <jolevequ@microsoft.com>:
      kexec: Add kexec reboot string

    Jia-Ju Bai <baijiaju1990@gmail.com>:
      kernel: kexec_file: fix error return code of kexec_calculate_store_digests()

    Pavel Tatashin <pasha.tatashin@soleen.com>:
      kexec: dump kmessage before machine_kexec

Subsystem: gcov

    Johannes Berg <johannes.berg@intel.com>:
      gcov: combine common code
      gcov: simplify buffer allocation
      gcov: use kvmalloc()

    Nick Desaulniers <ndesaulniers@google.com>:
      gcov: clang: drop support for clang-10 and older

Subsystem: panic

    He Ying <heying24@huawei.com>:
      smp: kernel/panic.c - silence warnings

Subsystem: delayacct

    Yafang Shao <laoar.shao@gmail.com>:
      delayacct: clear right task's flag after blkio completes

Subsystem: gdb

    Johannes Berg <johannes.berg@intel.com>:
      gdb: lx-symbols: store the abspath()

    Barry Song <song.bao.hua@hisilicon.com>:
    Patch series "scripts/gdb: clarify the platforms supporting lx_current and add arm64 support", v2:
      scripts/gdb: document lx_current is only supported by x86
      scripts/gdb: add lx_current support for arm64

Subsystem: resource

    David Hildenbrand <david@redhat.com>:
    Patch series "kernel/resource: make walk_system_ram_res() and walk_mem_res() search the whole tree", v2:
      kernel/resource: make walk_system_ram_res() find all busy IORESOURCE_SYSTEM_RAM resources
      kernel/resource: make walk_mem_res() find all busy IORESOURCE_MEM resources
      kernel/resource: remove first_lvl / siblings_only logic

    Alistair Popple <apopple@nvidia.com>:
      kernel/resource: allow region_intersects users to hold resource_lock
      kernel/resource: refactor __request_region to allow external locking
      kernel/resource: fix locking in request_free_mem_region

Subsystem: selftests

    Zhang Yunkai <zhang.yunkai@zte.com.cn>:
      selftests: remove duplicate include

Subsystem: async

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
      kernel/async.c: stop guarding pr_debug() statements
      kernel/async.c: remove async_unregister_domain()

Subsystem: initramfs

    Rasmus Villemoes <linux@rasmusvillemoes.dk>:
    Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3:
      init/initramfs.c: do unpacking asynchronously
      modules: add CONFIG_MODPROBE_PATH

Subsystem: ipc

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ipc/sem.c: mundane typo fixes

Subsystem: mm/cleanups

    Shijie Luo <luoshijie1@huawei.com>:
      mm: fix some typos and code style problems

Subsystem: drivers/char

    David Hildenbrand <david@redhat.com>:
    Patch series "drivers/char: remove /dev/kmem for good":
      drivers/char: remove /dev/kmem for good
      mm: remove xlate_dev_kmem_ptr()
      mm/vmalloc: remove vwrite()

Subsystem: mm/slub

    Maninder Singh <maninder1.s@samsung.com>:
      arm: print alloc free paths for address in registers

Subsystem: spelling

    Drew Fustini <drew@beagleboard.org>:
      scripts/spelling.txt: add "overlfow"

    zuoqilin <zuoqilin@yulong.com>:
      scripts/spelling.txt: Add "diabled" typo

    Drew Fustini <drew@beagleboard.org>:
      scripts/spelling.txt: add "overflw"

    Colin Ian King <colin.king@canonical.com>:
      mm/slab.c: fix spelling mistake "disired" -> "desired"

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      include/linux/pgtable.h: few spelling fixes

    zhouchuangao <zhouchuangao@vivo.com>:
      kernel/umh.c: fix some spelling mistakes

    Xiaofeng Cao <cxfcosmos@gmail.com>:
      kernel/user_namespace.c: fix typos

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      kernel/up.c: fix typo

    Xiaofeng Cao <caoxiaofeng@yulong.com>:
      kernel/sys.c: fix typo

    dingsenjie <dingsenjie@yulong.com>:
      fs: fat: fix spelling typo of values

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ipc/sem.c: spelling fix

    Masahiro Yamada <masahiroy@kernel.org>:
      treewide: remove editor modelines and cruft

    Ingo Molnar <mingo@kernel.org>:
      mm: fix typos in comments

    Lu Jialin <lujialin4@huawei.com>:
      mm: fix typos in comments

 Documentation/admin-guide/devices.txt                         |    2 
 Documentation/admin-guide/kdump/kdump.rst                     |    3 
 Documentation/admin-guide/kernel-parameters.txt               |   18 
 Documentation/dev-tools/gdb-kernel-debugging.rst              |    4 
 MAINTAINERS                                                   |   16 
 arch/Kconfig                                                  |   20 
 arch/alpha/include/asm/io.h                                   |    5 
 arch/alpha/kernel/pc873xx.c                                   |    4 
 arch/alpha/lib/csum_partial_copy.c                            |    1 
 arch/arm/configs/dove_defconfig                               |    1 
 arch/arm/configs/magician_defconfig                           |    1 
 arch/arm/configs/moxart_defconfig                             |    1 
 arch/arm/configs/mps2_defconfig                               |    1 
 arch/arm/configs/mvebu_v5_defconfig                           |    1 
 arch/arm/configs/xcep_defconfig                               |    1 
 arch/arm/include/asm/bug.h                                    |    1 
 arch/arm/include/asm/io.h                                     |    5 
 arch/arm/kernel/process.c                                     |   11 
 arch/arm/kernel/traps.c                                       |    1 
 arch/h8300/include/asm/bitops.h                               |    8 
 arch/hexagon/configs/comet_defconfig                          |    1 
 arch/hexagon/include/asm/io.h                                 |    1 
 arch/ia64/include/asm/io.h                                    |    1 
 arch/ia64/include/asm/uaccess.h                               |   18 
 arch/m68k/atari/time.c                                        |    7 
 arch/m68k/configs/amcore_defconfig                            |    1 
 arch/m68k/include/asm/bitops.h                                |    6 
 arch/m68k/include/asm/io_mm.h                                 |    5 
 arch/mips/include/asm/io.h                                    |    5 
 arch/openrisc/configs/or1ksim_defconfig                       |    1 
 arch/parisc/include/asm/io.h                                  |    5 
 arch/parisc/include/asm/pdc_chassis.h                         |    1 
 arch/powerpc/include/asm/io.h                                 |    5 
 arch/s390/include/asm/io.h                                    |    5 
 arch/sh/configs/edosk7705_defconfig                           |    1 
 arch/sh/configs/se7206_defconfig                              |    1 
 arch/sh/configs/sh2007_defconfig                              |    1 
 arch/sh/configs/sh7724_generic_defconfig                      |    1 
 arch/sh/configs/sh7770_generic_defconfig                      |    1 
 arch/sh/configs/sh7785lcr_32bit_defconfig                     |    1 
 arch/sh/include/asm/bitops.h                                  |    5 
 arch/sh/include/asm/io.h                                      |    5 
 arch/sparc/configs/sparc64_defconfig                          |    1 
 arch/sparc/include/asm/io_64.h                                |    5 
 arch/um/drivers/cow.h                                         |    7 
 arch/xtensa/configs/xip_kc705_defconfig                       |    1 
 block/blk-settings.c                                          |    1 
 drivers/auxdisplay/panel.c                                    |    7 
 drivers/base/firmware_loader/main.c                           |    2 
 drivers/block/brd.c                                           |    1 
 drivers/block/loop.c                                          |    1 
 drivers/char/Kconfig                                          |   10 
 drivers/char/mem.c                                            |  231 --------
 drivers/gpu/drm/qxl/qxl_drv.c                                 |    1 
 drivers/isdn/capi/kcapi_proc.c                                |    1 
 drivers/md/bcache/super.c                                     |    1 
 drivers/media/usb/pwc/pwc-uncompress.c                        |    3 
 drivers/net/ethernet/adaptec/starfire.c                       |    8 
 drivers/net/ethernet/amd/atarilance.c                         |    8 
 drivers/net/ethernet/amd/pcnet32.c                            |    7 
 drivers/net/wireless/intersil/hostap/hostap_proc.c            |    1 
 drivers/net/wireless/intersil/orinoco/orinoco_nortel.c        |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_pci.c           |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_plx.c           |    8 
 drivers/net/wireless/intersil/orinoco/orinoco_tmd.c           |    8 
 drivers/nvdimm/btt.c                                          |    1 
 drivers/nvdimm/pmem.c                                         |    1 
 drivers/parport/parport_ip32.c                                |   12 
 drivers/platform/x86/dell/dell_rbu.c                          |    3 
 drivers/scsi/53c700.c                                         |    1 
 drivers/scsi/53c700.h                                         |    1 
 drivers/scsi/ch.c                                             |    6 
 drivers/scsi/esas2r/esas2r_main.c                             |    1 
 drivers/scsi/ips.c                                            |   20 
 drivers/scsi/ips.h                                            |   20 
 drivers/scsi/lasi700.c                                        |    1 
 drivers/scsi/megaraid/mbox_defs.h                             |    2 
 drivers/scsi/megaraid/mega_common.h                           |    2 
 drivers/scsi/megaraid/megaraid_mbox.c                         |    2 
 drivers/scsi/megaraid/megaraid_mbox.h                         |    2 
 drivers/scsi/qla1280.c                                        |   12 
 drivers/scsi/scsicam.c                                        |    1 
 drivers/scsi/sni_53c710.c                                     |    1 
 drivers/video/fbdev/matrox/matroxfb_base.c                    |    9 
 drivers/video/fbdev/vga16fb.c                                 |   10 
 fs/configfs/configfs_internal.h                               |    4 
 fs/configfs/dir.c                                             |    4 
 fs/configfs/file.c                                            |    4 
 fs/configfs/inode.c                                           |    4 
 fs/configfs/item.c                                            |    4 
 fs/configfs/mount.c                                           |    4 
 fs/configfs/symlink.c                                         |    4 
 fs/eventpoll.c                                                |    6 
 fs/fat/fatent.c                                               |    2 
 fs/hpfs/hpfs.h                                                |    3 
 fs/isofs/rock.c                                               |    1 
 fs/nfs/dir.c                                                  |    7 
 fs/nfs/nfs4proc.c                                             |    6 
 fs/nfs/nfs4renewd.c                                           |    6 
 fs/nfs/nfs4state.c                                            |    6 
 fs/nfs/nfs4xdr.c                                              |    6 
 fs/nfsd/nfs4proc.c                                            |    6 
 fs/nfsd/nfs4xdr.c                                             |    6 
 fs/nfsd/xdr4.h                                                |    6 
 fs/nilfs2/cpfile.c                                            |    2 
 fs/nilfs2/ioctl.c                                             |    4 
 fs/nilfs2/segment.c                                           |    4 
 fs/nilfs2/the_nilfs.c                                         |    2 
 fs/ocfs2/acl.c                                                |    4 
 fs/ocfs2/acl.h                                                |    4 
 fs/ocfs2/alloc.c                                              |    4 
 fs/ocfs2/alloc.h                                              |    4 
 fs/ocfs2/aops.c                                               |    4 
 fs/ocfs2/aops.h                                               |    4 
 fs/ocfs2/blockcheck.c                                         |    4 
 fs/ocfs2/blockcheck.h                                         |    4 
 fs/ocfs2/buffer_head_io.c                                     |    4 
 fs/ocfs2/buffer_head_io.h                                     |    4 
 fs/ocfs2/cluster/heartbeat.c                                  |    4 
 fs/ocfs2/cluster/heartbeat.h                                  |    4 
 fs/ocfs2/cluster/masklog.c                                    |    4 
 fs/ocfs2/cluster/masklog.h                                    |    4 
 fs/ocfs2/cluster/netdebug.c                                   |    4 
 fs/ocfs2/cluster/nodemanager.c                                |    4 
 fs/ocfs2/cluster/nodemanager.h                                |    4 
 fs/ocfs2/cluster/ocfs2_heartbeat.h                            |    4 
 fs/ocfs2/cluster/ocfs2_nodemanager.h                          |    4 
 fs/ocfs2/cluster/quorum.c                                     |    4 
 fs/ocfs2/cluster/quorum.h                                     |    4 
 fs/ocfs2/cluster/sys.c                                        |    4 
 fs/ocfs2/cluster/sys.h                                        |    4 
 fs/ocfs2/cluster/tcp.c                                        |    4 
 fs/ocfs2/cluster/tcp.h                                        |    4 
 fs/ocfs2/cluster/tcp_internal.h                               |    4 
 fs/ocfs2/dcache.c                                             |    4 
 fs/ocfs2/dcache.h                                             |    4 
 fs/ocfs2/dir.c                                                |    4 
 fs/ocfs2/dir.h                                                |    4 
 fs/ocfs2/dlm/dlmapi.h                                         |    4 
 fs/ocfs2/dlm/dlmast.c                                         |    4 
 fs/ocfs2/dlm/dlmcommon.h                                      |    4 
 fs/ocfs2/dlm/dlmconvert.c                                     |    4 
 fs/ocfs2/dlm/dlmconvert.h                                     |    4 
 fs/ocfs2/dlm/dlmdebug.c                                       |    4 
 fs/ocfs2/dlm/dlmdebug.h                                       |    4 
 fs/ocfs2/dlm/dlmdomain.c                                      |    4 
 fs/ocfs2/dlm/dlmdomain.h                                      |    4 
 fs/ocfs2/dlm/dlmlock.c                                        |    4 
 fs/ocfs2/dlm/dlmmaster.c                                      |    4 
 fs/ocfs2/dlm/dlmrecovery.c                                    |    4 
 fs/ocfs2/dlm/dlmthread.c                                      |    4 
 fs/ocfs2/dlm/dlmunlock.c                                      |    4 
 fs/ocfs2/dlmfs/dlmfs.c                                        |    4 
 fs/ocfs2/dlmfs/userdlm.c                                      |    4 
 fs/ocfs2/dlmfs/userdlm.h                                      |    4 
 fs/ocfs2/dlmglue.c                                            |    4 
 fs/ocfs2/dlmglue.h                                            |    4 
 fs/ocfs2/export.c                                             |    4 
 fs/ocfs2/export.h                                             |    4 
 fs/ocfs2/extent_map.c                                         |    4 
 fs/ocfs2/extent_map.h                                         |    4 
 fs/ocfs2/file.c                                               |    4 
 fs/ocfs2/file.h                                               |    4 
 fs/ocfs2/filecheck.c                                          |    4 
 fs/ocfs2/filecheck.h                                          |    4 
 fs/ocfs2/heartbeat.c                                          |    4 
 fs/ocfs2/heartbeat.h                                          |    4 
 fs/ocfs2/inode.c                                              |    4 
 fs/ocfs2/inode.h                                              |    4 
 fs/ocfs2/journal.c                                            |    4 
 fs/ocfs2/journal.h                                            |    4 
 fs/ocfs2/localalloc.c                                         |    4 
 fs/ocfs2/localalloc.h                                         |    4 
 fs/ocfs2/locks.c                                              |    4 
 fs/ocfs2/locks.h                                              |    4 
 fs/ocfs2/mmap.c                                               |    4 
 fs/ocfs2/move_extents.c                                       |    4 
 fs/ocfs2/move_extents.h                                       |    4 
 fs/ocfs2/namei.c                                              |    4 
 fs/ocfs2/namei.h                                              |    4 
 fs/ocfs2/ocfs1_fs_compat.h                                    |    4 
 fs/ocfs2/ocfs2.h                                              |    4 
 fs/ocfs2/ocfs2_fs.h                                           |    4 
 fs/ocfs2/ocfs2_ioctl.h                                        |    4 
 fs/ocfs2/ocfs2_lockid.h                                       |    4 
 fs/ocfs2/ocfs2_lockingver.h                                   |    4 
 fs/ocfs2/refcounttree.c                                       |    4 
 fs/ocfs2/refcounttree.h                                       |    4 
 fs/ocfs2/reservations.c                                       |    4 
 fs/ocfs2/reservations.h                                       |    4 
 fs/ocfs2/resize.c                                             |    4 
 fs/ocfs2/resize.h                                             |    4 
 fs/ocfs2/slot_map.c                                           |    4 
 fs/ocfs2/slot_map.h                                           |    4 
 fs/ocfs2/stack_o2cb.c                                         |    4 
 fs/ocfs2/stack_user.c                                         |    4 
 fs/ocfs2/stackglue.c                                          |    4 
 fs/ocfs2/stackglue.h                                          |    4 
 fs/ocfs2/suballoc.c                                           |    4 
 fs/ocfs2/suballoc.h                                           |    4 
 fs/ocfs2/super.c                                              |    4 
 fs/ocfs2/super.h                                              |    4 
 fs/ocfs2/symlink.c                                            |    4 
 fs/ocfs2/symlink.h                                            |    4 
 fs/ocfs2/sysfile.c                                            |    4 
 fs/ocfs2/sysfile.h                                            |    4 
 fs/ocfs2/uptodate.c                                           |    4 
 fs/ocfs2/uptodate.h                                           |    4 
 fs/ocfs2/xattr.c                                              |    4 
 fs/ocfs2/xattr.h                                              |    4 
 fs/proc/generic.c                                             |   13 
 fs/proc/inode.c                                               |   18 
 fs/proc/proc_sysctl.c                                         |    2 
 fs/reiserfs/procfs.c                                          |   10 
 include/asm-generic/bitops/find.h                             |  108 +++
 include/asm-generic/bitops/le.h                               |   38 +
 include/asm-generic/bitsperlong.h                             |   12 
 include/asm-generic/io.h                                      |   11 
 include/linux/align.h                                         |   15 
 include/linux/async.h                                         |    1 
 include/linux/bitmap.h                                        |   11 
 include/linux/bitops.h                                        |   12 
 include/linux/blkdev.h                                        |    1 
 include/linux/compat.h                                        |    1 
 include/linux/configfs.h                                      |    4 
 include/linux/crc8.h                                          |    2 
 include/linux/cred.h                                          |    1 
 include/linux/delayacct.h                                     |   20 
 include/linux/fs.h                                            |    2 
 include/linux/genl_magic_func.h                               |    1 
 include/linux/genl_magic_struct.h                             |    1 
 include/linux/gfp.h                                           |    2 
 include/linux/init_task.h                                     |    1 
 include/linux/initrd.h                                        |    2 
 include/linux/kernel.h                                        |    9 
 include/linux/mm.h                                            |    2 
 include/linux/mmzone.h                                        |    2 
 include/linux/pgtable.h                                       |   10 
 include/linux/proc_fs.h                                       |    1 
 include/linux/profile.h                                       |    3 
 include/linux/smp.h                                           |    8 
 include/linux/swap.h                                          |    1 
 include/linux/vmalloc.h                                       |    7 
 include/uapi/linux/if_bonding.h                               |   11 
 include/uapi/linux/nfs4.h                                     |    6 
 include/xen/interface/elfnote.h                               |   10 
 include/xen/interface/hvm/hvm_vcpu.h                          |   10 
 include/xen/interface/io/xenbus.h                             |   10 
 init/Kconfig                                                  |   12 
 init/initramfs.c                                              |   38 +
 init/main.c                                                   |    1 
 ipc/sem.c                                                     |   12 
 kernel/async.c                                                |   68 --
 kernel/configs/android-base.config                            |    1 
 kernel/crash_core.c                                           |    7 
 kernel/cred.c                                                 |    2 
 kernel/exit.c                                                 |   67 ++
 kernel/fork.c                                                 |   23 
 kernel/gcov/Kconfig                                           |    1 
 kernel/gcov/base.c                                            |   49 +
 kernel/gcov/clang.c                                           |  282 ----------
 kernel/gcov/fs.c                                              |  146 ++++-
 kernel/gcov/gcc_4_7.c                                         |  173 ------
 kernel/gcov/gcov.h                                            |   14 
 kernel/kexec_core.c                                           |    4 
 kernel/kexec_file.c                                           |    4 
 kernel/kmod.c                                                 |    2 
 kernel/resource.c                                             |  198 ++++---
 kernel/sys.c                                                  |   14 
 kernel/umh.c                                                  |    8 
 kernel/up.c                                                   |    2 
 kernel/user_namespace.c                                       |    6 
 lib/bch.c                                                     |    2 
 lib/crc8.c                                                    |    2 
 lib/decompress_unlzma.c                                       |    2 
 lib/find_bit.c                                                |   68 --
 lib/genalloc.c                                                |    7 
 lib/list_sort.c                                               |    2 
 lib/parser.c                                                  |   61 +-
 lib/percpu_counter.c                                          |    2 
 lib/stackdepot.c                                              |    6 
 mm/balloon_compaction.c                                       |    4 
 mm/compaction.c                                               |    4 
 mm/filemap.c                                                  |    2 
 mm/gup.c                                                      |    2 
 mm/highmem.c                                                  |    2 
 mm/huge_memory.c                                              |    6 
 mm/hugetlb.c                                                  |    6 
 mm/internal.h                                                 |    2 
 mm/kasan/kasan.h                                              |    8 
 mm/kasan/quarantine.c                                         |    4 
 mm/kasan/shadow.c                                             |    4 
 mm/kfence/report.c                                            |    2 
 mm/khugepaged.c                                               |    2 
 mm/ksm.c                                                      |    6 
 mm/madvise.c                                                  |    4 
 mm/memcontrol.c                                               |   18 
 mm/memory-failure.c                                           |    2 
 mm/memory.c                                                   |   18 
 mm/mempolicy.c                                                |    6 
 mm/migrate.c                                                  |    8 
 mm/mmap.c                                                     |    4 
 mm/mprotect.c                                                 |    2 
 mm/mremap.c                                                   |    2 
 mm/nommu.c                                                    |   10 
 mm/oom_kill.c                                                 |    2 
 mm/page-writeback.c                                           |    4 
 mm/page_alloc.c                                               |   16 
 mm/page_owner.c                                               |    2 
 mm/page_vma_mapped.c                                          |    2 
 mm/percpu-internal.h                                          |    2 
 mm/percpu.c                                                   |    2 
 mm/pgalloc-track.h                                            |    6 
 mm/rmap.c                                                     |    2 
 mm/slab.c                                                     |    8 
 mm/slub.c                                                     |    2 
 mm/swap.c                                                     |    4 
 mm/swap_slots.c                                               |    2 
 mm/swap_state.c                                               |    2 
 mm/vmalloc.c                                                  |  124 ----
 mm/vmstat.c                                                   |    2 
 mm/z3fold.c                                                   |    2 
 mm/zpool.c                                                    |    2 
 mm/zsmalloc.c                                                 |    6 
 samples/configfs/configfs_sample.c                            |    2 
 scripts/checkpatch.pl                                         |   15 
 scripts/gdb/linux/cpus.py                                     |   23 
 scripts/gdb/linux/symbols.py                                  |    3 
 scripts/spelling.txt                                          |    3 
 tools/include/asm-generic/bitops/find.h                       |   85 ++-
 tools/include/asm-generic/bitsperlong.h                       |    3 
 tools/include/linux/bitmap.h                                  |   18 
 tools/lib/bitmap.c                                            |    4 
 tools/lib/find_bit.c                                          |   56 -
 tools/scripts/Makefile.include                                |    1 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   44 +
 tools/testing/selftests/kvm/lib/sparsebit.c                   |    1 
 tools/testing/selftests/mincore/mincore_selftest.c            |    1 
 tools/testing/selftests/powerpc/mm/tlbie_test.c               |    1 
 tools/testing/selftests/proc/Makefile                         |    1 
 tools/testing/selftests/proc/proc-subset-pid.c                |  121 ++++
 tools/testing/selftests/proc/read.c                           |    4 
 tools/usb/hcd-tests.sh                                        |    2 
 343 files changed, 1383 insertions(+), 2119 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-05 17:44       ` incoming Andrew Morton
@ 2021-05-06  3:19         ` Anshuman Khandual
  0 siblings, 0 replies; 485+ messages in thread
From: Anshuman Khandual @ 2021-05-06  3:19 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: Konstantin Ryabitsev, Linux-MM, mm-commits



On 5/5/21 11:14 PM, Andrew Morton wrote:
> On Wed, 5 May 2021 10:10:33 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
>> On Tue, May 4, 2021 at 8:16 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>>> Let me resend right now with the same in-reply-to.  Hopefully they will
>>> land in the correct place.
>> Well, you re-sent it twice, and I have three copies in my own mailbox,
>> bot they still don't show up on the mm-commits mailing list.
>>
>> So the list hates them for some odd reason.
>>
>> I've picked them up locally, but adding Konstantin to the participants
>> to see if he can see what's up.
>>
>> Konstantin: patches 103/106/107 are missing on lore out of Andrew's
>> series of 143. Odd.
> It's weird.  They don't turn up on linux-mm either, and that's running
> at kvack.org, also majordomo.  They don't get through when sent with
> either heirloom-mailx or with sylpheed.
> 
> Also, it seems that when Anshuman originally sent the patch, linux-mm
> and linux-kernel didn't send it back out.  So perhaps a spam filter
> triggered?
> 
> I'm seeing
> 
> https://lore.kernel.org/linux-arm-kernel/1615278790-18053-3-git-send-email-anshuman.khandual@arm.com/
> 
> which is via linux-arm-kernel@lists.infradead.org but the linux-kernel
> server massacred that patch series.  Searching
> https://lkml.org/lkml/2021/3/9 for "anshuman" only shows 3 of the 7
> email series.

Yeah these patches faced problem from the very beginning getting
into the MM/LKML list for some strange reason.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-05 17:10     ` incoming Linus Torvalds
@ 2021-05-05 17:44       ` Andrew Morton
  2021-05-06  3:19         ` incoming Anshuman Khandual
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-05-05 17:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Konstantin Ryabitsev, Linux-MM, mm-commits

[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]

On Wed, 5 May 2021 10:10:33 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, May 4, 2021 at 8:16 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > Let me resend right now with the same in-reply-to.  Hopefully they will
> > land in the correct place.
> 
> Well, you re-sent it twice, and I have three copies in my own mailbox,
> bot they still don't show up on the mm-commits mailing list.
> 
> So the list hates them for some odd reason.
> 
> I've picked them up locally, but adding Konstantin to the participants
> to see if he can see what's up.
> 
> Konstantin: patches 103/106/107 are missing on lore out of Andrew's
> series of 143. Odd.

It's weird.  They don't turn up on linux-mm either, and that's running
at kvack.org, also majordomo.  They don't get through when sent with
either heirloom-mailx or with sylpheed.

Also, it seems that when Anshuman originally sent the patch, linux-mm
and linux-kernel didn't send it back out.  So perhaps a spam filter
triggered?

I'm seeing

https://lore.kernel.org/linux-arm-kernel/1615278790-18053-3-git-send-email-anshuman.khandual@arm.com/

which is via linux-arm-kernel@lists.infradead.org but the linux-kernel
server massacred that patch series.  Searching
https://lkml.org/lkml/2021/3/9 for "anshuman" only shows 3 of the 7
email series.

One of the emails (as sent my me) is attached, if that helps.



[-- Attachment #2: x.txt --]
[-- Type: text/plain, Size: 21048 bytes --]

Return-Path: <akpm@linux-foundation.org>
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on y
X-Spam-Level: (none)
X-Spam-Status: No, score=-101.5 required=2.5 tests=BAYES_00,T_DKIM_INVALID,
	USER_IN_WHITELIST autolearn=ham autolearn_force=no version=3.4.1
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by localhost.localdomain (8.15.2/8.15.2/Debian-8ubuntu1) with ESMTP id 1453H2fk032202
	for <akpm@localhost>; Tue, 4 May 2021 20:17:03 -0700
Received: from imap.fastmail.com [66.111.4.135]
	by localhost.localdomain with IMAP (fetchmail-6.3.26)
	for <akpm@localhost> (single-drop); Tue, 04 May 2021 20:17:03 -0700 (PDT)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
	 by sloti11d1t06 (Cyrus 3.5.0-alpha0-442-g5daca166b9-fm-20210428.001-g5daca166) with LMTPA;
	 Tue, 04 May 2021 23:16:31 -0400
X-Cyrus-Session-Id: sloti11d1t06-1620184591-1699471-2-6359664467419938249
X-Sieve: CMU Sieve 3.0
X-Resolved-to: akpm@mbx.kernel.org
X-Delivered-to: akpm@mbx.kernel.org
X-Mail-from: akpm@linux-foundation.org
Received: from mx6 ([10.202.2.205])
  by compute1.internal (LMTPProxy); Tue, 04 May 2021 23:16:31 -0400
Received: from mx6.messagingengine.com (localhost [127.0.0.1])
	by mailmx.nyi.internal (Postfix) with ESMTP id 40796C800E1
	for <akpm@mbx.kernel.org>; Tue,  4 May 2021 23:16:31 -0400 (EDT)
Received: from mx6.messagingengine.com (localhost [127.0.0.1])
    by mx6.messagingengine.com (Authentication Milter) with ESMTP
    id 14870833D7F;
    Tue, 4 May 2021 23:16:31 -0400
ARC-Seal: i=2; a=rsa-sha256; cv=pass; d=messagingengine.com; s=fm2; t=
    1620184591; b=FBo7Gf3JFN+4QYg5Byan0oNm6RESv+sIf5HcaslVNsUd9SOTGS
    yI0+IsXr1CUpGH783hE6fmgEq9SyfOwQVZjdikLaJS1+7u0JtfAYQFU3RORCtXlr
    djJWrScfjVa8nAHX4rQCtzvtPYuzx5w7cTgGgeILgoJMxgLj7EC9xcT8BIf68+9W
    Lw+ohAmcuiKhL2ez+de4SMuwdh3dh2FwAIHQOsSjEU1/NV+WGxMLwYbxWgTrqQGH
    RQIzFNdq30qslW9huK47+e80uHOX2tXwxtshwbThFEn458bdV5LL6Y8Oh4ZWMbv1
    tFgTt515DVedonZknxc07XsXtAjaJyB8bfHw==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=
    messagingengine.com; h=date:from:to:subject:message-id
    :in-reply-to; s=fm2; t=1620184591; bh=LuH7mbm3+zp863vKBEqKeoZtnp
    uFxYpIb5oTVwf56Es=; b=m5E1fbz2b+an/X406oY3BuG0Zm4/W05vWAki8Lsnud
    gPCc1LfPUFSuXaMppcEDPbLKprp4hH3T52itK4pivXMQCLEOyme7kVStaLMVTiky
    Xxqh5ZdhOWvygBfda/GjfuLBSbbj2gfm8HPKpbL7CA5foelknIBhJHDzGkJyxetZ
    YagZfVvtdo2OEwnC1mmjUCpKPO5+m5kaZO0ol6rPdl+TV0MKGhjLg+/i6Ia+0nFp
    zDwV4VeACvVcGb2xY7KG5Z+BtqVxeVFn+w5JcqpWUtxEKoSBR4bWARzjwHg6eouh
    7psOOKPTt/NzDKk+3f49lso5KlPiTF2xEU/+5SIttCkQ==
ARC-Authentication-Results: i=2; mx6.messagingengine.com;
    arc=pass (as.1.google.com=pass, ams.1.google.com=pass)
    smtp.remote-ip=209.85.215.198;
    bimi=skipped (DMARC did not pass);
    dkim=pass (1024-bit rsa key sha256) header.d=linux-foundation.org
    header.i=@linux-foundation.org header.b=Gdz/3wY9 header.a=rsa-sha256
    header.s=korg x-bits=1024;
    dmarc=none policy.published-domain-policy=none
    policy.applied-disposition=none policy.evaluated-disposition=none
    (p=none,d=none,d.eval=none) policy.policy-from=p
    header.from=linux-foundation.org;
    iprev=pass smtp.remote-ip=209.85.215.198 (mail-pg1-f198.google.com);
    spf=pass smtp.mailfrom=akpm@linux-foundation.org
    smtp.helo=mail-pg1-f198.google.com;
    x-aligned-from=pass (Address match);
    x-arc-spf=pass
    (google.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender)
    smtp.mailfrom=akpm@linux-foundation.org x-arc-instance=1
    x-arc-domain=google.com (Trusted from aar.1.google.com);
    x-csa=none;
    x-google-dkim=fail (message has been altered, 2048-bit rsa key)
    header.d=1e100.net header.i=@1e100.net header.b=VZuDOxUf;
    x-me-sender=none;
    x-ptr=pass smtp.helo=mail-pg1-f198.google.com
    policy.ptr=mail-pg1-f198.google.com;
    x-return-mx=pass header.domain=linux-foundation.org policy.is_org=yes
    (MX Records found: ASPMX.L.GOOGLE.COM,ALT1.ASPMX.L.GOOGLE.COM,ALT2.ASPMX.L.GOOGLE.COM,ALT3.ASPMX.L.GOOGLE.COM,ALT4.ASPMX.L.GOOGLE.COM);
    x-return-mx=pass smtp.domain=linux-foundation.org policy.is_org=yes
    (MX Records found: ASPMX.L.GOOGLE.COM,ALT1.ASPMX.L.GOOGLE.COM,ALT2.ASPMX.L.GOOGLE.COM,ALT3.ASPMX.L.GOOGLE.COM,ALT4.ASPMX.L.GOOGLE.COM);
    x-tls=pass smtp.version=TLSv1.3 smtp.cipher=TLS_AES_256_GCM_SHA384
    smtp.bits=256/256;
    x-vs=clean score=40 state=0
Authentication-Results: mx6.messagingengine.com;
    arc=pass (as.1.google.com=pass, ams.1.google.com=pass)
      smtp.remote-ip=209.85.215.198;
    bimi=skipped (DMARC did not pass);
    dkim=pass (1024-bit rsa key sha256) header.d=linux-foundation.org
      header.i=@linux-foundation.org header.b=Gdz/3wY9 header.a=rsa-sha256
      header.s=korg x-bits=1024;
    dmarc=none policy.published-domain-policy=none
      policy.applied-disposition=none policy.evaluated-disposition=none
      (p=none,d=none,d.eval=none) policy.policy-from=p
      header.from=linux-foundation.org;
    iprev=pass smtp.remote-ip=209.85.215.198 (mail-pg1-f198.google.com);
    spf=pass smtp.mailfrom=akpm@linux-foundation.org
      smtp.helo=mail-pg1-f198.google.com;
    x-aligned-from=pass (Address match);
    x-arc-spf=pass
      (google.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender)
      smtp.mailfrom=akpm@linux-foundation.org x-arc-instance=1
      x-arc-domain=google.com (Trusted from aar.1.google.com);
    x-csa=none;
    x-google-dkim=fail (message has been altered, 2048-bit rsa key)
      header.d=1e100.net header.i=@1e100.net header.b=VZuDOxUf;
    x-me-sender=none;
    x-ptr=pass smtp.helo=mail-pg1-f198.google.com
      policy.ptr=mail-pg1-f198.google.com;
    x-return-mx=pass header.domain=linux-foundation.org policy.is_org=yes
      (MX Records found: ASPMX.L.GOOGLE.COM,ALT1.ASPMX.L.GOOGLE.COM,ALT2.ASPMX.L.GOOGLE.COM,ALT3.ASPMX.L.GOOGLE.COM,ALT4.ASPMX.L.GOOGLE.COM);
    x-return-mx=pass smtp.domain=linux-foundation.org policy.is_org=yes
      (MX Records found: ASPMX.L.GOOGLE.COM,ALT1.ASPMX.L.GOOGLE.COM,ALT2.ASPMX.L.GOOGLE.COM,ALT3.ASPMX.L.GOOGLE.COM,ALT4.ASPMX.L.GOOGLE.COM);
    x-tls=pass smtp.version=TLSv1.3 smtp.cipher=TLS_AES_256_GCM_SHA384
      smtp.bits=256/256;
    x-vs=clean score=40 state=0
X-ME-VSCause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefjedgieegucetufdoteggodetrfdotf
    fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
    rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucgoufhorhhtvggutfgvtg
    hiphdvucdlgedtmdenucfjughrpeffhffvuffkjggfsedttdertddtredtnecuhfhrohhm
    peetnhgurhgvficuofhorhhtohhnuceorghkphhmsehlihhnuhigqdhfohhunhgurghtih
    honhdrohhrgheqnecuggftrfgrthhtvghrnhepjeevfeduveffvddvudetkefhgeduveeu
    geevvdfhhfevhfekkedtieefgfduheeinecuffhomhgrihhnpehkvghrnhgvlhdrohhrgh
    enucfkphepvddtledrkeehrddvudehrdduleekpdduleekrddugeehrddvledrleelnecu
    uegrugftvghpuhhtkfhppeduleekrddugeehrddvledrleelnecuvehluhhsthgvrhfuih
    iivgeptdenucfrrghrrghmpehinhgvthepvddtledrkeehrddvudehrdduleekpdhhvghl
    ohepmhgrihhlqdhpghduqdhfudelkedrghhoohhglhgvrdgtohhmpdhmrghilhhfrhhomh
    epoegrkhhpmheslhhinhhugidqfhhouhhnuggrthhiohhnrdhorhhgqe
X-ME-VSScore: 40
X-ME-VSCategory: clean
X-ME-CSA: none
Received-SPF: pass
    (linux-foundation.org: Sender is authorized to use 'akpm@linux-foundation.org' in 'mfrom' identity (mechanism 'include:_spf.google.com' matched))
    receiver=mx6.messagingengine.com;
    identity=mailfrom;
    envelope-from="akpm@linux-foundation.org";
    helo=mail-pg1-f198.google.com;
    client-ip=209.85.215.198
Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by mx6.messagingengine.com (Postfix) with ESMTPS
	for <akpm@mbx.kernel.org>; Tue,  4 May 2021 23:16:31 -0400 (EDT)
Received: by mail-pg1-f198.google.com with SMTP id g5-20020a63f4050000b02901f6c7b9a6d0so593624pgi.5
        for <akpm@mbx.kernel.org>; Tue, 04 May 2021 20:16:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:dkim-signature:date:from:to:subject:message-id
         :in-reply-to:user-agent;
        bh=LuH7mbm3+zp863vKBEqKeoZtnpuFxYpIb5oTVwf56Es=;
        b=VZuDOxUfeHXJz1/CiFfcxuMVHkmW5RznvqYS+Py8Ub6nHHXprQJGE9Ze3WgH+1ylSe
         NJLEC7xgv15SR9A+e/MT4RTj3OVOwtd1Zi2vPav39a9K4tP+2uL2Ei+5d7FtT3LLZsjo
         feek/DqCGSkJ/EC5woLyU9BBkfLUuQ9/2HiDCk10BMetEfWdor69Slb39NOXES8br02X
         25Btabu9ZCWroyjQj7W5gwGr5Z6Hs2nbnnfAb+e92FalcUD/4ql77lNzRcWGi4/9TT8s
         ntqI2g46Xv+k5LURaRH5CRBpxkkKgzcrioRPYFUHkEgOEWy1hPzg9QPk8ZO35Xm9R9d2
         vl3Q==
X-Gm-Message-State: AOAM531IlYUTVWcMrsTunnxZWB7SKeeOmoZj5mZ1A5tl7N/JlZUueN8L
	tvyRKnvxHr6a5mDaGHN9Tb1N/iCzT0U5oQgRVTxTnj1qFGibRa9+leLQNKX0aGlNg9JiaMfromb
	xyOlCUpVXOlVvchuwTUSTn7rXum+Hh3PWQZm5II/EX+0AkzKqez62Z8U=
X-Received: by 2002:a17:90a:a581:: with SMTP id b1mr32203271pjq.53.1620184589161;
        Tue, 04 May 2021 20:16:29 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJxffoGdRqAjUagWoMVD5p/Lk1KTEDftEhkWh8ewatgDmZLlxh0lO1hxYIdYYwoO5dsJ/i0z
X-Received: by 2002:a17:90a:a581:: with SMTP id b1mr32203198pjq.53.1620184588109;
        Tue, 04 May 2021 20:16:28 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1620184588; cv=none;
        d=google.com; s=arc-20160816;
        b=Fr2b2AMXJr6OeNpSql45tq1korkuDOunp7t+DpARuEBnwvQnKfagyipQ93jywsRf/c
         /i/mP2eTmJwOLWNORClh1MGF/0VfBx1ULoB9W4CI3LpVgGFXGGFis8LTcvUYD5yvhlsV
         50rm2j34iS9lyo04FB/hbhGkwLtUhz2PGkLGuqHspTd+pUpUCf5SLxGJbZC5uCcUEsbO
         8WSDBWyvaCPjFzJQZK60gK70ticKW+fCG1xHtOG4qsFCbqEpFKBy8eVK83OBazo/dQDr
         DOheWNWyw2o/WMP4GpZMvZuj30dx3j8xnBahIpnMIQJaog6wLMcVX9pkQ8UJym3/PGNm
         pO/g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816;
        h=user-agent:in-reply-to:message-id:subject:to:from:date
         :dkim-signature;
        bh=LuH7mbm3+zp863vKBEqKeoZtnpuFxYpIb5oTVwf56Es=;
        b=vVN16NPMKjoxSJQ6b36VXFCkZqnmG7wABfilgE069txZqmHpEMyZb8lRStkHy557LM
         Kn7UfJFP3xwsP8ZTCipVDZ6tpFW/hYFU9o4th9G8asWs+MOf9xpWX2LQZ1FTmaao2Fg5
         uCHypz39cnAh0Z1EJfNsTcaTGIrkbBd6zje+mtBgs8hnfH8HcWBYTPCHCCx950Z928tb
         XOPd/Igs7yzD1ioBiGXZj/ciwPbWVTaZXBg4JOZSApxkDMfuMyfyLLOs++EVkyxJHUme
         TmgwvLkixcwEtKF7gIeqEhwvOUSVvilLuJLFVaLumwTcjJ1amVfGcJhBE7LIM9C3SMpA
         rOOg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux-foundation.org header.s=korg header.b="Gdz/3wY9";
       spf=pass (google.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org
Received: from mail.kernel.org (mail.kernel.org. [198.145.29.99])
        by mx.google.com with ESMTPS id c85si20173199pfb.8.2021.05.04.20.16.27
        (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
        Tue, 04 May 2021 20:16:28 -0700 (PDT)
Received-SPF: pass (google.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) client-ip=198.145.29.99;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux-foundation.org header.s=korg header.b="Gdz/3wY9";
       spf=pass (google.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org
Received: by mail.kernel.org (Postfix) with ESMTPSA id A4DB4610D2;
	Wed,  5 May 2021 03:16:26 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org;
	s=korg; t=1620184587;
	bh=TxN4wgKcKf2UUem+5pL09m9GL/7U592mEalo2U6vwAU=;
	h=Date:From:To:Subject:In-Reply-To:From;
	b=Gdz/3wY9ktH3hOmn2DAOkfh0JXwPdMJ8xsNQFa9eI25K39Z3iHdRGo9jX3QtMDtog
	 D4Zakt52CQCYsV91c9oCai8KnCTkkAjJq/Ez7p8UHpz97Go3yYYxqg6DDl6d8HCQvN
	 H47dTaZAgeH2sw29bjB9fRzNuTx7k4RAPlqZIpiE=
Date: Tue, 04 May 2021 20:16:26 -0700
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, anshuman.khandual@arm.com, aou@eecs.berkeley.edu, arnd@arndb.de, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, dalias@libc.org, deller@gmx.de, gor@linux.ibm.com, hca@linux.ibm.com, hpa@zytor.com, James.Bottomley@HansenPartnership.com, linux-mm@kvack.org, linux@armlinux.org.uk, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, palmerdabbelt@google.com, paul.walmsley@sifive.com, paulus@samba.org, tglx@linutronix.de, torvalds@linux-foundation.org, tsbogend@alpha.franken.de, vgupta@synopsys.com, viro@zeniv.linux.org.uk, will@kernel.org, ysato@users.osdn.me
Subject: [patch 103/143] mm: generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)
Message-ID: <20210505031626.c8o4WL7KE%akpm@linux-foundation.org>
In-Reply-To: <20210504183219.a3cc46aee4013d77402276c5@linux-foundation.org>
User-Agent: s-nail v14.8.16
X-Gm-Original-To: akpm@linux-foundation.org

From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm: generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)

SYS_SUPPORTS_HUGETLBFS config has duplicate definitions on platforms that
subscribe it.  Instead, just make it a generic option which can be
selected on applicable platforms.  Also rename it as
ARCH_SUPPORTS_HUGETLBFS instead.  This reduces code duplication and makes
it cleaner.

Link: https://lkml.kernel.org/r/1617259448-22529-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>	[riscv]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm/Kconfig                       |    5 +----
 arch/arm64/Kconfig                     |    4 +---
 arch/mips/Kconfig                      |    6 +-----
 arch/parisc/Kconfig                    |    5 +----
 arch/powerpc/Kconfig                   |    3 ---
 arch/powerpc/platforms/Kconfig.cputype |    6 +++---
 arch/riscv/Kconfig                     |    5 +----
 arch/sh/Kconfig                        |    5 +----
 fs/Kconfig                             |    5 ++++-
 9 files changed, 13 insertions(+), 31 deletions(-)

--- a/arch/arm64/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/arm64/Kconfig
@@ -73,6 +73,7 @@ config ARM64
 	select ARCH_USE_QUEUED_SPINLOCKS
 	select ARCH_USE_SYM_ANNOTATIONS
 	select ARCH_SUPPORTS_DEBUG_PAGEALLOC
+	select ARCH_SUPPORTS_HUGETLBFS
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
 	select ARCH_SUPPORTS_LTO_CLANG if CPU_LITTLE_ENDIAN
@@ -1072,9 +1073,6 @@ config HW_PERF_EVENTS
 	def_bool y
 	depends on ARM_PMU
 
-config SYS_SUPPORTS_HUGETLBFS
-	def_bool y
-
 config ARCH_HAS_FILTER_PGPROT
 	def_bool y
 
--- a/arch/arm/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/arm/Kconfig
@@ -31,6 +31,7 @@ config ARM
 	select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7
 	select ARCH_SUPPORTS_ATOMIC_RMW
+	select ARCH_SUPPORTS_HUGETLBFS if ARM_LPAE
 	select ARCH_USE_BUILTIN_BSWAP
 	select ARCH_USE_CMPXCHG_LOCKREF
 	select ARCH_USE_MEMTEST
@@ -1511,10 +1512,6 @@ config HW_PERF_EVENTS
 	def_bool y
 	depends on ARM_PMU
 
-config SYS_SUPPORTS_HUGETLBFS
-       def_bool y
-       depends on ARM_LPAE
-
 config HAVE_ARCH_TRANSPARENT_HUGEPAGE
        def_bool y
        depends on ARM_LPAE
--- a/arch/mips/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/mips/Kconfig
@@ -19,6 +19,7 @@ config MIPS
 	select ARCH_USE_MEMTEST
 	select ARCH_USE_QUEUED_RWLOCKS
 	select ARCH_USE_QUEUED_SPINLOCKS
+	select ARCH_SUPPORTS_HUGETLBFS if CPU_SUPPORTS_HUGEPAGES
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select ARCH_WANT_LD_ORPHAN_WARN
@@ -1287,11 +1288,6 @@ config SYS_SUPPORTS_BIG_ENDIAN
 config SYS_SUPPORTS_LITTLE_ENDIAN
 	bool
 
-config SYS_SUPPORTS_HUGETLBFS
-	bool
-	depends on CPU_SUPPORTS_HUGEPAGES
-	default y
-
 config MIPS_HUGE_TLB_SUPPORT
 	def_bool HUGETLB_PAGE || TRANSPARENT_HUGEPAGE
 
--- a/arch/parisc/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/parisc/Kconfig
@@ -12,6 +12,7 @@ config PARISC
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
 	select ARCH_NO_SG_CHAIN
+	select ARCH_SUPPORTS_HUGETLBFS if PA20
 	select ARCH_SUPPORTS_MEMORY_FAILURE
 	select DMA_OPS
 	select RTC_CLASS
@@ -138,10 +139,6 @@ config PGTABLE_LEVELS
 	default 3 if 64BIT && PARISC_PAGE_SIZE_4KB
 	default 2
 
-config SYS_SUPPORTS_HUGETLBFS
-	def_bool y if PA20
-
-
 menu "Processor type and features"
 
 choice
--- a/arch/powerpc/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/powerpc/Kconfig
@@ -697,9 +697,6 @@ config ARCH_SPARSEMEM_DEFAULT
 	def_bool y
 	depends on PPC_BOOK3S_64
 
-config SYS_SUPPORTS_HUGETLBFS
-	bool
-
 config ILLEGAL_POINTER_VALUE
 	hex
 	# This is roughly half way between the top of user space and the bottom
--- a/arch/powerpc/platforms/Kconfig.cputype~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/powerpc/platforms/Kconfig.cputype
@@ -40,8 +40,8 @@ config PPC_85xx
 
 config PPC_8xx
 	bool "Freescale 8xx"
+	select ARCH_SUPPORTS_HUGETLBFS
 	select FSL_SOC
-	select SYS_SUPPORTS_HUGETLBFS
 	select PPC_HAVE_KUEP
 	select PPC_HAVE_KUAP
 	select HAVE_ARCH_VMAP_STACK
@@ -95,9 +95,9 @@ config PPC_BOOK3S_64
 	bool "Server processors"
 	select PPC_FPU
 	select PPC_HAVE_PMU_SUPPORT
-	select SYS_SUPPORTS_HUGETLBFS
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
+	select ARCH_SUPPORTS_HUGETLBFS
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select IRQ_WORK
 	select PPC_MM_SLICES
@@ -278,9 +278,9 @@ config FSL_BOOKE
 # this is for common code between PPC32 & PPC64 FSL BOOKE
 config PPC_FSL_BOOK3E
 	bool
+	select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
 	select FSL_EMB_PERFMON
 	select PPC_SMP_MUXED_IPI
-	select SYS_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64
 	select PPC_DOORBELL
 	default y if FSL_BOOKE
 
--- a/arch/riscv/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/riscv/Kconfig
@@ -30,6 +30,7 @@ config RISCV
 	select ARCH_HAS_STRICT_KERNEL_RWX if MMU
 	select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
+	select ARCH_SUPPORTS_HUGETLBFS if MMU
 	select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
@@ -165,10 +166,6 @@ config ARCH_WANT_GENERAL_HUGETLB
 config ARCH_SUPPORTS_UPROBES
 	def_bool y
 
-config SYS_SUPPORTS_HUGETLBFS
-	depends on MMU
-	def_bool y
-
 config STACKTRACE_SUPPORT
 	def_bool y
 
--- a/arch/sh/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/arch/sh/Kconfig
@@ -101,9 +101,6 @@ config SYS_SUPPORTS_APM_EMULATION
 	bool
 	select ARCH_SUSPEND_POSSIBLE
 
-config SYS_SUPPORTS_HUGETLBFS
-	bool
-
 config SYS_SUPPORTS_SMP
 	bool
 
@@ -175,12 +172,12 @@ config CPU_SH3
 
 config CPU_SH4
 	bool
+	select ARCH_SUPPORTS_HUGETLBFS if MMU
 	select CPU_HAS_INTEVT
 	select CPU_HAS_SR_RB
 	select CPU_HAS_FPU if !CPU_SH4AL_DSP
 	select SH_INTC
 	select SYS_SUPPORTS_SH_TMU
-	select SYS_SUPPORTS_HUGETLBFS if MMU
 
 config CPU_SH4A
 	bool
--- a/fs/Kconfig~mm-generalize-sys_supports_hugetlbfs-rename-as-arch_supports_hugetlbfs
+++ a/fs/Kconfig
@@ -223,10 +223,13 @@ config TMPFS_INODE64
 
 	  If unsure, say N.
 
+config ARCH_SUPPORTS_HUGETLBFS
+	def_bool n
+
 config HUGETLBFS
 	bool "HugeTLB file system support"
 	depends on X86 || IA64 || SPARC64 || (S390 && 64BIT) || \
-		   SYS_SUPPORTS_HUGETLBFS || BROKEN
+		   ARCH_SUPPORTS_HUGETLBFS || BROKEN
 	help
 	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
 	  ramfs. For architectures that support it, say Y here and read
_

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-05  3:16   ` incoming Andrew Morton
@ 2021-05-05 17:10     ` Linus Torvalds
  2021-05-05 17:44       ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-05-05 17:10 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: Linux-MM, mm-commits

On Tue, May 4, 2021 at 8:16 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Let me resend right now with the same in-reply-to.  Hopefully they will
> land in the correct place.

Well, you re-sent it twice, and I have three copies in my own mailbox,
bot they still don't show up on the mm-commits mailing list.

So the list hates them for some odd reason.

I've picked them up locally, but adding Konstantin to the participants
to see if he can see what's up.

Konstantin: patches 103/106/107 are missing on lore out of Andrew's
series of 143. Odd.

             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-05  1:47 ` incoming Linus Torvalds
@ 2021-05-05  3:16   ` Andrew Morton
  2021-05-05 17:10     ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-05-05  3:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux-MM, mm-commits

On Tue, 4 May 2021 18:47:19 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, May 4, 2021 at 6:32 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > 143 patches
> 
> Hmm. Only 140 seem to have made it to the list, with 103, 106 and 107 missing.
> 
> Maybe just some mail delay? But at least right now
> 
>    https://lore.kernel.org/mm-commits/
> 
> doesn't show them (and thus 'b4' doesn't work).
> 
> I'll check again later.
> 

Well that's strange.  I see all three via cc:me, but not on linux-mm or
mm-commits.

Let me resend right now with the same in-reply-to.  Hopefully they will
land in the correct place.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-05-05  1:32 incoming Andrew Morton
@ 2021-05-05  1:47 ` Linus Torvalds
  2021-05-05  3:16   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-05-05  1:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, May 4, 2021 at 6:32 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> 143 patches

Hmm. Only 140 seem to have made it to the list, with 103, 106 and 107 missing.

Maybe just some mail delay? But at least right now

   https://lore.kernel.org/mm-commits/

doesn't show them (and thus 'b4' doesn't work).

I'll check again later.

             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-05-05  1:32 Andrew Morton
  2021-05-05  1:47 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-05-05  1:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


The remainder of the main mm/ queue.

143 patches, based on 8ca5297e7e38f2dc8c753d33a5092e7be181fff0, plus
previously sent patches.

Subsystems affected by this patch series:

  mm/pagecache
  mm/hugetlb
  mm/userfaultfd
  mm/vmscan
  mm/compaction
  mm/migration
  mm/cma
  mm/ksm
  mm/vmstat
  mm/mmap
  mm/kconfig
  mm/util
  mm/memory-hotplug
  mm/zswap
  mm/zsmalloc
  mm/highmem
  mm/cleanups
  mm/kfence

Subsystem: mm/pagecache

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Remove nrexceptional tracking", v2:
      mm: introduce and use mapping_empty()
      mm: stop accounting shadow entries
      dax: account DAX entries as nrpages
      mm: remove nrexceptional from inode

    Hugh Dickins <hughd@google.com>:
      mm: remove nrexceptional from inode: remove BUG_ON

Subsystem: mm/hugetlb

    Peter Xu <peterx@redhat.com>:
    Patch series "hugetlb: Disable huge pmd unshare for uffd-wp", v4:
      hugetlb: pass vma into huge_pte_alloc() and huge_pmd_share()
      hugetlb/userfaultfd: forbid huge pmd sharing when uffd enabled
      mm/hugetlb: move flush_hugetlb_tlb_range() into hugetlb.h
      hugetlb/userfaultfd: unshare all pmds for hugetlbfs when register wp

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: remove redundant reservation check condition in alloc_huge_page()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm: generalize HUGETLB_PAGE_SIZE_VARIABLE

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Some cleanups for hugetlb":
      mm/hugetlb: use some helper functions to cleanup code
      mm/hugetlb: optimize the surplus state transfer code in move_hugetlb_state()
      mm/hugetlb_cgroup: remove unnecessary VM_BUG_ON_PAGE in hugetlb_cgroup_migrate()
      mm/hugetlb: simplify the code when alloc_huge_page() failed in hugetlb_no_page()
      mm/hugetlb: avoid calculating fault_mutex_hash in truncate_op case
    Patch series "Cleanup and fixup for khugepaged", v2:
      khugepaged: remove unneeded return value of khugepaged_collapse_pte_mapped_thps()
      khugepaged: reuse the smp_wmb() inside __SetPageUptodate()
      khugepaged: use helper khugepaged_test_exit() in __khugepaged_enter()
      khugepaged: fix wrong result value for trace_mm_collapse_huge_page_isolate()
      mm/huge_memory.c: remove unnecessary local variable ret2
    Patch series "Some cleanups for huge_memory", v3:
      mm/huge_memory.c: rework the function vma_adjust_trans_huge()
      mm/huge_memory.c: make get_huge_zero_page() return bool
      mm/huge_memory.c: rework the function do_huge_pmd_numa_page() slightly
      mm/huge_memory.c: remove redundant PageCompound() check
      mm/huge_memory.c: remove unused macro TRANSPARENT_HUGEPAGE_DEBUG_COW_FLAG
      mm/huge_memory.c: use helper function migration_entry_to_page()

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/khugepaged.c: replace barrier() with READ_ONCE() for a selective variable

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup for khugepaged":
      khugepaged: use helper function range_in_vma() in collapse_pte_mapped_thp()
      khugepaged: remove unnecessary out label in collapse_huge_page()
      khugepaged: remove meaningless !pte_present() check in khugepaged_scan_pmd()

    Zi Yan <ziy@nvidia.com>:
      mm: huge_memory: a new debugfs interface for splitting THP tests
      mm: huge_memory: debugfs for file-backed THP split

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for hugetlb", v2:
      mm/hugeltb: remove redundant VM_BUG_ON() in region_add()
      mm/hugeltb: simplify the return code of __vma_reservation_common()
      mm/hugeltb: clarify (chg - freed) won't go negative in hugetlb_unreserve_pages()
      mm/hugeltb: handle the error case in hugetlb_fix_reserve_counts()
      mm/hugetlb: remove unused variable pseudo_vma in remove_inode_hugepages()

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "make hugetlb put_page safe for all calling contexts", v5:
      mm/cma: change cma mutex to irq safe spinlock
      hugetlb: no need to drop hugetlb_lock to call cma_release
      hugetlb: add per-hstate mutex to synchronize user adjustments
      hugetlb: create remove_hugetlb_page() to separate functionality
      hugetlb: call update_and_free_page without hugetlb_lock
      hugetlb: change free_pool_huge_page to remove_pool_huge_page
      hugetlb: make free_huge_page irq safe
      hugetlb: add lockdep_assert_held() calls for hugetlb_lock

    Oscar Salvador <osalvador@suse.de>:
    Patch series "Make alloc_contig_range handle Hugetlb pages", v10:
      mm,page_alloc: bail out earlier on -ENOMEM in alloc_contig_migrate_range
      mm,compaction: let isolate_migratepages_{range,block} return error codes
      mm,hugetlb: drop clearing of flag from prep_new_huge_page
      mm,hugetlb: split prep_new_huge_page functionality
      mm: make alloc_contig_range handle free hugetlb pages
      mm: make alloc_contig_range handle in-use hugetlb pages
      mm,page_alloc: drop unnecessary checks from pfn_range_valid_contig

Subsystem: mm/userfaultfd

    Axel Rasmussen <axelrasmussen@google.com>:
    Patch series "userfaultfd: add minor fault handling", v9:
      userfaultfd: add minor fault registration mode
      userfaultfd: disable huge PMD sharing for MINOR registered VMAs
      userfaultfd: hugetlbfs: only compile UFFD helpers if config enabled
      userfaultfd: add UFFDIO_CONTINUE ioctl
      userfaultfd: update documentation to describe minor fault handling
      userfaultfd/selftests: add test exercising minor fault handling

Subsystem: mm/vmscan

    Dave Hansen <dave.hansen@linux.intel.com>:
      mm/vmscan: move RECLAIM* bits to uapi header
      mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks

    Yang Shi <shy828301@gmail.com>:
    Patch series "Make shrinker's nr_deferred memcg aware", v10:
      mm: vmscan: use nid from shrink_control for tracepoint
      mm: vmscan: consolidate shrinker_maps handling code
      mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation
      mm: vmscan: remove memcg_shrinker_map_size
      mm: vmscan: use kvfree_rcu instead of call_rcu
      mm: memcontrol: rename shrinker_map to shrinker_info
      mm: vmscan: add shrinker_info_protected() helper
      mm: vmscan: use a new flag to indicate shrinker is registered
      mm: vmscan: add per memcg shrinker nr_deferred
      mm: vmscan: use per memcg nr_deferred of shrinker
      mm: vmscan: don't need allocate shrinker->nr_deferred for memcg aware shrinkers
      mm: memcontrol: reparent nr_deferred when memcg offline
      mm: vmscan: shrink deferred objects proportional to priority

Subsystem: mm/compaction

    Pintu Kumar <pintu@codeaurora.org>:
      mm/compaction: remove unused variable sysctl_compact_memory

    Charan Teja Reddy <charante@codeaurora.org>:
      mm: compaction: update the COMPACT[STALL|FAIL] events properly

Subsystem: mm/migration

    Minchan Kim <minchan@kernel.org>:
      mm: disable LRU pagevec during the migration temporarily
      mm: replace migrate_[prep|finish] with lru_cache_[disable|enable]
      mm: fs: invalidate BH LRU during page migration

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for mm/migrate.c", v3:
      mm/migrate.c: make putback_movable_page() static
      mm/migrate.c: remove unnecessary rc != MIGRATEPAGE_SUCCESS check in 'else' case
      mm/migrate.c: fix potential indeterminate pte entry in migrate_vma_insert_page()
      mm/migrate.c: use helper migrate_vma_collect_skip() in migrate_vma_collect_hole()
      Revert "mm: migrate: skip shared exec THP for NUMA balancing"

Subsystem: mm/cma

    Minchan Kim <minchan@kernel.org>:
      mm: vmstat: add cma statistics

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: cma: use pr_err_ratelimited for CMA warning

    Liam Mark <lmark@codeaurora.org>:
      mm: cma: add trace events for CMA alloc perf testing

    Minchan Kim <minchan@kernel.org>:
      mm: cma: support sysfs
      mm: cma: add the CMA instance name to cma trace events
      mm: use proper type for cma_[alloc|release]

Subsystem: mm/ksm

    Miaohe Lin <linmiaohe@huawei.com>:
    Patch series "Cleanup and fixup for ksm":
      ksm: remove redundant VM_BUG_ON_PAGE() on stable_tree_search()
      ksm: use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()
      ksm: remove dedicated macro KSM_FLAG_MASK
      ksm: fix potential missing rmap_item for stable_node

    Chengyang Fan <cy.fan@huawei.com>:
      mm/ksm: remove unused parameter from remove_trailing_rmap_items()

Subsystem: mm/vmstat

    Hugh Dickins <hughd@google.com>:
      mm: restore node stat checking in /proc/sys/vm/stat_refresh
      mm: no more EINVAL from /proc/sys/vm/stat_refresh
      mm: /proc/sys/vm/stat_refresh skip checking known negative stats
      mm: /proc/sys/vm/stat_refresh stop checking monotonic numa stats

    Saravanan D <saravanand@fb.com>:
      x86/mm: track linear mapping split events

Subsystem: mm/mmap

    Liam Howlett <liam.howlett@oracle.com>:
      mm/mmap.c: don't unlock VMAs in remap_file_pages()

Subsystem: mm/kconfig

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm: some config cleanups", v2:
      mm: generalize ARCH_HAS_CACHE_LINE_SIZE
      mm: generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)
      mm: generalize ARCH_ENABLE_MEMORY_[HOTPLUG|HOTREMOVE]
      mm: drop redundant ARCH_ENABLE_[HUGEPAGE|THP]_MIGRATION
      mm: drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK
      mm: drop redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE

Subsystem: mm/util

    Joe Perches <joe@perches.com>:
      mm/util.c: reduce mem_dump_obj() object size

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      mm/util.c: fix typo

Subsystem: mm/memory-hotplug

    Pavel Tatashin <pasha.tatashin@soleen.com>:
    Patch series "prohibit pinning pages in ZONE_MOVABLE", v11:
      mm/gup: don't pin migrated cma pages in movable zone
      mm/gup: check every subpage of a compound page during isolation
      mm/gup: return an error on migration failure
      mm/gup: check for isolation errors
      mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN
      mm: apply per-task gfp constraints in fast path
      mm: honor PF_MEMALLOC_PIN for all movable pages
      mm/gup: do not migrate zero page
      mm/gup: migrate pinned pages out of movable zone
      memory-hotplug.rst: add a note about ZONE_MOVABLE and page pinning
      mm/gup: change index type to long as it counts pages
      mm/gup: longterm pin migration cleanup
      selftests/vm: gup_test: fix test flag
      selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages

    Mel Gorman <mgorman@techsingularity.net>:
      mm/memory_hotplug: remove broken locking of zone PCP structures during hot remove

    Oscar Salvador <osalvador@suse.de>:
    Patch series "Allocate memmap from hotadded memory (per device)", v10:
      drivers/base/memory: introduce memory_block_{online,offline}
      mm,memory_hotplug: relax fully spanned sections check

    David Hildenbrand <david@redhat.com>:
      mm,memory_hotplug: factor out adjusting present pages into adjust_present_page_count()

    Oscar Salvador <osalvador@suse.de>:
      mm,memory_hotplug: allocate memmap from the added memory range
      acpi,memhotplug: enable MHP_MEMMAP_ON_MEMORY when supported
      mm,memory_hotplug: add kernel boot option to enable memmap_on_memory
      x86/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
      arm64/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE

Subsystem: mm/zswap

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/zswap.c: switch from strlcpy to strscpy

Subsystem: mm/zsmalloc

    zhouchuangao <zhouchuangao@vivo.com>:
      mm/zsmalloc: use BUG_ON instead of if condition followed by BUG.

Subsystem: mm/highmem

    Ira Weiny <ira.weiny@intel.com>:
    Patch series "btrfs: Convert kmap/memset/kunmap to memzero_user()":
      iov_iter: lift memzero_page() to highmem.h
      btrfs: use memzero_page() instead of open coded kmap pattern

    songqiang <songqiang@uniontech.com>:
      mm/highmem.c: fix coding style issue

Subsystem: mm/cleanups

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/mempool: minor coding style tweaks

    Zhang Yunkai <zhang.yunkai@zte.com.cn>:
      mm/process_vm_access.c: remove duplicate include

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: zero guard page after out-of-bounds access
    Patch series "kfence: optimize timer scheduling", v2:
      kfence: await for allocation using wait_event
      kfence: maximize allocation wait timeout duration
      kfence: use power-efficient work queue to run delayed work

 Documentation/ABI/testing/sysfs-kernel-mm-cma     |   25 
 Documentation/admin-guide/kernel-parameters.txt   |   17 
 Documentation/admin-guide/mm/memory-hotplug.rst   |    9 
 Documentation/admin-guide/mm/userfaultfd.rst      |  105 +-
 arch/arc/Kconfig                                  |    9 
 arch/arm/Kconfig                                  |   10 
 arch/arm64/Kconfig                                |   34 
 arch/arm64/mm/hugetlbpage.c                       |    7 
 arch/ia64/Kconfig                                 |   14 
 arch/ia64/mm/hugetlbpage.c                        |    3 
 arch/mips/Kconfig                                 |    6 
 arch/mips/mm/hugetlbpage.c                        |    4 
 arch/parisc/Kconfig                               |    5 
 arch/parisc/mm/hugetlbpage.c                      |    2 
 arch/powerpc/Kconfig                              |   17 
 arch/powerpc/mm/hugetlbpage.c                     |    3 
 arch/powerpc/platforms/Kconfig.cputype            |   16 
 arch/riscv/Kconfig                                |    5 
 arch/s390/Kconfig                                 |   12 
 arch/s390/mm/hugetlbpage.c                        |    2 
 arch/sh/Kconfig                                   |    7 
 arch/sh/mm/Kconfig                                |    8 
 arch/sh/mm/hugetlbpage.c                          |    2 
 arch/sparc/mm/hugetlbpage.c                       |    2 
 arch/x86/Kconfig                                  |   33 
 arch/x86/mm/pat/set_memory.c                      |    8 
 drivers/acpi/acpi_memhotplug.c                    |    5 
 drivers/base/memory.c                             |  105 ++
 fs/Kconfig                                        |    5 
 fs/block_dev.c                                    |    2 
 fs/btrfs/compression.c                            |    5 
 fs/btrfs/extent_io.c                              |   22 
 fs/btrfs/inode.c                                  |   33 
 fs/btrfs/reflink.c                                |    6 
 fs/btrfs/zlib.c                                   |    5 
 fs/btrfs/zstd.c                                   |    5 
 fs/buffer.c                                       |   36 
 fs/dax.c                                          |    8 
 fs/gfs2/glock.c                                   |    3 
 fs/hugetlbfs/inode.c                              |    9 
 fs/inode.c                                        |   11 
 fs/proc/task_mmu.c                                |    3 
 fs/userfaultfd.c                                  |  149 +++
 include/linux/buffer_head.h                       |    4 
 include/linux/cma.h                               |    4 
 include/linux/compaction.h                        |    1 
 include/linux/fs.h                                |    2 
 include/linux/gfp.h                               |    2 
 include/linux/highmem.h                           |    7 
 include/linux/huge_mm.h                           |    3 
 include/linux/hugetlb.h                           |   37 
 include/linux/memcontrol.h                        |   27 
 include/linux/memory.h                            |    8 
 include/linux/memory_hotplug.h                    |   15 
 include/linux/memremap.h                          |    2 
 include/linux/migrate.h                           |   11 
 include/linux/mm.h                                |   28 
 include/linux/mmzone.h                            |   20 
 include/linux/pagemap.h                           |    5 
 include/linux/pgtable.h                           |   12 
 include/linux/sched.h                             |    2 
 include/linux/sched/mm.h                          |   27 
 include/linux/shrinker.h                          |    7 
 include/linux/swap.h                              |   21 
 include/linux/userfaultfd_k.h                     |   55 +
 include/linux/vm_event_item.h                     |    8 
 include/trace/events/cma.h                        |   92 +-
 include/trace/events/migrate.h                    |   25 
 include/trace/events/mmflags.h                    |    7 
 include/uapi/linux/mempolicy.h                    |    7 
 include/uapi/linux/userfaultfd.h                  |   36 
 init/Kconfig                                      |    5 
 kernel/sysctl.c                                   |    2 
 lib/Kconfig.kfence                                |    1 
 lib/iov_iter.c                                    |    8 
 mm/Kconfig                                        |   28 
 mm/Makefile                                       |    6 
 mm/cma.c                                          |   70 +
 mm/cma.h                                          |   25 
 mm/cma_debug.c                                    |    8 
 mm/cma_sysfs.c                                    |  112 ++
 mm/compaction.c                                   |  113 ++
 mm/filemap.c                                      |   24 
 mm/frontswap.c                                    |   12 
 mm/gup.c                                          |  264 +++---
 mm/gup_test.c                                     |   29 
 mm/gup_test.h                                     |    3 
 mm/highmem.c                                      |   11 
 mm/huge_memory.c                                  |  326 +++++++-
 mm/hugetlb.c                                      |  843 ++++++++++++++--------
 mm/hugetlb_cgroup.c                               |    9 
 mm/internal.h                                     |   10 
 mm/kfence/core.c                                  |   61 +
 mm/khugepaged.c                                   |   63 -
 mm/ksm.c                                          |   17 
 mm/list_lru.c                                     |    6 
 mm/memcontrol.c                                   |  137 ---
 mm/memory_hotplug.c                               |  220 +++++
 mm/mempolicy.c                                    |   16 
 mm/mempool.c                                      |    2 
 mm/migrate.c                                      |  103 --
 mm/mlock.c                                        |    4 
 mm/mmap.c                                         |   18 
 mm/oom_kill.c                                     |    2 
 mm/page_alloc.c                                   |   83 +-
 mm/process_vm_access.c                            |    1 
 mm/shmem.c                                        |    2 
 mm/sparse.c                                       |    4 
 mm/swap.c                                         |   69 +
 mm/swap_state.c                                   |    4 
 mm/swapfile.c                                     |    4 
 mm/truncate.c                                     |   19 
 mm/userfaultfd.c                                  |   39 -
 mm/util.c                                         |   26 
 mm/vmalloc.c                                      |    2 
 mm/vmscan.c                                       |  543 +++++++++-----
 mm/vmstat.c                                       |   45 -
 mm/workingset.c                                   |    1 
 mm/zsmalloc.c                                     |    6 
 mm/zswap.c                                        |    2 
 tools/testing/selftests/vm/.gitignore             |    1 
 tools/testing/selftests/vm/Makefile               |    1 
 tools/testing/selftests/vm/gup_test.c             |   38 
 tools/testing/selftests/vm/split_huge_page_test.c |  400 ++++++++++
 tools/testing/selftests/vm/userfaultfd.c          |  164 ++++
 125 files changed, 3596 insertions(+), 1668 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-04-30  5:52 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-04-30  5:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


A few misc subsystems and some of MM.


178 patches, based on 8ca5297e7e38f2dc8c753d33a5092e7be181fff0.

Subsystems affected by this patch series:

  ia64
  kbuild
  scripts
  sh
  ocfs2
  kfifo
  vfs
  kernel/watchdog
  mm/slab-generic
  mm/slub
  mm/kmemleak
  mm/debug
  mm/pagecache
  mm/msync
  mm/gup
  mm/memremap
  mm/memcg
  mm/pagemap
  mm/mremap
  mm/dma
  mm/sparsemem
  mm/vmalloc
  mm/documentation
  mm/kasan
  mm/initialization
  mm/pagealloc
  mm/memory-failure

Subsystem: ia64

    Zhang Yunkai <zhang.yunkai@zte.com.cn>:
      arch/ia64/kernel/head.S: remove duplicate include

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      arch/ia64/kernel/fsys.S: fix typos
      arch/ia64/include/asm/pgtable.h: minor typo fixes

    Valentin Schneider <valentin.schneider@arm.com>:
      ia64: ensure proper NUMA distance and possible map initialization

    Sergei Trofimovich <slyfox@gentoo.org>:
      ia64: drop unused IA64_FW_EMU ifdef
      ia64: simplify code flow around swiotlb init

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ia64: trivial spelling fixes

    Sergei Trofimovich <slyfox@gentoo.org>:
      ia64: fix EFI_DEBUG build
      ia64: mca: always make IA64_MCA_DEBUG an expression
      ia64: drop marked broken DISCONTIGMEM and VIRTUAL_MEM_MAP
      ia64: module: fix symbolizer crash on fdescr

Subsystem: kbuild

    Luc Van Oostenryck <luc.vanoostenryck@gmail.com>:
      include/linux/compiler-gcc.h: sparse can do constant folding of __builtin_bswap*()

Subsystem: scripts

    Tom Saeger <tom.saeger@oracle.com>:
      scripts/spelling.txt: add entries for recent discoveries

    Wan Jiabing <wanjiabing@vivo.com>:
      scripts: a new script for checking duplicate struct declaration

Subsystem: sh

    Zhang Yunkai <zhang.yunkai@zte.com.cn>:
      arch/sh/include/asm/tlb.h: remove duplicate include

Subsystem: ocfs2

    Yang Li <yang.lee@linux.alibaba.com>:
      ocfs2: replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: map flags directly in flags_to_o2dlm()

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      ocfs2: fix a typo

    Jiapeng Chong <jiapeng.chong@linux.alibaba.com>:
      ocfs2/dlm: remove unused function

Subsystem: kfifo

    Dan Carpenter <dan.carpenter@oracle.com>:
      kfifo: fix ternary sign extension bugs

Subsystem: vfs

    Randy Dunlap <rdunlap@infradead.org>:
      vfs: fs_parser: clean up kernel-doc warnings

Subsystem: kernel/watchdog

    Petr Mladek <pmladek@suse.com>:
    Patch series "watchdog/softlockup: Report overall time and some cleanup", v2:
      watchdog: rename __touch_watchdog() to a better descriptive name
      watchdog: explicitly update timestamp when reporting softlockup
      watchdog/softlockup: report the overall time of softlockups
      watchdog/softlockup: remove logic that tried to prevent repeated reports
      watchdog: fix barriers when printing backtraces from all CPUs
      watchdog: cleanup handling of false positives

Subsystem: mm/slab-generic

    Rafael Aquini <aquini@redhat.com>:
      mm/slab_common: provide "slab_merge" option for !IS_ENABLED(CONFIG_SLAB_MERGE_DEFAULT) builds

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: enable slub_debug static key when creating cache with explicit debug flags

    Oliver Glitta <glittao@gmail.com>:
      kunit: add a KUnit test for SLUB debugging functionality
      slub: remove resiliency_test() function

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      mm/slub.c: trivial typo fixes

Subsystem: mm/kmemleak

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      mm/kmemleak.c: fix a typo

Subsystem: mm/debug

    Georgi Djakov <georgi.djakov@linaro.org>:
      mm/page_owner: record the timestamp of all pages during free

    zhongjiang-ali <zhongjiang-ali@linux.alibaba.com>:
      mm, page_owner: remove unused parameter in __set_page_owner_handle

    Sergei Trofimovich <slyfox@gentoo.org>:
      mm: page_owner: fetch backtrace only for tracked pages
      mm: page_owner: use kstrtobool() to parse bool option
      mm: page_owner: detect page_owner recursion via task_struct
      mm: page_poison: print page info when corruption is caught

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/memtest: add ARCH_USE_MEMTEST

Subsystem: mm/pagecache

    Jens Axboe <axboe@kernel.dk>:
    Patch series "Improve IOCB_NOWAIT O_DIRECT reads", v3:
      mm: provide filemap_range_needs_writeback() helper
      mm: use filemap_range_needs_writeback() for O_DIRECT reads
      iomap: use filemap_range_needs_writeback() for O_DIRECT reads

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap: use filemap_read_page in filemap_fault
      mm/filemap: drop check for truncated page after I/O

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: page-writeback: simplify memcg handling in test_clear_page_writeback()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: move page_mapping_file to pagemap.h

    Rui Sun <sunrui26@huawei.com>:
      mm/filemap: update stale comment

Subsystem: mm/msync

    Nikita Ermakov <sh1r4s3@mail.si-head.nl>:
      mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start

Subsystem: mm/gup

    Joao Martins <joao.m.martins@oracle.com>:
    Patch series "mm/gup: page unpining improvements", v4:
      mm/gup: add compound page list iterator
      mm/gup: decrement head page once for group of subpages
      mm/gup: add a range variant of unpin_user_pages_dirty_lock()
      RDMA/umem: batch page unpin in __ib_umem_release()

    Yang Shi <shy828301@gmail.com>:
      mm: gup: remove FOLL_SPLIT

Subsystem: mm/memremap

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/memremap.c: fix improper SPDX comment style

Subsystem: mm/memcg

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: fix kernel stack account

    Shakeel Butt <shakeelb@google.com>:
      memcg: cleanup root memcg checks
      memcg: enable memcg oom-kill for __GFP_NOFAIL

    Johannes Weiner <hannes@cmpxchg.org>:
    Patch series "mm: memcontrol: switch to rstat", v3:
      mm: memcontrol: fix cpuhotplug statistics flushing
      mm: memcontrol: kill mem_cgroup_nodeinfo()
      mm: memcontrol: privatize memcg_page_state query functions
      cgroup: rstat: support cgroup1
      cgroup: rstat: punt root-level optimization to individual controllers
      mm: memcontrol: switch to rstat
      mm: memcontrol: consolidate lruvec stat flushing
      kselftests: cgroup: update kmem test for new vmstat implementation

    Shakeel Butt <shakeelb@google.com>:
      memcg: charge before adding to swapcache on swapin

    Muchun Song <songmuchun@bytedance.com>:
    Patch series "Use obj_cgroup APIs to charge kmem pages", v5:
      mm: memcontrol: slab: fix obtain a reference to a freeing memcg
      mm: memcontrol: introduce obj_cgroup_{un}charge_pages
      mm: memcontrol: directly access page->memcg_data in mm/page_alloc.c
      mm: memcontrol: change ug->dummy_page only if memcg changed
      mm: memcontrol: use obj_cgroup APIs to charge kmem pages
      mm: memcontrol: inline __memcg_kmem_{un}charge() into obj_cgroup_{un}charge_pages()
      mm: memcontrol: move PageMemcgKmem to the scope of CONFIG_MEMCG_KMEM

    Wan Jiabing <wanjiabing@vivo.com>:
      linux/memcontrol.h: remove duplicate struct declaration

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: page_counter: mitigate consequences of a page_counter underflow

Subsystem: mm/pagemap

    Wang Qing <wangqing@vivo.com>:
      mm/memory.c: do_numa_page(): delete bool "migrated"

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/interval_tree: add comments to improve code readability

    Oscar Salvador <osalvador@suse.de>:
    Patch series "Cleanup and fixups for vmemmap handling", v6:
      x86/vmemmap: drop handling of 4K unaligned vmemmap range
      x86/vmemmap: drop handling of 1GB vmemmap ranges
      x86/vmemmap: handle unpopulated sub-pmd ranges
      x86/vmemmap: optimize for consecutive sections in partial populated PMDs

    Ovidiu Panait <ovidiu.panait@windriver.com>:
      mm, tracing: improve rss_stat tracepoint message

    Christoph Hellwig <hch@lst.de>:
    Patch series "add remap_pfn_range_notrack instead of reinventing it in i915", v2:
      mm: add remap_pfn_range_notrack
      mm: add a io_mapping_map_user helper
      i915: use io_mapping_map_user
      i915: fix remap_io_sg to verify the pgprot

    Huang Ying <ying.huang@intel.com>:
      NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

Subsystem: mm/mremap

    Brian Geffon <bgeffon@google.com>:
    Patch series "mm: Extend MREMAP_DONTUNMAP to non-anonymous mappings", v5:
      mm: extend MREMAP_DONTUNMAP to non-anonymous mappings
      Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio"
      selftests: add a MREMAP_DONTUNMAP selftest for shmem

Subsystem: mm/dma

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/dmapool: switch from strlcpy to strscpy

Subsystem: mm/sparsemem

    Wang Wensheng <wangwensheng4@huawei.com>:
      mm/sparse: add the missing sparse_buffer_fini() in error branch

Subsystem: mm/vmalloc

    Christoph Hellwig <hch@lst.de>:
    Patch series "remap_vmalloc_range cleanups":
      samples/vfio-mdev/mdpy: use remap_vmalloc_range
      mm: unexport remap_vmalloc_range_partial

    Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>:
      mm/vmalloc: use rb_tree instead of list for vread() lookups

    Nicholas Piggin <npiggin@gmail.com>:
    Patch series "huge vmalloc mappings", v13:
      ARM: mm: add missing pud_page define to 2-level page tables
      mm/vmalloc: fix HUGE_VMAP regression by enabling huge pages in vmalloc_to_page
      mm: apply_to_pte_range warn and fail if a large pte is encountered
      mm/vmalloc: rename vmap_*_range vmap_pages_*_range
      mm/ioremap: rename ioremap_*_range to vmap_*_range
      mm: HUGE_VMAP arch support cleanup
      powerpc: inline huge vmap supported functions
      arm64: inline huge vmap supported functions
      x86: inline huge vmap supported functions
      mm/vmalloc: provide fallback arch huge vmap support functions
      mm: move vmap_range from mm/ioremap.c to mm/vmalloc.c
      mm/vmalloc: add vmap_range_noflush variant
      mm/vmalloc: hugepage vmalloc mappings
    Patch series "mm/vmalloc: cleanup after hugepage series", v2:
      mm/vmalloc: remove map_kernel_range
      kernel/dma: remove unnecessary unmap_kernel_range
      powerpc/xive: remove unnecessary unmap_kernel_range
      mm/vmalloc: remove unmap_kernel_range
      mm/vmalloc: improve allocation failure error messages

    Vijayanand Jitta <vjitta@codeaurora.org>:
      mm: vmalloc: prevent use after free in _vm_unmap_aliases

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      lib/test_vmalloc.c: remove two kvfree_rcu() tests
      lib/test_vmalloc.c: add a new 'nr_threads' parameter
      vm/test_vmalloc.sh: adapt for updated driver interface
      mm/vmalloc: refactor the preloading loagic
      mm/vmalloc: remove an empty line

Subsystem: mm/documentation

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/doc: fix fault_flag_allow_retry_first kerneldoc
      mm/doc: fix page_maybe_dma_pinned kerneldoc
      mm/doc: turn fault flags into an enum
      mm/doc: add mm.h and mm_types.h to the mm-api document

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
    Patch series "kernel-doc and MAINTAINERS clean-up":
      MAINTAINERS: assign pagewalk.h to MEMORY MANAGEMENT
      pagewalk: prefix struct kernel-doc descriptions

Subsystem: mm/kasan

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/kasan: switch from strlcpy to strscpy

    Peter Collingbourne <pcc@google.com>:
      kasan: fix kasan_byte_accessible() to be consistent with actual checks

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: initialize shadow to TAG_INVALID for SW_TAGS
      mm, kasan: don't poison boot memory with tag-based modes
    Patch series "kasan: integrate with init_on_alloc/free", v3:
      arm64: kasan: allow to init memory when setting tags
      kasan: init memory in kasan_(un)poison for HW_TAGS
      kasan, mm: integrate page_alloc init with HW_TAGS
      kasan, mm: integrate slab init_on_alloc with HW_TAGS
      kasan, mm: integrate slab init_on_free with HW_TAGS
      kasan: docs: clean up sections
      kasan: docs: update overview section
      kasan: docs: update usage section
      kasan: docs: update error reports section
      kasan: docs: update boot parameters section
      kasan: docs: update GENERIC implementation details section
      kasan: docs: update SW_TAGS implementation details section
      kasan: docs: update HW_TAGS implementation details section
      kasan: docs: update shadow memory section
      kasan: docs: update ignoring accesses section
      kasan: docs: update tests section

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: record task_work_add() call stack

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: detect false-positives in tests

    Zqiang <qiang.zhang@windriver.com>:
      irq_work: record irq_work_queue() call stack

Subsystem: mm/initialization

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: move mem_init_print_info() into mm_init()

Subsystem: mm/pagealloc

    David Hildenbrand <david@redhat.com>:
      mm/page_alloc: drop pr_info_ratelimited() in alloc_contig_range()

    Minchan Kim <minchan@kernel.org>:
      mm: remove lru_add_drain_all in alloc_contig_range

    Yu Zhao <yuzhao@google.com>:
      include/linux/page-flags-layout.h: correctly determine LAST_CPUPID_WIDTH
      include/linux/page-flags-layout.h: cleanups

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Rationalise __alloc_pages wrappers", v3:
      mm/page_alloc: rename alloc_mask to alloc_gfp
      mm/page_alloc: rename gfp_mask to gfp
      mm/page_alloc: combine __alloc_pages and __alloc_pages_nodemask
      mm/mempolicy: rename alloc_pages_current to alloc_pages
      mm/mempolicy: rewrite alloc_pages documentation
      mm/mempolicy: rewrite alloc_pages_vma documentation
      mm/mempolicy: fix mpol_misplaced kernel-doc

    Minchan Kim <minchan@kernel.org>:
      mm: page_alloc: dump migrate-failed pages

    Geert Uytterhoeven <geert@linux-m68k.org>:
      mm/Kconfig: remove default DISCONTIGMEM_MANUAL

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm, page_alloc: avoid page_to_pfn() in move_freepages()

    zhouchuangao <zhouchuangao@vivo.com>:
      mm/page_alloc: duplicate include linux/vmalloc.h

    Mel Gorman <mgorman@techsingularity.net>:
    Patch series "Introduce a bulk order-0 page allocator with two in-tree users", v6:
      mm/page_alloc: rename alloced to allocated
      mm/page_alloc: add a bulk page allocator
      mm/page_alloc: add an array-based interface to the bulk page allocator

    Jesper Dangaard Brouer <brouer@redhat.com>:
      mm/page_alloc: optimize code layout for __alloc_pages_bulk
      mm/page_alloc: inline __rmqueue_pcplist

    Chuck Lever <chuck.lever@oracle.com>:
    Patch series "SUNRPC consumer for the bulk page allocator":
      SUNRPC: set rq_page_end differently
      SUNRPC: refresh rq_pages using a bulk page allocator

    Jesper Dangaard Brouer <brouer@redhat.com>:
      net: page_pool: refactor dma_map into own function page_pool_dma_map
      net: page_pool: use alloc_pages_bulk in refill code path

    Sergei Trofimovich <slyfox@gentoo.org>:
      mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1

    huxiang <huxiang@uniontech.com>:
      mm/page_alloc: redundant definition variables of pfn in for loop

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/mmzone.h: fix existing kernel-doc comments and link them to core-api

Subsystem: mm/memory-failure

    Jane Chu <jane.chu@oracle.com>:
      mm/memory-failure: unnecessary amount of unmapping

 Documentation/admin-guide/kernel-parameters.txt |    7 
 Documentation/admin-guide/mm/transhuge.rst      |    2 
 Documentation/core-api/cachetlb.rst             |    4 
 Documentation/core-api/mm-api.rst               |    6 
 Documentation/dev-tools/kasan.rst               |  355 +++++-----
 Documentation/vm/page_owner.rst                 |    2 
 Documentation/vm/transhuge.rst                  |    5 
 MAINTAINERS                                     |    1 
 arch/Kconfig                                    |   11 
 arch/alpha/mm/init.c                            |    1 
 arch/arc/mm/init.c                              |    1 
 arch/arm/Kconfig                                |    1 
 arch/arm/include/asm/pgtable-3level.h           |    2 
 arch/arm/include/asm/pgtable.h                  |    3 
 arch/arm/mm/copypage-v4mc.c                     |    1 
 arch/arm/mm/copypage-v6.c                       |    1 
 arch/arm/mm/copypage-xscale.c                   |    1 
 arch/arm/mm/init.c                              |    2 
 arch/arm64/Kconfig                              |    1 
 arch/arm64/include/asm/memory.h                 |    4 
 arch/arm64/include/asm/mte-kasan.h              |   39 -
 arch/arm64/include/asm/vmalloc.h                |   38 -
 arch/arm64/mm/init.c                            |    4 
 arch/arm64/mm/mmu.c                             |   36 -
 arch/csky/abiv1/cacheflush.c                    |    1 
 arch/csky/mm/init.c                             |    1 
 arch/h8300/mm/init.c                            |    2 
 arch/hexagon/mm/init.c                          |    1 
 arch/ia64/Kconfig                               |   23 
 arch/ia64/configs/bigsur_defconfig              |    1 
 arch/ia64/include/asm/meminit.h                 |   11 
 arch/ia64/include/asm/module.h                  |    6 
 arch/ia64/include/asm/page.h                    |   25 
 arch/ia64/include/asm/pgtable.h                 |    7 
 arch/ia64/kernel/Makefile                       |    2 
 arch/ia64/kernel/acpi.c                         |    7 
 arch/ia64/kernel/efi.c                          |   11 
 arch/ia64/kernel/fsys.S                         |    4 
 arch/ia64/kernel/head.S                         |    6 
 arch/ia64/kernel/ia64_ksyms.c                   |   12 
 arch/ia64/kernel/machine_kexec.c                |    2 
 arch/ia64/kernel/mca.c                          |    4 
 arch/ia64/kernel/module.c                       |   29 
 arch/ia64/kernel/pal.S                          |    6 
 arch/ia64/mm/Makefile                           |    1 
 arch/ia64/mm/contig.c                           |    4 
 arch/ia64/mm/discontig.c                        |   21 
 arch/ia64/mm/fault.c                            |   15 
 arch/ia64/mm/init.c                             |  221 ------
 arch/m68k/mm/init.c                             |    1 
 arch/microblaze/mm/init.c                       |    1 
 arch/mips/Kconfig                               |    1 
 arch/mips/loongson64/numa.c                     |    1 
 arch/mips/mm/cache.c                            |    1 
 arch/mips/mm/init.c                             |    1 
 arch/mips/sgi-ip27/ip27-memory.c                |    1 
 arch/nds32/mm/init.c                            |    1 
 arch/nios2/mm/cacheflush.c                      |    1 
 arch/nios2/mm/init.c                            |    1 
 arch/openrisc/mm/init.c                         |    2 
 arch/parisc/mm/init.c                           |    2 
 arch/powerpc/Kconfig                            |    1 
 arch/powerpc/include/asm/vmalloc.h              |   34 -
 arch/powerpc/kernel/isa-bridge.c                |    4 
 arch/powerpc/kernel/pci_64.c                    |    2 
 arch/powerpc/mm/book3s64/radix_pgtable.c        |   29 
 arch/powerpc/mm/ioremap.c                       |    2 
 arch/powerpc/mm/mem.c                           |    1 
 arch/powerpc/sysdev/xive/common.c               |    4 
 arch/riscv/mm/init.c                            |    1 
 arch/s390/mm/init.c                             |    2 
 arch/sh/include/asm/tlb.h                       |   10 
 arch/sh/mm/cache-sh4.c                          |    1 
 arch/sh/mm/cache-sh7705.c                       |    1 
 arch/sh/mm/init.c                               |    1 
 arch/sparc/include/asm/pgtable_32.h             |    3 
 arch/sparc/mm/init_32.c                         |    2 
 arch/sparc/mm/init_64.c                         |    1 
 arch/sparc/mm/tlb.c                             |    1 
 arch/um/kernel/mem.c                            |    1 
 arch/x86/Kconfig                                |    1 
 arch/x86/include/asm/vmalloc.h                  |   42 -
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c       |    2 
 arch/x86/mm/init_32.c                           |    2 
 arch/x86/mm/init_64.c                           |  222 ++++--
 arch/x86/mm/ioremap.c                           |   33 
 arch/x86/mm/pgtable.c                           |   13 
 arch/xtensa/Kconfig                             |    1 
 arch/xtensa/mm/init.c                           |    1 
 block/blk-cgroup.c                              |   17 
 drivers/gpu/drm/i915/Kconfig                    |    1 
 drivers/gpu/drm/i915/gem/i915_gem_mman.c        |    9 
 drivers/gpu/drm/i915/i915_drv.h                 |    3 
 drivers/gpu/drm/i915/i915_mm.c                  |  117 ---
 drivers/infiniband/core/umem.c                  |   12 
 drivers/pci/pci.c                               |    2 
 fs/aio.c                                        |    5 
 fs/fs_parser.c                                  |    2 
 fs/iomap/direct-io.c                            |   24 
 fs/ocfs2/blockcheck.c                           |    2 
 fs/ocfs2/dlm/dlmrecovery.c                      |    7 
 fs/ocfs2/stack_o2cb.c                           |   36 -
 fs/ocfs2/stackglue.c                            |    2 
 include/linux/compiler-gcc.h                    |    8 
 include/linux/fs.h                              |    2 
 include/linux/gfp.h                             |   45 -
 include/linux/io-mapping.h                      |    3 
 include/linux/io.h                              |    9 
 include/linux/kasan.h                           |   51 +
 include/linux/memcontrol.h                      |  271 ++++----
 include/linux/mm.h                              |   50 -
 include/linux/mmzone.h                          |   43 -
 include/linux/page-flags-layout.h               |   64 -
 include/linux/pagemap.h                         |   10 
 include/linux/pagewalk.h                        |    4 
 include/linux/sched.h                           |    4 
 include/linux/slab.h                            |    2 
 include/linux/slub_def.h                        |    2 
 include/linux/vmalloc.h                         |   73 +-
 include/linux/vmstat.h                          |   24 
 include/net/page_pool.h                         |    2 
 include/trace/events/kmem.h                     |   24 
 init/main.c                                     |    2 
 kernel/cgroup/cgroup.c                          |   34 -
 kernel/cgroup/rstat.c                           |   61 +
 kernel/dma/remap.c                              |    1 
 kernel/fork.c                                   |   13 
 kernel/irq_work.c                               |    7 
 kernel/task_work.c                              |    3 
 kernel/watchdog.c                               |  102 +--
 lib/Kconfig.debug                               |   14 
 lib/Makefile                                    |    1 
 lib/test_kasan.c                                |   59 -
 lib/test_slub.c                                 |  124 +++
 lib/test_vmalloc.c                              |  128 +--
 mm/Kconfig                                      |    4 
 mm/Makefile                                     |    1 
 mm/debug_vm_pgtable.c                           |    4 
 mm/dmapool.c                                    |    2 
 mm/filemap.c                                    |   61 +
 mm/gup.c                                        |  145 +++-
 mm/hugetlb.c                                    |    2 
 mm/internal.h                                   |   25 
 mm/interval_tree.c                              |    2 
 mm/io-mapping.c                                 |   29 
 mm/ioremap.c                                    |  361 ++--------
 mm/kasan/common.c                               |   53 -
 mm/kasan/generic.c                              |   12 
 mm/kasan/kasan.h                                |   28 
 mm/kasan/report_generic.c                       |    2 
 mm/kasan/shadow.c                               |   10 
 mm/kasan/sw_tags.c                              |   12 
 mm/kmemleak.c                                   |    2 
 mm/memcontrol.c                                 |  798 ++++++++++++------------
 mm/memory-failure.c                             |    2 
 mm/memory.c                                     |  191 +++--
 mm/mempolicy.c                                  |   78 --
 mm/mempool.c                                    |    4 
 mm/memremap.c                                   |    2 
 mm/migrate.c                                    |    2 
 mm/mm_init.c                                    |    4 
 mm/mmap.c                                       |    6 
 mm/mremap.c                                     |    6 
 mm/msync.c                                      |    6 
 mm/page-writeback.c                             |    9 
 mm/page_alloc.c                                 |  430 +++++++++---
 mm/page_counter.c                               |    8 
 mm/page_owner.c                                 |   68 --
 mm/page_poison.c                                |    6 
 mm/percpu-vm.c                                  |    7 
 mm/slab.c                                       |   43 -
 mm/slab.h                                       |   24 
 mm/slab_common.c                                |   10 
 mm/slub.c                                       |  215 ++----
 mm/sparse.c                                     |    1 
 mm/swap_state.c                                 |   13 
 mm/util.c                                       |   10 
 mm/vmalloc.c                                    |  728 ++++++++++++++++-----
 net/core/page_pool.c                            |  127 ++-
 net/sunrpc/svc_xprt.c                           |   38 -
 samples/kfifo/bytestream-example.c              |    8 
 samples/kfifo/inttype-example.c                 |    8 
 samples/kfifo/record-example.c                  |    8 
 samples/vfio-mdev/mdpy.c                        |    4 
 scripts/checkdeclares.pl                        |   53 +
 scripts/spelling.txt                            |   26 
 tools/testing/selftests/cgroup/test_kmem.c      |   22 
 tools/testing/selftests/vm/mremap_dontunmap.c   |   52 +
 tools/testing/selftests/vm/test_vmalloc.sh      |   21 
 189 files changed, 3642 insertions(+), 3013 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-04-23 21:28 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-04-23 21:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


5 patches, based on 5bfc75d92efd494db37f5c4c173d3639d4772966.

Subsystems affected by this patch series:

  coda
  overlayfs
  mm/pagecache
  mm/memcg

Subsystem: coda

    Christian König <christian.koenig@amd.com>:
      coda: fix reference counting in coda_file_mmap error path

Subsystem: overlayfs

    Christian König <christian.koenig@amd.com>:
      ovl: fix reference counting in ovl_mmap error path

Subsystem: mm/pagecache

    Hugh Dickins <hughd@google.com>:
      mm/filemap: fix find_lock_entries hang on 32-bit THP
      mm/filemap: fix mapping_seek_hole_data on THP & 32-bit

Subsystem: mm/memcg

    Vasily Averin <vvs@virtuozzo.com>:
      tools/cgroup/slabinfo.py: updated to work on current kernel

 fs/coda/file.c                 |    6 +++---
 fs/overlayfs/file.c            |   11 +----------
 mm/filemap.c                   |   31 +++++++++++++++++++------------
 tools/cgroup/memcg_slabinfo.py |    8 ++++----
 4 files changed, 27 insertions(+), 29 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-04-16 22:45 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-04-16 22:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

12 patches, based on 06c2aac4014c38247256fe49c61b7f55890271e7.

Subsystems affected by this patch series:

  mm/documentation
  mm/kasan
  csky
  ia64
  mm/pagemap
  gcov
  lib

Subsystem: mm/documentation

    Randy Dunlap <rdunlap@infradead.org>:
      mm: eliminate "expecting prototype" kernel-doc warnings

Subsystem: mm/kasan

    Arnd Bergmann <arnd@arndb.de>:
      kasan: fix hwasan build for gcc

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: remove redundant config option

Subsystem: csky

    Randy Dunlap <rdunlap@infradead.org>:
      csky: change a Kconfig symbol name to fix e1000 build error

Subsystem: ia64

    Randy Dunlap <rdunlap@infradead.org>:
      ia64: remove duplicate entries in generic_defconfig
      ia64: fix discontig.c section mismatches

    John Paul Adrian Glaubitz <glaubitz () physik ! fu-berlin ! de>:
      ia64: tools: remove inclusion of ia64-specific version of errno.h header

    John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>:
      ia64: tools: remove duplicate definition of ia64_mf() on ia64

Subsystem: mm/pagemap

    Zack Rusin <zackr@vmware.com>:
      mm/mapping_dirty_helpers: guard hugepage pud's usage

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm: ptdump: fix build failure

Subsystem: gcov

    Johannes Berg <johannes.berg@intel.com>:
      gcov: clang: fix clang-11+ build

Subsystem: lib

    Randy Dunlap <rdunlap@infradead.org>:
      lib: remove "expecting prototype" kernel-doc warnings

 arch/arm64/kernel/sleep.S             |    2 +-
 arch/csky/Kconfig                     |    2 +-
 arch/csky/include/asm/page.h          |    2 +-
 arch/ia64/configs/generic_defconfig   |    2 --
 arch/ia64/mm/discontig.c              |    6 +++---
 arch/x86/kernel/acpi/wakeup_64.S      |    2 +-
 include/linux/kasan.h                 |    2 +-
 kernel/gcov/clang.c                   |    2 +-
 lib/Kconfig.kasan                     |    9 ++-------
 lib/earlycpio.c                       |    4 ++--
 lib/lru_cache.c                       |    3 ++-
 lib/parman.c                          |    4 ++--
 lib/radix-tree.c                      |   11 ++++++-----
 mm/kasan/common.c                     |    2 +-
 mm/kasan/kasan.h                      |    2 +-
 mm/kasan/report_generic.c             |    2 +-
 mm/mapping_dirty_helpers.c            |    2 ++
 mm/mmu_gather.c                       |   29 +++++++++++++++++++----------
 mm/oom_kill.c                         |    2 +-
 mm/ptdump.c                           |    2 +-
 mm/shuffle.c                          |    4 ++--
 scripts/Makefile.kasan                |   22 ++++++++++++++--------
 security/Kconfig.hardening            |    4 ++--
 tools/arch/ia64/include/asm/barrier.h |    3 ---
 tools/include/uapi/asm/errno.h        |    2 --
 25 files changed, 67 insertions(+), 60 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-04-09 20:26 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-04-09 20:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

16 patches, based on 17e7124aad766b3f158943acb51467f86220afe9.

Subsystems affected by this patch series:

  MAINTAINERS
  mailmap
  mm/kasan
  mm/gup
  nds32
  gcov
  ocfs2
  ia64
  mm/pagecache
  mm/kasan
  mm/kfence
  lib

Subsystem: MAINTAINERS

    Marek Behún <kabel@kernel.org>:
      MAINTAINERS: update CZ.NIC's Turris information
      treewide: change my e-mail address, fix my name

Subsystem: mailmap

    Jordan Crouse <jordan@cosmicpenguin.net>:
      mailmap: update email address for Jordan Crouse

    Matthew Wilcox <willy@infradead.org>:
      .mailmap: fix old email addresses

Subsystem: mm/kasan

    Arnd Bergmann <arnd@arndb.de>:
      kasan: fix hwasan build for gcc

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: remove redundant config option

Subsystem: mm/gup

    Aili Yao <yaoaili@kingsoft.com>:
      mm/gup: check page posion status for coredump.

Subsystem: nds32

    Mike Rapoport <rppt@linux.ibm.com>:
      nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff

Subsystem: gcov

    Nick Desaulniers <ndesaulniers@google.com>:
      gcov: re-fix clang-11+ support

Subsystem: ocfs2

    Wengang Wang <wen.gang.wang@oracle.com>:
      ocfs2: fix deadlock between setattr and dio_end_io_write

Subsystem: ia64

    Sergei Trofimovich <slyfox@gentoo.org>:
      ia64: fix user_stack_pointer() for ptrace()

Subsystem: mm/pagecache

    Jack Qiu <jack.qiu@huawei.com>:
      fs: direct-io: fix missing sdio->boundary

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix conflict with page poisoning

    Andrew Morton <akpm@linux-foundation.org>:
      lib/test_kasan_module.c: suppress unused var warning

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence, x86: fix preemptible warning on KPTI-enabled systems

Subsystem: lib

    Julian Braha <julianbraha@gmail.com>:
      lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS

 .mailmap                                                            |    7 ++
 Documentation/ABI/testing/debugfs-moxtet                            |    4 -
 Documentation/ABI/testing/debugfs-turris-mox-rwtm                   |    2 
 Documentation/ABI/testing/sysfs-bus-moxtet-devices                  |    6 +-
 Documentation/ABI/testing/sysfs-class-led-driver-turris-omnia       |    2 
 Documentation/ABI/testing/sysfs-firmware-turris-mox-rwtm            |   10 +--
 Documentation/devicetree/bindings/leds/cznic,turris-omnia-leds.yaml |    2 
 MAINTAINERS                                                         |   13 +++-
 arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts              |    2 
 arch/arm64/kernel/sleep.S                                           |    2 
 arch/ia64/include/asm/ptrace.h                                      |    8 --
 arch/nds32/mm/cacheflush.c                                          |    2 
 arch/x86/include/asm/kfence.h                                       |    7 ++
 arch/x86/kernel/acpi/wakeup_64.S                                    |    2 
 drivers/bus/moxtet.c                                                |    4 -
 drivers/firmware/turris-mox-rwtm.c                                  |    4 -
 drivers/gpio/gpio-moxtet.c                                          |    4 -
 drivers/leds/leds-turris-omnia.c                                    |    4 -
 drivers/mailbox/armada-37xx-rwtm-mailbox.c                          |    4 -
 drivers/watchdog/armada_37xx_wdt.c                                  |    4 -
 fs/direct-io.c                                                      |    5 +
 fs/ocfs2/aops.c                                                     |   11 ---
 fs/ocfs2/file.c                                                     |    8 ++
 include/dt-bindings/bus/moxtet.h                                    |    2 
 include/linux/armada-37xx-rwtm-mailbox.h                            |    2 
 include/linux/kasan.h                                               |    2 
 include/linux/moxtet.h                                              |    2 
 kernel/gcov/clang.c                                                 |   29 ++++++----
 lib/Kconfig.debug                                                   |    6 +-
 lib/Kconfig.kasan                                                   |    9 ---
 lib/test_kasan_module.c                                             |    2 
 mm/gup.c                                                            |    4 +
 mm/internal.h                                                       |   20 ++++++
 mm/kasan/common.c                                                   |    2 
 mm/kasan/kasan.h                                                    |    2 
 mm/kasan/report_generic.c                                           |    2 
 mm/page_poison.c                                                    |    4 +
 scripts/Makefile.kasan                                              |   18 ++++--
 security/Kconfig.hardening                                          |    4 -
 39 files changed, 136 insertions(+), 91 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-03-25  4:36 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-03-25  4:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


14 patches, based on 7acac4b3196caee5e21fb5ea53f8bc124e6a16fc.

Subsystems affected by this patch series:

  mm/hugetlb
  mm/kasan
  mm/gup
  mm/selftests
  mm/z3fold
  squashfs
  ia64
  gcov
  mm/kfence
  mm/memblock
  mm/highmem
  mailmap

Subsystem: mm/hugetlb

    Miaohe Lin <linmiaohe@huawei.com>:
      hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix per-page tags for non-page_alloc pages

Subsystem: mm/gup

    Sean Christopherson <seanjc@google.com>:
      mm/mmu_notifiers: ensure range_end() is paired with range_start()

Subsystem: mm/selftests

    Rong Chen <rong.a.chen@intel.com>:
      selftests/vm: fix out-of-tree build

Subsystem: mm/z3fold

    Thomas Hebb <tommyhebb@gmail.com>:
      z3fold: prevent reclaim/free race for headless pages

Subsystem: squashfs

    Sean Nyekjaer <sean@geanix.com>:
      squashfs: fix inode lookup sanity checks

    Phillip Lougher <phillip@squashfs.org.uk>:
      squashfs: fix xattr id and id lookup sanity checks

Subsystem: ia64

    Sergei Trofimovich <slyfox@gentoo.org>:
      ia64: mca: allocate early mca with GFP_ATOMIC
      ia64: fix format strings for err_inject

Subsystem: gcov

    Nick Desaulniers <ndesaulniers@google.com>:
      gcov: fix clang-11+ support

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: make compatible with kmemleak

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
      mm: memblock: fix section mismatch warning again

Subsystem: mm/highmem

    Ira Weiny <ira.weiny@intel.com>:
      mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP

Subsystem: mailmap

    Andrey Konovalov <andreyknvl@google.com>:
      mailmap: update Andrey Konovalov's email address

 .mailmap                            |    1 
 arch/ia64/kernel/err_inject.c       |   22 +++++------
 arch/ia64/kernel/mca.c              |    2 -
 fs/squashfs/export.c                |    8 +++-
 fs/squashfs/id.c                    |    6 ++-
 fs/squashfs/squashfs_fs.h           |    1 
 fs/squashfs/xattr_id.c              |    6 ++-
 include/linux/hugetlb_cgroup.h      |   15 ++++++-
 include/linux/memblock.h            |    4 +-
 include/linux/mm.h                  |   18 +++++++--
 include/linux/mmu_notifier.h        |   10 ++---
 kernel/gcov/clang.c                 |   69 ++++++++++++++++++++++++++++++++++++
 mm/highmem.c                        |    4 +-
 mm/hugetlb.c                        |   41 +++++++++++++++++++--
 mm/hugetlb_cgroup.c                 |   10 ++++-
 mm/kfence/core.c                    |    9 ++++
 mm/kmemleak.c                       |    3 +
 mm/mmu_notifier.c                   |   23 ++++++++++++
 mm/z3fold.c                         |   16 +++++++-
 tools/testing/selftests/vm/Makefile |    4 +-
 20 files changed, 230 insertions(+), 42 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-03-13  5:06 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-03-13  5:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


29 patches, based on f78d76e72a4671ea52d12752d92077788b4f5d50.

Subsystems affected by this patch series:

  mm/memblock
  core-kernel
  kconfig
  mm/pagealloc
  fork
  mm/hugetlb
  mm/highmem
  binfmt
  MAINTAINERS
  kbuild
  mm/kfence
  mm/oom-kill
  mm/madvise
  mm/kasan
  mm/userfaultfd
  mm/memory-failure
  ia64
  mm/memcg
  mm/zram

Subsystem: mm/memblock

    Arnd Bergmann <arnd@arndb.de>:
      memblock: fix section mismatch warning

Subsystem: core-kernel

    Arnd Bergmann <arnd@arndb.de>:
      stop_machine: mark helpers __always_inline

Subsystem: kconfig

    Masahiro Yamada <masahiroy@kernel.org>:
      init/Kconfig: make COMPILE_TEST depend on HAS_IOMEM

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/page_alloc.c: refactor initialization of struct page for holes in memory layout

Subsystem: fork

    Fenghua Yu <fenghua.yu@intel.com>:
      mm/fork: clear PASID for new mm

Subsystem: mm/hugetlb

    Peter Xu <peterx@redhat.com>:
    Patch series "mm/hugetlb: Early cow on fork, and a few cleanups", v5:
      hugetlb: dedup the code to add a new file_region
      hugetlb: break earlier in add_reservation_in_range() when we can
      mm: introduce page_needs_cow_for_dma() for deciding whether cow
      mm: use is_cow_mapping() across tree where proper
      hugetlb: do early cow when page pinned on src mm

Subsystem: mm/highmem

    OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>:
      mm/highmem.c: fix zero_user_segments() with start > end

Subsystem: binfmt

    Lior Ribak <liorribak@gmail.com>:
      binfmt_misc: fix possible deadlock in bm_register_write

Subsystem: MAINTAINERS

    Vlastimil Babka <vbabka@suse.cz>:
      MAINTAINERS: exclude uapi directories in API/ABI section

Subsystem: kbuild

    Arnd Bergmann <arnd@arndb.de>:
      linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*

Subsystem: mm/kfence

    Marco Elver <elver@google.com>:
      kfence: fix printk format for ptrdiff_t
      kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations
      kfence: fix reports if constant function prefixes exist

Subsystem: mm/oom-kill

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      include/linux/sched/mm.h: use rcu_dereference in in_vfork()

Subsystem: mm/madvise

    Suren Baghdasaryan <surenb@google.com>:
      mm/madvise: replace ptrace attach requirement for process_madvise

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
      kasan: fix KASAN_STACK dependency for HW_TAGS

Subsystem: mm/userfaultfd

    Nadav Amit <namit@vmware.com>:
      mm/userfaultfd: fix memory corruption due to writeprotect

Subsystem: mm/memory-failure

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm, hwpoison: do not lock page again when me_huge_page() successfully recovers

Subsystem: ia64

    Sergei Trofimovich <slyfox@gentoo.org>:
      ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
      ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

Subsystem: mm/memcg

    Zhou Guanghui <zhouguanghui1@huawei.com>:
      mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
      mm/memcg: set memcg when splitting page

Subsystem: mm/zram

    Minchan Kim <minchan@kernel.org>:
      zram: fix return value on writeback_store
      zram: fix broken page writeback

 MAINTAINERS                                |    4 
 arch/ia64/include/asm/syscall.h            |    2 
 arch/ia64/kernel/ptrace.c                  |   24 +++-
 drivers/block/zram/zram_drv.c              |   17 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c |    4 
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c   |    2 
 fs/binfmt_misc.c                           |   29 ++---
 fs/proc/task_mmu.c                         |    2 
 include/linux/compiler-clang.h             |    6 +
 include/linux/memblock.h                   |    4 
 include/linux/memcontrol.h                 |    6 -
 include/linux/mm.h                         |   21 +++
 include/linux/mm_types.h                   |    1 
 include/linux/sched/mm.h                   |    3 
 include/linux/stop_machine.h               |   11 +
 init/Kconfig                               |    3 
 kernel/fork.c                              |    8 +
 lib/Kconfig.kasan                          |    1 
 mm/highmem.c                               |   17 ++
 mm/huge_memory.c                           |   10 -
 mm/hugetlb.c                               |  123 +++++++++++++++------
 mm/internal.h                              |    5 
 mm/kfence/report.c                         |   30 +++--
 mm/madvise.c                               |   13 ++
 mm/memcontrol.c                            |   15 +-
 mm/memory-failure.c                        |    4 
 mm/memory.c                                |   16 +-
 mm/page_alloc.c                            |  167 ++++++++++++++---------------
 mm/slab.c                                  |    2 
 29 files changed, 334 insertions(+), 216 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-26 17:55 ` incoming Linus Torvalds
@ 2021-02-26 19:16   ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-02-26 19:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, Linux-MM

On Fri, 26 Feb 2021 09:55:27 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, Feb 25, 2021 at 5:14 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > - The rest of MM.
> >
> >   Includes kfence - another runtime memory validator.  Not as
> >   thorough as KASAN, but it has unmeasurable overhead and is intended
> >   to be usable in production builds.
> >
> > - Everything else
> 
> Just to clarify: you have nothing else really pending?

Yes, that's it from me for -rc1.




^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-26  1:14 incoming Andrew Morton
@ 2021-02-26 17:55 ` Linus Torvalds
  2021-02-26 19:16   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-02-26 17:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Thu, Feb 25, 2021 at 5:14 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> - The rest of MM.
>
>   Includes kfence - another runtime memory validator.  Not as
>   thorough as KASAN, but it has unmeasurable overhead and is intended
>   to be usable in production builds.
>
> - Everything else

Just to clarify: you have nothing else really pending?

I'm hoping to just do -rc1 this weekend after all - despite my late
start due to loss of power for several days.

I'll allow late stragglers with good reason through, but the fewer of
those there are, the better, of course.

Thanks,
                   Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-02-26  1:14 Andrew Morton
  2021-02-26 17:55 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-02-26  1:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- The rest of MM.

  Includes kfence - another runtime memory validator.  Not as
  thorough as KASAN, but it has unmeasurable overhead and is intended
  to be usable in production builds.

- Everything else


118 patches, based on 6fbd6cf85a3be127454a1ad58525a3adcf8612ab.

Subsystems affected by this patch series:

  mm/thp
  mm/cma
  mm/vmstat
  mm/memory-hotplug
  mm/mlock
  mm/rmap
  mm/zswap
  mm/zsmalloc
  mm/cleanups
  mm/kfence
  mm/kasan2
  alpha
  procfs
  sysctl
  misc
  core-kernel
  MAINTAINERS
  lib
  bitops
  checkpatch
  init
  coredump
  seq_file
  gdb
  ubsan
  initramfs
  mm/pagemap2

Subsystem: mm/thp

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Overhaul multi-page lookups for THP", v4:
      mm: make pagecache tagged lookups return only head pages
      mm/shmem: use pagevec_lookup in shmem_unlock_mapping
      mm/swap: optimise get_shadow_from_swap_cache
      mm: add FGP_ENTRY
      mm/filemap: rename find_get_entry to mapping_get_entry
      mm/filemap: add helper for finding pages
      mm/filemap: add mapping_seek_hole_data
      iomap: use mapping_seek_hole_data
      mm: add and use find_lock_entries
      mm: add an 'end' parameter to find_get_entries
      mm: add an 'end' parameter to pagevec_lookup_entries
      mm: remove nr_entries parameter from pagevec_lookup_entries
      mm: pass pvec directly to find_get_entries
      mm: remove pagevec_lookup_entries

    Rik van Riel <riel@surriel.com>:
    Patch series "mm,thp,shm: limit shmem THP alloc gfp_mask", v6:
      mm,thp,shmem: limit shmem THP alloc gfp_mask
      mm,thp,shm: limit gfp mask to no more than specified
      mm,thp,shmem: make khugepaged obey tmpfs mount flags
      mm,shmem,thp: limit shmem THP allocations to requested zones

Subsystem: mm/cma

    Roman Gushchin <guro@fb.com>:
      mm: cma: allocate cma areas bottom-up

    David Hildenbrand <david@redhat.com>:
      mm/cma: expose all pages to the buddy if activation of an area fails
      mm/page_alloc: count CMA pages per zone and print them in /proc/zoneinfo

    Patrick Daly <pdaly@codeaurora.org>:
      mm: cma: print region name on failure

Subsystem: mm/vmstat

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: vmstat: fix NOHZ wakeups for node stat changes
      mm: vmstat: add some comments on internal storage of byte items

    Jiang Biao <benbjiang@tencent.com>:
      mm/vmstat.c: erase latency in vmstat_shepherd

Subsystem: mm/memory-hotplug

    Dan Williams <dan.j.williams@intel.com>:
    Patch series "mm: Fix pfn_to_online_page() with respect to ZONE_DEVICE", v4:
      mm: move pfn_to_online_page() out of line
      mm: teach pfn_to_online_page() to consider subsection validity
      mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions
      mm: fix memory_failure() handling of dax-namespace metadata

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/memory_hotplug: rename all existing 'memhp' into 'mhp'

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/memory_hotplug: use helper function zone_end_pfn() to get end_pfn

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory: don't store phys_device in memory blocks
      Documentation: sysfs/memory: clarify some memory block device properties

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/memory_hotplug: Pre-validate the address range with platform", v5:
      mm/memory_hotplug: prevalidate the address range being added with platform
      arm64/mm: define arch_get_mappable_range()
      s390/mm: define arch_get_mappable_range()

    David Hildenbrand <david@redhat.com>:
      virtio-mem: check against mhp_get_pluggable_range() which memory we can hotplug

Subsystem: mm/mlock

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mlock: stop counting mlocked pages when none vma is found

Subsystem: mm/rmap

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/rmap: correct some obsolete comments of anon_vma
      mm/rmap: remove unneeded semicolon in page_not_mapped()
      mm/rmap: fix obsolete comment in __page_check_anon_rmap()
      mm/rmap: use page_not_mapped in try_to_unmap()
      mm/rmap: correct obsolete comment of page_get_anon_vma()
      mm/rmap: fix potential pte_unmap on an not mapped pte

Subsystem: mm/zswap

    Randy Dunlap <rdunlap@infradead.org>:
      mm: zswap: clean up confusing comment

    Tian Tao <tiantao6@hisilicon.com>:
    Patch series "Fix the compatibility of zsmalloc and zswap":
      mm/zswap: add the flag can_sleep_mapped
      mm: set the sleep_mapped to true for zbud and z3fold

Subsystem: mm/zsmalloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/zsmalloc.c: convert to use kmem_cache_zalloc in cache_alloc_zspage()

    Rokudo Yan <wu-yan@tcl.com>:
      zsmalloc: account the number of compacted pages correctly

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/zsmalloc.c: use page_private() to access page->private

Subsystem: mm/cleanups

    Guo Ren <guoren@linux.alibaba.com>:
      mm: page-flags.h: Typo fix (It -> If)

    Daniel Vetter <daniel.vetter@ffwll.ch>:
      mm/dmapool: use might_alloc()
      mm/backing-dev.c: use might_alloc()

    Stephen Zhang <stephenzhangzsd@gmail.com>:
      mm/early_ioremap.c: use __func__ instead of function name

Subsystem: mm/kfence

    Alexander Potapenko <glider@google.com>:
    Patch series "KFENCE: A low-overhead sampling-based memory safety error detector", v7:
      mm: add Kernel Electric-Fence infrastructure
      x86, kfence: enable KFENCE for x86

    Marco Elver <elver@google.com>:
      arm64, kfence: enable KFENCE for ARM64
      kfence: use pt_regs to generate stack trace on faults

    Alexander Potapenko <glider@google.com>:
      mm, kfence: insert KFENCE hooks for SLAB
      mm, kfence: insert KFENCE hooks for SLUB
      kfence, kasan: make KFENCE compatible with KASAN

    Marco Elver <elver@google.com>:
      kfence, Documentation: add KFENCE documentation
      kfence: add test suite
      MAINTAINERS: add entry for KFENCE
      kfence: report sensitive information based on no_hash_pointers

    Alexander Potapenko <glider@google.com>:
    Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3:
      tracing: add error_report_end trace point
      kfence: use error_report_end tracepoint
      kasan: use error_report_end tracepoint

Subsystem: mm/kasan2

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: optimizations and fixes for HW_TAGS", v4:
      kasan, mm: don't save alloc stacks twice
      kasan, mm: optimize kmalloc poisoning
      kasan: optimize large kmalloc poisoning
      kasan: clean up setting free info in kasan_slab_free
      kasan: unify large kfree checks
      kasan: rework krealloc tests
      kasan, mm: fail krealloc on freed objects
      kasan, mm: optimize krealloc poisoning
      kasan: ensure poisoning size alignment
      arm64: kasan: simplify and inline MTE functions
      kasan: inline HW_TAGS helper functions
      kasan: clarify that only first bug is reported in HW_TAGS

Subsystem: alpha

    Randy Dunlap <rdunlap@infradead.org>:
      alpha: remove CONFIG_EXPERIMENTAL from defconfigs

Subsystem: procfs

    Helge Deller <deller@gmx.de>:
      proc/wchan: use printk format instead of lookup_symbol_name()

    Josef Bacik <josef@toxicpanda.com>:
      proc: use kvzalloc for our kernel buffer

Subsystem: sysctl

    Lin Feng <linf@wangsu.com>:
      sysctl.c: fix underflow value setting risk in vm_table

Subsystem: misc

    Randy Dunlap <rdunlap@infradead.org>:
      include/linux: remove repeated words

    Miguel Ojeda <ojeda@kernel.org>:
      treewide: Miguel has moved

Subsystem: core-kernel

    Hubert Jasudowicz <hubert.jasudowicz@gmail.com>:
      groups: use flexible-array member in struct group_info
      groups: simplify struct group_info allocation

    Randy Dunlap <rdunlap@infradead.org>:
      kernel: delete repeated words in comments

Subsystem: MAINTAINERS

    Vlastimil Babka <vbabka@suse.cz>:
      MAINTAINERS: add uapi directories to API/ABI section

Subsystem: lib

    Huang Shijie <sjhuang@iluvatar.ai>:
      lib/genalloc.c: change return type to unsigned long for bitmap_set_ll

    Francis Laniel <laniel_francis@privacyrequired.com>:
      string.h: move fortified functions definitions in a dedicated header.

    Yogesh Lal <ylal@codeaurora.org>:
      lib: stackdepot: add support to configure STACK_HASH_SIZE

    Vijayanand Jitta <vjitta@codeaurora.org>:
      lib: stackdepot: add support to disable stack depot
      lib: stackdepot: fix ignoring return value warning

    Masahiro Yamada <masahiroy@kernel.org>:
      lib/cmdline: remove an unneeded local variable in next_arg()

Subsystem: bitops

    Geert Uytterhoeven <geert+renesas@glider.be>:
      include/linux/bitops.h: spelling s/synomyn/synonym/

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: improve blank line after declaration test

    Peng Wang <rocking@linux.alibaba.com>:
      checkpatch: ignore warning designated initializers using NR_CPUS

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: trivial style fixes

    Joe Perches <joe@perches.com>:
      checkpatch: prefer ftrace over function entry/exit printks
      checkpatch: improve TYPECAST_INT_CONSTANT test message

    Aditya Srivastava <yashsri421@gmail.com>:
      checkpatch: add warning for avoiding .L prefix symbols in assembly files

    Joe Perches <joe@perches.com>:
      checkpatch: add kmalloc_array_node to unnecessary OOM message check

    Chris Down <chris@chrisdown.name>:
      checkpatch: don't warn about colon termination in linker scripts

    Song Liu <songliubraving@fb.com>:
      checkpatch: do not apply "initialise globals to 0" check to BPF progs

Subsystem: init

    Masahiro Yamada <masahiroy@kernel.org>:
      init/version.c: remove Version_<LINUX_VERSION_CODE> symbol
      init: clean up early_param_on_off() macro

    Bhaskar Chowdhury <unixbhaskar@gmail.com>:
      init/Kconfig: fix a typo in CC_VERSION_TEXT help text

Subsystem: coredump

    Ira Weiny <ira.weiny@intel.com>:
      fs/coredump: use kmap_local_page()

Subsystem: seq_file

    NeilBrown <neilb@suse.de>:
    Patch series "Fix some seq_file users that were recently broken":
      seq_file: document how per-entry resources are managed.
      x86: fix seq_file iteration for pat/memtype.c

Subsystem: gdb

    George Prekas <prekageo@amazon.com>:
      scripts/gdb: fix list_for_each

    Sumit Garg <sumit.garg@linaro.org>:
      kgdb: fix to kill breakpoints on initmem after boot

Subsystem: ubsan

    Andrey Ryabinin <ryabinin.a.a@gmail.com>:
      ubsan: remove overflow checks

Subsystem: initramfs

    Florian Fainelli <f.fainelli@gmail.com>:
      initramfs: panic with memory information

Subsystem: mm/pagemap2

    Huang Pei <huangpei@loongson.cn>:
      MIPS: make userspace mapping young by default

 .mailmap                                            |    1 
 CREDITS                                             |    9 
 Documentation/ABI/testing/sysfs-devices-memory      |   58 -
 Documentation/admin-guide/auxdisplay/cfag12864b.rst |    2 
 Documentation/admin-guide/auxdisplay/ks0108.rst     |    2 
 Documentation/admin-guide/kernel-parameters.txt     |    6 
 Documentation/admin-guide/mm/memory-hotplug.rst     |   20 
 Documentation/dev-tools/index.rst                   |    1 
 Documentation/dev-tools/kasan.rst                   |    8 
 Documentation/dev-tools/kfence.rst                  |  318 +++++++
 Documentation/filesystems/seq_file.rst              |    6 
 MAINTAINERS                                         |   26 
 arch/alpha/configs/defconfig                        |    1 
 arch/arm64/Kconfig                                  |    1 
 arch/arm64/include/asm/cache.h                      |    1 
 arch/arm64/include/asm/kasan.h                      |    1 
 arch/arm64/include/asm/kfence.h                     |   26 
 arch/arm64/include/asm/mte-def.h                    |    2 
 arch/arm64/include/asm/mte-kasan.h                  |   65 +
 arch/arm64/include/asm/mte.h                        |    2 
 arch/arm64/kernel/mte.c                             |   46 -
 arch/arm64/lib/mte.S                                |   16 
 arch/arm64/mm/fault.c                               |    8 
 arch/arm64/mm/mmu.c                                 |   23 
 arch/mips/mm/cache.c                                |   30 
 arch/s390/mm/init.c                                 |    1 
 arch/s390/mm/vmem.c                                 |   14 
 arch/x86/Kconfig                                    |    1 
 arch/x86/include/asm/kfence.h                       |   76 +
 arch/x86/mm/fault.c                                 |   10 
 arch/x86/mm/pat/memtype.c                           |    4 
 drivers/auxdisplay/cfag12864b.c                     |    4 
 drivers/auxdisplay/cfag12864bfb.c                   |    4 
 drivers/auxdisplay/ks0108.c                         |    4 
 drivers/base/memory.c                               |   35 
 drivers/block/zram/zram_drv.c                       |    2 
 drivers/hv/hv_balloon.c                             |    2 
 drivers/virtio/virtio_mem.c                         |   43 
 drivers/xen/balloon.c                               |    2 
 fs/coredump.c                                       |    4 
 fs/iomap/seek.c                                     |  125 --
 fs/proc/base.c                                      |   21 
 fs/proc/proc_sysctl.c                               |    4 
 include/linux/bitops.h                              |    2 
 include/linux/cfag12864b.h                          |    2 
 include/linux/cred.h                                |    2 
 include/linux/fortify-string.h                      |  302 ++++++
 include/linux/gfp.h                                 |    2 
 include/linux/init.h                                |    4 
 include/linux/kasan.h                               |   25 
 include/linux/kfence.h                              |  230 +++++
 include/linux/kgdb.h                                |    2 
 include/linux/khugepaged.h                          |    2 
 include/linux/ks0108.h                              |    2 
 include/linux/mdev.h                                |    2 
 include/linux/memory.h                              |    3 
 include/linux/memory_hotplug.h                      |   33 
 include/linux/memremap.h                            |    6 
 include/linux/mmzone.h                              |   49 -
 include/linux/page-flags.h                          |    4 
 include/linux/pagemap.h                             |   10 
 include/linux/pagevec.h                             |   10 
 include/linux/pgtable.h                             |    8 
 include/linux/ptrace.h                              |    2 
 include/linux/rmap.h                                |    3 
 include/linux/slab_def.h                            |    3 
 include/linux/slub_def.h                            |    3 
 include/linux/stackdepot.h                          |    9 
 include/linux/string.h                              |  282 ------
 include/linux/vmstat.h                              |    6 
 include/linux/zpool.h                               |    3 
 include/linux/zsmalloc.h                            |    2 
 include/trace/events/error_report.h                 |   74 +
 include/uapi/linux/firewire-cdev.h                  |    2 
 include/uapi/linux/input.h                          |    2 
 init/Kconfig                                        |    2 
 init/initramfs.c                                    |   19 
 init/main.c                                         |    6 
 init/version.c                                      |    8 
 kernel/debug/debug_core.c                           |   11 
 kernel/events/core.c                                |    8 
 kernel/events/uprobes.c                             |    2 
 kernel/groups.c                                     |    7 
 kernel/locking/rtmutex.c                            |    4 
 kernel/locking/rwsem.c                              |    2 
 kernel/locking/semaphore.c                          |    2 
 kernel/sched/fair.c                                 |    2 
 kernel/sched/membarrier.c                           |    2 
 kernel/sysctl.c                                     |    8 
 kernel/trace/Makefile                               |    1 
 kernel/trace/error_report-traces.c                  |   12 
 lib/Kconfig                                         |    9 
 lib/Kconfig.debug                                   |    1 
 lib/Kconfig.kfence                                  |   84 +
 lib/Kconfig.ubsan                                   |   17 
 lib/cmdline.c                                       |    7 
 lib/genalloc.c                                      |    3 
 lib/stackdepot.c                                    |   41 
 lib/test_kasan.c                                    |  111 ++
 lib/test_ubsan.c                                    |   49 -
 lib/ubsan.c                                         |   68 -
 mm/Makefile                                         |    1 
 mm/backing-dev.c                                    |    3 
 mm/cma.c                                            |   64 -
 mm/dmapool.c                                        |    3 
 mm/early_ioremap.c                                  |   12 
 mm/filemap.c                                        |  361 +++++---
 mm/huge_memory.c                                    |    6 
 mm/internal.h                                       |    6 
 mm/kasan/common.c                                   |  213 +++-
 mm/kasan/generic.c                                  |    3 
 mm/kasan/hw_tags.c                                  |    2 
 mm/kasan/kasan.h                                    |   97 +-
 mm/kasan/report.c                                   |    8 
 mm/kasan/shadow.c                                   |   78 +
 mm/kfence/Makefile                                  |    6 
 mm/kfence/core.c                                    |  875 +++++++++++++++++++-
 mm/kfence/kfence.h                                  |  126 ++
 mm/kfence/kfence_test.c                             |  860 +++++++++++++++++++
 mm/kfence/report.c                                  |  350 ++++++--
 mm/khugepaged.c                                     |   22 
 mm/memory-failure.c                                 |    6 
 mm/memory.c                                         |    4 
 mm/memory_hotplug.c                                 |  178 +++-
 mm/memremap.c                                       |   23 
 mm/mlock.c                                          |    2 
 mm/page_alloc.c                                     |    1 
 mm/rmap.c                                           |   24 
 mm/shmem.c                                          |  160 +--
 mm/slab.c                                           |   38 
 mm/slab_common.c                                    |   29 
 mm/slub.c                                           |   63 +
 mm/swap.c                                           |   54 -
 mm/swap_state.c                                     |    7 
 mm/truncate.c                                       |  141 ---
 mm/vmstat.c                                         |   35 
 mm/z3fold.c                                         |    1 
 mm/zbud.c                                           |    1 
 mm/zpool.c                                          |   13 
 mm/zsmalloc.c                                       |   22 
 mm/zswap.c                                          |   57 +
 samples/auxdisplay/cfag12864b-example.c             |    2 
 scripts/Makefile.ubsan                              |    2 
 scripts/checkpatch.pl                               |  152 ++-
 scripts/gdb/linux/lists.py                          |    5 
 145 files changed, 5046 insertions(+), 1682 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-25  9:12       ` incoming Andrey Ryabinin
@ 2021-02-25 11:07         ` Walter Wu
  0 siblings, 0 replies; 485+ messages in thread
From: Walter Wu @ 2021-02-25 11:07 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Arnd Bergmann, Linus Torvalds, Andrew Morton, Dmitry Vyukov,
	Nathan Chancellor, Arnd Bergmann, Andrey Konovalov, Linux-MM,
	mm-commits, Andrey Ryabinin, Alexander Potapenko

Hi Andrey,

On Thu, 2021-02-25 at 12:12 +0300, Andrey Ryabinin wrote:
> On Thu, Feb 25, 2021 at 11:53 AM Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > On Wed, Feb 24, 2021 at 10:37 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > On Wed, Feb 24, 2021 at 1:30 PM Linus Torvalds
> > > <torvalds@linux-foundation.org> wrote:
> > > >
> > > > Hmm. I haven't bisected things yet, but I suspect it's something with
> > > > the KASAN patches. With this all applied, I get:
> > > >
> > > >   lib/crypto/curve25519-hacl64.c: In function ‘ladder_cmult.constprop’:
> > > >   lib/crypto/curve25519-hacl64.c:601:1: warning: the frame size of
> > > > 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
> > > >
> > > > and
> > > >
> > > >   lib/bitfield_kunit.c: In function ‘test_bitfields_constants’:
> > > >   lib/bitfield_kunit.c:93:1: warning: the frame size of 11200 bytes is
> > > > larger than 2048 bytes [-Wframe-larger-than=]
> > > >
> > > > which is obviously not really acceptable. A 11kB stack frame _will_
> > > > cause issues.
> > >
> > > A quick bisect shoes that this was introduced by "[patch 101/173]
> > > kasan: remove redundant config option".
> > >
> > > I didn't check what part of that patch screws up, but it's definitely
> > > doing something bad.
> >
> > I'm not sure why that patch surfaced the bug, but it's worth pointing
> > out that the underlying problem is asan-stack in combination
> > with the structleak plugin. This will happen for every user of kunit.
> >
> 
> The patch didn't update KASAN_STACK dependency in kconfig:
>         config GCC_PLUGIN_STRUCTLEAK_BYREF
> ....
>                depends on !(KASAN && KASAN_STACK=1)
> 
> This 'depends on'  stopped working with the patch

Thanks for pointing out this problem. I will re-send that patch.


Walter

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-25  8:53     ` incoming Arnd Bergmann
@ 2021-02-25  9:12       ` Andrey Ryabinin
  2021-02-25 11:07         ` incoming Walter Wu
  0 siblings, 1 reply; 485+ messages in thread
From: Andrey Ryabinin @ 2021-02-25  9:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Andrew Morton, Walter Wu, Dmitry Vyukov,
	Nathan Chancellor, Arnd Bergmann, Andrey Konovalov, Linux-MM,
	mm-commits, Andrey Ryabinin, Alexander Potapenko

On Thu, Feb 25, 2021 at 11:53 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> On Wed, Feb 24, 2021 at 10:37 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Wed, Feb 24, 2021 at 1:30 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > Hmm. I haven't bisected things yet, but I suspect it's something with
> > > the KASAN patches. With this all applied, I get:
> > >
> > >   lib/crypto/curve25519-hacl64.c: In function ‘ladder_cmult.constprop’:
> > >   lib/crypto/curve25519-hacl64.c:601:1: warning: the frame size of
> > > 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
> > >
> > > and
> > >
> > >   lib/bitfield_kunit.c: In function ‘test_bitfields_constants’:
> > >   lib/bitfield_kunit.c:93:1: warning: the frame size of 11200 bytes is
> > > larger than 2048 bytes [-Wframe-larger-than=]
> > >
> > > which is obviously not really acceptable. A 11kB stack frame _will_
> > > cause issues.
> >
> > A quick bisect shoes that this was introduced by "[patch 101/173]
> > kasan: remove redundant config option".
> >
> > I didn't check what part of that patch screws up, but it's definitely
> > doing something bad.
>
> I'm not sure why that patch surfaced the bug, but it's worth pointing
> out that the underlying problem is asan-stack in combination
> with the structleak plugin. This will happen for every user of kunit.
>

The patch didn't update KASAN_STACK dependency in kconfig:
        config GCC_PLUGIN_STRUCTLEAK_BYREF
....
               depends on !(KASAN && KASAN_STACK=1)

This 'depends on'  stopped working with the patch


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-24 21:37   ` incoming Linus Torvalds
@ 2021-02-25  8:53     ` Arnd Bergmann
  2021-02-25  9:12       ` incoming Andrey Ryabinin
  0 siblings, 1 reply; 485+ messages in thread
From: Arnd Bergmann @ 2021-02-25  8:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Walter Wu, Dmitry Vyukov, Nathan Chancellor,
	Arnd Bergmann, Andrey Konovalov, Linux-MM, mm-commits,
	Andrey Ryabinin, Alexander Potapenko

On Wed, Feb 24, 2021 at 10:37 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, Feb 24, 2021 at 1:30 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Hmm. I haven't bisected things yet, but I suspect it's something with
> > the KASAN patches. With this all applied, I get:
> >
> >   lib/crypto/curve25519-hacl64.c: In function ‘ladder_cmult.constprop’:
> >   lib/crypto/curve25519-hacl64.c:601:1: warning: the frame size of
> > 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
> >
> > and
> >
> >   lib/bitfield_kunit.c: In function ‘test_bitfields_constants’:
> >   lib/bitfield_kunit.c:93:1: warning: the frame size of 11200 bytes is
> > larger than 2048 bytes [-Wframe-larger-than=]
> >
> > which is obviously not really acceptable. A 11kB stack frame _will_
> > cause issues.
>
> A quick bisect shoes that this was introduced by "[patch 101/173]
> kasan: remove redundant config option".
>
> I didn't check what part of that patch screws up, but it's definitely
> doing something bad.

I'm not sure why that patch surfaced the bug, but it's worth pointing
out that the underlying problem is asan-stack in combination
with the structleak plugin. This will happen for every user of kunit.

I sent a series[1] out earlier this year to turn off the structleak
plugin as an alternative workaround, but need to follow up on
the remaining patches. Someone suggested adding a more
generic way to turn off the plugin for a file instead of open-coding
the CLFAGS_REMOVE_*.o Makefile bit, which would help.

I am also still hoping that someone can come up with a way
to make kunit work better with the structleak plugin, as there
shouldn't be a fundamental reason why it can't work, just that
it the code pattern triggers a particularly bad case in the compiler.

      Arnd

[1] https://lore.kernel.org/lkml/20210125124533.101339-1-arnd@kernel.org/


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-24 21:30 ` incoming Linus Torvalds
@ 2021-02-24 21:37   ` Linus Torvalds
  2021-02-25  8:53     ` incoming Arnd Bergmann
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-02-24 21:37 UTC (permalink / raw)
  To: Andrew Morton, Walter Wu, Dmitry Vyukov, Nathan Chancellor,
	Arnd Bergmann, Andrey Konovalov
  Cc: Linux-MM, mm-commits, Andrey Ryabinin, Alexander Potapenko

On Wed, Feb 24, 2021 at 1:30 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Hmm. I haven't bisected things yet, but I suspect it's something with
> the KASAN patches. With this all applied, I get:
>
>   lib/crypto/curve25519-hacl64.c: In function ‘ladder_cmult.constprop’:
>   lib/crypto/curve25519-hacl64.c:601:1: warning: the frame size of
> 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]
>
> and
>
>   lib/bitfield_kunit.c: In function ‘test_bitfields_constants’:
>   lib/bitfield_kunit.c:93:1: warning: the frame size of 11200 bytes is
> larger than 2048 bytes [-Wframe-larger-than=]
>
> which is obviously not really acceptable. A 11kB stack frame _will_
> cause issues.

A quick bisect shoes that this was introduced by "[patch 101/173]
kasan: remove redundant config option".

I didn't check what part of that patch screws up, but it's definitely
doing something bad.

I will drop that patch.

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-24 19:58 incoming Andrew Morton
@ 2021-02-24 21:30 ` Linus Torvalds
  2021-02-24 21:37   ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2021-02-24 21:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Wed, Feb 24, 2021 at 11:58 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> A few small subsystems and some of MM.

Hmm. I haven't bisected things yet, but I suspect it's something with
the KASAN patches. With this all applied, I get:

  lib/crypto/curve25519-hacl64.c: In function ‘ladder_cmult.constprop’:
  lib/crypto/curve25519-hacl64.c:601:1: warning: the frame size of
2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]

and

  lib/bitfield_kunit.c: In function ‘test_bitfields_constants’:
  lib/bitfield_kunit.c:93:1: warning: the frame size of 11200 bytes is
larger than 2048 bytes [-Wframe-larger-than=]

which is obviously not really acceptable. A 11kB stack frame _will_
cause issues.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-02-24 19:58 Andrew Morton
  2021-02-24 21:30 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-02-24 19:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


A few small subsystems and some of MM.


173 patches, based on c03c21ba6f4e95e406a1a7b4c34ef334b977c194.

Subsystems affected by this patch series:

  hexagon
  scripts
  ntfs
  ocfs2
  vfs
  mm/slab-generic
  mm/slab
  mm/slub
  mm/debug
  mm/pagecache
  mm/swap
  mm/memcg
  mm/pagemap
  mm/mprotect
  mm/mremap
  mm/page-reporting
  mm/vmalloc
  mm/kasan
  mm/pagealloc
  mm/memory-failure
  mm/hugetlb
  mm/vmscan
  mm/z3fold
  mm/compaction
  mm/mempolicy
  mm/oom-kill
  mm/hugetlbfs
  mm/migration

Subsystem: hexagon

    Randy Dunlap <rdunlap@infradead.org>:
      hexagon: remove CONFIG_EXPERIMENTAL from defconfigs

Subsystem: scripts

    tangchunyou <tangchunyou@yulong.com>:
      scripts/spelling.txt: increase error-prone spell checking

    zuoqilin <zuoqilin@yulong.com>:
      scripts/spelling.txt: check for "exeeds"

    dingsenjie <dingsenjie@yulong.com>:
      scripts/spelling.txt: add "allocted" and "exeeds" typo

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ntfs

    Randy Dunlap <rdunlap@infradead.org>:
      ntfs: layout.h: delete duplicated words

    Rustam Kovhaev <rkovhaev@gmail.com>:
      ntfs: check for valid standard information attribute

Subsystem: ocfs2

    Yi Li <yili@winhong.com>:
      ocfs2: remove redundant conditional before iput

    guozh <guozh88@chinatelecom.cn>:
      ocfs2: clean up some definitions which are not used any more

    Dan Carpenter <dan.carpenter@oracle.com>:
      ocfs2: fix a use after free on error

    Jiapeng Chong <jiapeng.chong@linux.alibaba.com>:
      ocfs2: simplify the calculation of variables

Subsystem: vfs

    Randy Dunlap <rdunlap@infradead.org>:
      fs: delete repeated words in comments

    Alexey Dobriyan <adobriyan@gmail.com>:
      ramfs: support O_TMPFILE

Subsystem: mm/slab-generic

    Jacob Wen <jian.w.wen@oracle.com>:
      mm, tracing: record slab name for kmem_cache_free()

    Nikolay Borisov <nborisov@suse.com>:
      mm/sl?b.c: remove ctor argument from kmem_cache_flags

Subsystem: mm/slab

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/slab: minor coding style tweaks

Subsystem: mm/slub

    Johannes Berg <johannes.berg@intel.com>:
      mm/slub: disable user tracing for kmemleak caches by default

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "mm, slab, slub: remove cpu and memory hotplug locks":
      mm, slub: stop freeing kmem_cache_node structures on node offline
      mm, slab, slub: stop taking memory hotplug lock
      mm, slab, slub: stop taking cpu hotplug lock
      mm, slub: splice cpu and page freelists in deactivate_slab()
      mm, slub: remove slub_memcg_sysfs boot param and CONFIG_SLUB_MEMCG_SYSFS_ON

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/slub: minor coding style tweaks

Subsystem: mm/debug

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/debug: improve memcg debugging

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/debug_vm_pgtable: Some minor updates", v3:
      mm/debug_vm_pgtable/basic: add validation for dirtiness after write protect
      mm/debug_vm_pgtable/basic: iterate over entire protection_map[]

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/page_owner: use helper function zone_end_pfn() to get end_pfn

Subsystem: mm/pagecache

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm/filemap: remove unused parameter and change to void type for replace_page_cache_page()

    Pavel Begunkov <asml.silence@gmail.com>:
      mm/filemap: don't revert iter on -EIOCBQUEUED

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Refactor generic_file_buffered_read", v5:
      mm/filemap: rename generic_file_buffered_read subfunctions
      mm/filemap: remove dynamically allocated array from filemap_read
      mm/filemap: convert filemap_get_pages to take a pagevec
      mm/filemap: use head pages in generic_file_buffered_read
      mm/filemap: pass a sleep state to put_and_wait_on_page_locked
      mm/filemap: support readpage splitting a page
      mm/filemap: inline __wait_on_page_locked_async into caller
      mm/filemap: don't call ->readpage if IOCB_WAITQ is set
      mm/filemap: change filemap_read_page calling conventions
      mm/filemap: change filemap_create_page calling conventions
      mm/filemap: convert filemap_update_page to return an errno
      mm/filemap: move the iocb checks into filemap_update_page
      mm/filemap: add filemap_range_uptodate
      mm/filemap: split filemap_readahead out of filemap_get_pages
      mm/filemap: restructure filemap_get_pages
      mm/filemap: don't relock the page after calling readpage

    Christoph Hellwig <hch@lst.de>:
      mm/filemap: rename generic_file_buffered_read to filemap_read
      mm/filemap: simplify generic_file_read_iter

    Yang Guo <guoyang2@huawei.com>:
      fs/buffer.c: add checking buffer head stat before clear

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm: backing-dev: Remove duplicated macro definition

Subsystem: mm/swap

    Yang Li <abaci-bugfix@linux.alibaba.com>:
      mm/swap_slots.c: remove redundant NULL check

    Stephen Zhang <stephenzhangzsd@gmail.com>:
      mm/swapfile.c: fix debugging information problem

    Georgi Djakov <georgi.djakov@linaro.org>:
      mm/page_io: use pr_alert_ratelimited for swap read/write errors

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      mm/swap_state: constify static struct attribute_group

    Yu Zhao <yuzhao@google.com>:
      mm/swap: don't SetPageWorkingset unconditionally during swapin

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: pre-allocate obj_cgroups for slab caches with SLAB_ACCOUNT

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: optimize per-lruvec stats counter memory usage
    Patch series "Convert all THP vmstat counters to pages", v6:
      mm: memcontrol: fix NR_ANON_THPS accounting in charge moving
      mm: memcontrol: convert NR_ANON_THPS account to pages
      mm: memcontrol: convert NR_FILE_THPS account to pages
      mm: memcontrol: convert NR_SHMEM_THPS account to pages
      mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages
      mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages
      mm: memcontrol: make the slab calculation consistent

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/memcg: revise the using condition of lock_page_lruvec function series
      mm/memcg: remove rcu locking for lock_page_lruvec function series

    Shakeel Butt <shakeelb@google.com>:
      mm: memcg: add swapcache stat for memcg v2

    Roman Gushchin <guro@fb.com>:
      mm: kmem: make __memcg_kmem_(un)charge static

    Feng Tang <feng.tang@intel.com>:
      mm: page_counter: re-layout structure to reduce false sharing

    Yang Li <abaci-bugfix@linux.alibaba.com>:
      mm/memcontrol: remove redundant NULL check

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: replace the loop with a list_for_each_entry()

    Shakeel Butt <shakeelb@google.com>:
      mm/list_lru.c: remove kvfree_rcu_local()

    Johannes Weiner <hannes@cmpxchg.org>:
      fs: buffer: use raw page_memcg() on locked page

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: fix swap undercounting in cgroup2
      mm: memcontrol: fix get_active_memcg return value
      mm: memcontrol: fix slub memory accounting

Subsystem: mm/pagemap

    Adrian Huang <ahuang12@lenovo.com>:
      mm/mmap.c: remove unnecessary local variable

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/memory.c: fix potential pte_unmap_unlock pte error
      mm/pgtable-generic.c: simplify the VM_BUG_ON condition in pmdp_huge_clear_flush()
      mm/pgtable-generic.c: optimize the VM_BUG_ON condition in pmdp_huge_clear_flush()
      mm/memory.c: fix potential pte_unmap_unlock pte error

Subsystem: mm/mprotect

    Tianjia Zhang <tianjia.zhang@linux.alibaba.com>:
      mm/mprotect.c: optimize error detection in do_mprotect_pkey()

Subsystem: mm/mremap

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas()
      mm: mremap: unlink anon_vmas when mremap with MREMAP_DONTUNMAP success

Subsystem: mm/page-reporting

    sh <sh_def@163.com>:
      mm/page_reporting: use list_entry_is_head() in page_reporting_cycle()

Subsystem: mm/vmalloc

    Yang Li <abaci-bugfix@linux.alibaba.com>:
      vmalloc: remove redundant NULL check

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: HW_TAGS tests support and fixes", v4:
      kasan: prefix global functions with kasan_
      kasan: clarify HW_TAGS impact on TBI
      kasan: clean up comments in tests
      kasan: add macros to simplify checking test constraints
      kasan: add match-all tag tests
      kasan, arm64: allow using KUnit tests with HW_TAGS mode
      kasan: rename CONFIG_TEST_KASAN_MODULE
      kasan: add compiler barriers to KUNIT_EXPECT_KASAN_FAIL
      kasan: adapt kmalloc_uaf2 test to HW_TAGS mode
      kasan: fix memory corruption in kasan_bitops_tags test
      kasan: move _RET_IP_ to inline wrappers
      kasan: fix bug detection via ksize for HW_TAGS mode
      kasan: add proper page allocator tests
      kasan: add a test for kmem_cache_alloc/free_bulk
      kasan: don't run tests when KASAN is not enabled

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: remove redundant config option

Subsystem: mm/pagealloc

    Baoquan He <bhe@redhat.com>:
    Patch series "mm: clean up names and parameters of memmap_init_xxxx functions", v5:
      mm: fix prototype warning from kernel test robot
      mm: rename memmap_init() and memmap_init_zone()
      mm: simplify parater of function memmap_init_zone()
      mm: simplify parameter of setup_usemap()
      mm: remove unneeded local variable in free_area_init_core

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: simplify free_highmem_page() and free_reserved_page()":
      video: fbdev: acornfb: remove free_unused_pages()
      mm: simplify free_highmem_page() and free_reserved_page()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/gfp: add kernel-doc for gfp_t

Subsystem: mm/memory-failure

    Aili Yao <yaoaili@kingsoft.com>:
      mm,hwpoison: send SIGBUS to PF_MCE_EARLY processes on action required events

Subsystem: mm/hugetlb

    Bibo Mao <maobibo@loongson.cn>:
      mm/huge_memory.c: update tlb entry if pmd is changed
      MIPS: do not call flush_tlb_all when setting pmd entry

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: fix potential double free in hugetlb_register_node() error path

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/hugetlb.c: fix unnecessary address expansion of pmd sharing

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: avoid unnecessary hugetlb_acct_memory() call
      mm/hugetlb: use helper huge_page_order and pages_per_huge_page
      mm/hugetlb: fix use after free when subpool max_hpages accounting is not enabled

    Jiapeng Zhong <abaci-bugfix@linux.alibaba.com>:
      mm/hugetlb: simplify the calculation of variables

    Joao Martins <joao.m.martins@oracle.com>:
    Patch series "mm/hugetlb: follow_hugetlb_page() improvements", v2:
      mm/hugetlb: grab head page refcount once for group of subpages
      mm/hugetlb: refactor subpage recording

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: fix some comment typos

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/hugetlb: remove redundant check in preparing and destroying gigantic page

    Zhiyuan Dai <daizhiyuan@phytium.com.cn>:
      mm/hugetlb.c: fix typos in comments

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/huge_memory.c: remove unused return value of set_huge_zero_page()

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/pmem: avoid inserting hugepage PTE entry with fsdax if hugepage support is disabled

    Miaohe Lin <linmiaohe@huawei.com>:
      hugetlb_cgroup: use helper pages_per_huge_page() in hugetlb_cgroup
      mm/hugetlb: use helper function range_in_vma() in page_table_shareable()
      mm/hugetlb: remove unnecessary VM_BUG_ON_PAGE on putback_active_hugepage()
      mm/hugetlb: use helper huge_page_size() to get hugepage size

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: fix update_and_free_page contig page struct assumption
      hugetlb: fix copy_huge_page_from_user contig page struct assumption

    Chen Wandun <chenwandun@huawei.com>:
      mm/hugetlb: suppress wrong warning info when alloc gigantic page

Subsystem: mm/vmscan

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/vmscan: __isolate_lru_page_prepare() cleanup

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/workingset.c: avoid unnecessary max_nodes estimation in count_shadow_nodes()

    Yu Zhao <yuzhao@google.com>:
    Patch series "mm: lru related cleanups", v2:
      mm/vmscan.c: use add_page_to_lru_list()
      include/linux/mm_inline.h: shuffle lru list addition and deletion functions
      mm: don't pass "enum lru_list" to lru list addition functions
      mm/swap.c: don't pass "enum lru_list" to trace_mm_lru_insertion()
      mm/swap.c: don't pass "enum lru_list" to del_page_from_lru_list()
      mm: add __clear_page_lru_flags() to replace page_off_lru()
      mm: VM_BUG_ON lru page flags
      include/linux/mm_inline.h: fold page_lru_base_type() into its sole caller
      include/linux/mm_inline.h: fold __update_lru_size() into its sole caller
      mm/vmscan.c: make lruvec_lru_size() static

    Oscar Salvador <osalvador@suse.de>:
      mm: workingset: clarify eviction order and distance calculation

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "create hugetlb flags to consolidate state", v3:
      hugetlb: use page.private for hugetlb specific page flags
      hugetlb: convert page_huge_active() HPageMigratable flag
      hugetlb: convert PageHugeTemporary() to HPageTemporary flag
      hugetlb: convert PageHugeFreed to HPageFreed flag
      include/linux/hugetlb.h: add synchronization information for new hugetlb specific flags
      hugetlb: fix uninitialized subpool pointer

    Dave Hansen <dave.hansen@linux.intel.com>:
      mm/vmscan: restore zone_reclaim_mode ABI

Subsystem: mm/z3fold

    Miaohe Lin <linmiaohe@huawei.com>:
      z3fold: remove unused attribute for release_z3fold_page
      z3fold: simplify the zhdr initialization code in init_z3fold_page()

Subsystem: mm/compaction

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/compaction: remove rcu_read_lock during page compaction

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/compaction: remove duplicated VM_BUG_ON_PAGE !PageLocked

    Charan Teja Reddy <charante@codeaurora.org>:
      mm/compaction: correct deferral logic for proactive compaction

    Wonhyuk Yang <vvghjk1234@gmail.com>:
      mm/compaction: fix misbehaviors of fast_find_migrateblock()

    Vlastimil Babka <vbabka@suse.cz>:
      mm, compaction: make fast_isolate_freepages() stay within zone

Subsystem: mm/mempolicy

    Huang Ying <ying.huang@intel.com>:
      numa balancing: migrate on fault among multiple bound nodes

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mempolicy: use helper range_in_vma() in queue_pages_test_walk()

Subsystem: mm/oom-kill

    Tang Yizhou <tangyizhou@huawei.com>:
      mm, oom: fix a comment in dump_task()

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      mm/hugetlb: change hugetlb_reserve_pages() to type bool
      hugetlbfs: remove special hugetlbfs_set_page_dirty()

    Miaohe Lin <linmiaohe@huawei.com>:
      hugetlbfs: remove useless BUG_ON(!inode) in hugetlbfs_setattr()
      hugetlbfs: use helper macro default_hstate in init_hugetlbfs_fs
      hugetlbfs: correct obsolete function name in hugetlbfs_read_iter()
      hugetlbfs: remove meaningless variable avoid_reserve
      hugetlbfs: make hugepage size conversion more readable
      hugetlbfs: correct some obsolete comments about inode i_mutex
      hugetlbfs: fix some comment typos
      hugetlbfs: remove unneeded return value of hugetlb_vmtruncate()

Subsystem: mm/migration

    Chengyang Fan <cy.fan@huawei.com>:
      mm/migrate: remove unneeded semicolons

 Documentation/admin-guide/cgroup-v2.rst         |    4 
 Documentation/admin-guide/kernel-parameters.txt |    8 
 Documentation/admin-guide/sysctl/vm.rst         |   10 
 Documentation/core-api/mm-api.rst               |    7 
 Documentation/dev-tools/kasan.rst               |   24 
 Documentation/vm/arch_pgtable_helpers.rst       |    8 
 arch/arm64/include/asm/memory.h                 |    1 
 arch/arm64/include/asm/mte-kasan.h              |   12 
 arch/arm64/kernel/mte.c                         |   12 
 arch/arm64/kernel/sleep.S                       |    2 
 arch/arm64/mm/fault.c                           |   20 
 arch/hexagon/configs/comet_defconfig            |    1 
 arch/ia64/include/asm/pgtable.h                 |    6 
 arch/ia64/mm/init.c                             |   18 
 arch/mips/mm/pgtable-32.c                       |    1 
 arch/mips/mm/pgtable-64.c                       |    1 
 arch/x86/kernel/acpi/wakeup_64.S                |    2 
 drivers/base/node.c                             |   33 
 drivers/video/fbdev/acornfb.c                   |   34 
 fs/block_dev.c                                  |    2 
 fs/btrfs/file.c                                 |    2 
 fs/buffer.c                                     |    7 
 fs/dcache.c                                     |    4 
 fs/direct-io.c                                  |    4 
 fs/exec.c                                       |    4 
 fs/fhandle.c                                    |    2 
 fs/fuse/dev.c                                   |    6 
 fs/hugetlbfs/inode.c                            |   72 --
 fs/ntfs/inode.c                                 |    6 
 fs/ntfs/layout.h                                |    4 
 fs/ocfs2/cluster/heartbeat.c                    |    8 
 fs/ocfs2/dlm/dlmast.c                           |   10 
 fs/ocfs2/dlm/dlmcommon.h                        |    4 
 fs/ocfs2/refcounttree.c                         |    2 
 fs/ocfs2/super.c                                |    2 
 fs/pipe.c                                       |    2 
 fs/proc/meminfo.c                               |   10 
 fs/proc/vmcore.c                                |    7 
 fs/ramfs/inode.c                                |   13 
 include/linux/fs.h                              |    4 
 include/linux/gfp.h                             |   14 
 include/linux/highmem-internal.h                |    5 
 include/linux/huge_mm.h                         |   15 
 include/linux/hugetlb.h                         |   98 ++
 include/linux/kasan-checks.h                    |    6 
 include/linux/kasan.h                           |   39 -
 include/linux/memcontrol.h                      |   43 -
 include/linux/migrate.h                         |    2 
 include/linux/mm.h                              |   28 
 include/linux/mm_inline.h                       |  123 +--
 include/linux/mmzone.h                          |   30 
 include/linux/page-flags.h                      |    6 
 include/linux/page_counter.h                    |    9 
 include/linux/pagemap.h                         |    5 
 include/linux/swap.h                            |    8 
 include/trace/events/kmem.h                     |   24 
 include/trace/events/pagemap.h                  |   11 
 include/uapi/linux/mempolicy.h                  |    4 
 init/Kconfig                                    |   14 
 lib/Kconfig.kasan                               |   14 
 lib/Makefile                                    |    2 
 lib/test_kasan.c                                |  446 ++++++++----
 lib/test_kasan_module.c                         |    5 
 mm/backing-dev.c                                |    6 
 mm/compaction.c                                 |   73 +-
 mm/debug.c                                      |   10 
 mm/debug_vm_pgtable.c                           |   86 ++
 mm/filemap.c                                    |  859 +++++++++++-------------
 mm/gup.c                                        |    5 
 mm/huge_memory.c                                |   28 
 mm/hugetlb.c                                    |  376 ++++------
 mm/hugetlb_cgroup.c                             |    6 
 mm/kasan/common.c                               |   60 -
 mm/kasan/generic.c                              |   40 -
 mm/kasan/hw_tags.c                              |   16 
 mm/kasan/kasan.h                                |   87 +-
 mm/kasan/quarantine.c                           |   22 
 mm/kasan/report.c                               |   15 
 mm/kasan/report_generic.c                       |   10 
 mm/kasan/report_hw_tags.c                       |    8 
 mm/kasan/report_sw_tags.c                       |    8 
 mm/kasan/shadow.c                               |   27 
 mm/kasan/sw_tags.c                              |   22 
 mm/khugepaged.c                                 |    6 
 mm/list_lru.c                                   |   12 
 mm/memcontrol.c                                 |  309 ++++----
 mm/memory-failure.c                             |   34 
 mm/memory.c                                     |   24 
 mm/memory_hotplug.c                             |   11 
 mm/mempolicy.c                                  |   18 
 mm/mempool.c                                    |    2 
 mm/migrate.c                                    |   10 
 mm/mlock.c                                      |    3 
 mm/mmap.c                                       |    4 
 mm/mprotect.c                                   |    7 
 mm/mremap.c                                     |    8 
 mm/oom_kill.c                                   |    5 
 mm/page_alloc.c                                 |   70 -
 mm/page_io.c                                    |   12 
 mm/page_owner.c                                 |    4 
 mm/page_reporting.c                             |    2 
 mm/pgtable-generic.c                            |    9 
 mm/rmap.c                                       |   35 
 mm/shmem.c                                      |    2 
 mm/slab.c                                       |   21 
 mm/slab.h                                       |   20 
 mm/slab_common.c                                |   40 -
 mm/slob.c                                       |    2 
 mm/slub.c                                       |  169 ++--
 mm/swap.c                                       |   54 -
 mm/swap_slots.c                                 |    3 
 mm/swap_state.c                                 |   31 
 mm/swapfile.c                                   |    8 
 mm/vmscan.c                                     |  100 +-
 mm/vmstat.c                                     |   14 
 mm/workingset.c                                 |    7 
 mm/z3fold.c                                     |   11 
 scripts/Makefile.kasan                          |   10 
 scripts/spelling.txt                            |   30 
 tools/objtool/check.c                           |    2 
 120 files changed, 2249 insertions(+), 1954 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-02-13  4:52 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-02-13  4:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

6 patches, based on dcc0b49040c70ad827a7f3d58a21b01fdb14e749.

Subsystems affected by this patch series:

  mm/pagemap
  scripts
  MAINTAINERS
  h8300

Subsystem: mm/pagemap

    Mike Rapoport <rppt@linux.ibm.com>:
      m68k: make __pfn_to_phys() and __phys_to_pfn() available for !MMU

Subsystem: scripts

    Rong Chen <rong.a.chen@intel.com>:
      scripts/recordmcount.pl: support big endian for ARCH sh

Subsystem: MAINTAINERS

    Andrey Konovalov <andreyknvl@google.com>:
      MAINTAINERS: update KASAN file list
      MAINTAINERS: update Andrey Konovalov's email address
      MAINTAINERS: add Andrey Konovalov to KASAN reviewers

Subsystem: h8300

    Randy Dunlap <rdunlap@infradead.org>:
      h8300: fix PREEMPTION build, TI_PRE_COUNT undefined

 MAINTAINERS                     |    8 +++++---
 arch/h8300/kernel/asm-offsets.c |    3 +++
 arch/m68k/include/asm/page.h    |    2 +-
 scripts/recordmcount.pl         |    6 +++++-
 4 files changed, 14 insertions(+), 5 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-02-09 21:41 incoming Andrew Morton
@ 2021-02-10 19:30 ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-02-10 19:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

Hah. This series shows a small deficiency in your scripting wrt the diffstat:

On Tue, Feb 9, 2021 at 1:41 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
>  .mailmap                          |    1
...
>  mm/slub.c                         |   18 +++++++++-
>  17 files changed, 172 insertions(+), 49 deletions(-)

It actually has 18 files changed, but one of them is a pure rename (no
change to the content), and apparently your diffstat tool can't handle
that case.

It *should* have ended with

 ...
 mm/slub.c                                          | 18 +++++-
 .../selftests/vm/{run_vmtests => run_vmtests.sh}   |  0
 18 files changed, 172 insertions(+), 49 deletions(-)
 rename tools/testing/selftests/vm/{run_vmtests => run_vmtests.sh} (100%)

if you'd done a proper "git diff -M --stat --summary" of the series.

[ Ok, by default git would actually have said

    18 files changed, 171 insertions(+), 48 deletions(-)

  but it looks like you use the patience diff option, which gives that
extra insertion/deletion line because it generates the diff a bit
differently ]

Not a big deal,, but it made me briefly wonder "why doesn't my
diffstat match yours".

           Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-02-09 21:41 Andrew Morton
  2021-02-10 19:30 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-02-09 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

14 patches, based on e0756cfc7d7cd08c98a53b6009c091a3f6a50be6.

Subsystems affected by this patch series:

  squashfs
  mm/kasan
  firmware
  mm/mremap
  mm/tmpfs
  mm/selftests
  MAINTAINERS
  mm/memcg
  mm/slub
  nilfs2

Subsystem: squashfs

    Phillip Lougher <phillip@squashfs.org.uk>:
    Patch series "Squashfs: fix BIO migration regression and add sanity checks":
      squashfs: avoid out of bounds writes in decompressors
      squashfs: add more sanity checks in id lookup
      squashfs: add more sanity checks in inode lookup
      squashfs: add more sanity checks in xattr id lookup

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix stack traces dependency for HW_TAGS

Subsystem: firmware

    Fangrui Song <maskray@google.com>:
      firmware_loader: align .builtin_fw to 8

Subsystem: mm/mremap

    Arnd Bergmann <arnd@arndb.de>:
      mm/mremap: fix BUILD_BUG_ON() error in get_extent

Subsystem: mm/tmpfs

    Seth Forshee <seth.forshee@canonical.com>:
      tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
      tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha

Subsystem: mm/selftests

    Rong Chen <rong.a.chen@intel.com>:
      selftests/vm: rename file run_vmtests to run_vmtests.sh

Subsystem: MAINTAINERS

    Andrey Ryabinin <ryabinin.a.a@gmail.com>:
      MAINTAINERS: update Andrey Ryabinin's email address

Subsystem: mm/memcg

    Johannes Weiner <hannes@cmpxchg.org>:
      Revert "mm: memcontrol: avoid workload stalls when lowering memory.high"

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: better heuristic for number of cpus when calculating slab order

Subsystem: nilfs2

    Joachim Henke <joachim.henke@t-systems.com>:
      nilfs2: make splice write available again

 .mailmap                          |    1 
 Documentation/dev-tools/kasan.rst |    3 -
 MAINTAINERS                       |    2 -
 fs/Kconfig                        |    4 +-
 fs/nilfs2/file.c                  |    1 
 fs/squashfs/block.c               |    8 ++++
 fs/squashfs/export.c              |   41 +++++++++++++++++++----
 fs/squashfs/id.c                  |   40 ++++++++++++++++++-----
 fs/squashfs/squashfs_fs_sb.h      |    1 
 fs/squashfs/super.c               |    6 +--
 fs/squashfs/xattr.h               |   10 +++++
 fs/squashfs/xattr_id.c            |   66 ++++++++++++++++++++++++++++++++------
 include/asm-generic/vmlinux.lds.h |    2 -
 mm/kasan/hw_tags.c                |    8 +---
 mm/memcontrol.c                   |    5 +-
 mm/mremap.c                       |    5 +-
 mm/slub.c                         |   18 +++++++++-
 17 files changed, 172 insertions(+), 49 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-02-05  2:31 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-02-05  2:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

18 patches, based on 5c279c4cf206e03995e04fd3404fa95ffd243a97.

Subsystems affected by this patch series:

  mm/hugetlb
  mm/compaction
  mm/vmalloc
  gcov
  mm/shmem
  mm/memblock
  mailmap
  mm/pagecache
  mm/kasan
  ubsan
  mm/hugetlb
  MAINTAINERS

Subsystem: mm/hugetlb

    Muchun Song <songmuchun@bytedance.com>:
      mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
      mm: hugetlb: fix a race between freeing and dissolving the page
      mm: hugetlb: fix a race between isolating and freeing page
      mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
      mm: migrate: do not migrate HugeTLB page whose refcount is one

Subsystem: mm/compaction

    Rokudo Yan <wu-yan@tcl.com>:
      mm, compaction: move high_pfn to the for loop scope

Subsystem: mm/vmalloc

    Rick Edgecombe <rick.p.edgecombe@intel.com>:
      mm/vmalloc: separate put pages and flush VM flags

Subsystem: gcov

    Johannes Berg <johannes.berg@intel.com>:
      init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov

Subsystem: mm/shmem

    Hugh Dickins <hughd@google.com>:
      mm: thp: fix MADV_REMOVE deadlock on shmem THP

Subsystem: mm/memblock

    Roman Gushchin <guro@fb.com>:
      memblock: do not start bottom-up allocations with kernel_end

Subsystem: mailmap

    Viresh Kumar <viresh.kumar@linaro.org>:
      mailmap: fix name/email for Viresh Kumar

    Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>:
      mailmap: add entries for Manivannan Sadhasivam

Subsystem: mm/pagecache

    Waiman Long <longman@redhat.com>:
      mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()

Subsystem: mm/kasan

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
    Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5:
      kasan: add explicit preconditions to kasan_report()
      kasan: make addr_has_metadata() return true for valid addresses

Subsystem: ubsan

    Nathan Chancellor <nathan@kernel.org>:
      ubsan: implement __ubsan_handle_alignment_assumption

Subsystem: mm/hugetlb

    Muchun Song <songmuchun@bytedance.com>:
      mm: hugetlb: fix missing put_page in gather_surplus_pages()

Subsystem: MAINTAINERS

    Nathan Chancellor <nathan@kernel.org>:
      MAINTAINERS/.mailmap: use my @kernel.org address

 .mailmap                |    5 ++++
 MAINTAINERS             |    2 -
 fs/hugetlbfs/inode.c    |    3 +-
 include/linux/hugetlb.h |    2 +
 include/linux/kasan.h   |    7 ++++++
 include/linux/vmalloc.h |    9 +-------
 init/Kconfig            |    1 
 init/main.c             |    8 ++++++-
 kernel/gcov/Kconfig     |    2 -
 lib/ubsan.c             |   31 ++++++++++++++++++++++++++++
 lib/ubsan.h             |    6 +++++
 mm/compaction.c         |    3 +-
 mm/filemap.c            |    4 +++
 mm/huge_memory.c        |   37 ++++++++++++++++++++-------------
 mm/hugetlb.c            |   53 ++++++++++++++++++++++++++++++++++++++++++------
 mm/kasan/kasan.h        |    2 -
 mm/memblock.c           |   49 +++++---------------------------------------
 mm/migrate.c            |    6 +++++
 18 files changed, 153 insertions(+), 77 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-01-24  5:00 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2021-01-24  5:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

19 patches, based on e1ae4b0be15891faf46d390e9f3dc9bd71a8cae1.

Subsystems affected by this patch series:

  mm/pagealloc
  mm/memcg
  mm/kasan
  ubsan
  mm/memory-failure
  mm/highmem
  proc
  MAINTAINERS

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: fix initialization of struct page for holes in memory layout", v3:
      x86/setup: don't remove E820_TYPE_RAM for pfn 0
      mm: fix initialization of struct page for holes in memory layout

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: optimize objcg stock draining

    Shakeel Butt <shakeelb@google.com>:
      mm: memcg: fix memcg file_dirty numa stat
      mm: fix numa stats for thp migration

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: prevent starvation when writing memory.high

Subsystem: mm/kasan

    Lecopzer Chen <lecopzer@gmail.com>:
      kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
      kasan: fix incorrect arguments passing in kasan_add_zero_shadow

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix HW_TAGS boot parameters
      kasan, mm: fix conflicts with init_on_alloc/free
      kasan, mm: fix resetting page_alloc tags for HW_TAGS

Subsystem: ubsan

    Arnd Bergmann <arnd@arndb.de>:
      ubsan: disable unsigned-overflow check for i386

Subsystem: mm/memory-failure

    Dan Williams <dan.j.williams@intel.com>:
      mm: fix page reference leak in soft_offline_page()

Subsystem: mm/highmem

    Thomas Gleixner <tglx@linutronix.de>:
    Patch series "mm/highmem: Fix fallout from generic kmap_local conversions":
      sparc/mm/highmem: flush cache and TLB
      mm/highmem: prepare for overriding set_pte_at()
      mips/mm/highmem: use set_pte() for kmap_local()
      powerpc/mm/highmem: use __set_pte_at() for kmap_local()

Subsystem: proc

    Xiaoming Ni <nixiaoming@huawei.com>:
      proc_sysctl: fix oops caused by incorrect command parameters

Subsystem: MAINTAINERS

    Nathan Chancellor <natechancellor@gmail.com>:
      MAINTAINERS: add a couple more files to the Clang/LLVM section

 Documentation/dev-tools/kasan.rst  |   27 ++---------
 MAINTAINERS                        |    2 
 arch/mips/include/asm/highmem.h    |    1 
 arch/powerpc/include/asm/highmem.h |    2 
 arch/sparc/include/asm/highmem.h   |    9 ++-
 arch/x86/kernel/setup.c            |   20 +++-----
 fs/proc/proc_sysctl.c              |    7 ++-
 lib/Kconfig.ubsan                  |    1 
 mm/highmem.c                       |    7 ++-
 mm/kasan/hw_tags.c                 |   77 +++++++++++++--------------------
 mm/kasan/init.c                    |   23 +++++----
 mm/memcontrol.c                    |   11 +---
 mm/memory-failure.c                |   20 ++++++--
 mm/migrate.c                       |   27 ++++++-----
 mm/page_alloc.c                    |   86 ++++++++++++++++++++++---------------
 mm/slub.c                          |    7 +--
 16 files changed, 173 insertions(+), 154 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2021-01-12 23:48 incoming Andrew Morton
@ 2021-01-15 23:32 ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2021-01-15 23:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, Jan 12, 2021 at 3:48 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> 10 patches, based on e609571b5ffa3528bf85292de1ceaddac342bc1c.

Whee. I had completely dropped the ball on this - I had built my usual
"akpm" branch with the patches, but then had completely forgotten
about it after doing my basic build tests.

I tend to leave it for a while to see if people send belated ACK/NAK's
for the patches, but that "for a while" is typically "overnight", not
several days.

So if you ever notice that I haven't merged your patch submission, and
you haven't seen me comment on them, feel free to ping me to remind
me.

Because it might just have gotten lost in the shuffle for some random
reason. Admittedly it's rare - I think this is the first time I just
randomly noticed three days later that I'd never done the actual merge
of the patch-series).

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2021-01-12 23:48 Andrew Morton
  2021-01-15 23:32 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2021-01-12 23:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

10 patches, based on e609571b5ffa3528bf85292de1ceaddac342bc1c.

Subsystems affected by this patch series:

  mm/slub
  mm/pagealloc
  mm/memcg
  mm/kasan
  mm/vmalloc
  mm/migration
  mm/hugetlb
  MAINTAINERS
  mm/memory-failure
  mm/process_vm_access

Subsystem: mm/slub

    Jann Horn <jannh@google.com>:
      mm, slub: consider rest of partial list if acquire_slab() fails

Subsystem: mm/pagealloc

    Hailong liu <liu.hailong6@zte.com.cn>:
      mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint

Subsystem: mm/memcg

    Hugh Dickins <hughd@google.com>:
      mm/memcontrol: fix warning in mem_cgroup_page_lruvec()

Subsystem: mm/kasan

    Hailong Liu <liu.hailong6@zte.com.cn>:
      arm/kasan: fix the array size of kasan_early_shadow_pte[]

Subsystem: mm/vmalloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/vmalloc.c: fix potential memory leak

Subsystem: mm/migration

    Jan Stancek <jstancek@redhat.com>:
      mm: migrate: initialize err in do_migrate_pages

Subsystem: mm/hugetlb

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/hugetlb: fix potential missing huge page size info

Subsystem: MAINTAINERS

    Vlastimil Babka <vbabka@suse.cz>:
      MAINTAINERS: add Vlastimil as slab allocators maintainer

Subsystem: mm/memory-failure

    Oscar Salvador <osalvador@suse.de>:
      mm,hwpoison: fix printing of page flags

Subsystem: mm/process_vm_access

    Andrew Morton <akpm@linux-foundation.org>:
      mm/process_vm_access.c: include compat.h

 MAINTAINERS                |    1 +
 include/linux/kasan.h      |    6 +++++-
 include/linux/memcontrol.h |    2 +-
 mm/hugetlb.c               |    2 +-
 mm/kasan/init.c            |    3 ++-
 mm/memory-failure.c        |    2 +-
 mm/mempolicy.c             |    2 +-
 mm/page_alloc.c            |   31 ++++++++++++++++---------------
 mm/process_vm_access.c     |    1 +
 mm/slub.c                  |    2 +-
 mm/vmalloc.c               |    4 +++-
 11 files changed, 33 insertions(+), 23 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-29 23:13 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-29 23:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

16 patches, based on dea8dcf2a9fa8cc540136a6cd885c3beece16ec3.

Subsystems affected by this patch series:

  mm/selftests
  mm/hugetlb
  kbuild
  checkpatch
  mm/pagecache
  mm/mremap
  mm/kasan
  misc
  lib
  mm/slub

Subsystem: mm/selftests

    Harish <harish@linux.ibm.com>:
      selftests/vm: fix building protection keys test

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      mm/hugetlb: fix deadlock in hugetlb_cow error path

Subsystem: kbuild

    Masahiro Yamada <masahiroy@kernel.org>:
      Revert "kbuild: avoid static_assert for genksyms"

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: prefer strscpy to strlcpy

Subsystem: mm/pagecache

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm: add prototype for __add_to_page_cache_locked()

    Baoquan He <bhe@redhat.com>:
      mm: memmap defer init doesn't work as expected

Subsystem: mm/mremap

    Kalesh Singh <kaleshsingh@google.com>:
      mm/mremap.c: fix extent calculation

    Nicholas Piggin <npiggin@gmail.com>:
      mm: generalise COW SMC TLB flushing race comment

Subsystem: mm/kasan

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: fix null pointer dereference in kasan_record_aux_stack

Subsystem: misc

    Randy Dunlap <rdunlap@infradead.org>:
      local64.h: make <asm/local64.h> mandatory

    Huang Shijie <sjhuang@iluvatar.ai>:
      sizes.h: add SZ_8G/SZ_16G/SZ_32G macros

    Josh Poimboeuf <jpoimboe@redhat.com>:
      kdev_t: always inline major/minor helper functions

Subsystem: lib

    Huang Shijie <sjhuang@iluvatar.ai>:
      lib/genalloc: fix the overflow when size is too big

    Ilya Leoshkevich <iii@linux.ibm.com>:
      lib/zlib: fix inflating zlib streams on s390

    Randy Dunlap <rdunlap@infradead.org>:
      zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c

Subsystem: mm/slub

    Roman Gushchin <guro@fb.com>:
      mm: slub: call account_slab_page() after slab page initialization

 arch/alpha/include/asm/local64.h    |    1 -
 arch/arc/include/asm/Kbuild         |    1 -
 arch/arm/include/asm/Kbuild         |    1 -
 arch/arm64/include/asm/Kbuild       |    1 -
 arch/csky/include/asm/Kbuild        |    1 -
 arch/h8300/include/asm/Kbuild       |    1 -
 arch/hexagon/include/asm/Kbuild     |    1 -
 arch/ia64/include/asm/local64.h     |    1 -
 arch/ia64/mm/init.c                 |    4 ++--
 arch/m68k/include/asm/Kbuild        |    1 -
 arch/microblaze/include/asm/Kbuild  |    1 -
 arch/mips/include/asm/Kbuild        |    1 -
 arch/nds32/include/asm/Kbuild       |    1 -
 arch/openrisc/include/asm/Kbuild    |    1 -
 arch/parisc/include/asm/Kbuild      |    1 -
 arch/powerpc/include/asm/Kbuild     |    1 -
 arch/riscv/include/asm/Kbuild       |    1 -
 arch/s390/include/asm/Kbuild        |    1 -
 arch/sh/include/asm/Kbuild          |    1 -
 arch/sparc/include/asm/Kbuild       |    1 -
 arch/x86/include/asm/local64.h      |    1 -
 arch/xtensa/include/asm/Kbuild      |    1 -
 include/asm-generic/Kbuild          |    1 +
 include/linux/build_bug.h           |    5 -----
 include/linux/kdev_t.h              |   22 +++++++++++-----------
 include/linux/mm.h                  |   12 ++++++++++--
 include/linux/sizes.h               |    3 +++
 lib/genalloc.c                      |   25 +++++++++++++------------
 lib/zlib_dfltcc/Makefile            |    2 +-
 lib/zlib_dfltcc/dfltcc.c            |    6 +++++-
 lib/zlib_dfltcc/dfltcc_deflate.c    |    3 +++
 lib/zlib_dfltcc/dfltcc_inflate.c    |    4 ++--
 lib/zlib_dfltcc/dfltcc_syms.c       |   17 -----------------
 mm/hugetlb.c                        |   22 +++++++++++++++++++++-
 mm/kasan/generic.c                  |    2 ++
 mm/memory.c                         |    8 +++++---
 mm/memory_hotplug.c                 |    2 +-
 mm/mremap.c                         |    4 +++-
 mm/page_alloc.c                     |    8 +++++---
 mm/slub.c                           |    5 ++---
 scripts/checkpatch.pl               |    6 ++++++
 tools/testing/selftests/vm/Makefile |   10 +++++-----
 42 files changed, 101 insertions(+), 91 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-22 19:58 incoming Andrew Morton
@ 2020-12-22 21:43 ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-12-22 21:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, Dec 22, 2020 at 11:58 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> 60 patches, based on 8653b778e454a7708847aeafe689bce07aeeb94e.

I see that you enabled renaming in the patches. Lovely.

Can you also enable it in the diffstat?

>  74 files changed, 2869 insertions(+), 1553 deletions(-)

With -M in the diffstat, you should have seen

 72 files changed, 2775 insertions(+), 1460 deletions(-)

and if you add "--summary", you'll also see the rename part ofthe file
create/delete summary:

 rename mm/kasan/{tags_report.c => report_sw_tags.c} (78%)

which is often nice to see in addition to the line stats..

           Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-22 19:58 Andrew Morton
  2020-12-22 21:43 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-12-22 19:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


60 patches, based on 8653b778e454a7708847aeafe689bce07aeeb94e.

Subsystems affected by this patch series:

  mm/kasan

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: add hardware tag-based mode for arm64", v11:
      kasan: drop unnecessary GPL text from comment headers
      kasan: KASAN_VMALLOC depends on KASAN_GENERIC
      kasan: group vmalloc code
      kasan: shadow declarations only for software modes
      kasan: rename (un)poison_shadow to (un)poison_range
      kasan: rename KASAN_SHADOW_* to KASAN_GRANULE_*
      kasan: only build init.c for software modes
      kasan: split out shadow.c from common.c
      kasan: define KASAN_MEMORY_PER_SHADOW_PAGE
      kasan: rename report and tags files
      kasan: don't duplicate config dependencies
      kasan: hide invalid free check implementation
      kasan: decode stack frame only with KASAN_STACK_ENABLE
      kasan, arm64: only init shadow for software modes
      kasan, arm64: only use kasan_depth for software modes
      kasan, arm64: move initialization message
      kasan, arm64: rename kasan_init_tags and mark as __init
      kasan: rename addr_has_shadow to addr_has_metadata
      kasan: rename print_shadow_for_address to print_memory_metadata
      kasan: rename SHADOW layout macros to META
      kasan: separate metadata_fetch_row for each mode
      kasan: introduce CONFIG_KASAN_HW_TAGS

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      arm64: enable armv8.5-a asm-arch option
      arm64: mte: add in-kernel MTE helpers
      arm64: mte: reset the page tag in page->flags
      arm64: mte: add in-kernel tag fault handler
      arm64: kasan: allow enabling in-kernel MTE
      arm64: mte: convert gcr_user into an exclude mask
      arm64: mte: switch GCR_EL1 in kernel entry and exit
      kasan, mm: untag page address in free_reserved_area

    Andrey Konovalov <andreyknvl@google.com>:
      arm64: kasan: align allocations for HW_TAGS
      arm64: kasan: add arch layer for memory tagging helpers
      kasan: define KASAN_GRANULE_SIZE for HW_TAGS
      kasan, x86, s390: update undef CONFIG_KASAN
      kasan, arm64: expand CONFIG_KASAN checks
      kasan, arm64: implement HW_TAGS runtime
      kasan, arm64: print report from tag fault handler
      kasan, mm: reset tags when accessing metadata
      kasan, arm64: enable CONFIG_KASAN_HW_TAGS
      kasan: add documentation for hardware tag-based mode

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      kselftest/arm64: check GCR_EL1 after context switch

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: boot parameters for hardware tag-based mode", v4:
      kasan: simplify quarantine_put call site
      kasan: rename get_alloc/free_info
      kasan: introduce set_alloc_info
      kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK
      kasan: allow VMAP_STACK for HW_TAGS mode
      kasan: remove __kasan_unpoison_stack
      kasan: inline kasan_reset_tag for tag-based modes
      kasan: inline random_tag for HW_TAGS
      kasan: open-code kasan_unpoison_slab
      kasan: inline (un)poison_range and check_invalid_free
      kasan: add and integrate kasan boot parameters
      kasan, mm: check kasan_enabled in annotations
      kasan, mm: rename kasan_poison_kfree
      kasan: don't round_up too much
      kasan: simplify assign_tag and set_tag calls
      kasan: clarify comment in __kasan_kfree_large
      kasan: sanitize objects when metadata doesn't fit
      kasan, mm: allow cache merging with no metadata
      kasan: update documentation

 Documentation/dev-tools/kasan.rst                         |  274 ++-
 arch/Kconfig                                              |    8 
 arch/arm64/Kconfig                                        |    9 
 arch/arm64/Makefile                                       |    7 
 arch/arm64/include/asm/assembler.h                        |    2 
 arch/arm64/include/asm/cache.h                            |    3 
 arch/arm64/include/asm/esr.h                              |    1 
 arch/arm64/include/asm/kasan.h                            |   17 
 arch/arm64/include/asm/memory.h                           |   15 
 arch/arm64/include/asm/mte-def.h                          |   16 
 arch/arm64/include/asm/mte-kasan.h                        |   67 
 arch/arm64/include/asm/mte.h                              |   22 
 arch/arm64/include/asm/processor.h                        |    2 
 arch/arm64/include/asm/string.h                           |    5 
 arch/arm64/include/asm/uaccess.h                          |   23 
 arch/arm64/kernel/asm-offsets.c                           |    3 
 arch/arm64/kernel/cpufeature.c                            |    3 
 arch/arm64/kernel/entry.S                                 |   41 
 arch/arm64/kernel/head.S                                  |    2 
 arch/arm64/kernel/hibernate.c                             |    5 
 arch/arm64/kernel/image-vars.h                            |    2 
 arch/arm64/kernel/kaslr.c                                 |    3 
 arch/arm64/kernel/module.c                                |    6 
 arch/arm64/kernel/mte.c                                   |  124 +
 arch/arm64/kernel/setup.c                                 |    2 
 arch/arm64/kernel/sleep.S                                 |    2 
 arch/arm64/kernel/smp.c                                   |    2 
 arch/arm64/lib/mte.S                                      |   16 
 arch/arm64/mm/copypage.c                                  |    9 
 arch/arm64/mm/fault.c                                     |   59 
 arch/arm64/mm/kasan_init.c                                |   41 
 arch/arm64/mm/mteswap.c                                   |    9 
 arch/arm64/mm/proc.S                                      |   23 
 arch/arm64/mm/ptdump.c                                    |    6 
 arch/s390/boot/string.c                                   |    1 
 arch/x86/boot/compressed/misc.h                           |    1 
 arch/x86/kernel/acpi/wakeup_64.S                          |    2 
 include/linux/kasan-checks.h                              |    2 
 include/linux/kasan.h                                     |  423 ++++-
 include/linux/mm.h                                        |   24 
 include/linux/moduleloader.h                              |    3 
 include/linux/page-flags-layout.h                         |    2 
 include/linux/sched.h                                     |    2 
 include/linux/string.h                                    |    2 
 init/init_task.c                                          |    2 
 kernel/fork.c                                             |    4 
 lib/Kconfig.kasan                                         |   71 
 lib/test_kasan.c                                          |    2 
 lib/test_kasan_module.c                                   |    2 
 mm/kasan/Makefile                                         |   33 
 mm/kasan/common.c                                         | 1006 +++-----------
 mm/kasan/generic.c                                        |   72 -
 mm/kasan/generic_report.c                                 |   13 
 mm/kasan/hw_tags.c                                        |  276 +++
 mm/kasan/init.c                                           |   25 
 mm/kasan/kasan.h                                          |  195 ++
 mm/kasan/quarantine.c                                     |   35 
 mm/kasan/report.c                                         |  363 +----
 mm/kasan/report_generic.c                                 |  169 ++
 mm/kasan/report_hw_tags.c                                 |   44 
 mm/kasan/report_sw_tags.c                                 |   22 
 mm/kasan/shadow.c                                         |  528 +++++++
 mm/kasan/sw_tags.c                                        |   34 
 mm/kasan/tags.c                                           |    7 
 mm/kasan/tags_report.c                                    |    7 
 mm/mempool.c                                              |    4 
 mm/page_alloc.c                                           |    9 
 mm/page_poison.c                                          |    2 
 mm/ptdump.c                                               |   13 
 mm/slab_common.c                                          |    5 
 mm/slub.c                                                 |   29 
 scripts/Makefile.lib                                      |    2 
 tools/testing/selftests/arm64/mte/Makefile                |    2 
 tools/testing/selftests/arm64/mte/check_gcr_el1_cswitch.c |  155 ++
 74 files changed, 2869 insertions(+), 1553 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-18 22:00 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-18 22:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


78 patches, based on a409ed156a90093a03fe6a93721ddf4c591eac87.

Subsystems affected by this patch series:

  mm/memcg
  epoll
  mm/kasan
  mm/cleanups
  epoll

Subsystem: mm/memcg

    Alex Shi <alex.shi@linux.alibaba.com>:
    Patch series "bail out early for memcg disable":
      mm/memcg: bail early from swap accounting if memcg disabled
      mm/memcg: warning on !memcg after readahead page charged

    Wei Yang <richard.weiyang@gmail.com>:
      mm/memcg: remove unused definitions

    Shakeel Butt <shakeelb@google.com>:
      mm, kvm: account kvm_vcpu_mmap to kmemcg

    Hui Su <sh_def@163.com>:
      mm/memcontrol:rewrite mem_cgroup_page_lruvec()

Subsystem: epoll

    Soheil Hassas Yeganeh <soheil@google.com>:
    Patch series "simplify ep_poll":
      epoll: check for events when removing a timed out thread from the wait queue
      epoll: simplify signal handling
      epoll: pull fatal signal checks into ep_send_events()
      epoll: move eavail next to the list_empty_careful check
      epoll: simplify and optimize busy loop logic
      epoll: pull all code between fetch_events and send_event into the loop
      epoll: replace gotos with a proper loop
      epoll: eliminate unnecessary lock for zero timeout

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: add hardware tag-based mode for arm64", v11:
      kasan: drop unnecessary GPL text from comment headers
      kasan: KASAN_VMALLOC depends on KASAN_GENERIC
      kasan: group vmalloc code
      kasan: shadow declarations only for software modes
      kasan: rename (un)poison_shadow to (un)poison_range
      kasan: rename KASAN_SHADOW_* to KASAN_GRANULE_*
      kasan: only build init.c for software modes
      kasan: split out shadow.c from common.c
      kasan: define KASAN_MEMORY_PER_SHADOW_PAGE
      kasan: rename report and tags files
      kasan: don't duplicate config dependencies
      kasan: hide invalid free check implementation
      kasan: decode stack frame only with KASAN_STACK_ENABLE
      kasan, arm64: only init shadow for software modes
      kasan, arm64: only use kasan_depth for software modes
      kasan, arm64: move initialization message
      kasan, arm64: rename kasan_init_tags and mark as __init
      kasan: rename addr_has_shadow to addr_has_metadata
      kasan: rename print_shadow_for_address to print_memory_metadata
      kasan: rename SHADOW layout macros to META
      kasan: separate metadata_fetch_row for each mode
      kasan: introduce CONFIG_KASAN_HW_TAGS

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      arm64: enable armv8.5-a asm-arch option
      arm64: mte: add in-kernel MTE helpers
      arm64: mte: reset the page tag in page->flags
      arm64: mte: add in-kernel tag fault handler
      arm64: kasan: allow enabling in-kernel MTE
      arm64: mte: convert gcr_user into an exclude mask
      arm64: mte: switch GCR_EL1 in kernel entry and exit
      kasan, mm: untag page address in free_reserved_area

    Andrey Konovalov <andreyknvl@google.com>:
      arm64: kasan: align allocations for HW_TAGS
      arm64: kasan: add arch layer for memory tagging helpers
      kasan: define KASAN_GRANULE_SIZE for HW_TAGS
      kasan, x86, s390: update undef CONFIG_KASAN
      kasan, arm64: expand CONFIG_KASAN checks
      kasan, arm64: implement HW_TAGS runtime
      kasan, arm64: print report from tag fault handler
      kasan, mm: reset tags when accessing metadata
      kasan, arm64: enable CONFIG_KASAN_HW_TAGS
      kasan: add documentation for hardware tag-based mode

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      kselftest/arm64: check GCR_EL1 after context switch

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: boot parameters for hardware tag-based mode", v4:
      kasan: simplify quarantine_put call site
      kasan: rename get_alloc/free_info
      kasan: introduce set_alloc_info
      kasan, arm64: unpoison stack only with CONFIG_KASAN_STACK
      kasan: allow VMAP_STACK for HW_TAGS mode
      kasan: remove __kasan_unpoison_stack
      kasan: inline kasan_reset_tag for tag-based modes
      kasan: inline random_tag for HW_TAGS
      kasan: open-code kasan_unpoison_slab
      kasan: inline (un)poison_range and check_invalid_free
      kasan: add and integrate kasan boot parameters
      kasan, mm: check kasan_enabled in annotations
      kasan, mm: rename kasan_poison_kfree
      kasan: don't round_up too much
      kasan: simplify assign_tag and set_tag calls
      kasan: clarify comment in __kasan_kfree_large
      kasan: sanitize objects when metadata doesn't fit
      kasan, mm: allow cache merging with no metadata
      kasan: update documentation

Subsystem: mm/cleanups

    Colin Ian King <colin.king@canonical.com>:
      mm/Kconfig: fix spelling mistake "whats" -> "what's"

Subsystem: epoll

    Willem de Bruijn <willemb@google.com>:
    Patch series "add epoll_pwait2 syscall", v4:
      epoll: convert internal api to timespec64
      epoll: add syscall epoll_pwait2
      epoll: wire up syscall epoll_pwait2
      selftests/filesystems: expand epoll with epoll_pwait2

 Documentation/dev-tools/kasan.rst                             |  274 +-
 arch/Kconfig                                                  |    8 
 arch/alpha/kernel/syscalls/syscall.tbl                        |    1 
 arch/arm/tools/syscall.tbl                                    |    1 
 arch/arm64/Kconfig                                            |    9 
 arch/arm64/Makefile                                           |    7 
 arch/arm64/include/asm/assembler.h                            |    2 
 arch/arm64/include/asm/cache.h                                |    3 
 arch/arm64/include/asm/esr.h                                  |    1 
 arch/arm64/include/asm/kasan.h                                |   17 
 arch/arm64/include/asm/memory.h                               |   15 
 arch/arm64/include/asm/mte-def.h                              |   16 
 arch/arm64/include/asm/mte-kasan.h                            |   67 
 arch/arm64/include/asm/mte.h                                  |   22 
 arch/arm64/include/asm/processor.h                            |    2 
 arch/arm64/include/asm/string.h                               |    5 
 arch/arm64/include/asm/uaccess.h                              |   23 
 arch/arm64/include/asm/unistd.h                               |    2 
 arch/arm64/include/asm/unistd32.h                             |    2 
 arch/arm64/kernel/asm-offsets.c                               |    3 
 arch/arm64/kernel/cpufeature.c                                |    3 
 arch/arm64/kernel/entry.S                                     |   41 
 arch/arm64/kernel/head.S                                      |    2 
 arch/arm64/kernel/hibernate.c                                 |    5 
 arch/arm64/kernel/image-vars.h                                |    2 
 arch/arm64/kernel/kaslr.c                                     |    3 
 arch/arm64/kernel/module.c                                    |    6 
 arch/arm64/kernel/mte.c                                       |  124 +
 arch/arm64/kernel/setup.c                                     |    2 
 arch/arm64/kernel/sleep.S                                     |    2 
 arch/arm64/kernel/smp.c                                       |    2 
 arch/arm64/lib/mte.S                                          |   16 
 arch/arm64/mm/copypage.c                                      |    9 
 arch/arm64/mm/fault.c                                         |   59 
 arch/arm64/mm/kasan_init.c                                    |   41 
 arch/arm64/mm/mteswap.c                                       |    9 
 arch/arm64/mm/proc.S                                          |   23 
 arch/arm64/mm/ptdump.c                                        |    6 
 arch/ia64/kernel/syscalls/syscall.tbl                         |    1 
 arch/m68k/kernel/syscalls/syscall.tbl                         |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl                   |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl                     |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl                     |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl                     |    1 
 arch/parisc/kernel/syscalls/syscall.tbl                       |    1 
 arch/powerpc/kernel/syscalls/syscall.tbl                      |    1 
 arch/s390/boot/string.c                                       |    1 
 arch/s390/kernel/syscalls/syscall.tbl                         |    1 
 arch/sh/kernel/syscalls/syscall.tbl                           |    1 
 arch/sparc/kernel/syscalls/syscall.tbl                        |    1 
 arch/x86/boot/compressed/misc.h                               |    1 
 arch/x86/entry/syscalls/syscall_32.tbl                        |    1 
 arch/x86/entry/syscalls/syscall_64.tbl                        |    1 
 arch/x86/kernel/acpi/wakeup_64.S                              |    2 
 arch/x86/kvm/x86.c                                            |    2 
 arch/xtensa/kernel/syscalls/syscall.tbl                       |    1 
 fs/eventpoll.c                                                |  359 ++-
 include/linux/compat.h                                        |    6 
 include/linux/kasan-checks.h                                  |    2 
 include/linux/kasan.h                                         |  423 ++--
 include/linux/memcontrol.h                                    |  137 -
 include/linux/mm.h                                            |   24 
 include/linux/mmdebug.h                                       |   13 
 include/linux/moduleloader.h                                  |    3 
 include/linux/page-flags-layout.h                             |    2 
 include/linux/sched.h                                         |    2 
 include/linux/string.h                                        |    2 
 include/linux/syscalls.h                                      |    5 
 include/uapi/asm-generic/unistd.h                             |    4 
 init/init_task.c                                              |    2 
 kernel/fork.c                                                 |    4 
 kernel/sys_ni.c                                               |    2 
 lib/Kconfig.kasan                                             |   71 
 lib/test_kasan.c                                              |    2 
 lib/test_kasan_module.c                                       |    2 
 mm/Kconfig                                                    |    2 
 mm/kasan/Makefile                                             |   33 
 mm/kasan/common.c                                             | 1006 ++--------
 mm/kasan/generic.c                                            |   72 
 mm/kasan/generic_report.c                                     |   13 
 mm/kasan/hw_tags.c                                            |  294 ++
 mm/kasan/init.c                                               |   25 
 mm/kasan/kasan.h                                              |  204 +-
 mm/kasan/quarantine.c                                         |   35 
 mm/kasan/report.c                                             |  363 +--
 mm/kasan/report_generic.c                                     |  169 +
 mm/kasan/report_hw_tags.c                                     |   44 
 mm/kasan/report_sw_tags.c                                     |   22 
 mm/kasan/shadow.c                                             |  541 +++++
 mm/kasan/sw_tags.c                                            |   34 
 mm/kasan/tags.c                                               |    7 
 mm/kasan/tags_report.c                                        |    7 
 mm/memcontrol.c                                               |   53 
 mm/mempool.c                                                  |    4 
 mm/page_alloc.c                                               |    9 
 mm/page_poison.c                                              |    2 
 mm/ptdump.c                                                   |   13 
 mm/slab_common.c                                              |    5 
 mm/slub.c                                                     |   29 
 scripts/Makefile.lib                                          |    2 
 tools/testing/selftests/arm64/mte/Makefile                    |    2 
 tools/testing/selftests/arm64/mte/check_gcr_el1_cswitch.c     |  155 +
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   72 
 virt/kvm/coalesced_mmio.c                                     |    2 
 virt/kvm/kvm_main.c                                           |    2 
 105 files changed, 3268 insertions(+), 1873 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-16  4:41 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-16  4:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- lots of little subsystems

- a few post-linux-next MM material.  Most of this awaits more merging
  of other trees.


95 patches, based on 489e9fea66f31086f85d9a18e61e4791d94a56a4.

Subsystems affected by this patch series:

  mm/swap
  mm/memory-hotplug
  alpha
  procfs
  misc
  core-kernel
  bitmap
  lib
  lz4
  bitops
  checkpatch
  nilfs
  kdump
  rapidio
  gcov
  bfs
  relay
  resource
  ubsan
  reboot
  fault-injection
  lzo
  apparmor
  mm/pagemap
  mm/cleanups
  mm/gup

Subsystem: mm/swap

    Zhaoyang Huang <huangzhaoyang@gmail.com>:
      mm: fix a race on nr_swap_pages

Subsystem: mm/memory-hotplug

    Laurent Dufour <ldufour@linux.ibm.com>:
      mm/memory_hotplug: quieting offline operation

Subsystem: alpha

    Thomas Gleixner <tglx@linutronix.de>:
      alpha: replace bogus in_interrupt()

Subsystem: procfs

    Randy Dunlap <rdunlap@infradead.org>:
      procfs: delete duplicated words + other fixes

    Anand K Mistry <amistry@google.com>:
      proc: provide details on indirect branch speculation

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: fix lookup in /proc/net subdirectories after setns(2)

    Hui Su <sh_def@163.com>:
      fs/proc: make pde_get() return nothing

Subsystem: misc

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      asm-generic: force inlining of get_order() to work around gcc10 poor decision

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: split out mathematical helpers

Subsystem: core-kernel

    Hui Su <sh_def@163.com>:
      kernel/acct.c: use #elif instead of #end and #elif

Subsystem: bitmap

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      include/linux/bitmap.h: convert bitmap_empty() / bitmap_full() to return boolean

    "Ma, Jianpeng" <jianpeng.ma@intel.com>:
      bitmap: remove unused function declaration

Subsystem: lib

    Geert Uytterhoeven <geert@linux-m68k.org>:
      lib/test_free_pages.c: add basic progress indicators

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
    Patch series "] lib/stackdepot.c: Replace one-element array with flexible-array member":
      lib/stackdepot.c: replace one-element array with flexible-array member
      lib/stackdepot.c: use flex_array_size() helper in memcpy()
      lib/stackdepot.c: use array_size() helper in jhash2()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      lib/test_lockup.c: minimum fix to get it compiled on PREEMPT_RT

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      lib/list_kunit: follow new file name convention for KUnit tests
      lib/linear_ranges_kunit: follow new file name convention for KUnit tests
      lib/bits_kunit: follow new file name convention for KUnit tests
      lib/cmdline: fix get_option() for strings starting with hyphen
      lib/cmdline: allow NULL to be an output for get_option()
      lib/cmdline_kunit: add a new test suite for cmdline API

    Jakub Jelinek <jakub@redhat.com>:
      ilog2: improve ilog2 for constant arguments

    Nick Desaulniers <ndesaulniers@google.com>:
      lib/string: remove unnecessary #undefs

    Daniel Axtens <dja@axtens.net>:
    Patch series "Fortify strscpy()", v7:
      lib: string.h: detect intra-object overflow in fortified string functions
      lkdtm: tests for FORTIFY_SOURCE

    Francis Laniel <laniel_francis@privacyrequired.com>:
      string.h: add FORTIFY coverage for strscpy()
      drivers/misc/lkdtm: add new file in LKDTM to test fortified strscpy
      drivers/misc/lkdtm/lkdtm.h: correct wrong filenames in comment

    Alexey Dobriyan <adobriyan@gmail.com>:
      lib: cleanup kstrto*() usage

Subsystem: lz4

    Gao Xiang <hsiangkao@redhat.com>:
      lib/lz4: explicitly support in-place decompression

Subsystem: bitops

    Syed Nayyar Waris <syednwaris@gmail.com>:
    Patch series "Introduce the for_each_set_clump macro", v12:
      bitops: introduce the for_each_set_clump macro
      lib/test_bitmap.c: add for_each_set_clump test cases
      gpio: thunderx: utilize for_each_set_clump macro
      gpio: xilinx: utilize generic bitmap_get_value and _set_value

Subsystem: checkpatch

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: add new exception to repeated word check

    Aditya Srivastava <yashsri421@gmail.com>:
      checkpatch: fix false positives in REPEATED_WORD warning

    Łukasz Stelmach <l.stelmach@samsung.com>:
      checkpatch: ignore generated CamelCase defines and enum values

    Joe Perches <joe@perches.com>:
      checkpatch: prefer static const declarations
      checkpatch: allow --fix removal of unnecessary break statements

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: extend attributes check to handle more patterns

    Tom Rix <trix@redhat.com>:
      checkpatch: add a fixer for missing newline at eof

    Joe Perches <joe@perches.com>:
      checkpatch: update __attribute__((section("name"))) quote removal

    Aditya Srivastava <yashsri421@gmail.com>:
      checkpatch: add fix option for GERRIT_CHANGE_ID

    Joe Perches <joe@perches.com>:
      checkpatch: add __alias and __weak to suggested __attribute__ conversions

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: improve email parsing
      checkpatch: fix spelling errors and remove repeated word

    Aditya Srivastava <yashsri421@gmail.com>:
      checkpatch: avoid COMMIT_LOG_LONG_LINE warning for signature tags

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: fix unescaped left brace

    Aditya Srivastava <yashsri421@gmail.com>:
      checkpatch: add fix option for ASSIGNMENT_CONTINUATIONS
      checkpatch: add fix option for LOGICAL_CONTINUATIONS
      checkpatch: add fix and improve warning msg for non-standard signature

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: add warning for unnecessary use of %h[xudi] and %hh[xudi]
      checkpatch: add warning for lines starting with a '#' in commit log
      checkpatch: fix TYPO_SPELLING check for words with apostrophe

    Joe Perches <joe@perches.com>:
      checkpatch: add printk_once and printk_ratelimit to prefer pr_<level> warning

Subsystem: nilfs

    Alex Shi <alex.shi@linux.alibaba.com>:
      fs/nilfs2: remove some unused macros to tame gcc

Subsystem: kdump

    Alexander Egorenkov <egorenar@linux.ibm.com>:
      kdump: append uts_namespace.name offset to VMCOREINFO

Subsystem: rapidio

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      rapidio: remove unused rio_get_asm() and rio_get_device()

Subsystem: gcov

    Nick Desaulniers <ndesaulniers@google.com>:
      gcov: remove support for GCC < 4.9

    Alex Shi <alex.shi@linux.alibaba.com>:
      gcov: fix kernel-doc markup issue

Subsystem: bfs

    Randy Dunlap <rdunlap@infradead.org>:
      bfs: don't use WARNING: string when it's just info.

Subsystem: relay

    Jani Nikula <jani.nikula@intel.com>:
    Patch series "relay: cleanup and const callbacks", v2:
      relay: remove unused buf_mapped and buf_unmapped callbacks
      relay: require non-NULL callbacks in relay_open()
      relay: make create_buf_file and remove_buf_file callbacks mandatory
      relay: allow the use of const callback structs
      drm/i915: make relay callbacks const
      ath10k: make relay callbacks const
      ath11k: make relay callbacks const
      ath9k: make relay callbacks const
      blktrace: make relay callbacks const

Subsystem: resource

    Mauro Carvalho Chehab <mchehab+huawei@kernel.org>:
      kernel/resource.c: fix kernel-doc markups

Subsystem: ubsan

    Kees Cook <keescook@chromium.org>:
    Patch series "Clean up UBSAN Makefile", v2:
      ubsan: remove redundant -Wno-maybe-uninitialized
      ubsan: move cc-option tests into Kconfig
      ubsan: disable object-size sanitizer under GCC
      ubsan: disable UBSAN_TRAP for all*config
      ubsan: enable for all*config builds
      ubsan: remove UBSAN_MISC in favor of individual options
      ubsan: expand tests and reporting

    Dmitry Vyukov <dvyukov@google.com>:
      kcov: don't instrument with UBSAN

    Zou Wei <zou_wei@huawei.com>:
      lib/ubsan.c: mark type_check_kinds with static keyword

Subsystem: reboot

    Matteo Croce <mcroce@microsoft.com>:
      reboot: refactor and comment the cpu selection code
      reboot: allow to specify reboot mode via sysfs
      reboot: remove cf9_safe from allowed types and rename cf9_force
    Patch series "reboot: sysfs improvements":
      reboot: allow to override reboot type if quirks are found
      reboot: hide from sysfs not applicable settings

Subsystem: fault-injection

    Barnabás Pőcze <pobrn@protonmail.com>:
      fault-injection: handle EI_ETYPE_TRUE

Subsystem: lzo

    Jason Yan <yanaijie@huawei.com>:
      lib/lzo/lzo1x_compress.c: make lzogeneric1x_1_compress() static

Subsystem: apparmor

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      apparmor: remove duplicate macro list_entry_is_head()

Subsystem: mm/pagemap

    Christoph Hellwig <hch@lst.de>:
    Patch series "simplify follow_pte a bit":
      mm: unexport follow_pte_pmd
      mm: simplify follow_pte{,pmd}

Subsystem: mm/cleanups

    Haitao Shi <shihaitao1@huawei.com>:
      mm: fix some spelling mistakes in comments

Subsystem: mm/gup

    Jann Horn <jannh@google.com>:
      mmap locking API: don't check locking if the mm isn't live yet
      mm/gup: assert that the mmap lock is held in __get_user_pages()

 Documentation/ABI/testing/sysfs-kernel-reboot    |   32 
 Documentation/admin-guide/kdump/vmcoreinfo.rst   |    6 
 Documentation/dev-tools/ubsan.rst                |    1 
 Documentation/filesystems/proc.rst               |    2 
 MAINTAINERS                                      |    5 
 arch/alpha/kernel/process.c                      |    2 
 arch/powerpc/kernel/vmlinux.lds.S                |    4 
 arch/s390/pci/pci_mmio.c                         |    4 
 drivers/gpio/gpio-thunderx.c                     |   11 
 drivers/gpio/gpio-xilinx.c                       |   61 -
 drivers/gpu/drm/i915/gt/uc/intel_guc_log.c       |    2 
 drivers/misc/lkdtm/Makefile                      |    1 
 drivers/misc/lkdtm/bugs.c                        |   50 +
 drivers/misc/lkdtm/core.c                        |    3 
 drivers/misc/lkdtm/fortify.c                     |   82 ++
 drivers/misc/lkdtm/lkdtm.h                       |   19 
 drivers/net/wireless/ath/ath10k/spectral.c       |    2 
 drivers/net/wireless/ath/ath11k/spectral.c       |    2 
 drivers/net/wireless/ath/ath9k/common-spectral.c |    2 
 drivers/rapidio/rio.c                            |   81 --
 fs/bfs/inode.c                                   |    2 
 fs/dax.c                                         |    9 
 fs/exec.c                                        |    8 
 fs/nfs/callback_proc.c                           |    5 
 fs/nilfs2/segment.c                              |    5 
 fs/proc/array.c                                  |   28 
 fs/proc/base.c                                   |    2 
 fs/proc/generic.c                                |   24 
 fs/proc/internal.h                               |   10 
 fs/proc/proc_net.c                               |   20 
 include/asm-generic/bitops/find.h                |   19 
 include/asm-generic/getorder.h                   |    2 
 include/linux/bitmap.h                           |   67 +-
 include/linux/bitops.h                           |   24 
 include/linux/dcache.h                           |    1 
 include/linux/iommu-helper.h                     |    4 
 include/linux/kernel.h                           |  173 -----
 include/linux/log2.h                             |    3 
 include/linux/math.h                             |  177 +++++
 include/linux/mm.h                               |    6 
 include/linux/mm_types.h                         |   10 
 include/linux/mmap_lock.h                        |   16 
 include/linux/proc_fs.h                          |    8 
 include/linux/rcu_node_tree.h                    |    2 
 include/linux/relay.h                            |   29 
 include/linux/rio_drv.h                          |    3 
 include/linux/string.h                           |   75 +-
 include/linux/units.h                            |    2 
 kernel/Makefile                                  |    3 
 kernel/acct.c                                    |    7 
 kernel/crash_core.c                              |    1 
 kernel/fail_function.c                           |    6 
 kernel/gcov/gcc_4_7.c                            |   10 
 kernel/reboot.c                                  |  308 ++++++++-
 kernel/relay.c                                   |  111 ---
 kernel/resource.c                                |   24 
 kernel/trace/blktrace.c                          |    2 
 lib/Kconfig.debug                                |   11 
 lib/Kconfig.ubsan                                |  154 +++-
 lib/Makefile                                     |    7 
 lib/bits_kunit.c                                 |   75 ++
 lib/cmdline.c                                    |   20 
 lib/cmdline_kunit.c                              |  100 +++
 lib/errname.c                                    |    1 
 lib/error-inject.c                               |    2 
 lib/errseq.c                                     |    1 
 lib/find_bit.c                                   |   17 
 lib/linear_ranges_kunit.c                        |  228 +++++++
 lib/list-test.c                                  |  748 -----------------------
 lib/list_kunit.c                                 |  748 +++++++++++++++++++++++
 lib/lz4/lz4_decompress.c                         |    6 
 lib/lz4/lz4defs.h                                |    1 
 lib/lzo/lzo1x_compress.c                         |    2 
 lib/math/div64.c                                 |    4 
 lib/math/int_pow.c                               |    2 
 lib/math/int_sqrt.c                              |    3 
 lib/math/reciprocal_div.c                        |    9 
 lib/stackdepot.c                                 |   11 
 lib/string.c                                     |    4 
 lib/test_bitmap.c                                |  143 ++++
 lib/test_bits.c                                  |   75 --
 lib/test_firmware.c                              |    9 
 lib/test_free_pages.c                            |    5 
 lib/test_kmod.c                                  |   26 
 lib/test_linear_ranges.c                         |  228 -------
 lib/test_lockup.c                                |   16 
 lib/test_ubsan.c                                 |   74 ++
 lib/ubsan.c                                      |    2 
 mm/filemap.c                                     |    2 
 mm/gup.c                                         |    2 
 mm/huge_memory.c                                 |    2 
 mm/khugepaged.c                                  |    2 
 mm/memblock.c                                    |    2 
 mm/memory.c                                      |   36 -
 mm/memory_hotplug.c                              |    2 
 mm/migrate.c                                     |    2 
 mm/page_ext.c                                    |    2 
 mm/swapfile.c                                    |   11 
 scripts/Makefile.ubsan                           |   49 -
 scripts/checkpatch.pl                            |  495 +++++++++++----
 security/apparmor/apparmorfs.c                   |    3 
 tools/testing/selftests/lkdtm/tests.txt          |    1 
 102 files changed, 3022 insertions(+), 1899 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15 22:49   ` incoming Linus Torvalds
@ 2020-12-15 22:55     ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-15 22:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux-MM, mm-commits

On Tue, 15 Dec 2020 14:49:24 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Dec 15, 2020 at 2:48 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I will try to apply it on top of my merge of your previous series instead.
> 
> Yes, then it applies cleanly. So apparently we just have different
> concepts of what really constitutes a "base" for applying your series.
> 

oop, sorry, yes, the "based on" thing was wrong because I had two
series in flight simultaneously.  I've never tried that before..


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15 22:48 ` incoming Linus Torvalds
@ 2020-12-15 22:49   ` Linus Torvalds
  2020-12-15 22:55     ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-12-15 22:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, Dec 15, 2020 at 2:48 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I will try to apply it on top of my merge of your previous series instead.

Yes, then it applies cleanly. So apparently we just have different
concepts of what really constitutes a "base" for applying your series.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15 20:32 incoming Andrew Morton
  2020-12-15 21:00 ` incoming Linus Torvalds
@ 2020-12-15 22:48 ` Linus Torvalds
  2020-12-15 22:49   ` incoming Linus Torvalds
  1 sibling, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-12-15 22:48 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, Dec 15, 2020 at 12:32 PM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> - more MM work: a memcg scalability improvememt
>
> 19 patches, based on 148842c98a24e508aecb929718818fbf4c2a6ff3.

With your re-send, I get all patches, but they don't actually apply cleanly.

Is that base correct?

I get

  error: patch failed: mm/huge_memory.c:2750
  error: mm/huge_memory.c: patch does not apply
  Patch failed at 0004 mm/thp: narrow lru locking

for that patch "[patch 04/19] mm/thp: narrow lru locking", and that's
definitely true: the patch fragment has

@@ -2750,7 +2751,7 @@ int split_huge_page_to_list(struct page
                                __dec_lruvec_page_state(head, NR_FILE_THPS);
                }

-               __split_huge_page(page, list, end, flags);
+               __split_huge_page(page, list, end);
                ret = 0;
        } else {
                if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {

but that __dec_lruvec_page_state() conversion was done by your
previous commit series.

So I have the feeling that what you actually mean by "base" isn't
actually really the base for that series at all..

I will try to apply it on top of my merge of your previous series instead.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15 20:32 incoming Andrew Morton
@ 2020-12-15 21:00 ` Linus Torvalds
  2020-12-15 22:48 ` incoming Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-12-15 21:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux-MM, mm-commits

On Tue, Dec 15, 2020 at 12:32 PM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> - more MM work: a memcg scalability improvememt
>
> 19 patches, based on 148842c98a24e508aecb929718818fbf4c2a6ff3.

I'm not seeing patch 10/19 at all.

And patch 19/19 is corrupted and has an attachment with a '^P'
character in it. I could fix it up, but with the missing patch in the
middle I'm not going to even try. 'b4' is also very unhappy about that
patch 19/19.

I don't know what went wrong, but I'll ignore this send - please
re-send the series at your leisure, ok?

            Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-15 20:32 Andrew Morton
  2020-12-15 21:00 ` incoming Linus Torvalds
  2020-12-15 22:48 ` incoming Linus Torvalds
  0 siblings, 2 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-15 20:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


- more MM work: a memcg scalability improvememt

19 patches, based on 148842c98a24e508aecb929718818fbf4c2a6ff3.

Subsystems affected by this patch series:


    Alex Shi <alex.shi@linux.alibaba.com>:
    Patch series "per memcg lru lock", v21:
      mm/thp: move lru_add_page_tail() to huge_memory.c
      mm/thp: use head for head page in lru_add_page_tail()
      mm/thp: simplify lru_add_page_tail()
      mm/thp: narrow lru locking
      mm/vmscan: remove unnecessary lruvec adding
      mm/rmap: stop store reordering issue on page->mapping

    Hugh Dickins <hughd@google.com>:
      mm: page_idle_get_page() does not need lru_lock

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/memcg: add debug checking in lock_page_memcg
      mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn
      mm/lru: move lock into lru_note_cost
      mm/vmscan: remove lruvec reget in move_pages_to_lru
      mm/mlock: remove lru_lock on TestClearPageMlocked
      mm/mlock: remove __munlock_isolate_lru_page()
      mm/lru: introduce TestClearPageLRU()
      mm/compaction: do page isolation first in compaction
      mm/swap.c: serialize memcg changes in pagevec_lru_move_fn
      mm/lru: replace pgdat lru_lock with lruvec lock

    Alexander Duyck <alexander.h.duyck@linux.intel.com>:
      mm/lru: introduce relock_page_lruvec()

    Hugh Dickins <hughd@google.com>:
      mm/lru: revise the comments of lru_lock

 Documentation/admin-guide/cgroup-v1/memcg_test.rst |   15 -
 Documentation/admin-guide/cgroup-v1/memory.rst     |   23 -
 Documentation/trace/events-kmem.rst                |    2 
 Documentation/vm/unevictable-lru.rst               |   22 -
 include/linux/memcontrol.h                         |  110 +++++++
 include/linux/mm_types.h                           |    2 
 include/linux/mmzone.h                             |    6 
 include/linux/page-flags.h                         |    1 
 include/linux/swap.h                               |    4 
 mm/compaction.c                                    |   98 ++++---
 mm/filemap.c                                       |    4 
 mm/huge_memory.c                                   |  109 ++++---
 mm/memcontrol.c                                    |   84 +++++-
 mm/mlock.c                                         |   93 ++----
 mm/mmzone.c                                        |    1 
 mm/page_alloc.c                                    |    1 
 mm/page_idle.c                                     |    4 
 mm/rmap.c                                          |   12 
 mm/swap.c                                          |  292 ++++++++-------------
 mm/vmscan.c                                        |  239 ++++++++---------
 mm/workingset.c                                    |    2 
 21 files changed, 644 insertions(+), 480 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15  3:30   ` incoming Linus Torvalds
@ 2020-12-15 14:04     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 485+ messages in thread
From: Konstantin Ryabitsev @ 2020-12-15 14:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, mm-commits, Linux-MM

On Mon, Dec 14, 2020 at 07:30:54PM -0800, Linus Torvalds wrote:
> > All the patches except for _one_ get a nice little green check-mark
> > next to them when I use 'git am' on this series.
> >
> > The one that did not was [patch 192/200].
> >
> > I have no idea why
> 
> Hmm. It looks like that patch is the only one in the series with the
> ">From" marker in the commit message, from the silly "clarify that
> this isn't the first line in a new message in mbox format".
> 
> And "b4 am" has turned the single ">" into two, making the stupid
> marker worse, and actually corrupting the end result.

It's a bug in b4 that I overlooked. Public-inbox emits mboxrd-formatted 
.mbox files, while Python's mailbox.mbox consumes mboxo only. The main 
distinction between the two is precisely that mboxrd will convert 
">From " into ">>From " in an attempt to avoid corruption during
escape/unescape (it didn't end up fixing the problem 100% and mostly 
introduced incompatibilities like this one).

I have a fix in master/stable-0.6.y and I'll release a 0.6.2 before the 
end of the week.

Thanks for the report.

-K


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15  3:25 ` incoming Linus Torvalds
@ 2020-12-15  3:30   ` Linus Torvalds
  2020-12-15 14:04     ` incoming Konstantin Ryabitsev
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-12-15  3:30 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: mm-commits, Linux-MM

On Mon, Dec 14, 2020 at 7:25 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> All the patches except for _one_ get a nice little green check-mark
> next to them when I use 'git am' on this series.
>
> The one that did not was [patch 192/200].
>
> I have no idea why

Hmm. It looks like that patch is the only one in the series with the
">From" marker in the commit message, from the silly "clarify that
this isn't the first line in a new message in mbox format".

And "b4 am" has turned the single ">" into two, making the stupid
marker worse, and actually corrupting the end result.

Coincidence? Or cause?

            Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-12-15  3:02 incoming Andrew Morton
@ 2020-12-15  3:25 ` Linus Torvalds
  2020-12-15  3:30   ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-12-15  3:25 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: mm-commits, Linux-MM

On Mon, Dec 14, 2020 at 7:02 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> 200 patches, based on 2c85ebc57b3e1817b6ce1a6b703928e113a90442.

I haven't actually processed the patches yet, but I have a question
for Konstantin wrt b4.

All the patches except for _one_ get a nice little green check-mark
next to them when I use 'git am' on this series.

The one that did not was [patch 192/200].

I have no idea why - and it doesn't matter a lot to me, it just stood
out as being different. I'm assuming Andrew has started doing patch
attestation, and that patch failed. But if so, maybe Konstantin wants
to know what went wrong.

Konstantin?

            Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-15  3:02 Andrew Morton
  2020-12-15  3:25 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-12-15  3:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- a few random little subsystems

- almost all of the MM patches which are staged ahead of linux-next
  material.  I'll trickle to post-linux-next work in as the dependents
  get merged up.


200 patches, based on 2c85ebc57b3e1817b6ce1a6b703928e113a90442.

Subsystems affected by this patch series:

  kthread
  kbuild
  ide
  ntfs
  ocfs2
  arch
  mm/slab-generic
  mm/slab
  mm/slub
  mm/dax
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/shmem
  mm/memcg
  mm/pagemap
  mm/mremap
  mm/hmm
  mm/vmalloc
  mm/documentation
  mm/kasan
  mm/pagealloc
  mm/memory-failure
  mm/hugetlb
  mm/vmscan
  mm/z3fold
  mm/compaction
  mm/oom-kill
  mm/migration
  mm/cma
  mm/page-poison
  mm/userfaultfd
  mm/zswap
  mm/zsmalloc
  mm/uaccess
  mm/zram
  mm/cleanups

Subsystem: kthread

    Rob Clark <robdclark@chromium.org>:
      kthread: add kthread_work tracepoints

    Petr Mladek <pmladek@suse.com>:
      kthread_worker: document CPU hotplug handling

Subsystem: kbuild

    Petr Vorel <petr.vorel@gmail.com>:
      uapi: move constants from <linux/kernel.h> to <linux/const.h>

Subsystem: ide

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      ide/falcon: remove in_interrupt() usage
      ide: remove BUG_ON(in_interrupt() || irqs_disabled()) from ide_unregister()

Subsystem: ntfs

    Alex Shi <alex.shi@linux.alibaba.com>:
      fs/ntfs: remove unused varibles
      fs/ntfs: remove unused variable attr_len

Subsystem: ocfs2

    Tom Rix <trix@redhat.com>:
      fs/ocfs2/cluster/tcp.c: remove unneeded break

    Mauricio Faria de Oliveira <mfo@canonical.com>:
      ocfs2: ratelimit the 'max lookup times reached' notice

Subsystem: arch

    Colin Ian King <colin.king@canonical.com>:
      arch/Kconfig: fix spelling mistakes

Subsystem: mm/slab-generic

    Hui Su <sh_def@163.com>:
      mm/slab_common.c: use list_for_each_entry in dump_unreclaimable_slab()

    Bartosz Golaszewski <bgolaszewski@baylibre.com>:
    Patch series "slab: provide and use krealloc_array()", v3:
      mm: slab: clarify krealloc()'s behavior with __GFP_ZERO
      mm: slab: provide krealloc_array()
      ALSA: pcm: use krealloc_array()
      vhost: vringh: use krealloc_array()
      pinctrl: use krealloc_array()
      edac: ghes: use krealloc_array()
      drm: atomic: use krealloc_array()
      hwtracing: intel: use krealloc_array()
      dma-buf: use krealloc_array()

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slab, slub: clear the slab_cache field when freeing page

Subsystem: mm/slab

    Alexander Popov <alex.popov@linux.com>:
      mm/slab: rerform init_on_free earlier

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: use kmem_cache_debug_flags() in deactivate_slab()

    Bharata B Rao <bharata@linux.ibm.com>:
      mm/slub: let number of online CPUs determine the slub page order

Subsystem: mm/dax

    Dan Williams <dan.j.williams@intel.com>:
      device-dax/kmem: use struct_size()

Subsystem: mm/debug

    Zhenhua Huang <zhenhuah@codeaurora.org>:
      mm: fix page_owner initializing issue for arm32

    Liam Mark <lmark@codeaurora.org>:
      mm/page_owner: record timestamp and pid

Subsystem: mm/pagecache

    Kent Overstreet <kent.overstreet@gmail.com>:
    Patch series "generic_file_buffered_read() improvements", v2:
      mm/filemap/c: break generic_file_buffered_read up into multiple functions
      mm/filemap.c: generic_file_buffered_read() now uses find_get_pages_contig

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/truncate: add parameter explanation for invalidate_mapping_pagevec

    Hailong Liu <carver4lio@163.com>:
      mm/filemap.c: remove else after a return

Subsystem: mm/gup

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "selftests/vm: gup_test, hmm-tests, assorted improvements", v3:
      mm/gup_benchmark: rename to mm/gup_test
      selftests/vm: use a common gup_test.h
      selftests/vm: rename run_vmtests --> run_vmtests.sh
      selftests/vm: minor cleanup: Makefile and gup_test.c
      selftests/vm: only some gup_test items are really benchmarks
      selftests/vm: gup_test: introduce the dump_pages() sub-test
      selftests/vm: run_vmtests.sh: update and clean up gup_test invocation
      selftests/vm: hmm-tests: remove the libhugetlbfs dependency
      selftests/vm: 2x speedup for run_vmtests.sh

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/gup_test.c: mark gup_test_init as __init function
      mm/gup_test: GUP_TEST depends on DEBUG_FS

    Jason Gunthorpe <jgg@nvidia.com>:
    Patch series "Add a seqcount between gup_fast and copy_page_range()", v4:
      mm/gup: reorganize internal_get_user_pages_fast()
      mm/gup: prevent gup_fast from racing with COW during fork
      mm/gup: remove the vma allocation from gup_longterm_locked()
      mm/gup: combine put_compound_head() and unpin_user_page()

Subsystem: mm/swap

    Ralph Campbell <rcampbell@nvidia.com>:
      mm: handle zone device pages in release_pages()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/swapfile.c: use helper function swap_count() in add_swap_count_continuation()
      mm/swap_state: skip meaningless swap cache readahead when ra_info.win == 0
      mm/swapfile.c: remove unnecessary out label in __swap_duplicate()
      mm/swapfile.c: use memset to fill the swap_map with SWAP_HAS_CACHE

    Jeff Layton <jlayton@kernel.org>:
      mm: remove pagevec_lookup_range_nr_tag()

Subsystem: mm/shmem

    Hui Su <sh_def@163.com>:
      mm/shmem.c: make shmem_mapping() inline

    Randy Dunlap <rdunlap@infradead.org>:
      tmpfs: fix Documentation nits

Subsystem: mm/memcg

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: add file_thp, shmem_thp to memory.stat

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: remove unused mod_memcg_obj_state()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: memcontrol: eliminate redundant check in __mem_cgroup_insert_exceeded()

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcg/slab: fix return of child memcg objcg for root memcg
      mm: memcg/slab: fix use after free in obj_cgroup_charge

    Shakeel Butt <shakeelb@google.com>:
      mm/rmap: always do TTU_IGNORE_ACCESS

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/memcg: update page struct member in comments

    Roman Gushchin <guro@fb.com>:
      mm: memcg: fix obsolete code comments
    Patch series "mm: memcg: deprecate cgroup v1 non-hierarchical mode", v1:
      mm: memcg: deprecate the non-hierarchical mode
      docs: cgroup-v1: reflect the deprecation of the non-hierarchical mode
      cgroup: remove obsoleted broken_hierarchy and warned_broken_hierarchy

    Hui Su <sh_def@163.com>:
      mm/page_counter: use page_counter_read in page_counter_set_max

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      mm: memcg: remove obsolete memcg_has_children()

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcg/slab: rename *_lruvec_slab_state to *_lruvec_kmem_state

    Kaixu Xia <kaixuxia@tencent.com>:
      mm: memcontrol: sssign boolean values to a bool variable

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/memcg: remove incorrect comment

    Shakeel Butt <shakeelb@google.com>:
    Patch series "memcg: add pagetable comsumption to memory.stat", v2:
      mm: move lruvec stats update functions to vmstat.h
      mm: memcontrol: account pagetables per node

Subsystem: mm/pagemap

    Dan Williams <dan.j.williams@intel.com>:
      xen/unpopulated-alloc: consolidate pgmap manipulation

    Kalesh Singh <kaleshsingh@google.com>:
    Patch series "Speed up mremap on large regions", v4:
      kselftests: vm: add mremap tests
      mm: speedup mremap on 1GB or larger regions
      arm64: mremap speedup - enable HAVE_MOVE_PUD
      x86: mremap speedup - Enable HAVE_MOVE_PUD

    John Hubbard <jhubbard@nvidia.com>:
      mm: cleanup: remove unused tsk arg from __access_remote_vm

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/mapping_dirty_helpers: enhance the kernel-doc markups
      mm/page_vma_mapped.c: add colon to fix kernel-doc markups error for check_pte

    Axel Rasmussen <axelrasmussen@google.com>:
      mm: mmap_lock: add tracepoints around lock acquisition

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      sparc: fix handling of page table constructor failure
      mm: move free_unref_page to mm/internal.h

Subsystem: mm/mremap

    Dmitry Safonov <dima@arista.com>:
    Patch series "mremap: move_vma() fixes":
      mm/mremap: account memory on do_munmap() failure
      mm/mremap: for MREMAP_DONTUNMAP check security_vm_enough_memory_mm()
      mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio
      vm_ops: rename .split() callback to .may_split()
      mremap: check if it's possible to split original vma
      mm: forbid splitting special mappings

Subsystem: mm/hmm

    Daniel Vetter <daniel.vetter@ffwll.ch>:
      mm: track mmu notifiers in fs_reclaim_acquire/release
      mm: extract might_alloc() debug check
      locking/selftests: add testcases for fs_reclaim

Subsystem: mm/vmalloc

    Andrew Morton <akpm@linux-foundation.org>:
      mm/vmalloc.c:__vmalloc_area_node(): avoid 32-bit overflow

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: use free_vm_area() if an allocation fails
      mm/vmalloc: rework the drain logic

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/vmalloc: add 'align' parameter explanation for pvm_determine_end_from_reverse

    Baolin Wang <baolin.wang@linux.alibaba.com>:
      mm/vmalloc.c: remove unnecessary return statement

    Waiman Long <longman@redhat.com>:
      mm/vmalloc: Fix unlock order in s_stop()

Subsystem: mm/documentation

    Alex Shi <alex.shi@linux.alibaba.com>:
      docs/vm: remove unused 3 items explanation for /proc/vmstat

Subsystem: mm/kasan

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      mm/vmalloc.c: fix kasan shadow poisoning size

    Walter Wu <walter-zh.wu@mediatek.com>:
    Patch series "kasan: add workqueue stack for generic KASAN", v5:
      workqueue: kasan: record workqueue stack
      kasan: print workqueue stack
      lib/test_kasan.c: add workqueue test case
      kasan: update documentation for generic kasan

    Marco Elver <elver@google.com>:
      lkdtm: disable KASAN for rodata.o

Subsystem: mm/pagealloc

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "arch, mm: deprecate DISCONTIGMEM", v2:
      alpha: switch from DISCONTIGMEM to SPARSEMEM
      ia64: remove custom __early_pfn_to_nid()
      ia64: remove 'ifdef CONFIG_ZONE_DMA32' statements
      ia64: discontig: paging_init(): remove local max_pfn calculation
      ia64: split virtual map initialization out of paging_init()
      ia64: forbid using VIRTUAL_MEM_MAP with FLATMEM
      ia64: make SPARSEMEM default and disable DISCONTIGMEM
      arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
      arm, arm64: move free_unused_memmap() to generic mm
      arc: use FLATMEM with freeing of unused memory map instead of DISCONTIGMEM
      m68k/mm: make node data and node setup depend on CONFIG_DISCONTIGMEM
      m68k/mm: enable use of generic memory_model.h for !DISCONTIGMEM
      m68k: deprecate DISCONTIGMEM
    Patch series "arch, mm: improve robustness of direct map manipulation", v7:
      mm: introduce debug_pagealloc_{map,unmap}_pages() helpers
      PM: hibernate: make direct map manipulations more explicit
      arch, mm: restore dependency of __kernel_map_pages() on DEBUG_PAGEALLOC
      arch, mm: make kernel_page_present() always available

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "disable pcplists during memory offline", v3:
      mm, page_alloc: clean up pageset high and batch update
      mm, page_alloc: calculate pageset high and batch once per zone
      mm, page_alloc: remove setup_pageset()
      mm, page_alloc: simplify pageset_update()
      mm, page_alloc: cache pageset high and batch in struct zone
      mm, page_alloc: move draining pcplists to page isolation users
      mm, page_alloc: disable pcplists during memory offline

    Miaohe Lin <linmiaohe@huawei.com>:
      include/linux/page-flags.h: remove unused __[Set|Clear]PagePrivate

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/page-flags: fix comment
      mm/page_alloc: add __free_pages() documentation

    Zou Wei <zou_wei@huawei.com>:
      mm/page_alloc: mark some symbols with static keyword

    David Hildenbrand <david@redhat.com>:
      mm/page_alloc: clear all pages in post_alloc_hook() with init_on_alloc=1

    Lin Feng <linf@wangsu.com>:
      init/main: fix broken buffer_init when DEFERRED_STRUCT_PAGE_INIT set

    Lorenzo Stoakes <lstoakes@gmail.com>:
      mm: page_alloc: refactor setup_per_zone_lowmem_reserve()

    Muchun Song <songmuchun@bytedance.com>:
      mm/page_alloc: speed up the iteration of max_order

Subsystem: mm/memory-failure

    Oscar Salvador <osalvador@suse.de>:
    Patch series "HWpoison: further fixes and cleanups", v5:
      mm,hwpoison: drain pcplists before bailing out for non-buddy zero-refcount page
      mm,hwpoison: take free pages off the buddy freelists
      mm,hwpoison: drop unneeded pcplist draining
    Patch series "HWPoison: Refactor get page interface", v2:
      mm,hwpoison: refactor get_any_page
      mm,hwpoison: disable pcplists before grabbing a refcount
      mm,hwpoison: remove drain_all_pages from shake_page
      mm,memory_failure: always pin the page in madvise_inject_error
      mm,hwpoison: return -EBUSY when migration fails

Subsystem: mm/hugetlb

    Hui Su <sh_def@163.com>:
      mm/hugetlb.c: just use put_page_testzero() instead of page_count()

    Ralph Campbell <rcampbell@nvidia.com>:
      include/linux/huge_mm.h: remove extern keyword

    Alex Shi <alex.shi@linux.alibaba.com>:
      khugepaged: add parameter explanations for kernel-doc markup

    Liu Xiang <liu.xiang@zlingsmart.com>:
      mm: hugetlb: fix type of delta parameter and related local variables in gather_surplus_pages()

    Oscar Salvador <osalvador@suse.de>:
      mm,hugetlb: remove unneeded initialization

    Dan Carpenter <dan.carpenter@oracle.com>:
      hugetlb: fix an error code in hugetlb_reserve_pages()

Subsystem: mm/vmscan

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: don't wake kswapd prematurely when watermark boosting is disabled

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      mm/vmscan: drop unneeded assignment in kswapd()

    "logic.yu" <hymmsx.yu@gmail.com>:
      mm/vmscan.c: remove the filename in the top of file comment

    Muchun Song <songmuchun@bytedance.com>:
      mm/page_isolation: do not isolate the max order page

Subsystem: mm/z3fold

    Vitaly Wool <vitaly.wool@konsulko.com>:
    Patch series "z3fold: stability / rt fixes":
      z3fold: simplify freeing slots
      z3fold: stricter locking and more careful reclaim
      z3fold: remove preempt disabled sections for RT

Subsystem: mm/compaction

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/compaction: rename 'start_pfn' to 'iteration_start_pfn' in compact_zone()

    Hui Su <sh_def@163.com>:
      mm/compaction: move compaction_suitable's comment to right place
      mm/compaction: make defer_compaction and compaction_deferred static

Subsystem: mm/oom-kill

    Hui Su <sh_def@163.com>:
      mm/oom_kill: change comment and rename is_dump_unreclaim_slabs()

Subsystem: mm/migration

    Long Li <lonuxli.64@gmail.com>:
      mm/migrate.c: fix comment spelling

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/migrate.c: optimize migrate_vma_pages() mmu notifier

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: support THPs in zero_user_segments

    Yang Shi <shy828301@gmail.com>:
    Patch series "mm: misc migrate cleanup and improvement", v3:
      mm: truncate_complete_page() does not exist any more
      mm: migrate: simplify the logic for handling permanent failure
      mm: migrate: skip shared exec THP for NUMA balancing
      mm: migrate: clean up migrate_prep{_local}
      mm: migrate: return -ENOSYS if THP migration is unsupported

    Stephen Zhang <starzhangzsd@gmail.com>:
      mm: migrate: remove unused parameter in migrate_vma_insert_page()

Subsystem: mm/cma

    Lecopzer Chen <lecopzer.chen@mediatek.com>:
      mm/cma.c: remove redundant cma_mutex lock

    Charan Teja Reddy <charante@codeaurora.org>:
      mm: cma: improve pr_debug log in cma_release()

Subsystem: mm/page-poison

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "cleanup page poisoning", v3:
      mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters
      mm, page_poison: use static key more efficiently
      kernel/power: allow hibernation with page_poison sanity checking
      mm, page_poison: remove CONFIG_PAGE_POISONING_NO_SANITY
      mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO

Subsystem: mm/userfaultfd

    Lokesh Gidra <lokeshgidra@google.com>:
    Patch series "Control over userfaultfd kernel-fault handling", v6:
      userfaultfd: add UFFD_USER_MODE_ONLY
      userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob

    Axel Rasmussen <axelrasmussen@google.com>:
      userfaultfd: selftests: make __{s,u}64 format specifiers portable

    Peter Xu <peterx@redhat.com>:
    Patch series "userfaultfd: selftests: Small fixes":
      userfaultfd/selftests: always dump something in modes
      userfaultfd/selftests: fix retval check for userfaultfd_open()
      userfaultfd/selftests: hint the test runner on required privilege

Subsystem: mm/zswap

    Joe Perches <joe@perches.com>:
      mm/zswap: make struct kernel_param_ops definitions const

    YueHaibing <yuehaibing@huawei.com>:
      mm/zswap: fix passing zero to 'PTR_ERR' warning

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/zswap: move to use crypto_acomp API for hardware acceleration

Subsystem: mm/zsmalloc

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/zsmalloc.c: rework the list_add code in insert_zspage()

Subsystem: mm/uaccess

    Colin Ian King <colin.king@canonical.com>:
      mm/process_vm_access: remove redundant initialization of iov_r

Subsystem: mm/zram

    Minchan Kim <minchan@kernel.org>:
      zram: support page writeback
      zram: add stat to gather incompressible pages since zram set up

    Rui Salvaterra <rsalvaterra@gmail.com>:
      zram: break the strict dependency from lzo

Subsystem: mm/cleanups

    Mauro Carvalho Chehab <mchehab+huawei@kernel.org>:
      mm: fix kernel-doc markups

    Joe Perches <joe@perches.com>:
    Patch series "mm: Convert sysfs sprintf family to sysfs_emit", v2:
      mm: use sysfs_emit for struct kobject * uses
      mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
      mm:backing-dev: use sysfs_emit in macro defining functions
      mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
      mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      mm: fix fall-through warnings for Clang

    Alexey Dobriyan <adobriyan@gmail.com>:
      mm: cleanup kstrto*() usage

 /mmap_lock.h                                         |  107 ++
 a/Documentation/admin-guide/blockdev/zram.rst        |    6 
 a/Documentation/admin-guide/cgroup-v1/memcg_test.rst |    8 
 a/Documentation/admin-guide/cgroup-v1/memory.rst     |   42 
 a/Documentation/admin-guide/cgroup-v2.rst            |   11 
 a/Documentation/admin-guide/mm/transhuge.rst         |   15 
 a/Documentation/admin-guide/sysctl/vm.rst            |   15 
 a/Documentation/core-api/memory-allocation.rst       |    4 
 a/Documentation/core-api/pin_user_pages.rst          |    8 
 a/Documentation/dev-tools/kasan.rst                  |    5 
 a/Documentation/filesystems/tmpfs.rst                |    8 
 a/Documentation/vm/memory-model.rst                  |    3 
 a/Documentation/vm/page_owner.rst                    |   12 
 a/arch/Kconfig                                       |   21 
 a/arch/alpha/Kconfig                                 |    8 
 a/arch/alpha/include/asm/mmzone.h                    |   14 
 a/arch/alpha/include/asm/page.h                      |    7 
 a/arch/alpha/include/asm/pgtable.h                   |   12 
 a/arch/alpha/include/asm/sparsemem.h                 |   18 
 a/arch/alpha/kernel/setup.c                          |    1 
 a/arch/arc/Kconfig                                   |    3 
 a/arch/arc/include/asm/page.h                        |   20 
 a/arch/arc/mm/init.c                                 |   29 
 a/arch/arm/Kconfig                                   |   12 
 a/arch/arm/kernel/vdso.c                             |    9 
 a/arch/arm/mach-bcm/Kconfig                          |    1 
 a/arch/arm/mach-davinci/Kconfig                      |    1 
 a/arch/arm/mach-exynos/Kconfig                       |    1 
 a/arch/arm/mach-highbank/Kconfig                     |    1 
 a/arch/arm/mach-omap2/Kconfig                        |    1 
 a/arch/arm/mach-s5pv210/Kconfig                      |    1 
 a/arch/arm/mach-tango/Kconfig                        |    1 
 a/arch/arm/mm/init.c                                 |   78 -
 a/arch/arm64/Kconfig                                 |    9 
 a/arch/arm64/include/asm/cacheflush.h                |    1 
 a/arch/arm64/include/asm/pgtable.h                   |    1 
 a/arch/arm64/kernel/vdso.c                           |   41 
 a/arch/arm64/mm/init.c                               |   68 -
 a/arch/arm64/mm/pageattr.c                           |   12 
 a/arch/ia64/Kconfig                                  |   11 
 a/arch/ia64/include/asm/meminit.h                    |    2 
 a/arch/ia64/mm/contig.c                              |   88 --
 a/arch/ia64/mm/discontig.c                           |   44 -
 a/arch/ia64/mm/init.c                                |   14 
 a/arch/ia64/mm/numa.c                                |   30 
 a/arch/m68k/Kconfig.cpu                              |   31 
 a/arch/m68k/include/asm/page.h                       |    2 
 a/arch/m68k/include/asm/page_mm.h                    |    7 
 a/arch/m68k/include/asm/virtconvert.h                |    7 
 a/arch/m68k/mm/init.c                                |   10 
 a/arch/mips/vdso/genvdso.c                           |    4 
 a/arch/nds32/mm/mm-nds32.c                           |    6 
 a/arch/powerpc/Kconfig                               |    5 
 a/arch/riscv/Kconfig                                 |    4 
 a/arch/riscv/include/asm/pgtable.h                   |    2 
 a/arch/riscv/include/asm/set_memory.h                |    1 
 a/arch/riscv/mm/pageattr.c                           |   31 
 a/arch/s390/Kconfig                                  |    4 
 a/arch/s390/configs/debug_defconfig                  |    2 
 a/arch/s390/configs/defconfig                        |    2 
 a/arch/s390/kernel/vdso.c                            |   11 
 a/arch/sparc/Kconfig                                 |    4 
 a/arch/sparc/mm/init_64.c                            |    2 
 a/arch/x86/Kconfig                                   |    5 
 a/arch/x86/entry/vdso/vma.c                          |   17 
 a/arch/x86/include/asm/set_memory.h                  |    1 
 a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c          |    2 
 a/arch/x86/kernel/tboot.c                            |    1 
 a/arch/x86/mm/pat/set_memory.c                       |    6 
 a/drivers/base/node.c                                |    2 
 a/drivers/block/zram/Kconfig                         |   42 
 a/drivers/block/zram/zcomp.c                         |    2 
 a/drivers/block/zram/zram_drv.c                      |   29 
 a/drivers/block/zram/zram_drv.h                      |    1 
 a/drivers/dax/device.c                               |    4 
 a/drivers/dax/kmem.c                                 |    2 
 a/drivers/dma-buf/sync_file.c                        |    3 
 a/drivers/edac/ghes_edac.c                           |    4 
 a/drivers/firmware/efi/efi.c                         |    1 
 a/drivers/gpu/drm/drm_atomic.c                       |    3 
 a/drivers/hwtracing/intel_th/msu.c                   |    2 
 a/drivers/ide/falconide.c                            |    2 
 a/drivers/ide/ide-probe.c                            |    3 
 a/drivers/misc/lkdtm/Makefile                        |    1 
 a/drivers/pinctrl/pinctrl-utils.c                    |    2 
 a/drivers/vhost/vringh.c                             |    3 
 a/drivers/virtio/virtio_balloon.c                    |    6 
 a/drivers/xen/unpopulated-alloc.c                    |   14 
 a/fs/aio.c                                           |    5 
 a/fs/ntfs/file.c                                     |    5 
 a/fs/ntfs/inode.c                                    |    2 
 a/fs/ntfs/logfile.c                                  |    3 
 a/fs/ocfs2/cluster/tcp.c                             |    1 
 a/fs/ocfs2/namei.c                                   |    4 
 a/fs/proc/kcore.c                                    |    2 
 a/fs/proc/meminfo.c                                  |    2 
 a/fs/userfaultfd.c                                   |   20 
 a/include/linux/cgroup-defs.h                        |   15 
 a/include/linux/compaction.h                         |   12 
 a/include/linux/fs.h                                 |    2 
 a/include/linux/gfp.h                                |    2 
 a/include/linux/highmem.h                            |   19 
 a/include/linux/huge_mm.h                            |   93 --
 a/include/linux/memcontrol.h                         |  148 ---
 a/include/linux/migrate.h                            |    4 
 a/include/linux/mm.h                                 |  118 +-
 a/include/linux/mm_types.h                           |    8 
 a/include/linux/mmap_lock.h                          |   94 ++
 a/include/linux/mmzone.h                             |   50 -
 a/include/linux/page-flags.h                         |    6 
 a/include/linux/page_ext.h                           |    8 
 a/include/linux/pagevec.h                            |    3 
 a/include/linux/poison.h                             |    4 
 a/include/linux/rmap.h                               |    1 
 a/include/linux/sched/mm.h                           |   16 
 a/include/linux/set_memory.h                         |    5 
 a/include/linux/shmem_fs.h                           |    6 
 a/include/linux/slab.h                               |   18 
 a/include/linux/vmalloc.h                            |    8 
 a/include/linux/vmstat.h                             |  104 ++
 a/include/trace/events/sched.h                       |   84 +
 a/include/uapi/linux/const.h                         |    5 
 a/include/uapi/linux/ethtool.h                       |    2 
 a/include/uapi/linux/kernel.h                        |    9 
 a/include/uapi/linux/lightnvm.h                      |    2 
 a/include/uapi/linux/mroute6.h                       |    2 
 a/include/uapi/linux/netfilter/x_tables.h            |    2 
 a/include/uapi/linux/netlink.h                       |    2 
 a/include/uapi/linux/sysctl.h                        |    2 
 a/include/uapi/linux/userfaultfd.h                   |    9 
 a/init/main.c                                        |    6 
 a/ipc/shm.c                                          |    8 
 a/kernel/cgroup/cgroup.c                             |   12 
 a/kernel/fork.c                                      |    3 
 a/kernel/kthread.c                                   |   29 
 a/kernel/power/hibernate.c                           |    2 
 a/kernel/power/power.h                               |    2 
 a/kernel/power/snapshot.c                            |   52 +
 a/kernel/ptrace.c                                    |    2 
 a/kernel/workqueue.c                                 |    3 
 a/lib/locking-selftest.c                             |   47 +
 a/lib/test_kasan_module.c                            |   29 
 a/mm/Kconfig                                         |   25 
 a/mm/Kconfig.debug                                   |   28 
 a/mm/Makefile                                        |    4 
 a/mm/backing-dev.c                                   |    8 
 a/mm/cma.c                                           |    6 
 a/mm/compaction.c                                    |   29 
 a/mm/filemap.c                                       |  823 ++++++++++---------
 a/mm/gup.c                                           |  329 ++-----
 a/mm/gup_benchmark.c                                 |  210 ----
 a/mm/gup_test.c                                      |  299 ++++++
 a/mm/gup_test.h                                      |   40 
 a/mm/highmem.c                                       |   52 +
 a/mm/huge_memory.c                                   |   86 +
 a/mm/hugetlb.c                                       |   28 
 a/mm/init-mm.c                                       |    1 
 a/mm/internal.h                                      |    5 
 a/mm/kasan/generic.c                                 |    3 
 a/mm/kasan/report.c                                  |    4 
 a/mm/khugepaged.c                                    |   58 -
 a/mm/ksm.c                                           |   50 -
 a/mm/madvise.c                                       |   14 
 a/mm/mapping_dirty_helpers.c                         |    6 
 a/mm/memblock.c                                      |   80 +
 a/mm/memcontrol.c                                    |  170 +--
 a/mm/memory-failure.c                                |  322 +++----
 a/mm/memory.c                                        |   24 
 a/mm/memory_hotplug.c                                |   44 -
 a/mm/mempolicy.c                                     |    8 
 a/mm/migrate.c                                       |  183 ++--
 a/mm/mm_init.c                                       |    1 
 a/mm/mmap.c                                          |   22 
 a/mm/mmap_lock.c                                     |  230 +++++
 a/mm/mmu_notifier.c                                  |    7 
 a/mm/mmzone.c                                        |   14 
 a/mm/mremap.c                                        |  282 ++++--
 a/mm/nommu.c                                         |    8 
 a/mm/oom_kill.c                                      |   14 
 a/mm/page_alloc.c                                    |  517 ++++++-----
 a/mm/page_counter.c                                  |    4 
 a/mm/page_ext.c                                      |   10 
 a/mm/page_isolation.c                                |   18 
 a/mm/page_owner.c                                    |   17 
 a/mm/page_poison.c                                   |   56 -
 a/mm/page_vma_mapped.c                               |    9 
 a/mm/process_vm_access.c                             |    2 
 a/mm/rmap.c                                          |    9 
 a/mm/shmem.c                                         |   39 
 a/mm/slab.c                                          |   10 
 a/mm/slab.h                                          |    9 
 a/mm/slab_common.c                                   |   10 
 a/mm/slob.c                                          |    6 
 a/mm/slub.c                                          |  156 +--
 a/mm/swap.c                                          |   12 
 a/mm/swap_state.c                                    |    7 
 a/mm/swapfile.c                                      |   14 
 a/mm/truncate.c                                      |   18 
 a/mm/vmalloc.c                                       |  105 +-
 a/mm/vmscan.c                                        |   21 
 a/mm/vmstat.c                                        |    6 
 a/mm/workingset.c                                    |    8 
 a/mm/z3fold.c                                        |  215 ++--
 a/mm/zsmalloc.c                                      |   11 
 a/mm/zswap.c                                         |  193 +++-
 a/sound/core/pcm_lib.c                               |    4 
 a/tools/include/linux/poison.h                       |    6 
 a/tools/testing/selftests/vm/.gitignore              |    4 
 a/tools/testing/selftests/vm/Makefile                |   41 
 a/tools/testing/selftests/vm/check_config.sh         |   31 
 a/tools/testing/selftests/vm/config                  |    2 
 a/tools/testing/selftests/vm/gup_benchmark.c         |  143 ---
 a/tools/testing/selftests/vm/gup_test.c              |  258 +++++
 a/tools/testing/selftests/vm/hmm-tests.c             |   10 
 a/tools/testing/selftests/vm/mremap_test.c           |  344 +++++++
 a/tools/testing/selftests/vm/run_vmtests             |   51 -
 a/tools/testing/selftests/vm/userfaultfd.c           |   94 --
 217 files changed, 4817 insertions(+), 3369 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-11 21:35 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-11 21:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

8 patches, based on 33dc9614dc208291d0c4bcdeb5d30d481dcd2c4c.

Subsystems affected by this patch series:

  mm/pagecache
  proc
  selftests
  kbuild
  mm/kasan
  mm/hugetlb

Subsystem: mm/pagecache

    Andrew Morton <akpm@linux-foundation.org>:
      revert "mm/filemap: add static for function __add_to_page_cache_locked"

Subsystem: proc

    Miles Chen <miles.chen@mediatek.com>:
      proc: use untagged_addr() for pagemap_read addresses

Subsystem: selftests

    Arnd Bergmann <arnd@arndb.de>:
      selftest/fpu: avoid clang warning

Subsystem: kbuild

    Arnd Bergmann <arnd@arndb.de>:
      kbuild: avoid static_assert for genksyms
      initramfs: fix clang build failure
      elfcore: fix building with clang

Subsystem: mm/kasan

    Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>:
      kasan: fix object remaining in offline per-cpu quarantine

Subsystem: mm/hugetlb

    Gerald Schaefer <gerald.schaefer@linux.ibm.com>:
      mm/hugetlb: clear compound_nr before freeing gigantic pages

 fs/proc/task_mmu.c        |    8 ++++++--
 include/linux/build_bug.h |    5 +++++
 include/linux/elfcore.h   |   22 ++++++++++++++++++++++
 init/initramfs.c          |    2 +-
 kernel/Makefile           |    1 -
 kernel/elfcore.c          |   26 --------------------------
 lib/Makefile              |    3 ++-
 mm/filemap.c              |    2 +-
 mm/hugetlb.c              |    1 +
 mm/kasan/quarantine.c     |   39 +++++++++++++++++++++++++++++++++++++++
 10 files changed, 77 insertions(+), 32 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-12-06  6:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-12-06  6:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

12 patches, based on 33256ce194110874d4bc90078b577c59f9076c59.

Subsystems affected by this patch series:

  lib
  coredump
  mm/memcg
  mm/zsmalloc
  mm/swap
  mailmap
  mm/selftests
  mm/pagecache
  mm/hugetlb
  mm/pagemap

Subsystem: lib

    Randy Dunlap <rdunlap@infradead.org>:
      zlib: export S390 symbols for zlib modules

Subsystem: coredump

    Menglong Dong <dong.menglong@zte.com.cn>:
      coredump: fix core_pattern parse error

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: fix obj_cgroup_charge() return value handling

    Yang Shi <shy828301@gmail.com>:
      mm: list_lru: set shrinker map bit when child nr_items is not zero

Subsystem: mm/zsmalloc

    Minchan Kim <minchan@kernel.org>:
      mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING

Subsystem: mm/swap

    Qian Cai <qcai@redhat.com>:
      mm/swapfile: do not sleep with a spin lock held

Subsystem: mailmap

    Uwe Kleine-König <u.kleine-koenig@pengutronix.de>:
      mailmap: add two more addresses of Uwe Kleine-König

Subsystem: mm/selftests

    Xingxing Su <suxingxing@loongson.cn>:
      tools/testing/selftests/vm: fix build error

    Axel Rasmussen <axelrasmussen@google.com>:
      userfaultfd: selftests: fix SIGSEGV if huge mmap fails

Subsystem: mm/pagecache

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/filemap: add static for function __add_to_page_cache_locked

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb_cgroup: fix offline of hugetlb cgroup with reservations

Subsystem: mm/pagemap

    Liu Zixian <liuzixian4@huawei.com>:
      mm/mmap.c: fix mmap return value when vma is merged after call_mmap()

 .mailmap                                 |    2 +
 arch/arm/configs/omap2plus_defconfig     |    1 
 fs/coredump.c                            |    3 +
 include/linux/zsmalloc.h                 |    1 
 lib/zlib_dfltcc/dfltcc_inflate.c         |    3 +
 mm/Kconfig                               |   13 -------
 mm/filemap.c                             |    2 -
 mm/hugetlb_cgroup.c                      |    8 +---
 mm/list_lru.c                            |   10 ++---
 mm/mmap.c                                |   26 ++++++--------
 mm/slab.h                                |   40 +++++++++++++---------
 mm/swapfile.c                            |    4 +-
 mm/zsmalloc.c                            |   54 -------------------------------
 tools/testing/selftests/vm/Makefile      |    4 ++
 tools/testing/selftests/vm/userfaultfd.c |   25 +++++++++-----
 15 files changed, 75 insertions(+), 121 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-11-22  6:16 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-11-22  6:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

8 patches, based on a349e4c659609fd20e4beea89e5c4a4038e33a95.

Subsystems affected by this patch series:

  mm/madvise
  kbuild
  mm/pagemap
  mm/readahead
  mm/memcg
  mm/userfaultfd
  vfs-akpm
  mm/madvise

Subsystem: mm/madvise

    Eric Dumazet <edumazet@google.com>:
      mm/madvise: fix memory leak from process_madvise

Subsystem: kbuild

    Nick Desaulniers <ndesaulniers@google.com>:
      compiler-clang: remove version check for BPF Tracing

Subsystem: mm/pagemap

    Dan Williams <dan.j.williams@intel.com>:
      mm: fix phys_to_target_node() and memory_add_physaddr_to_nid() exports

Subsystem: mm/readahead

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: fix readahead_page_batch for retry entries

Subsystem: mm/memcg

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcg/slab: fix root memcg vmstats

Subsystem: mm/userfaultfd

    Gerald Schaefer <gerald.schaefer@linux.ibm.com>:
      mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault()

Subsystem: vfs-akpm

    Yicong Yang <yangyicong@hisilicon.com>:
      libfs: fix error cast of negative value in simple_attr_write()

Subsystem: mm/madvise

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: fix madvise WILLNEED performance problem

 arch/ia64/include/asm/sparsemem.h    |    6 ++++++
 arch/powerpc/include/asm/mmzone.h    |    5 +++++
 arch/powerpc/include/asm/sparsemem.h |    5 ++---
 arch/powerpc/mm/mem.c                |    1 +
 arch/x86/include/asm/sparsemem.h     |   10 ++++++++++
 arch/x86/mm/numa.c                   |    2 ++
 drivers/dax/Kconfig                  |    1 -
 fs/libfs.c                           |    6 ++++--
 include/linux/compiler-clang.h       |    2 ++
 include/linux/memory_hotplug.h       |   14 --------------
 include/linux/numa.h                 |   30 +++++++++++++++++++++++++++++-
 include/linux/pagemap.h              |    2 ++
 mm/huge_memory.c                     |    9 ++++-----
 mm/madvise.c                         |    4 +---
 mm/memcontrol.c                      |    9 +++++++--
 mm/memory_hotplug.c                  |   18 ------------------
 16 files changed, 75 insertions(+), 49 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-11-14  6:51 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-11-14  6:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

14 patches, based on 9e6a39eae450b81c8b2c8cbbfbdf8218e9b40c81.

Subsystems affected by this patch series:

  mm/migration
  mm/vmscan
  mailmap
  mm/slub
  mm/gup
  kbuild
  reboot
  kernel/watchdog
  mm/memcg
  mm/hugetlbfs
  panic
  ocfs2

Subsystem: mm/migration

    Zi Yan <ziy@nvidia.com>:
      mm/compaction: count pages and stop correctly during page isolation
      mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate

Subsystem: mm/vmscan

    Nicholas Piggin <npiggin@gmail.com>:
      mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit

Subsystem: mailmap

    Dmitry Baryshkov <dbaryshkov@gmail.com>:
      mailmap: fix entry for Dmitry Baryshkov/Eremin-Solenikov

Subsystem: mm/slub

    Laurent Dufour <ldufour@linux.ibm.com>:
      mm/slub: fix panic in slab_alloc_node()

Subsystem: mm/gup

    Jason Gunthorpe <jgg@nvidia.com>:
      mm/gup: use unpin_user_pages() in __gup_longterm_locked()

Subsystem: kbuild

    Arvind Sankar <nivedita@alum.mit.edu>:
      compiler.h: fix barrier_data() on clang

Subsystem: reboot

    Matteo Croce <mcroce@microsoft.com>:
    Patch series "fix parsing of reboot= cmdline", v3:
      Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint"
      reboot: fix overflow parsing reboot cpu number

Subsystem: kernel/watchdog

    Santosh Sivaraj <santosh@fossix.org>:
      kernel/watchdog: fix watchdog_allowed_mask not used warning

Subsystem: mm/memcg

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: fix missing wakeup polling thread

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: fix anon huge page migration race

Subsystem: panic

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      panic: don't dump stack twice on warn

Subsystem: ocfs2

    Wengang Wang <wen.gang.wang@oracle.com>:
      ocfs2: initialize ip_next_orphan

 .mailmap                       |    5 +-
 fs/ocfs2/super.c               |    1 
 include/asm-generic/barrier.h  |    1 
 include/linux/compiler-clang.h |    6 --
 include/linux/compiler-gcc.h   |   19 --------
 include/linux/compiler.h       |   18 +++++++-
 include/linux/memcontrol.h     |   11 ++++-
 kernel/panic.c                 |    3 -
 kernel/reboot.c                |   28 ++++++------
 kernel/watchdog.c              |    4 -
 mm/compaction.c                |   12 +++--
 mm/gup.c                       |   14 ++++--
 mm/hugetlb.c                   |   90 ++---------------------------------------
 mm/memory-failure.c            |   36 +++++++---------
 mm/migrate.c                   |   46 +++++++++++---------
 mm/rmap.c                      |    5 --
 mm/slub.c                      |    2 
 mm/vmscan.c                    |    5 +-
 18 files changed, 119 insertions(+), 187 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-11-02  1:06 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-11-02  1:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

15 patches, based on 3cea11cd5e3b00d91caf0b4730194039b45c5891.

Subsystems affected by this patch series:

  mm/memremap
  mm/memcg
  mm/slab-generic
  mm/kasan
  mm/mempolicy
  signals
  lib
  mm/pagecache
  kthread
  mm/oom-kill
  mm/pagemap
  epoll
  core-kernel

Subsystem: mm/memremap

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/mremap_pages: fix static key devmap_managed_key updates

Subsystem: mm/memcg

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb_cgroup: fix reservation accounting

    zhongjiang-ali <zhongjiang-ali@linux.alibaba.com>:
      mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg: link page counters to root if use_hierarchy is false

Subsystem: mm/slab-generic

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: adopt KUNIT tests to SW_TAGS mode

Subsystem: mm/mempolicy

    Shijie Luo <luoshijie1@huawei.com>:
      mm: mempolicy: fix potential pte_unmap_unlock pte error

Subsystem: signals

    Oleg Nesterov <oleg@redhat.com>:
      ptrace: fix task_join_group_stop() for the case when current is traced

Subsystem: lib

    Vasily Gorbik <gor@linux.ibm.com>:
      lib/crc32test: remove extra local_irq_disable/enable

Subsystem: mm/pagecache

    Jason Yan <yanaijie@huawei.com>:
      mm/truncate.c: make __invalidate_mapping_pages() static

Subsystem: kthread

    Zqiang <qiang.zhang@windriver.com>:
      kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled

Subsystem: mm/oom-kill

    Charles Haithcock <chaithco@redhat.com>:
      mm, oom: keep oom_adj under or at upper limit when printing

Subsystem: mm/pagemap

    Jason Gunthorpe <jgg@nvidia.com>:
      mm: always have io_remap_pfn_range() set pgprot_decrypted()

Subsystem: epoll

    Soheil Hassas Yeganeh <soheil@google.com>:
      epoll: check ep_events_available() upon timeout
      epoll: add a selftest for epoll timeout race

Subsystem: core-kernel

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      kernel/hung_task.c: make type annotations consistent

 fs/eventpoll.c                                                |   16 +
 fs/proc/base.c                                                |    2 
 include/linux/mm.h                                            |    9 
 include/linux/pgtable.h                                       |    4 
 kernel/hung_task.c                                            |    3 
 kernel/kthread.c                                              |    3 
 kernel/signal.c                                               |   19 -
 lib/crc32test.c                                               |    4 
 lib/test_kasan.c                                              |  149 +++++++---
 mm/hugetlb.c                                                  |   20 -
 mm/memcontrol.c                                               |   25 +
 mm/mempolicy.c                                                |    6 
 mm/memremap.c                                                 |   39 +-
 mm/truncate.c                                                 |    2 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   95 ++++++
 15 files changed, 290 insertions(+), 106 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-10-17 23:13 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-10-17 23:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


40 patches, based on 9d9af1007bc08971953ae915d88dc9bb21344b53.

Subsystems affected by this patch series:

  ia64
  mm/memcg
  mm/migration
  mm/pagemap
  mm/gup
  mm/madvise
  mm/vmalloc
  misc

Subsystem: ia64

    Krzysztof Kozlowski <krzk@kernel.org>:
      ia64: fix build error with !COREDUMP

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm, memcg: rework remote charging API to support nesting
    Patch series "mm: kmem: kernel memory accounting in an interrupt context":
      mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current()
      mm: kmem: remove redundant checks from get_obj_cgroup_from_current()
      mm: kmem: prepare remote memcg charging infra for interrupt contexts
      mm: kmem: enable kernel memcg accounting from interrupt contexts

Subsystem: mm/migration

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
      mm/memory-failure: remove a wrapper for alloc_migration_target()
      mm/memory_hotplug: remove a wrapper for alloc_migration_target()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/migrate: avoid possible unnecessary process right check in kernel_move_pages()

Subsystem: mm/pagemap

    "Liam R. Howlett" <Liam.Howlett@Oracle.com>:
      mm/mmap: add inline vma_next() for readability of mmap code
      mm/mmap: add inline munmap_vma_range() for code readability

Subsystem: mm/gup

    Jann Horn <jannh@google.com>:
      mm/gup_benchmark: take the mmap lock around GUP
      binfmt_elf: take the mmap lock around find_extend_vma()
      mm/gup: assert that the mmap lock is held in __get_user_pages()

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "selftests/vm: gup_test, hmm-tests, assorted improvements", v2:
      mm/gup_benchmark: rename to mm/gup_test
      selftests/vm: use a common gup_test.h
      selftests/vm: rename run_vmtests --> run_vmtests.sh
      selftests/vm: minor cleanup: Makefile and gup_test.c
      selftests/vm: only some gup_test items are really benchmarks
      selftests/vm: gup_test: introduce the dump_pages() sub-test
      selftests/vm: run_vmtests.sh: update and clean up gup_test invocation
      selftests/vm: hmm-tests: remove the libhugetlbfs dependency
      selftests/vm: 10x speedup for hmm-tests

Subsystem: mm/madvise

    Minchan Kim <minchan@kernel.org>:
    Patch series "introduce memory hinting API for external process", v9:
      mm/madvise: pass mm to do_madvise
      pid: move pidfd_get_pid() to pid.c
      mm/madvise: introduce process_madvise() syscall: an external memory hinting API

Subsystem: mm/vmalloc

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "remove alloc_vm_area", v4:
      mm: update the documentation for vfree

    Christoph Hellwig <hch@lst.de>:
      mm: add a VM_MAP_PUT_PAGES flag for vmap
      mm: add a vmap_pfn function
      mm: allow a NULL fn callback in apply_to_page_range
      zsmalloc: switch from alloc_vm_area to get_vm_area
      drm/i915: use vmap in shmem_pin_map
      drm/i915: stop using kmap in i915_gem_object_map
      drm/i915: use vmap in i915_gem_object_map
      xen/xenbus: use apply_to_page_range directly in xenbus_map_ring_pv
      x86/xen: open code alloc_vm_area in arch_gnttab_valloc
      mm: remove alloc_vm_area
    Patch series "two small vmalloc cleanups":
      mm: cleanup the gfp_mask handling in __vmalloc_area_node
      mm: remove the filename in the top of file comment in vmalloc.c

Subsystem: misc

    Tian Tao <tiantao6@hisilicon.com>:
      mm: remove duplicate include statement in mmu.c

 Documentation/core-api/pin_user_pages.rst   |    8 
 arch/alpha/kernel/syscalls/syscall.tbl      |    1 
 arch/arm/mm/mmu.c                           |    1 
 arch/arm/tools/syscall.tbl                  |    1 
 arch/arm64/include/asm/unistd.h             |    2 
 arch/arm64/include/asm/unistd32.h           |    2 
 arch/ia64/kernel/Makefile                   |    2 
 arch/ia64/kernel/syscalls/syscall.tbl       |    1 
 arch/m68k/kernel/syscalls/syscall.tbl       |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl   |    1 
 arch/parisc/kernel/syscalls/syscall.tbl     |    1 
 arch/powerpc/kernel/syscalls/syscall.tbl    |    1 
 arch/s390/configs/debug_defconfig           |    2 
 arch/s390/configs/defconfig                 |    2 
 arch/s390/kernel/syscalls/syscall.tbl       |    1 
 arch/sh/kernel/syscalls/syscall.tbl         |    1 
 arch/sparc/kernel/syscalls/syscall.tbl      |    1 
 arch/x86/entry/syscalls/syscall_32.tbl      |    1 
 arch/x86/entry/syscalls/syscall_64.tbl      |    1 
 arch/x86/xen/grant-table.c                  |   27 +-
 arch/xtensa/kernel/syscalls/syscall.tbl     |    1 
 drivers/gpu/drm/i915/Kconfig                |    1 
 drivers/gpu/drm/i915/gem/i915_gem_pages.c   |  136 ++++------
 drivers/gpu/drm/i915/gt/shmem_utils.c       |   78 +-----
 drivers/xen/xenbus/xenbus_client.c          |   30 +-
 fs/binfmt_elf.c                             |    3 
 fs/buffer.c                                 |    6 
 fs/io_uring.c                               |    2 
 fs/notify/fanotify/fanotify.c               |    5 
 fs/notify/inotify/inotify_fsnotify.c        |    5 
 include/linux/memcontrol.h                  |   12 
 include/linux/mm.h                          |    2 
 include/linux/pid.h                         |    1 
 include/linux/sched/mm.h                    |   43 +--
 include/linux/syscalls.h                    |    2 
 include/linux/vmalloc.h                     |    7 
 include/uapi/asm-generic/unistd.h           |    4 
 kernel/exit.c                               |   19 -
 kernel/pid.c                                |   19 +
 kernel/sys_ni.c                             |    1 
 mm/Kconfig                                  |   24 +
 mm/Makefile                                 |    2 
 mm/gup.c                                    |    2 
 mm/gup_benchmark.c                          |  225 ------------------
 mm/gup_test.c                               |  295 +++++++++++++++++++++--
 mm/gup_test.h                               |   40 ++-
 mm/madvise.c                                |  125 ++++++++--
 mm/memcontrol.c                             |   83 ++++--
 mm/memory-failure.c                         |   18 -
 mm/memory.c                                 |   16 -
 mm/memory_hotplug.c                         |   46 +--
 mm/migrate.c                                |   71 +++--
 mm/mmap.c                                   |   74 ++++-
 mm/nommu.c                                  |    7 
 mm/percpu.c                                 |    3 
 mm/slab.h                                   |    3 
 mm/vmalloc.c                                |  147 +++++------
 mm/zsmalloc.c                               |   10 
 tools/testing/selftests/vm/.gitignore       |    3 
 tools/testing/selftests/vm/Makefile         |   40 ++-
 tools/testing/selftests/vm/check_config.sh  |   31 ++
 tools/testing/selftests/vm/config           |    2 
 tools/testing/selftests/vm/gup_benchmark.c  |  143 -----------
 tools/testing/selftests/vm/gup_test.c       |  260 ++++++++++++++++++--
 tools/testing/selftests/vm/hmm-tests.c      |   12 
 tools/testing/selftests/vm/run_vmtests      |  334 --------------------------
 tools/testing/selftests/vm/run_vmtests.sh   |  350 +++++++++++++++++++++++++++-
 70 files changed, 1580 insertions(+), 1224 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-10-16  2:40 incoming Andrew Morton
@ 2020-10-16  3:03 ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-10-16  3:03 UTC (permalink / raw)
  To: Linus Torvalds, mm-commits, linux-mm

And... I forgot to set in-reply-to :(

Shall resend, omitting linux-mm.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-10-16  2:40 Andrew Morton
  2020-10-16  3:03 ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-10-16  2:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- most of the rest of mm/

- various other subsystems

156 patches, based on 578a7155c5a1894a789d4ece181abf9d25dc6b0d.

Subsystems affected by this patch series:

  mm/dax
  mm/debug
  mm/thp
  mm/readahead
  mm/page-poison
  mm/util
  mm/memory-hotplug
  mm/zram
  mm/cleanups
  misc
  core-kernel
  get_maintainer
  MAINTAINERS
  lib
  bitops
  checkpatch
  binfmt
  ramfs
  autofs
  nilfs
  rapidio
  panic
  relay
  kgdb
  ubsan
  romfs
  fault-injection

Subsystem: mm/dax

    Dan Williams <dan.j.williams@intel.com>:
      device-dax/kmem: fix resource release

Subsystem: mm/debug

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "mm/debug_vm_pgtable fixes", v4:
      powerpc/mm: add DEBUG_VM WARN for pmd_clear
      powerpc/mm: move setting pte specific flags to pfn_pte
      mm/debug_vm_pgtable/ppc64: avoid setting top bits in radom value
      mm/debug_vm_pgtables/hugevmap: use the arch helper to identify huge vmap support.
      mm/debug_vm_pgtable/savedwrite: enable savedwrite test with CONFIG_NUMA_BALANCING
      mm/debug_vm_pgtable/THP: mark the pte entry huge before using set_pmd/pud_at
      mm/debug_vm_pgtable/set_pte/pmd/pud: don't use set_*_at to update an existing pte entry
      mm/debug_vm_pgtable/locks: move non page table modifying test together
      mm/debug_vm_pgtable/locks: take correct page table lock
      mm/debug_vm_pgtable/thp: use page table depost/withdraw with THP
      mm/debug_vm_pgtable/pmd_clear: don't use pmd/pud_clear on pte entries
      mm/debug_vm_pgtable/hugetlb: disable hugetlb test on ppc64
      mm/debug_vm_pgtable: avoid none pte in pte_clear_test
      mm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.

Subsystem: mm/thp

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Fix read-only THP for non-tmpfs filesystems":
      XArray: add xa_get_order
      XArray: add xas_split
      mm/filemap: fix storing to a THP shadow entry
    Patch series "Remove assumptions of THP size":
      mm/filemap: fix page cache removal for arbitrary sized THPs
      mm/memory: remove page fault assumption of compound page size
      mm/page_owner: change split_page_owner to take a count

    "Kirill A. Shutemov" <kirill@shutemov.name>:
      mm/huge_memory: fix total_mapcount assumption of page size
      mm/huge_memory: fix split assumption of page size

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/huge_memory: fix page_trans_huge_mapcount assumption of THP size
      mm/huge_memory: fix can_split_huge_page assumption of THP size
      mm/rmap: fix assumptions of THP size
      mm/truncate: fix truncation for pages of arbitrary size
      mm/page-writeback: support tail pages in wait_for_stable_page
      mm/vmscan: allow arbitrary sized pages to be paged out
      fs: add a filesystem flag for THPs
      fs: do not update nr_thps for mappings which support THPs

    Huang Ying <ying.huang@intel.com>:
      mm: fix a race during THP splitting

Subsystem: mm/readahead

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Readahead patches for 5.9/5.10":
      mm/readahead: add DEFINE_READAHEAD
      mm/readahead: make page_cache_ra_unbounded take a readahead_control
      mm/readahead: make do_page_cache_ra take a readahead_control

    David Howells <dhowells@redhat.com>:
      mm/readahead: make ondemand_readahead take a readahead_control
      mm/readahead: pass readahead_control to force_page_cache_ra

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/readahead: add page_cache_sync_ra and page_cache_async_ra

    David Howells <dhowells@redhat.com>:
      mm/filemap: fold ra_submit into do_sync_mmap_readahead
      mm/readahead: pass a file_ra_state into force_page_cache_ra

Subsystem: mm/page-poison

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
    Patch series "HWPOISON: soft offline rework", v7:
      mm,hwpoison: cleanup unused PageHuge() check
      mm, hwpoison: remove recalculating hpage
      mm,hwpoison-inject: don't pin for hwpoison_filter

    Oscar Salvador <osalvador@suse.de>:
      mm,hwpoison: unexport get_hwpoison_page and make it static
      mm,hwpoison: refactor madvise_inject_error
      mm,hwpoison: kill put_hwpoison_page
      mm,hwpoison: unify THP handling for hard and soft offline
      mm,hwpoison: rework soft offline for free pages
      mm,hwpoison: rework soft offline for in-use pages
      mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
      mm,hwpoison: return 0 if the page is already poisoned in soft-offline

    Naoya Horiguchi <naoya.horiguchi@nec.com>:
      mm,hwpoison: introduce MF_MSG_UNSPLIT_THP
      mm,hwpoison: double-check page count in __get_any_page()

    Oscar Salvador <osalvador@suse.de>:
      mm,hwpoison: try to narrow window race for free pages

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/page_poison.c: replace bool variable with static key

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/vmstat.c: use helper macro abs()

Subsystem: mm/util

    Bartosz Golaszewski <bgolaszewski@baylibre.com>:
      mm/util.c: update the kerneldoc for kstrdup_const()

    Jann Horn <jannh@google.com>:
      mm/mmu_notifier: fix mmget() assert in __mmu_interval_notifier_insert

Subsystem: mm/memory-hotplug

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: online_pages()/offline_pages() cleanups", v2:
      mm/memory_hotplug: inline __offline_pages() into offline_pages()
      mm/memory_hotplug: enforce section granularity when onlining/offlining
      mm/memory_hotplug: simplify page offlining
      mm/page_alloc: simplify __offline_isolated_pages()
      mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
      mm/page_isolation: simplify return value of start_isolate_page_range()
      mm/memory_hotplug: simplify page onlining
      mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
      mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
      mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
    Patch series "selective merging of system ram resources", v4:
      kernel/resource: make release_mem_region_adjustable() never fail
      kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED
      mm/memory_hotplug: guard more declarations by CONFIG_MEMORY_HOTPLUG
      mm/memory_hotplug: prepare passing flags to add_memory() and friends
      mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources
      virtio-mem: try to merge system ram resources
      xen/balloon: try to merge system ram resources
      hv_balloon: try to merge system ram resources
      kernel/resource: make iomem_resource implicit in release_mem_region_adjustable()

    Laurent Dufour <ldufour@linux.ibm.com>:
      mm: don't panic when links can't be created in sysfs

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: place pages to the freelist tail when onlining and undoing isolation", v2:
      mm/page_alloc: convert "report" flag of __free_one_page() to a proper flag
      mm/page_alloc: place pages to tail in __putback_isolated_page()
      mm/page_alloc: move pages to tail in move_to_free_list()
      mm/page_alloc: place pages to tail in __free_pages_core()
      mm/memory_hotplug: update comment regarding zone shuffling

Subsystem: mm/zram

    Douglas Anderson <dianders@chromium.org>:
      zram: failing to decompress is WARN_ON worthy

Subsystem: mm/cleanups

    YueHaibing <yuehaibing@huawei.com>:
      mm/slab.h: remove duplicate include

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/page_reporting.c: drop stale list head check in page_reporting_cycle

    Ira Weiny <ira.weiny@intel.com>:
      mm/highmem.c: clean up endif comments

    Yu Zhao <yuzhao@google.com>:
      mm: use self-explanatory macros rather than "2"

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: fix some broken comments

    Chen Tao <chentao3@hotmail.com>:
      mm: fix some comments formatting

    Xiaofei Tan <tanxiaofei@huawei.com>:
      mm/workingset.c: fix some doc warnings

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: use helper function put_write_access()

    Mike Rapoport <rppt@linux.ibm.com>:
      include/linux/mmzone.h: remove unused early_pfn_valid()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: rename page_order() to buddy_order()

Subsystem: misc

    Randy Dunlap <rdunlap@infradead.org>:
      fs: configfs: delete repeated words in comments

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: split out min()/max() et al. helpers

Subsystem: core-kernel

    Liao Pingfang <liao.pingfang@zte.com.cn>:
      kernel/sys.c: replace do_brk with do_brk_flags in comment of prctl_set_mm_map()

    Randy Dunlap <rdunlap@infradead.org>:
      kernel/: fix repeated words in comments
      kernel: acct.c: fix some kernel-doc nits

Subsystem: get_maintainer

    Joe Perches <joe@perches.com>:
      get_maintainer: add test for file in VCS

Subsystem: MAINTAINERS

    Joe Perches <joe@perches.com>:
      get_maintainer: exclude MAINTAINERS file(s) from --git-fallback

    Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>:
      MAINTAINERS: jarkko.sakkinen@linux.intel.com -> jarkko@kernel.org

Subsystem: lib

    Randy Dunlap <rdunlap@infradead.org>:
      lib: bitmap: delete duplicated words
      lib: libcrc32c: delete duplicated words
      lib: decompress_bunzip2: delete duplicated words
      lib: dynamic_queue_limits: delete duplicated words + fix typo
      lib: earlycpio: delete duplicated words
      lib: radix-tree: delete duplicated words
      lib: syscall: delete duplicated words
      lib: test_sysctl: delete duplicated words
      lib/mpi/mpi-bit.c: fix spello of "functions"

    Stephen Boyd <swboyd@chromium.org>:
      lib/idr.c: document calling context for IDA APIs mustn't use locks
      lib/idr.c: document that ida_simple_{get,remove}() are deprecated

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      lib/scatterlist.c: avoid a double memset

    Miaohe Lin <linmiaohe@huawei.com>:
      lib/percpu_counter.c: use helper macro abs()

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      include/linux/list.h: add a macro to test if entry is pointing to the head

    Dan Carpenter <dan.carpenter@oracle.com>:
      lib/test_hmm.c: fix an error code in dmirror_allocate_chunk()

    Tobias Jordan <kernel@cdqe.de>:
      lib/crc32.c: fix trivial typo in preprocessor condition

Subsystem: bitops

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      bitops: simplify get_count_order_long()
      bitops: use the same mechanism for get_count_order[_long]

Subsystem: checkpatch

    Jerome Forissier <jerome@forissier.org>:
      checkpatch: add --kconfig-prefix

    Joe Perches <joe@perches.com>:
      checkpatch: move repeated word test
      checkpatch: add test for comma use that should be semicolon

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      const_structs.checkpatch: add phy_ops

    Nicolas Boichat <drinkcat@chromium.org>:
      checkpatch: warn if trace_printk and friends are called

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      const_structs.checkpatch: add pinctrl_ops and pinmux_ops

    Joe Perches <joe@perches.com>:
      checkpatch: warn on self-assignments
      checkpatch: allow not using -f with files that are in git

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: extend author Signed-off-by check for split From: header

    Joe Perches <joe@perches.com>:
      checkpatch: emit a warning on embedded filenames

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: fix multi-statement macro checks for while blocks.

    Łukasz Stelmach <l.stelmach@samsung.com>:
      checkpatch: fix false positive on empty block comment lines

    Dwaipayan Ray <dwaipayanray1@gmail.com>:
      checkpatch: add new warnings to author signoff checks.

Subsystem: binfmt

    Chris Kennelly <ckennelly@google.com>:
    Patch series "Selecting Load Addresses According to p_align", v3:
      fs/binfmt_elf: use PT_LOAD p_align values for suitable start address
      tools/testing/selftests: add self-test for verifying load alignment

    Jann Horn <jannh@google.com>:
    Patch series "Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there", v5:
      binfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU
      coredump: let dump_emit() bail out on short writes
      coredump: refactor page range dumping into common helper
      coredump: rework elf/elf_fdpic vma_dump_size() into common helper
      binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
      mm/gup: take mmap_lock in get_dump_page()
      mm: remove the now-unnecessary mmget_still_valid() hack

Subsystem: ramfs

    Matthew Wilcox (Oracle) <willy@infradead.org>:
      ramfs: fix nommu mmap with gaps in the page cache

Subsystem: autofs

    Matthew Wilcox <willy@infradead.org>:
      autofs: harden ioctl table

Subsystem: nilfs

    Wang Hai <wanghai38@huawei.com>:
      nilfs2: fix some kernel-doc warnings for nilfs2

Subsystem: rapidio

    Souptick Joarder <jrdr.linux@gmail.com>:
      rapidio: fix error handling path

    Jing Xiangfeng <jingxiangfeng@huawei.com>:
      rapidio: fix the missed put_device() for rio_mport_add_riodev

Subsystem: panic

    Alexey Kardashevskiy <aik@ozlabs.ru>:
      panic: dump registers on panic_on_warn

Subsystem: relay

    Sudip Mukherjee <sudipm.mukherjee@gmail.com>:
      kernel/relay.c: drop unneeded initialization

Subsystem: kgdb

    Ritesh Harjani <riteshh@linux.ibm.com>:
      scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
      scripts/gdb/tasks: add headers and improve spacing format

Subsystem: ubsan

    Elena Petrova <lenaptr@google.com>:
      sched.h: drop in_ubsan field when UBSAN is in trap mode

    George Popescu <georgepope@android.com>:
      ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang

Subsystem: romfs

    Libing Zhou <libing.zhou@nokia-sbell.com>:
      ROMFS: support inode blocks calculation

Subsystem: fault-injection

    Albert van der Linde <alinde@google.com>:
    Patch series "add fault injection to user memory access", v3:
      lib, include/linux: add usercopy failure capability
      lib, uaccess: add failure injection to usercopy functions

 .mailmap                                          |    1 
 Documentation/admin-guide/kernel-parameters.txt   |    1 
 Documentation/core-api/xarray.rst                 |   14 
 Documentation/fault-injection/fault-injection.rst |    7 
 MAINTAINERS                                       |    6 
 arch/ia64/mm/init.c                               |    4 
 arch/powerpc/include/asm/book3s/64/pgtable.h      |   29 +
 arch/powerpc/include/asm/nohash/pgtable.h         |    5 
 arch/powerpc/mm/pgtable.c                         |    5 
 arch/powerpc/platforms/powernv/memtrace.c         |    2 
 arch/powerpc/platforms/pseries/hotplug-memory.c   |    2 
 drivers/acpi/acpi_memhotplug.c                    |    3 
 drivers/base/memory.c                             |    3 
 drivers/base/node.c                               |   33 +-
 drivers/block/zram/zram_drv.c                     |    2 
 drivers/dax/kmem.c                                |   50 ++-
 drivers/hv/hv_balloon.c                           |    4 
 drivers/infiniband/core/uverbs_main.c             |    3 
 drivers/rapidio/devices/rio_mport_cdev.c          |   18 -
 drivers/s390/char/sclp_cmd.c                      |    2 
 drivers/vfio/pci/vfio_pci.c                       |   38 +-
 drivers/virtio/virtio_mem.c                       |    5 
 drivers/xen/balloon.c                             |    4 
 fs/autofs/dev-ioctl.c                             |    8 
 fs/binfmt_elf.c                                   |  267 +++-------------
 fs/binfmt_elf_fdpic.c                             |  176 ++--------
 fs/configfs/dir.c                                 |    2 
 fs/configfs/file.c                                |    2 
 fs/coredump.c                                     |  238 +++++++++++++-
 fs/ext4/verity.c                                  |    4 
 fs/f2fs/verity.c                                  |    4 
 fs/inode.c                                        |    2 
 fs/nilfs2/bmap.c                                  |    2 
 fs/nilfs2/cpfile.c                                |    6 
 fs/nilfs2/page.c                                  |    1 
 fs/nilfs2/sufile.c                                |    4 
 fs/proc/task_mmu.c                                |   18 -
 fs/ramfs/file-nommu.c                             |    2 
 fs/romfs/super.c                                  |    1 
 fs/userfaultfd.c                                  |   28 -
 include/linux/bitops.h                            |   13 
 include/linux/blkdev.h                            |    1 
 include/linux/bvec.h                              |    6 
 include/linux/coredump.h                          |   13 
 include/linux/fault-inject-usercopy.h             |   22 +
 include/linux/fs.h                                |   28 -
 include/linux/idr.h                               |   13 
 include/linux/ioport.h                            |   15 
 include/linux/jiffies.h                           |    3 
 include/linux/kernel.h                            |  150 ---------
 include/linux/list.h                              |   29 +
 include/linux/memory_hotplug.h                    |   42 +-
 include/linux/minmax.h                            |  153 +++++++++
 include/linux/mm.h                                |    5 
 include/linux/mmzone.h                            |   17 -
 include/linux/node.h                              |   16 
 include/linux/nodemask.h                          |    2 
 include/linux/page-flags.h                        |    6 
 include/linux/page_owner.h                        |    6 
 include/linux/pagemap.h                           |  111 ++++++
 include/linux/sched.h                             |    2 
 include/linux/sched/mm.h                          |   25 -
 include/linux/uaccess.h                           |   12 
 include/linux/vmstat.h                            |    2 
 include/linux/xarray.h                            |   22 +
 include/ras/ras_event.h                           |    3 
 kernel/acct.c                                     |   10 
 kernel/cgroup/cpuset.c                            |    2 
 kernel/dma/direct.c                               |    2 
 kernel/fork.c                                     |    4 
 kernel/futex.c                                    |    2 
 kernel/irq/timings.c                              |    2 
 kernel/jump_label.c                               |    2 
 kernel/kcsan/encoding.h                           |    2 
 kernel/kexec_core.c                               |    2 
 kernel/kexec_file.c                               |    2 
 kernel/kthread.c                                  |    2 
 kernel/livepatch/state.c                          |    2 
 kernel/panic.c                                    |   12 
 kernel/pid_namespace.c                            |    2 
 kernel/power/snapshot.c                           |    2 
 kernel/range.c                                    |    3 
 kernel/relay.c                                    |    2 
 kernel/resource.c                                 |  114 +++++--
 kernel/smp.c                                      |    2 
 kernel/sys.c                                      |    2 
 kernel/user_namespace.c                           |    2 
 lib/Kconfig.debug                                 |    7 
 lib/Kconfig.ubsan                                 |   14 
 lib/Makefile                                      |    1 
 lib/bitmap.c                                      |    2 
 lib/crc32.c                                       |    2 
 lib/decompress_bunzip2.c                          |    2 
 lib/dynamic_queue_limits.c                        |    4 
 lib/earlycpio.c                                   |    2 
 lib/fault-inject-usercopy.c                       |   39 ++
 lib/find_bit.c                                    |    1 
 lib/hexdump.c                                     |    1 
 lib/idr.c                                         |    9 
 lib/iov_iter.c                                    |    5 
 lib/libcrc32c.c                                   |    2 
 lib/math/rational.c                               |    2 
 lib/math/reciprocal_div.c                         |    1 
 lib/mpi/mpi-bit.c                                 |    2 
 lib/percpu_counter.c                              |    2 
 lib/radix-tree.c                                  |    2 
 lib/scatterlist.c                                 |    2 
 lib/strncpy_from_user.c                           |    3 
 lib/syscall.c                                     |    2 
 lib/test_hmm.c                                    |    2 
 lib/test_sysctl.c                                 |    2 
 lib/test_xarray.c                                 |   65 ++++
 lib/usercopy.c                                    |    5 
 lib/xarray.c                                      |  208 ++++++++++++
 mm/Kconfig                                        |    2 
 mm/compaction.c                                   |    6 
 mm/debug_vm_pgtable.c                             |  267 ++++++++--------
 mm/filemap.c                                      |   58 ++-
 mm/gup.c                                          |   73 ++--
 mm/highmem.c                                      |    4 
 mm/huge_memory.c                                  |   47 +-
 mm/hwpoison-inject.c                              |   18 -
 mm/internal.h                                     |   47 +-
 mm/khugepaged.c                                   |    2 
 mm/madvise.c                                      |   52 ---
 mm/memory-failure.c                               |  357 ++++++++++------------
 mm/memory.c                                       |    7 
 mm/memory_hotplug.c                               |  223 +++++--------
 mm/memremap.c                                     |    3 
 mm/migrate.c                                      |   11 
 mm/mmap.c                                         |    7 
 mm/mmu_notifier.c                                 |    2 
 mm/page-writeback.c                               |    1 
 mm/page_alloc.c                                   |  289 +++++++++++------
 mm/page_isolation.c                               |   16 
 mm/page_owner.c                                   |   10 
 mm/page_poison.c                                  |   20 -
 mm/page_reporting.c                               |    4 
 mm/readahead.c                                    |  174 ++++------
 mm/rmap.c                                         |   10 
 mm/shmem.c                                        |    2 
 mm/shuffle.c                                      |    2 
 mm/slab.c                                         |    2 
 mm/slab.h                                         |    1 
 mm/slub.c                                         |    2 
 mm/sparse.c                                       |    2 
 mm/swap_state.c                                   |    2 
 mm/truncate.c                                     |    6 
 mm/util.c                                         |    3 
 mm/vmscan.c                                       |    5 
 mm/vmstat.c                                       |    8 
 mm/workingset.c                                   |    2 
 scripts/Makefile.ubsan                            |   10 
 scripts/checkpatch.pl                             |  238 ++++++++++----
 scripts/const_structs.checkpatch                  |    3 
 scripts/gdb/linux/proc.py                         |   15 
 scripts/gdb/linux/tasks.py                        |    9 
 scripts/get_maintainer.pl                         |    9 
 tools/testing/selftests/exec/.gitignore           |    1 
 tools/testing/selftests/exec/Makefile             |    9 
 tools/testing/selftests/exec/load_address.c       |   68 ++++
 161 files changed, 2532 insertions(+), 1864 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-10-13 23:46 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-10-13 23:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

181 patches, based on 029f56db6ac248769f2c260bfaf3c3c0e23e904c.

Subsystems affected by this patch series:

  kbuild
  scripts
  ntfs
  ocfs2
  vfs
  mm/slab
  mm/slub
  mm/kmemleak
  mm/dax
  mm/debug
  mm/pagecache
  mm/fadvise
  mm/gup
  mm/swap
  mm/memremap
  mm/memcg
  mm/selftests
  mm/pagemap
  mm/mincore
  mm/hmm
  mm/dma
  mm/memory-failure
  mm/vmalloc
  mm/documentation
  mm/kasan
  mm/pagealloc
  mm/hugetlb
  mm/vmscan
  mm/z3fold
  mm/zbud
  mm/compaction
  mm/mempolicy
  mm/mempool
  mm/memblock
  mm/oom-kill
  mm/migration

Subsystem: kbuild

    Nick Desaulniers <ndesaulniers@google.com>:
    Patch series "set clang minimum version to 10.0.1", v3:
      compiler-clang: add build check for clang 10.0.1
      Revert "kbuild: disable clang's default use of -fmerge-all-constants"
      Revert "arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support"
      Revert "arm64: vdso: Fix compilation with clang older than 8"
      Partially revert "ARM: 8905/1: Emit __gnu_mcount_nc when using Clang 10.0.0 or newer"

    Marco Elver <elver@google.com>:
      kasan: remove mentions of unsupported Clang versions

    Nick Desaulniers <ndesaulniers@google.com>:
      compiler-gcc: improve version error
      compiler.h: avoid escaped section names
      export.h: fix section name for CONFIG_TRIM_UNUSED_KSYMS for Clang

    Lukas Bulwahn <lukas.bulwahn@gmail.com>:
      kbuild: doc: describe proper script invocation

Subsystem: scripts

    Wang Qing <wangqing@vivo.com>:
      scripts/spelling.txt: increase error-prone spell checking

    Naoki Hayama <naoki.hayama@lineo.co.jp>:
      scripts/spelling.txt: add "arbitrary" typo

    Borislav Petkov <bp@suse.de>:
      scripts/decodecode: add the capability to supply the program counter

Subsystem: ntfs

    Rustam Kovhaev <rkovhaev@gmail.com>:
      ntfs: add check for mft record size in superblock

Subsystem: ocfs2

    Randy Dunlap <rdunlap@infradead.org>:
      ocfs2: delete repeated words in comments

    Gang He <ghe@suse.com>:
      ocfs2: fix potential soft lockup during fstrim

Subsystem: vfs

    Randy Dunlap <rdunlap@infradead.org>:
      fs/xattr.c: fix kernel-doc warnings for setxattr & removexattr

    Luo Jiaxing <luojiaxing@huawei.com>:
      fs_parse: mark fs_param_bad_value() as static

Subsystem: mm/slab

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/slab.c: clean code by removing redundant if condition

    tangjianqiang <wyqt1985@gmail.com>:
      include/linux/slab.h: fix a typo error in comment

Subsystem: mm/slub

    Abel Wu <wuyun.wu@huawei.com>:
      mm/slub.c: branch optimization in free slowpath
      mm/slub: fix missing ALLOC_SLOWPATH stat when bulk alloc
      mm/slub: make add_full() condition more explicit

Subsystem: mm/kmemleak

    Davidlohr Bueso <dave@stgolabs.net>:
      mm/kmemleak: rely on rcu for task stack scanning

    Hui Su <sh_def@163.com>:
      mm,kmemleak-test.c: move kmemleak-test.c to samples dir

Subsystem: mm/dax

    Dan Williams <dan.j.williams@intel.com>:
    Patch series "device-dax: Support sub-dividing soft-reserved ranges", v5:
      x86/numa: cleanup configuration dependent command-line options
      x86/numa: add 'nohmat' option
      efi/fake_mem: arrange for a resource entry per efi_fake_mem instance
      ACPI: HMAT: refactor hmat_register_target_device to hmem_register_device
      resource: report parent to walk_iomem_res_desc() callback
      mm/memory_hotplug: introduce default phys_to_target_node() implementation
      ACPI: HMAT: attach a device for each soft-reserved range
      device-dax: drop the dax_region.pfn_flags attribute
      device-dax: move instance creation parameters to 'struct dev_dax_data'
      device-dax: make pgmap optional for instance creation
      device-dax/kmem: introduce dax_kmem_range()
      device-dax/kmem: move resource name tracking to drvdata
      device-dax/kmem: replace release_resource() with release_mem_region()
      device-dax: add an allocation interface for device-dax instances
      device-dax: introduce 'struct dev_dax' typed-driver operations
      device-dax: introduce 'seed' devices
      drivers/base: make device_find_child_by_name() compatible with sysfs inputs
      device-dax: add resize support
      mm/memremap_pages: convert to 'struct range'
      mm/memremap_pages: support multiple ranges per invocation
      device-dax: add dis-contiguous resource support
      device-dax: introduce 'mapping' devices

    Joao Martins <joao.m.martins@oracle.com>:
      device-dax: make align a per-device property

    Dan Williams <dan.j.williams@intel.com>:
      device-dax: add an 'align' attribute

    Joao Martins <joao.m.martins@oracle.com>:
      dax/hmem: introduce dax_hmem.region_idle parameter
      device-dax: add a range mapping allocation attribute

Subsystem: mm/debug

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/debug.c: do not dereference i_ino blindly

    John Hubbard <jhubbard@nvidia.com>:
      mm, dump_page: rename head_mapcount() --> head_compound_mapcount()

Subsystem: mm/pagecache

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Return head pages from find_*_entry", v2:
      mm: factor find_get_incore_page out of mincore_page
      mm: use find_get_incore_page in memcontrol
      mm: optimise madvise WILLNEED
      proc: optimise smaps for shmem entries
      i915: use find_lock_page instead of find_lock_entry
      mm: convert find_get_entry to return the head page
      mm/shmem: return head page from find_lock_entry
      mm: add find_lock_head
      mm/filemap: fix filemap_map_pages for THP

Subsystem: mm/fadvise

    Yafang Shao <laoar.shao@gmail.com>:
      mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED

Subsystem: mm/gup

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/gup_benchmark: update the documentation in Kconfig
      mm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flag
      mm/gup: don't permit users to call get_user_pages with FOLL_LONGTERM

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: protect unpin_user_pages() against npages==-ERRNO

Subsystem: mm/swap

    Gao Xiang <hsiangkao@redhat.com>:
      swap: rename SWP_FS to SWAP_FS_OPS to avoid ambiguity

    Yu Zhao <yuzhao@google.com>:
      mm: remove activate_page() from unuse_pte()
      mm: remove superfluous __ClearPageActive()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/swap.c: fix confusing comment in release_pages()
      mm/swap_slots.c: remove always zero and unused return value of enable_swap_slots_cache()
      mm/page_io.c: remove useless out label in __swap_writepage()
      mm/swap.c: fix incomplete comment in lru_cache_add_inactive_or_unevictable()
      mm/swapfile.c: remove unnecessary goto out in _swap_info_get()
      mm/swapfile.c: fix potential memory leak in sys_swapon

Subsystem: mm/memremap

    Ira Weiny <ira.weiny@intel.com>:
      mm/memremap.c: convert devmap static branch to {inc,dec}

Subsystem: mm/memcg

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      mm: memcontrol: use flex_array_size() helper in memcpy()
      mm: memcontrol: use the preferred form for passing the size of a structure type

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: fix racy access to page->mem_cgroup in mem_cgroup_from_obj()

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: memcontrol: correct the comment of mem_cgroup_iter()

    Waiman Long <longman@redhat.com>:
    Patch series "mm/memcg: Miscellaneous cleanups and streamlining", v2:
      mm/memcg: clean up obsolete enum charge_type
      mm/memcg: simplify mem_cgroup_get_max()
      mm/memcg: unify swap and memsw page counters

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: add the missing numa_stat interface for cgroup v2

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/page_counter: correct the obsolete func name in the comment of page_counter_try_charge()
      mm: memcontrol: reword obsolete comment of mem_cgroup_unmark_under_oom()

    Bharata B Rao <bharata@linux.ibm.com>:
      mm: memcg/slab: uncharge during kmem_cache_free_bulk()

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/memcg: fix device private memcg accounting

Subsystem: mm/selftests

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "selftests/vm: fix some minor aggravating factors in the Makefile":
      selftests/vm: fix false build success on the second and later attempts
      selftests/vm: fix incorrect gcc invocation in some cases

Subsystem: mm/pagemap

    Matthew Wilcox <willy@infradead.org>:
      mm: account PMD tables like PTE tables

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/memory.c: fix typo in __do_fault() comment
      mm/memory.c: replace vmf->vma with variable vma

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/mmap: rename __vma_unlink_common() to __vma_unlink()
      mm/mmap: leverage vma_rb_erase_ignore() to implement vma_rb_erase()

    Chinwen Chang <chinwen.chang@mediatek.com>:
    Patch series "Try to release mmap_lock temporarily in smaps_rollup", v4:
      mmap locking API: add mmap_lock_is_contended()
      mm: smaps*: extend smap_gather_stats to support specified beginning
      mm: proc: smaps_rollup: do not stall write attempts on mmap_lock

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Fix PageDoubleMap":
      mm: move PageDoubleMap bit
      mm: simplify PageDoubleMap with PF_SECOND policy

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/mmap: leave adjust_next as virtual address instead of page frame number

    Randy Dunlap <rdunlap@infradead.org>:
      mm/memory.c: fix spello of "function"

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/mmap: not necessary to check mapping separately
      mm/mmap: check on file instead of the rb_root_cached of its address_space

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: use helper function mapping_allow_writable()
      mm/mmap.c: use helper function allow_write_access() in __remove_shared_vm_struct()

    Liao Pingfang <liao.pingfang@zte.com.cn>:
      mm/mmap.c: replace do_brk with do_brk_flags in comment of insert_vm_struct()

    Peter Xu <peterx@redhat.com>:
      mm: remove src/dst mm parameter in copy_page_range()

Subsystem: mm/mincore

    yuleixzhang <yulei.kernel@gmail.com>:
      include/linux/huge_mm.h: remove mincore_huge_pmd declaration

Subsystem: mm/hmm

    Ralph Campbell <rcampbell@nvidia.com>:
      tools/testing/selftests/vm/hmm-tests.c: use the new SKIP() macro
      lib/test_hmm.c: remove unused dmirror_zero_page

Subsystem: mm/dma

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      mm/dmapool.c: replace open-coded list_for_each_entry_safe()
      mm/dmapool.c: replace hard coded function name with __func__

Subsystem: mm/memory-failure

    Xianting Tian <tian.xianting@h3c.com>:
      mm/memory-failure: do pgoff calculation before for_each_process()

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/memory-failure.c: remove unused macro `writeback'

Subsystem: mm/vmalloc

    Hui Su <sh_def@163.com>:
      mm/vmalloc.c: update the comment in __vmalloc_area_node()
      mm/vmalloc.c: fix the comment of find_vm_area

Subsystem: mm/documentation

    Alexander Gordeev <agordeev@linux.ibm.com>:
      docs/vm: fix 'mm_count' vs 'mm_users' counter confusion

Subsystem: mm/kasan

    Patricia Alfonso <trishalfonso@google.com>:
    Patch series "KASAN-KUnit Integration", v14:
      kasan/kunit: add KUnit Struct to Current Task
      KUnit: KASAN Integration
      KASAN: port KASAN Tests to KUnit
      KASAN: Testing Documentation

    David Gow <davidgow@google.com>:
      mm: kasan: do not panic if both panic_on_warn and kasan_multishot set

Subsystem: mm/pagealloc

    David Hildenbrand <david@redhat.com>:
    Patch series "mm / virtio-mem: support ZONE_MOVABLE", v5:
      mm/page_alloc: tweak comments in has_unmovable_pages()
      mm/page_isolation: exit early when pageblock is isolated in set_migratetype_isolate()
      mm/page_isolation: drop WARN_ON_ONCE() in set_migratetype_isolate()
      mm/page_isolation: cleanup set_migratetype_isolate()
      virtio-mem: don't special-case ZONE_MOVABLE
      mm: document semantics of ZONE_MOVABLE

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm, isolation: avoid checking unmovable pages across pageblock boundary

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/page_alloc.c: clean code by removing unnecessary initialization
      mm/page_alloc.c: micro-optimization remove unnecessary branch
      mm/page_alloc.c: fix early params garbage value accesses
      mm/page_alloc.c: clean code by merging two functions

    Yanfei Xu <yanfei.xu@windriver.com>:
      mm/page_alloc.c: __perform_reclaim should return 'unsigned long'

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mmzone: clean code by removing unused macro parameter

    Ralph Campbell <rcampbell@nvidia.com>:
      mm: move call to compound_head() in release_pages()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/page_alloc.c: fix freeing non-compound pages

    Michal Hocko <mhocko@suse.com>:
      include/linux/gfp.h: clarify usage of GFP_ATOMIC in !preemptible contexts

Subsystem: mm/hugetlb

    Baoquan He <bhe@redhat.com>:
    Patch series "mm/hugetlb: Small cleanup and improvement", v2:
      mm/hugetlb.c: make is_hugetlb_entry_hwpoisoned return bool
      mm/hugetlb.c: remove the unnecessary non_swap_entry()
      doc/vm: fix typo in the hugetlb admin documentation

    Wei Yang <richard.weiyang@linux.alibaba.com>:
    Patch series "mm/hugetlb: code refine and simplification", v4:
      mm/hugetlb: not necessary to coalesce regions recursively
      mm/hugetlb: remove VM_BUG_ON(!nrg) in get_file_region_entry_from_cache()
      mm/hugetlb: use list_splice to merge two list at once
      mm/hugetlb: count file_region to be added when regions_needed != NULL
      mm/hugetlb: a page from buddy is not on any list
      mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page
      mm/hugetlb: take the free hpage during the iteration directly

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlb: add lockdep check for i_mmap_rwsem held in huge_pmd_share

Subsystem: mm/vmscan

    Chunxin Zang <zangchunxin@bytedance.com>:
      mm/vmscan: fix infinite loop in drop_slab_node

    Hui Su <sh_def@163.com>:
      mm/vmscan: fix comments for isolate_lru_page()

Subsystem: mm/z3fold

    Hui Su <sh_def@163.com>:
      mm/z3fold.c: use xx_zalloc instead xx_alloc and memset

Subsystem: mm/zbud

    Xiang Chen <chenxiang66@hisilicon.com>:
      mm/zbud: remove redundant initialization

Subsystem: mm/compaction

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/compaction.c: micro-optimization remove unnecessary branch
      include/linux/compaction.h: clean code by removing unused enum value

    John Hubbard <jhubbard@nvidia.com>:
      selftests/vm: 8x compaction_test speedup

Subsystem: mm/mempolicy

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/mempolicy: remove or narrow the lock on current
      mm: remove unused alloc_page_vma_node()

Subsystem: mm/mempool

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mempool: add 'else' to split mutually exclusive case

Subsystem: mm/memblock

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "memblock: seasonal cleaning^w cleanup", v3:
      KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
      dma-contiguous: simplify cma_early_percent_memory()
      arm, xtensa: simplify initialization of high memory pages
      arm64: numa: simplify dummy_numa_init()
      h8300, nds32, openrisc: simplify detection of memory extents
      riscv: drop unneeded node initialization
      mircoblaze: drop unneeded NUMA and sparsemem initializations
      memblock: make for_each_memblock_type() iterator private
      memblock: make memblock_debug and related functionality private
      memblock: reduce number of parameters in for_each_mem_range()
      arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
      arch, drivers: replace for_each_membock() with for_each_mem_range()
      x86/setup: simplify initrd relocation and reservation
      x86/setup: simplify reserve_crashkernel()
      memblock: remove unused memblock_mem_size()
      memblock: implement for_each_reserved_mem_region() using __next_mem_region()
      memblock: use separate iterators for memory and reserved regions

Subsystem: mm/oom-kill

    Suren Baghdasaryan <surenb@google.com>:
      mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

Subsystem: mm/migration

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/migrate: remove cpages-- in migrate_vma_finalize()
      mm/migrate: remove obsolete comment about device public

 .clang-format                                |    7 
 Documentation/admin-guide/cgroup-v2.rst      |   69 +
 Documentation/admin-guide/mm/hugetlbpage.rst |    2 
 Documentation/dev-tools/kasan.rst            |   74 +
 Documentation/dev-tools/kmemleak.rst         |    2 
 Documentation/kbuild/makefiles.rst           |   20 
 Documentation/vm/active_mm.rst               |    2 
 Documentation/x86/x86_64/boot-options.rst    |    4 
 MAINTAINERS                                  |    2 
 Makefile                                     |    9 
 arch/arm/Kconfig                             |    2 
 arch/arm/include/asm/tlb.h                   |    1 
 arch/arm/kernel/setup.c                      |   18 
 arch/arm/mm/init.c                           |   59 -
 arch/arm/mm/mmu.c                            |   39 
 arch/arm/mm/pmsa-v7.c                        |   23 
 arch/arm/mm/pmsa-v8.c                        |   17 
 arch/arm/xen/mm.c                            |    7 
 arch/arm64/Kconfig                           |    2 
 arch/arm64/kernel/machine_kexec_file.c       |    6 
 arch/arm64/kernel/setup.c                    |    4 
 arch/arm64/kernel/vdso/Makefile              |    7 
 arch/arm64/mm/init.c                         |   11 
 arch/arm64/mm/kasan_init.c                   |   10 
 arch/arm64/mm/mmu.c                          |   11 
 arch/arm64/mm/numa.c                         |   15 
 arch/c6x/kernel/setup.c                      |    9 
 arch/h8300/kernel/setup.c                    |    8 
 arch/microblaze/mm/init.c                    |   23 
 arch/mips/cavium-octeon/dma-octeon.c         |   14 
 arch/mips/kernel/setup.c                     |   31 
 arch/mips/netlogic/xlp/setup.c               |    2 
 arch/nds32/kernel/setup.c                    |    8 
 arch/openrisc/kernel/setup.c                 |    9 
 arch/openrisc/mm/init.c                      |    8 
 arch/powerpc/kernel/fadump.c                 |   61 -
 arch/powerpc/kexec/file_load_64.c            |   16 
 arch/powerpc/kvm/book3s_hv_builtin.c         |   12 
 arch/powerpc/kvm/book3s_hv_uvmem.c           |   14 
 arch/powerpc/mm/book3s64/hash_utils.c        |   16 
 arch/powerpc/mm/book3s64/radix_pgtable.c     |   10 
 arch/powerpc/mm/kasan/kasan_init_32.c        |    8 
 arch/powerpc/mm/mem.c                        |   31 
 arch/powerpc/mm/numa.c                       |    7 
 arch/powerpc/mm/pgtable_32.c                 |    8 
 arch/riscv/mm/init.c                         |   36 
 arch/riscv/mm/kasan_init.c                   |   10 
 arch/s390/kernel/setup.c                     |   27 
 arch/s390/mm/page-states.c                   |    6 
 arch/s390/mm/vmem.c                          |    7 
 arch/sh/mm/init.c                            |    9 
 arch/sparc/mm/init_64.c                      |   12 
 arch/x86/include/asm/numa.h                  |    8 
 arch/x86/kernel/e820.c                       |   16 
 arch/x86/kernel/setup.c                      |   56 -
 arch/x86/mm/numa.c                           |   13 
 arch/x86/mm/numa_emulation.c                 |    3 
 arch/x86/xen/enlighten_pv.c                  |    2 
 arch/xtensa/mm/init.c                        |   55 -
 drivers/acpi/numa/hmat.c                     |   76 -
 drivers/acpi/numa/srat.c                     |    9 
 drivers/base/core.c                          |    2 
 drivers/bus/mvebu-mbus.c                     |   12 
 drivers/dax/Kconfig                          |    6 
 drivers/dax/Makefile                         |    3 
 drivers/dax/bus.c                            | 1237 +++++++++++++++++++++++----
 drivers/dax/bus.h                            |   34 
 drivers/dax/dax-private.h                    |   74 +
 drivers/dax/device.c                         |  164 +--
 drivers/dax/hmem.c                           |   56 -
 drivers/dax/hmem/Makefile                    |    8 
 drivers/dax/hmem/device.c                    |  100 ++
 drivers/dax/hmem/hmem.c                      |   93 +-
 drivers/dax/kmem.c                           |  236 ++---
 drivers/dax/pmem/compat.c                    |    2 
 drivers/dax/pmem/core.c                      |   36 
 drivers/firmware/efi/x86_fake_mem.c          |   12 
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c    |    4 
 drivers/gpu/drm/nouveau/nouveau_dmem.c       |   15 
 drivers/irqchip/irq-gic-v3-its.c             |    2 
 drivers/nvdimm/badrange.c                    |   26 
 drivers/nvdimm/claim.c                       |   13 
 drivers/nvdimm/nd.h                          |    3 
 drivers/nvdimm/pfn_devs.c                    |   13 
 drivers/nvdimm/pmem.c                        |   27 
 drivers/nvdimm/region.c                      |   21 
 drivers/pci/p2pdma.c                         |   12 
 drivers/virtio/virtio_mem.c                  |   47 -
 drivers/xen/unpopulated-alloc.c              |   45 
 fs/fs_parser.c                               |    2 
 fs/ntfs/inode.c                              |    6 
 fs/ocfs2/alloc.c                             |    6 
 fs/ocfs2/localalloc.c                        |    2 
 fs/proc/base.c                               |    3 
 fs/proc/task_mmu.c                           |  104 +-
 fs/xattr.c                                   |   22 
 include/acpi/acpi_numa.h                     |   14 
 include/kunit/test.h                         |    5 
 include/linux/acpi.h                         |    2 
 include/linux/compaction.h                   |    3 
 include/linux/compiler-clang.h               |    8 
 include/linux/compiler-gcc.h                 |    2 
 include/linux/compiler.h                     |    2 
 include/linux/dax.h                          |    8 
 include/linux/export.h                       |    2 
 include/linux/fs.h                           |    4 
 include/linux/gfp.h                          |    6 
 include/linux/huge_mm.h                      |    3 
 include/linux/kasan.h                        |    6 
 include/linux/memblock.h                     |   90 +
 include/linux/memcontrol.h                   |   13 
 include/linux/memory_hotplug.h               |   23 
 include/linux/memremap.h                     |   15 
 include/linux/mm.h                           |   36 
 include/linux/mmap_lock.h                    |    5 
 include/linux/mmzone.h                       |   37 
 include/linux/numa.h                         |   11 
 include/linux/oom.h                          |    1 
 include/linux/page-flags.h                   |   42 
 include/linux/pagemap.h                      |   43 
 include/linux/range.h                        |    6 
 include/linux/sched.h                        |    4 
 include/linux/sched/coredump.h               |    1 
 include/linux/slab.h                         |    2 
 include/linux/swap.h                         |   10 
 include/linux/swap_slots.h                   |    2 
 kernel/dma/contiguous.c                      |   11 
 kernel/fork.c                                |   25 
 kernel/resource.c                            |   11 
 lib/Kconfig.debug                            |    9 
 lib/Kconfig.kasan                            |   31 
 lib/Makefile                                 |    5 
 lib/kunit/test.c                             |   13 
 lib/test_free_pages.c                        |   42 
 lib/test_hmm.c                               |   65 -
 lib/test_kasan.c                             |  732 ++++++---------
 lib/test_kasan_module.c                      |  111 ++
 mm/Kconfig                                   |    4 
 mm/Makefile                                  |    1 
 mm/compaction.c                              |    5 
 mm/debug.c                                   |   18 
 mm/dmapool.c                                 |   46 -
 mm/fadvise.c                                 |    9 
 mm/filemap.c                                 |   78 -
 mm/gup.c                                     |   44 
 mm/gup_benchmark.c                           |   23 
 mm/huge_memory.c                             |    4 
 mm/hugetlb.c                                 |  100 +-
 mm/internal.h                                |    3 
 mm/kasan/report.c                            |   34 
 mm/kmemleak-test.c                           |   99 --
 mm/kmemleak.c                                |    8 
 mm/madvise.c                                 |   21 
 mm/memblock.c                                |  102 --
 mm/memcontrol.c                              |  262 +++--
 mm/memory-failure.c                          |    5 
 mm/memory.c                                  |  147 +--
 mm/memory_hotplug.c                          |   10 
 mm/mempolicy.c                               |    8 
 mm/mempool.c                                 |   18 
 mm/memremap.c                                |  344 ++++---
 mm/migrate.c                                 |    3 
 mm/mincore.c                                 |   28 
 mm/mmap.c                                    |   45 
 mm/oom_kill.c                                |    2 
 mm/page_alloc.c                              |   82 -
 mm/page_counter.c                            |    2 
 mm/page_io.c                                 |   14 
 mm/page_isolation.c                          |   41 
 mm/shmem.c                                   |   19 
 mm/slab.c                                    |    4 
 mm/slab.h                                    |   50 -
 mm/slub.c                                    |   33 
 mm/sparse.c                                  |   10 
 mm/swap.c                                    |   14 
 mm/swap_slots.c                              |    3 
 mm/swap_state.c                              |   38 
 mm/swapfile.c                                |   12 
 mm/truncate.c                                |   58 -
 mm/vmalloc.c                                 |    6 
 mm/vmscan.c                                  |    5 
 mm/z3fold.c                                  |    3 
 mm/zbud.c                                    |    1 
 samples/Makefile                             |    1 
 samples/kmemleak/Makefile                    |    3 
 samples/kmemleak/kmemleak-test.c             |   99 ++
 scripts/decodecode                           |   29 
 scripts/spelling.txt                         |    4 
 tools/testing/nvdimm/dax-dev.c               |   28 
 tools/testing/nvdimm/test/iomap.c            |    2 
 tools/testing/selftests/vm/Makefile          |   17 
 tools/testing/selftests/vm/compaction_test.c |   11 
 tools/testing/selftests/vm/gup_benchmark.c   |   14 
 tools/testing/selftests/vm/hmm-tests.c       |    4 
 194 files changed, 4273 insertions(+), 2777 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-10-11  6:15 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-10-11  6:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

5 patches, based on da690031a5d6d50a361e3f19f3eeabd086a6f20d.

Subsystems affected by this patch series:

  MAINTAINERS
  mm/pagemap
  mm/swap
  mm/hugetlb

Subsystem: MAINTAINERS

    Kees Cook <keescook@chromium.org>:
      MAINTAINERS: change hardening mailing list

    Antoine Tenart <atenart@kernel.org>:
      MAINTAINERS: Antoine Tenart's email address

Subsystem: mm/pagemap

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: mmap: Fix general protection fault in unlink_file_vma()

Subsystem: mm/swap

    Minchan Kim <minchan@kernel.org>:
      mm: validate inode in mapping_set_error()

Subsystem: mm/hugetlb

    Vijay Balakrishna <vijayb@linux.microsoft.com>:
      mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

 .mailmap                   |    4 +++-
 MAINTAINERS                |    8 ++++----
 include/linux/khugepaged.h |    5 +++++
 include/linux/pagemap.h    |    3 ++-
 mm/khugepaged.c            |   13 +++++++++++--
 mm/mmap.c                  |    6 +++++-
 mm/page_alloc.c            |    3 +++
 7 files changed, 33 insertions(+), 9 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-10-03  5:20 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-10-03  5:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

3 patches, based on d3d45f8220d60a0b2aaaacf8fb2be4e6ffd9008e.

Subsystems affected by this patch series:

  mm/slub
  mm/cma
  scripts

Subsystem: mm/slub

    Eric Farman <farman@linux.ibm.com>:
      mm, slub: restore initial kmem_cache flags

Subsystem: mm/cma

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
      mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs

Subsystem: scripts

    Eric Biggers <ebiggers@google.com>:
      scripts/spelling.txt: fix malformed entry

 mm/page_alloc.c      |   19 ++++++++++++++++---
 mm/slub.c            |    6 +-----
 scripts/spelling.txt |    2 +-
 3 files changed, 18 insertions(+), 9 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-09-26  4:17 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-09-26  4:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

9 patches, based on 7c7ec3226f5f33f9c050d85ec20f18419c622ad6.

Subsystems affected by this patch series:

  mm/thp
  mm/memcg
  mm/gup
  mm/migration
  lib
  x86
  mm/memory-hotplug

Subsystem: mm/thp

    Gao Xiang <hsiangkao@redhat.com>:
      mm, THP, swap: fix allocating cluster for swapfile by mistake

Subsystem: mm/memcg

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcontrol: fix missing suffix of workingset_restore

Subsystem: mm/gup

    Vasily Gorbik <gor@linux.ibm.com>:
      mm/gup: fix gup_fast with dynamic page table folding

Subsystem: mm/migration

    Zi Yan <ziy@nvidia.com>:
      mm/migrate: correct thp migration stats

Subsystem: lib

    Nick Desaulniers <ndesaulniers@google.com>:
      lib/string.c: implement stpcpy

    Jason Yan <yanaijie@huawei.com>:
      lib/memregion.c: include memregion.h

Subsystem: x86

    Mikulas Patocka <mpatocka@redhat.com>:
      arch/x86/lib/usercopy_64.c: fix  __copy_user_flushcache() cache writeback

Subsystem: mm/memory-hotplug

    Laurent Dufour <ldufour@linux.ibm.com>:
    Patch series "mm: fix memory to node bad links in sysfs", v3:
      mm: replace memmap_context by meminit_context
      mm: don't rely on system state to detect hot-plug operations

 Documentation/admin-guide/cgroup-v2.rst |   25 ++++++---
 arch/ia64/mm/init.c                     |    6 +-
 arch/s390/include/asm/pgtable.h         |   42 +++++++++++----
 arch/x86/lib/usercopy_64.c              |    2 
 drivers/base/node.c                     |   85 ++++++++++++++++++++------------
 include/linux/mm.h                      |    2 
 include/linux/mmzone.h                  |   11 +++-
 include/linux/node.h                    |   11 ++--
 include/linux/pgtable.h                 |   10 +++
 lib/memregion.c                         |    1 
 lib/string.c                            |   24 +++++++++
 mm/gup.c                                |   18 +++---
 mm/memcontrol.c                         |    4 -
 mm/memory_hotplug.c                     |    5 +
 mm/migrate.c                            |    7 +-
 mm/page_alloc.c                         |   10 +--
 mm/swapfile.c                           |    2 
 17 files changed, 181 insertions(+), 84 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-09-19  4:19 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-09-19  4:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

15 patches, based on 92ab97adeefccf375de7ebaad9d5b75d4125fe8b.

Subsystems affected by this patch series:

  mailmap
  mm/hotfixes
  mm/thp
  mm/memory-hotplug
  misc
  kcsan

Subsystem: mailmap

    Kees Cook <keescook@chromium.org>:
      mailmap: add older email addresses for Kees Cook

Subsystem: mm/hotfixes

    Hugh Dickins <hughd@google.com>:
    Patch series "mm: fixes to past from future testing":
      ksm: reinstate memcg charge on copied pages
      mm: migration of hugetlbfs page skip memcg
      shmem: shmem_writepage() split unlikely i915 THP
      mm: fix check_move_unevictable_pages() on THP
      mlock: fix unevictable_pgs event counts on THP

    Byron Stanoszek <gandalf@winds.org>:
      tmpfs: restore functionality of nr_inodes=0

    Muchun Song <songmuchun@bytedance.com>:
      kprobes: fix kill kprobe which has been marked as gone

Subsystem: mm/thp

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/thp: fix __split_huge_pmd_locked() for migration PMD

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      selftests/vm: fix display of page size in map_hugetlb

Subsystem: mm/memory-hotplug

    Pavel Tatashin <pasha.tatashin@soleen.com>:
      mm/memory_hotplug: drain per-cpu pages again during memory offline

Subsystem: misc

    Tobias Klauser <tklauser@distanz.ch>:
      ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
      stackleak: let stack_erasing_sysctl take a kernel pointer buffer
      fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype

Subsystem: kcsan

    Changbin Du <changbin.du@gmail.com>:
      kcsan: kconfig: move to menu 'Generic Kernel Debugging Instruments'

 .mailmap                                 |    4 ++
 fs/fs-writeback.c                        |    2 -
 include/linux/ftrace.h                   |    3 --
 include/linux/stackleak.h                |    2 -
 kernel/kprobes.c                         |    9 +++++-
 kernel/stackleak.c                       |    2 -
 kernel/trace/ftrace.c                    |    3 --
 lib/Kconfig.debug                        |    4 --
 mm/huge_memory.c                         |   42 ++++++++++++++++---------------
 mm/ksm.c                                 |    4 ++
 mm/memory_hotplug.c                      |   14 ++++++++++
 mm/migrate.c                             |    3 +-
 mm/mlock.c                               |   24 +++++++++++------
 mm/page_isolation.c                      |    8 +++++
 mm/shmem.c                               |   20 +++++++++++---
 mm/swap.c                                |    6 ++--
 mm/vmscan.c                              |   10 +++++--
 tools/testing/selftests/vm/map_hugetlb.c |    2 -
 18 files changed, 111 insertions(+), 51 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-09-04 23:34 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-09-04 23:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

19 patches, based on 59126901f200f5fc907153468b03c64e0081b6e6.

Subsystems affected by this patch series:

  mm/memcg
  mm/slub
  MAINTAINERS
  mm/pagemap
  ipc
  fork
  checkpatch
  mm/madvise
  mm/migration
  mm/hugetlb
  lib

Subsystem: mm/memcg

    Michal Hocko <mhocko@suse.com>:
      memcg: fix use-after-free in uncharge_batch

    Xunlei Pang <xlpang@linux.alibaba.com>:
      mm: memcg: fix memcg reclaim soft lockup

Subsystem: mm/slub

    Eugeniu Rosca <erosca@de.adit-jv.com>:
      mm: slub: fix conversion of freelist_corrupted()

Subsystem: MAINTAINERS

    Robert Richter <rric@kernel.org>:
      MAINTAINERS: update Cavium/Marvell entries

    Nick Desaulniers <ndesaulniers@google.com>:
      MAINTAINERS: add LLVM maintainers

    Randy Dunlap <rdunlap@infradead.org>:
      MAINTAINERS: IA64: mark Status as Odd Fixes only

Subsystem: mm/pagemap

    Joerg Roedel <jroedel@suse.de>:
      mm: track page table modifications in __apply_to_page_range()

Subsystem: ipc

    Tobias Klauser <tklauser@distanz.ch>:
      ipc: adjust proc_ipc_sem_dointvec definition to match prototype

Subsystem: fork

    Tobias Klauser <tklauser@distanz.ch>:
      fork: adjust sysctl_max_threads definition to match prototype

Subsystem: checkpatch

    Mrinal Pandey <mrinalmni@gmail.com>:
      checkpatch: fix the usage of capture group ( ... )

Subsystem: mm/madvise

    Yang Shi <shy828301@gmail.com>:
      mm: madvise: fix vma user-after-free

Subsystem: mm/migration

    Alistair Popple <alistair@popple.id.au>:
      mm/migrate: fixup setting UFFD_WP flag
      mm/rmap: fixup copying of soft dirty and uffd ptes

    Ralph Campbell <rcampbell@nvidia.com>:
    Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()":
      mm/migrate: remove unnecessary is_zone_device_page() check
      mm/migrate: preserve soft dirty in remove_migration_pte()

Subsystem: mm/hugetlb

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/hugetlb: try preferred node first when alloc gigantic page from cma

    Muchun Song <songmuchun@bytedance.com>:
      mm/hugetlb: fix a race between hugetlb sysctl handlers

    David Howells <dhowells@redhat.com>:
      mm/khugepaged.c: fix khugepaged's request size in collapse_file

Subsystem: lib

    Jason Gunthorpe <jgg@nvidia.com>:
      include/linux/log2.h: add missing () around n in roundup_pow_of_two()

 MAINTAINERS           |   32 ++++++++++++++++----------------
 include/linux/log2.h  |    2 +-
 ipc/ipc_sysctl.c      |    2 +-
 kernel/fork.c         |    2 +-
 mm/hugetlb.c          |   49 +++++++++++++++++++++++++++++++++++++------------
 mm/khugepaged.c       |    2 +-
 mm/madvise.c          |    2 +-
 mm/memcontrol.c       |    6 ++++++
 mm/memory.c           |   37 ++++++++++++++++++++++++-------------
 mm/migrate.c          |   31 +++++++++++++++++++------------
 mm/rmap.c             |    9 +++++++--
 mm/slub.c             |   12 ++++++------
 mm/vmscan.c           |    8 ++++++++
 scripts/checkpatch.pl |    4 ++--
 14 files changed, 130 insertions(+), 68 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-08-21  0:41 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-08-21  0:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

11 patches, based on 7eac66d0456fe12a462e5c14c68e97c7460989da.

Subsystems affected by this patch series:

  misc
  mm/hugetlb
  mm/vmalloc
  mm/misc
  romfs
  relay
  uprobes
  squashfs
  mm/cma
  mm/pagealloc

Subsystem: misc

    Nick Desaulniers <ndesaulniers@google.com>:
      mailmap: add Andi Kleen

Subsystem: mm/hugetlb

    Xu Wang <vulab@iscas.ac.cn>:
      hugetlb_cgroup: convert comma to semicolon

    Hugh Dickins <hughd@google.com>:
      khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()

Subsystem: mm/vmalloc

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/vunmap: add cond_resched() in vunmap_pmd_range

Subsystem: mm/misc

    Leon Romanovsky <leonro@nvidia.com>:
      mm/rodata_test.c: fix missing function declaration

Subsystem: romfs

    Jann Horn <jannh@google.com>:
      romfs: fix uninitialized memory leak in romfs_dev_read()

Subsystem: relay

    Wei Yongjun <weiyongjun1@huawei.com>:
      kernel/relay.c: fix memleak on destroy relay channel

Subsystem: uprobes

    Hugh Dickins <hughd@google.com>:
      uprobes: __replace_page() avoid BUG in munlock_vma_page()

Subsystem: squashfs

    Phillip Lougher <phillip@squashfs.org.uk>:
      squashfs: avoid bio_alloc() failure with 1Mbyte blocks

Subsystem: mm/cma

    Doug Berger <opendmb@gmail.com>:
      mm: include CMA pages in lowmem_reserve at boot

Subsystem: mm/pagealloc

    Charan Teja Reddy <charante@codeaurora.org>:
      mm, page_alloc: fix core hung in free_pcppages_bulk()

 .mailmap                |    1 +
 fs/romfs/storage.c      |    4 +---
 fs/squashfs/block.c     |    6 +++++-
 kernel/events/uprobes.c |    2 +-
 kernel/relay.c          |    1 +
 mm/hugetlb_cgroup.c     |    4 ++--
 mm/khugepaged.c         |    2 +-
 mm/page_alloc.c         |    7 ++++++-
 mm/rodata_test.c        |    1 +
 mm/vmalloc.c            |    2 ++
 10 files changed, 21 insertions(+), 9 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-08-15  0:29 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-08-15  0:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


39 patches, based on b923f1247b72fc100b87792fd2129d026bb10e66.

Subsystems affected by this patch series:

  mm/hotfixes
  lz4
  exec
  mailmap
  mm/thp
  autofs
  mm/madvise
  sysctl
  mm/kmemleak
  mm/misc
  lib

Subsystem: mm/hotfixes

    Mike Rapoport <rppt@linux.ibm.com>:
      asm-generic: pgalloc.h: use correct #ifdef to enable pud_alloc_one()

    Baoquan He <bhe@redhat.com>:
      Revert "mm/vmstat.c: do not show lowmem reserve protection information of empty zone"

Subsystem: lz4

    Nick Terrell <terrelln@fb.com>:
      lz4: fix kernel decompression speed

Subsystem: exec

    Kees Cook <keescook@chromium.org>:
    Patch series "Fix S_ISDIR execve() errno":
      exec: restore EACCES of S_ISDIR execve()
      selftests/exec: add file type errno tests

Subsystem: mailmap

    Greg Kurz <groug@kaod.org>:
      mailmap: add entry for Greg Kurz

Subsystem: mm/thp

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "THP prep patches":
      mm: store compound_nr as well as compound_order
      mm: move page-flags include to top of file
      mm: add thp_order
      mm: add thp_size
      mm: replace hpage_nr_pages with thp_nr_pages
      mm: add thp_head
      mm: introduce offset_in_thp

Subsystem: autofs

    Randy Dunlap <rdunlap@infradead.org>:
      fs: autofs: delete repeated words in comments

Subsystem: mm/madvise

    Minchan Kim <minchan@kernel.org>:
    Patch series "introduce memory hinting API for external process", v8:
      mm/madvise: pass task and mm to do_madvise
      pid: move pidfd_get_pid() to pid.c
      mm/madvise: introduce process_madvise() syscall: an external memory hinting API
      mm/madvise: check fatal signal pending of target process

Subsystem: sysctl

    Xiaoming Ni <nixiaoming@huawei.com>:
      all arch: remove system call sys_sysctl

Subsystem: mm/kmemleak

    Qian Cai <cai@lca.pw>:
      mm/kmemleak: silence KCSAN splats in checksum

Subsystem: mm/misc

    Qian Cai <cai@lca.pw>:
      mm/frontswap: mark various intentional data races
      mm/page_io: mark various intentional data races
      mm/swap_state: mark various intentional data races

    Kirill A. Shutemov <kirill@shutemov.name>:
      mm/filemap.c: fix a data race in filemap_fault()

    Qian Cai <cai@lca.pw>:
      mm/swapfile: fix and annotate various data races
      mm/page_counter: fix various data races at memsw
      mm/memcontrol: fix a data race in scan count
      mm/list_lru: fix a data race in list_lru_count_one
      mm/mempool: fix a data race in mempool_free()
      mm/rmap: annotate a data race at tlb_flush_batched
      mm/swap.c: annotate data races for lru_rotate_pvecs
      mm: annotate a data race in page_zonenum()

    Romain Naour <romain.naour@gmail.com>:
      include/asm-generic/vmlinux.lds.h: align ro_after_init

    Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>:
      sh: clkfwk: remove r8/r16/r32
      sh: use generic strncpy()

Subsystem: lib

    Krzysztof Kozlowski <krzk@kernel.org>:
    Patch series "iomap: Constify ioreadX() iomem argument", v3:
      iomap: constify ioreadX() iomem argument (as in generic implementation)
      rtl818x: constify ioreadX() iomem argument (as in generic implementation)
      ntb: intel: constify ioreadX() iomem argument (as in generic implementation)
      virtio: pci: constify ioreadX() iomem argument (as in generic implementation)

 .mailmap                                               |    1 
 arch/alpha/include/asm/core_apecs.h                    |    6 
 arch/alpha/include/asm/core_cia.h                      |    6 
 arch/alpha/include/asm/core_lca.h                      |    6 
 arch/alpha/include/asm/core_marvel.h                   |    4 
 arch/alpha/include/asm/core_mcpcia.h                   |    6 
 arch/alpha/include/asm/core_t2.h                       |    2 
 arch/alpha/include/asm/io.h                            |   12 -
 arch/alpha/include/asm/io_trivial.h                    |   16 -
 arch/alpha/include/asm/jensen.h                        |    2 
 arch/alpha/include/asm/machvec.h                       |    6 
 arch/alpha/kernel/core_marvel.c                        |    2 
 arch/alpha/kernel/io.c                                 |   12 -
 arch/alpha/kernel/syscalls/syscall.tbl                 |    3 
 arch/arm/configs/am200epdkit_defconfig                 |    1 
 arch/arm/tools/syscall.tbl                             |    3 
 arch/arm64/include/asm/unistd.h                        |    2 
 arch/arm64/include/asm/unistd32.h                      |    6 
 arch/ia64/kernel/syscalls/syscall.tbl                  |    3 
 arch/m68k/kernel/syscalls/syscall.tbl                  |    3 
 arch/microblaze/kernel/syscalls/syscall.tbl            |    3 
 arch/mips/configs/cu1000-neo_defconfig                 |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl              |    3 
 arch/mips/kernel/syscalls/syscall_n64.tbl              |    3 
 arch/mips/kernel/syscalls/syscall_o32.tbl              |    3 
 arch/parisc/include/asm/io.h                           |    4 
 arch/parisc/kernel/syscalls/syscall.tbl                |    3 
 arch/parisc/lib/iomap.c                                |   72 +++---
 arch/powerpc/kernel/iomap.c                            |   28 +-
 arch/powerpc/kernel/syscalls/syscall.tbl               |    3 
 arch/s390/kernel/syscalls/syscall.tbl                  |    3 
 arch/sh/configs/dreamcast_defconfig                    |    1 
 arch/sh/configs/espt_defconfig                         |    1 
 arch/sh/configs/hp6xx_defconfig                        |    1 
 arch/sh/configs/landisk_defconfig                      |    1 
 arch/sh/configs/lboxre2_defconfig                      |    1 
 arch/sh/configs/microdev_defconfig                     |    1 
 arch/sh/configs/migor_defconfig                        |    1 
 arch/sh/configs/r7780mp_defconfig                      |    1 
 arch/sh/configs/r7785rp_defconfig                      |    1 
 arch/sh/configs/rts7751r2d1_defconfig                  |    1 
 arch/sh/configs/rts7751r2dplus_defconfig               |    1 
 arch/sh/configs/se7206_defconfig                       |    1 
 arch/sh/configs/se7343_defconfig                       |    1 
 arch/sh/configs/se7619_defconfig                       |    1 
 arch/sh/configs/se7705_defconfig                       |    1 
 arch/sh/configs/se7750_defconfig                       |    1 
 arch/sh/configs/se7751_defconfig                       |    1 
 arch/sh/configs/secureedge5410_defconfig               |    1 
 arch/sh/configs/sh03_defconfig                         |    1 
 arch/sh/configs/sh7710voipgw_defconfig                 |    1 
 arch/sh/configs/sh7757lcr_defconfig                    |    1 
 arch/sh/configs/sh7763rdp_defconfig                    |    1 
 arch/sh/configs/shmin_defconfig                        |    1 
 arch/sh/configs/titan_defconfig                        |    1 
 arch/sh/include/asm/string_32.h                        |   26 --
 arch/sh/kernel/iomap.c                                 |   22 -
 arch/sh/kernel/syscalls/syscall.tbl                    |    3 
 arch/sparc/kernel/syscalls/syscall.tbl                 |    3 
 arch/x86/entry/syscalls/syscall_32.tbl                 |    3 
 arch/x86/entry/syscalls/syscall_64.tbl                 |    4 
 arch/xtensa/kernel/syscalls/syscall.tbl                |    3 
 drivers/mailbox/bcm-pdc-mailbox.c                      |    2 
 drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h |    6 
 drivers/ntb/hw/intel/ntb_hw_gen1.c                     |    2 
 drivers/ntb/hw/intel/ntb_hw_gen3.h                     |    2 
 drivers/ntb/hw/intel/ntb_hw_intel.h                    |    2 
 drivers/nvdimm/btt.c                                   |    4 
 drivers/nvdimm/pmem.c                                  |    6 
 drivers/sh/clk/cpg.c                                   |   25 --
 drivers/virtio/virtio_pci_modern.c                     |    6 
 fs/autofs/dev-ioctl.c                                  |    4 
 fs/io_uring.c                                          |    2 
 fs/namei.c                                             |    4 
 include/asm-generic/iomap.h                            |   28 +-
 include/asm-generic/pgalloc.h                          |    2 
 include/asm-generic/vmlinux.lds.h                      |    1 
 include/linux/compat.h                                 |    5 
 include/linux/huge_mm.h                                |   58 ++++-
 include/linux/io-64-nonatomic-hi-lo.h                  |    4 
 include/linux/io-64-nonatomic-lo-hi.h                  |    4 
 include/linux/memcontrol.h                             |    2 
 include/linux/mm.h                                     |   16 -
 include/linux/mm_inline.h                              |    6 
 include/linux/mm_types.h                               |    1 
 include/linux/pagemap.h                                |    6 
 include/linux/pid.h                                    |    1 
 include/linux/syscalls.h                               |    4 
 include/linux/sysctl.h                                 |    6 
 include/uapi/asm-generic/unistd.h                      |    4 
 kernel/Makefile                                        |    2 
 kernel/exit.c                                          |   17 -
 kernel/pid.c                                           |   17 +
 kernel/sys_ni.c                                        |    3 
 kernel/sysctl_binary.c                                 |  171 --------------
 lib/iomap.c                                            |   30 +-
 lib/lz4/lz4_compress.c                                 |    4 
 lib/lz4/lz4_decompress.c                               |   18 -
 lib/lz4/lz4defs.h                                      |   10 
 lib/lz4/lz4hc_compress.c                               |    2 
 mm/compaction.c                                        |    2 
 mm/filemap.c                                           |   22 +
 mm/frontswap.c                                         |    8 
 mm/gup.c                                               |    2 
 mm/internal.h                                          |    4 
 mm/kmemleak.c                                          |    2 
 mm/list_lru.c                                          |    2 
 mm/madvise.c                                           |  190 ++++++++++++++--
 mm/memcontrol.c                                        |   10 
 mm/memory.c                                            |    4 
 mm/memory_hotplug.c                                    |    7 
 mm/mempolicy.c                                         |    2 
 mm/mempool.c                                           |    2 
 mm/migrate.c                                           |   18 -
 mm/mlock.c                                             |    9 
 mm/page_alloc.c                                        |    5 
 mm/page_counter.c                                      |   13 -
 mm/page_io.c                                           |   12 -
 mm/page_vma_mapped.c                                   |    6 
 mm/rmap.c                                              |   10 
 mm/swap.c                                              |   21 -
 mm/swap_state.c                                        |   10 
 mm/swapfile.c                                          |   33 +-
 mm/vmscan.c                                            |    6 
 mm/vmstat.c                                            |   12 -
 mm/workingset.c                                        |    6 
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl     |    2 
 tools/perf/arch/s390/entry/syscalls/syscall.tbl        |    2 
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl      |    2 
 tools/testing/selftests/exec/.gitignore                |    1 
 tools/testing/selftests/exec/Makefile                  |    5 
 tools/testing/selftests/exec/non-regular.c             |  196 +++++++++++++++++
 132 files changed, 815 insertions(+), 614 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-08-12  1:29 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-08-12  1:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- Most of the rest of MM

- various other subsystems


165 patches, based on 00e4db51259a5f936fec1424b884f029479d3981.

Subsystems affected by this patch series:

  mm/memcg
  mm/hugetlb
  mm/vmscan
  mm/proc
  mm/compaction
  mm/mempolicy
  mm/oom-kill
  mm/hugetlbfs
  mm/migration
  mm/thp
  mm/cma
  mm/util
  mm/memory-hotplug
  mm/cleanups
  mm/uaccess
  alpha
  misc
  sparse
  bitmap
  lib
  lz4
  bitops
  checkpatch
  autofs
  minix
  nilfs
  ufs
  fat
  signals
  kmod
  coredump
  exec
  kdump
  rapidio
  panic
  kcov
  kgdb
  ipc
  mm/migration
  mm/gup
  mm/pagemap

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
    Patch series "mm: memcg accounting of percpu memory", v3:
      percpu: return number of released bytes from pcpu_free_area()
      mm: memcg/percpu: account percpu memory to memory cgroups
      mm: memcg/percpu: per-memcg percpu memory statistics
      mm: memcg: charge memcg percpu memory to the parent cgroup
      kselftests: cgroup: add perpcu memory accounting test

Subsystem: mm/hugetlb

    Muchun Song <songmuchun@bytedance.com>:
      mm/hugetlb: add mempolicy check in the reservation routine

Subsystem: mm/vmscan

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
    Patch series "workingset protection/detection on the anonymous LRU list", v7:
      mm/vmscan: make active/inactive ratio as 1:1 for anon lru
      mm/vmscan: protect the workingset on anonymous LRU
      mm/workingset: prepare the workingset detection infrastructure for anon LRU
      mm/swapcache: support to handle the shadow entries
      mm/swap: implement workingset detection for anonymous LRU
      mm/vmscan: restore active/inactive ratio for anonymous LRU

Subsystem: mm/proc

    Michal Koutný <mkoutny@suse.com>:
      /proc/PID/smaps: consistent whitespace output format

Subsystem: mm/compaction

    Nitin Gupta <nigupta@nvidia.com>:
      mm: proactive compaction
      mm: fix compile error due to COMPACTION_HPAGE_ORDER
      mm: use unsigned types for fragmentation score

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/compaction: correct the comments of compact_defer_shift

Subsystem: mm/mempolicy

    Krzysztof Kozlowski <krzk@kernel.org>:
      mm: mempolicy: fix kerneldoc of numa_map_to_online_node()

    Wenchao Hao <haowenchao22@gmail.com>:
      mm/mempolicy.c: check parameters first in kernel_get_mempolicy

    Yanfei Xu <yanfei.xu@windriver.com>:
      include/linux/mempolicy.h: fix typo

Subsystem: mm/oom-kill

    Yafang Shao <laoar.shao@gmail.com>:
      mm, oom: make the calculation of oom badness more accurate

    Michal Hocko <mhocko@suse.com>:
      doc, mm: sync up oom_score_adj documentation
      doc, mm: clarify /proc/<pid>/oom_score value range

    Yafang Shao <laoar.shao@gmail.com>:
      mm, oom: show process exiting information in __oom_kill_process()

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: prevent filesystem stacking of hugetlbfs
      hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem

Subsystem: mm/migration

    Ralph Campbell <rcampbell@nvidia.com>:
    Patch series "mm/migrate: optimize migrate_vma_setup() for holes":
      mm/migrate: optimize migrate_vma_setup() for holes
      mm/migrate: add migrate-shared test for migrate_vma_*()

Subsystem: mm/thp

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: thp: remove debug_cow switch

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/vmstat: add events for THP migration without split

Subsystem: mm/cma

    Jianqun Xu <jay.xu@rock-chips.com>:
      mm/cma.c: fix NULL pointer dereference when cma could not be activated

    Barry Song <song.bao.hua@hisilicon.com>:
    Patch series "mm: fix the names of general cma and hugetlb cma", v2:
      mm: cma: fix the name of CMA areas
      mm: hugetlb: fix the name of hugetlb CMA

    Mike Kravetz <mike.kravetz@oracle.com>:
      cma: don't quit at first error when activating reserved areas

Subsystem: mm/util

    Waiman Long <longman@redhat.com>:
      include/linux/sched/mm.h: optimize current_gfp_context()

    Krzysztof Kozlowski <krzk@kernel.org>:
      mm: mmu_notifier: fix and extend kerneldoc

Subsystem: mm/memory-hotplug

    Daniel Jordan <daniel.m.jordan@oracle.com>:
      x86/mm: use max memory block size on bare metal

    Jia He <justin.he@arm.com>:
      mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid()
      mm/memory_hotplug: fix unpaired mem_hotplug_begin/done

    Charan Teja Reddy <charante@codeaurora.org>:
      mm, memory_hotplug: update pcp lists everytime onlining a memory block

Subsystem: mm/cleanups

    Randy Dunlap <rdunlap@infradead.org>:
      mm: drop duplicated words in <linux/pgtable.h>
      mm: drop duplicated words in <linux/mm.h>
      include/linux/highmem.h: fix duplicated words in a comment
      include/linux/frontswap.h:  drop duplicated word in a comment
      include/linux/memcontrol.h: drop duplicate word and fix spello

    Arvind Sankar <nivedita@alum.mit.edu>:
      sh/mm: drop unused MAX_PHYSADDR_BITS
      sparc: drop unused MAX_PHYSADDR_BITS

    Randy Dunlap <rdunlap@infradead.org>:
      mm/compaction.c: delete duplicated word
      mm/filemap.c: delete duplicated word
      mm/hmm.c: delete duplicated word
      mm/hugetlb.c: delete duplicated words
      mm/memcontrol.c: delete duplicated words
      mm/memory.c: delete duplicated words
      mm/migrate.c: delete duplicated word
      mm/nommu.c: delete duplicated words
      mm/page_alloc.c: delete or fix duplicated words
      mm/shmem.c: delete duplicated word
      mm/slab_common.c: delete duplicated word
      mm/usercopy.c: delete duplicated word
      mm/vmscan.c: delete or fix duplicated words
      mm/zpool.c: delete duplicated word and fix grammar
      mm/zsmalloc.c: fix duplicated words

Subsystem: mm/uaccess

    Christoph Hellwig <hch@lst.de>:
    Patch series "clean up address limit helpers", v2:
      syscalls: use uaccess_kernel in addr_limit_user_check
      nds32: use uaccess_kernel in show_regs
      riscv: include <asm/pgtable.h> in <asm/uaccess.h>
      uaccess: remove segment_eq
      uaccess: add force_uaccess_{begin,end} helpers
      exec: use force_uaccess_begin during exec and exit

Subsystem: alpha

    Luc Van Oostenryck <luc.vanoostenryck@gmail.com>:
      alpha: fix annotation of io{read,write}{16,32}be()

Subsystem: misc

    Randy Dunlap <rdunlap@infradead.org>:
      include/linux/compiler-clang.h: drop duplicated word in a comment
      include/linux/exportfs.h: drop duplicated word in a comment
      include/linux/async_tx.h: drop duplicated word in a comment
      include/linux/xz.h: drop duplicated word

    Christoph Hellwig <hch@lst.de>:
      kernel: add a kernel_wait helper

    Feng Tang <feng.tang@intel.com>:
      ./Makefile: add debug option to enable function aligned on 32 bytes

    Arvind Sankar <nivedita@alum.mit.edu>:
      kernel.h: remove duplicate include of asm/div64.h

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      include/: replace HTTP links with HTTPS ones

    Matthew Wilcox <willy@infradead.org>:
      include/linux/poison.h: remove obsolete comment

Subsystem: sparse

    Luc Van Oostenryck <luc.vanoostenryck@gmail.com>:
      sparse: group the defines by functionality

Subsystem: bitmap

    Stefano Brivio <sbrivio@redhat.com>:
    Patch series "lib: Fix bitmap_cut() for overlaps, add test":
      lib/bitmap.c: fix bitmap_cut() for partial overlapping case
      lib/test_bitmap.c: add test for bitmap_cut()

Subsystem: lib

    Luc Van Oostenryck <luc.vanoostenryck@gmail.com>:
      lib/generic-radix-tree.c: remove unneeded __rcu

    Geert Uytterhoeven <geert@linux-m68k.org>:
      lib/test_bitops: do the full test during module init

    Wei Yongjun <weiyongjun1@huawei.com>:
      lib/test_lockup.c: make symbol 'test_works' static

    Tiezhu Yang <yangtiezhu@loongson.cn>:
      lib/Kconfig.debug: make TEST_LOCKUP depend on module
      lib/test_lockup.c: fix return value of test_lockup_init()

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      lib/: replace HTTP links with HTTPS ones

    "Kars Mulder" <kerneldev@karsmulder.nl>:
      kstrto*: correct documentation references to simple_strto*()
      kstrto*: do not describe simple_strto*() as obsolete/replaced

Subsystem: lz4

    Nick Terrell <terrelln@fb.com>:
      lz4: fix kernel decompression speed

Subsystem: bitops

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      lib/test_bits.c: add tests of GENMASK

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: add test for possible misuse of IS_ENABLED() without CONFIG_
      checkpatch: add --fix option for ASSIGN_IN_IF

    Quentin Monnet <quentin@isovalent.com>:
      checkpatch: fix CONST_STRUCT when const_structs.checkpatch is missing

    Joe Perches <joe@perches.com>:
      checkpatch: add test for repeated words
      checkpatch: remove missing switch/case break test

Subsystem: autofs

    Randy Dunlap <rdunlap@infradead.org>:
      autofs: fix doubled word

Subsystem: minix

    Eric Biggers <ebiggers@google.com>:
    Patch series "fs/minix: fix syzbot bugs and set s_maxbytes":
      fs/minix: check return value of sb_getblk()
      fs/minix: don't allow getting deleted inodes
      fs/minix: reject too-large maximum file size
      fs/minix: set s_maxbytes correctly
      fs/minix: fix block limit check for V1 filesystems
      fs/minix: remove expected error message in block_to_path()

Subsystem: nilfs

    Eric Biggers <ebiggers@google.com>:
    Patch series "nilfs2 updates":
      nilfs2: only call unlock_new_inode() if I_NEW

    Joe Perches <joe@perches.com>:
      nilfs2: convert __nilfs_msg to integrate the level and format
      nilfs2: use a more common logging style

Subsystem: ufs

    Colin Ian King <colin.king@canonical.com>:
      fs/ufs: avoid potential u32 multiplication overflow

Subsystem: fat

    Yubo Feng <fengyubo3@huawei.com>:
      fatfs: switch write_lock to read_lock in fat_ioctl_get_attributes

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      VFAT/FAT/MSDOS FILESYSTEM: replace HTTP links with HTTPS ones

    OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>:
      fat: fix fat_ra_init() for data clusters == 0

Subsystem: signals

    Helge Deller <deller@gmx.de>:
      fs/signalfd.c: fix inconsistent return codes for signalfd4

Subsystem: kmod

    Tiezhu Yang <yangtiezhu@loongson.cn>:
    Patch series "kmod/umh: a few fixes":
      selftests: kmod: use variable NAME in kmod_test_0001()
      kmod: remove redundant "be an" in the comment
      test_kmod: avoid potential double free in trigger_config_run_type()

Subsystem: coredump

    Lepton Wu <ytht.net@gmail.com>:
      coredump: add %f for executable filename

Subsystem: exec

    Kees Cook <keescook@chromium.org>:
    Patch series "Relocate execve() sanity checks", v2:
      exec: change uselib(2) IS_SREG() failure to EACCES
      exec: move S_ISREG() check earlier
      exec: move path_noexec() check earlier

Subsystem: kdump

    Vijay Balakrishna <vijayb@linux.microsoft.com>:
      kdump: append kernel build-id string to VMCOREINFO

Subsystem: rapidio

    "Gustavo A. R. Silva" <gustavoars@kernel.org>:
      drivers/rapidio/devices/rio_mport_cdev.c: use struct_size() helper
      drivers/rapidio/rio-scan.c: use struct_size() helper
      rapidio/rio_mport_cdev: use array_size() helper in copy_{from,to}_user()

Subsystem: panic

    Tiezhu Yang <yangtiezhu@loongson.cn>:
      kernel/panic.c: make oops_may_print() return bool
      lib/Kconfig.debug: fix typo in the help text of CONFIG_PANIC_TIMEOUT

    Yue Hu <huyue2@yulong.com>:
      panic: make print_oops_end_marker() static

Subsystem: kcov

    Marco Elver <elver@google.com>:
      kcov: unconditionally add -fno-stack-protector to compiler options

    Wei Yongjun <weiyongjun1@huawei.com>:
      kcov: make some symbols static

Subsystem: kgdb

    Nick Desaulniers <ndesaulniers@google.com>:
      scripts/gdb: fix python 3.8 SyntaxWarning

Subsystem: ipc

    Alexey Dobriyan <adobriyan@gmail.com>:
      ipc: uninline functions

    Liao Pingfang <liao.pingfang@zte.com.cn>:
      ipc/shm.c: remove the superfluous break

Subsystem: mm/migration

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
    Patch series "clean-up the migration target allocation functions", v5:
      mm/page_isolation: prefer the node of the source page
      mm/migrate: move migration helper from .h to .c
      mm/hugetlb: unify migration callbacks
      mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations
      mm/migrate: introduce a standard migration target allocation function
      mm/mempolicy: use a standard migration target allocation callback
      mm/page_alloc: remove a wrapper for alloc_migration_target()

Subsystem: mm/gup

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
      mm/gup: restrict CMA region by using allocation scope API
      mm/hugetlb: make hugetlb migration callback CMA aware
      mm/gup: use a standard migration target allocation callback

Subsystem: mm/pagemap

    Peter Xu <peterx@redhat.com>:
    Patch series "mm: Page fault accounting cleanups", v5:
      mm: do page fault accounting in handle_mm_fault
      mm/alpha: use general page fault accounting
      mm/arc: use general page fault accounting
      mm/arm: use general page fault accounting
      mm/arm64: use general page fault accounting
      mm/csky: use general page fault accounting
      mm/hexagon: use general page fault accounting
      mm/ia64: use general page fault accounting
      mm/m68k: use general page fault accounting
      mm/microblaze: use general page fault accounting
      mm/mips: use general page fault accounting
      mm/nds32: use general page fault accounting
      mm/nios2: use general page fault accounting
      mm/openrisc: use general page fault accounting
      mm/parisc: use general page fault accounting
      mm/powerpc: use general page fault accounting
      mm/riscv: use general page fault accounting
      mm/s390: use general page fault accounting
      mm/sh: use general page fault accounting
      mm/sparc32: use general page fault accounting
      mm/sparc64: use general page fault accounting
      mm/x86: use general page fault accounting
      mm/xtensa: use general page fault accounting
      mm: clean up the last pieces of page fault accountings
      mm/gup: remove task_struct pointer for all gup code

 Documentation/admin-guide/cgroup-v2.rst         |    4 
 Documentation/admin-guide/sysctl/kernel.rst     |    3 
 Documentation/admin-guide/sysctl/vm.rst         |   15 +
 Documentation/filesystems/proc.rst              |   11 -
 Documentation/vm/page_migration.rst             |   27 +++
 Makefile                                        |    4 
 arch/alpha/include/asm/io.h                     |    8 
 arch/alpha/include/asm/uaccess.h                |    2 
 arch/alpha/mm/fault.c                           |   10 -
 arch/arc/include/asm/segment.h                  |    3 
 arch/arc/kernel/process.c                       |    2 
 arch/arc/mm/fault.c                             |   20 --
 arch/arm/include/asm/uaccess.h                  |    4 
 arch/arm/kernel/signal.c                        |    2 
 arch/arm/mm/fault.c                             |   27 ---
 arch/arm64/include/asm/uaccess.h                |    2 
 arch/arm64/kernel/sdei.c                        |    2 
 arch/arm64/mm/fault.c                           |   31 ---
 arch/arm64/mm/numa.c                            |   10 -
 arch/csky/include/asm/segment.h                 |    2 
 arch/csky/mm/fault.c                            |   15 -
 arch/h8300/include/asm/segment.h                |    2 
 arch/hexagon/mm/vm_fault.c                      |   11 -
 arch/ia64/include/asm/uaccess.h                 |    2 
 arch/ia64/mm/fault.c                            |   11 -
 arch/ia64/mm/numa.c                             |    2 
 arch/m68k/include/asm/segment.h                 |    2 
 arch/m68k/include/asm/tlbflush.h                |    6 
 arch/m68k/mm/fault.c                            |   16 -
 arch/microblaze/include/asm/uaccess.h           |    2 
 arch/microblaze/mm/fault.c                      |   11 -
 arch/mips/include/asm/uaccess.h                 |    2 
 arch/mips/kernel/unaligned.c                    |   27 +--
 arch/mips/mm/fault.c                            |   16 -
 arch/nds32/include/asm/uaccess.h                |    2 
 arch/nds32/kernel/process.c                     |    2 
 arch/nds32/mm/alignment.c                       |    7 
 arch/nds32/mm/fault.c                           |   21 --
 arch/nios2/include/asm/uaccess.h                |    2 
 arch/nios2/mm/fault.c                           |   16 -
 arch/openrisc/include/asm/uaccess.h             |    2 
 arch/openrisc/mm/fault.c                        |   11 -
 arch/parisc/include/asm/uaccess.h               |    2 
 arch/parisc/mm/fault.c                          |   10 -
 arch/powerpc/include/asm/uaccess.h              |    3 
 arch/powerpc/mm/copro_fault.c                   |    7 
 arch/powerpc/mm/fault.c                         |   13 -
 arch/riscv/include/asm/uaccess.h                |    6 
 arch/riscv/mm/fault.c                           |   18 --
 arch/s390/include/asm/uaccess.h                 |    2 
 arch/s390/kvm/interrupt.c                       |    2 
 arch/s390/kvm/kvm-s390.c                        |    2 
 arch/s390/kvm/priv.c                            |    8 
 arch/s390/mm/fault.c                            |   18 --
 arch/s390/mm/gmap.c                             |    4 
 arch/sh/include/asm/segment.h                   |    3 
 arch/sh/include/asm/sparsemem.h                 |    4 
 arch/sh/kernel/traps_32.c                       |   12 -
 arch/sh/mm/fault.c                              |   13 -
 arch/sh/mm/init.c                               |    9 -
 arch/sparc/include/asm/sparsemem.h              |    1 
 arch/sparc/include/asm/uaccess_32.h             |    2 
 arch/sparc/include/asm/uaccess_64.h             |    2 
 arch/sparc/mm/fault_32.c                        |   15 -
 arch/sparc/mm/fault_64.c                        |   13 -
 arch/um/kernel/trap.c                           |    6 
 arch/x86/include/asm/uaccess.h                  |    2 
 arch/x86/mm/fault.c                             |   19 --
 arch/x86/mm/init_64.c                           |    9 +
 arch/x86/mm/numa.c                              |    1 
 arch/xtensa/include/asm/uaccess.h               |    2 
 arch/xtensa/mm/fault.c                          |   17 -
 drivers/firmware/arm_sdei.c                     |    5 
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c     |    2 
 drivers/infiniband/core/umem_odp.c              |    2 
 drivers/iommu/amd/iommu_v2.c                    |    2 
 drivers/iommu/intel/svm.c                       |    3 
 drivers/rapidio/devices/rio_mport_cdev.c        |    7 
 drivers/rapidio/rio-scan.c                      |    8 
 drivers/vfio/vfio_iommu_type1.c                 |    4 
 fs/coredump.c                                   |   17 +
 fs/exec.c                                       |   38 ++--
 fs/fat/Kconfig                                  |    2 
 fs/fat/fatent.c                                 |    3 
 fs/fat/file.c                                   |    4 
 fs/hugetlbfs/inode.c                            |    6 
 fs/minix/inode.c                                |   48 ++++-
 fs/minix/itree_common.c                         |    8 
 fs/minix/itree_v1.c                             |   16 -
 fs/minix/itree_v2.c                             |   15 -
 fs/minix/minix.h                                |    1 
 fs/namei.c                                      |   10 -
 fs/nilfs2/alloc.c                               |   38 ++--
 fs/nilfs2/btree.c                               |   42 ++--
 fs/nilfs2/cpfile.c                              |   10 -
 fs/nilfs2/dat.c                                 |   14 -
 fs/nilfs2/direct.c                              |   14 -
 fs/nilfs2/gcinode.c                             |    2 
 fs/nilfs2/ifile.c                               |    4 
 fs/nilfs2/inode.c                               |   32 +--
 fs/nilfs2/ioctl.c                               |   37 ++--
 fs/nilfs2/mdt.c                                 |    2 
 fs/nilfs2/namei.c                               |    6 
 fs/nilfs2/nilfs.h                               |   18 +-
 fs/nilfs2/page.c                                |   11 -
 fs/nilfs2/recovery.c                            |   32 +--
 fs/nilfs2/segbuf.c                              |    2 
 fs/nilfs2/segment.c                             |   38 ++--
 fs/nilfs2/sufile.c                              |   29 +--
 fs/nilfs2/super.c                               |   73 ++++----
 fs/nilfs2/sysfs.c                               |   29 +--
 fs/nilfs2/the_nilfs.c                           |   85 ++++-----
 fs/open.c                                       |    6 
 fs/proc/base.c                                  |   11 +
 fs/proc/task_mmu.c                              |    4 
 fs/signalfd.c                                   |   10 -
 fs/ufs/super.c                                  |    2 
 include/asm-generic/uaccess.h                   |    4 
 include/clocksource/timer-ti-dm.h               |    2 
 include/linux/async_tx.h                        |    2 
 include/linux/btree.h                           |    2 
 include/linux/compaction.h                      |    6 
 include/linux/compiler-clang.h                  |    2 
 include/linux/compiler_types.h                  |   44 ++---
 include/linux/crash_core.h                      |    6 
 include/linux/delay.h                           |    2 
 include/linux/dma/k3-psil.h                     |    2 
 include/linux/dma/k3-udma-glue.h                |    2 
 include/linux/dma/ti-cppi5.h                    |    2 
 include/linux/exportfs.h                        |    2 
 include/linux/frontswap.h                       |    2 
 include/linux/fs.h                              |   10 +
 include/linux/generic-radix-tree.h              |    2 
 include/linux/highmem.h                         |    2 
 include/linux/huge_mm.h                         |    7 
 include/linux/hugetlb.h                         |   53 ++++--
 include/linux/irqchip/irq-omap-intc.h           |    2 
 include/linux/jhash.h                           |    2 
 include/linux/kernel.h                          |   12 -
 include/linux/leds-ti-lmu-common.h              |    2 
 include/linux/memcontrol.h                      |   12 +
 include/linux/mempolicy.h                       |   18 +-
 include/linux/migrate.h                         |   42 +---
 include/linux/mm.h                              |   20 +-
 include/linux/mmzone.h                          |   17 +
 include/linux/oom.h                             |    4 
 include/linux/pgtable.h                         |   12 -
 include/linux/platform_data/davinci-cpufreq.h   |    2 
 include/linux/platform_data/davinci_asp.h       |    2 
 include/linux/platform_data/elm.h               |    2 
 include/linux/platform_data/gpio-davinci.h      |    2 
 include/linux/platform_data/gpmc-omap.h         |    2 
 include/linux/platform_data/mtd-davinci-aemif.h |    2 
 include/linux/platform_data/omap-twl4030.h      |    2 
 include/linux/platform_data/uio_pruss.h         |    2 
 include/linux/platform_data/usb-omap.h          |    2 
 include/linux/poison.h                          |    4 
 include/linux/sched/mm.h                        |    8 
 include/linux/sched/task.h                      |    1 
 include/linux/soc/ti/k3-ringacc.h               |    2 
 include/linux/soc/ti/knav_qmss.h                |    2 
 include/linux/soc/ti/ti-msgmgr.h                |    2 
 include/linux/swap.h                            |   25 ++
 include/linux/syscalls.h                        |    2 
 include/linux/uaccess.h                         |   20 ++
 include/linux/vm_event_item.h                   |    3 
 include/linux/wkup_m3_ipc.h                     |    2 
 include/linux/xxhash.h                          |    2 
 include/linux/xz.h                              |    4 
 include/linux/zlib.h                            |    2 
 include/soc/arc/aux.h                           |    2 
 include/trace/events/migrate.h                  |   17 +
 include/uapi/linux/auto_dev-ioctl.h             |    2 
 include/uapi/linux/elf.h                        |    2 
 include/uapi/linux/map_to_7segment.h            |    2 
 include/uapi/linux/types.h                      |    2 
 include/uapi/linux/usb/ch9.h                    |    2 
 ipc/sem.c                                       |    3 
 ipc/shm.c                                       |    4 
 kernel/Makefile                                 |    2 
 kernel/crash_core.c                             |   50 +++++
 kernel/events/callchain.c                       |    5 
 kernel/events/core.c                            |    5 
 kernel/events/uprobes.c                         |    8 
 kernel/exit.c                                   |   18 +-
 kernel/futex.c                                  |    2 
 kernel/kcov.c                                   |    6 
 kernel/kmod.c                                   |    5 
 kernel/kthread.c                                |    5 
 kernel/panic.c                                  |    4 
 kernel/stacktrace.c                             |    5 
 kernel/sysctl.c                                 |   11 +
 kernel/umh.c                                    |   29 ---
 lib/Kconfig.debug                               |   27 ++-
 lib/Makefile                                    |    1 
 lib/bitmap.c                                    |    4 
 lib/crc64.c                                     |    2 
 lib/decompress_bunzip2.c                        |    2 
 lib/decompress_unlzma.c                         |    6 
 lib/kstrtox.c                                   |   20 --
 lib/lz4/lz4_compress.c                          |    4 
 lib/lz4/lz4_decompress.c                        |   18 +-
 lib/lz4/lz4defs.h                               |   10 +
 lib/lz4/lz4hc_compress.c                        |    2 
 lib/math/rational.c                             |    2 
 lib/rbtree.c                                    |    2 
 lib/test_bitmap.c                               |   58 ++++++
 lib/test_bitops.c                               |   18 +-
 lib/test_bits.c                                 |   75 ++++++++
 lib/test_kmod.c                                 |    2 
 lib/test_lockup.c                               |    6 
 lib/ts_bm.c                                     |    2 
 lib/xxhash.c                                    |    2 
 lib/xz/xz_crc32.c                               |    2 
 lib/xz/xz_dec_bcj.c                             |    2 
 lib/xz/xz_dec_lzma2.c                           |    2 
 lib/xz/xz_lzma2.h                               |    2 
 lib/xz/xz_stream.h                              |    2 
 mm/cma.c                                        |   40 +---
 mm/cma.h                                        |    4 
 mm/compaction.c                                 |  207 +++++++++++++++++++++--
 mm/filemap.c                                    |    2 
 mm/gup.c                                        |  195 ++++++----------------
 mm/hmm.c                                        |    5 
 mm/huge_memory.c                                |   23 --
 mm/hugetlb.c                                    |   93 ++++------
 mm/internal.h                                   |    9 -
 mm/khugepaged.c                                 |    2 
 mm/ksm.c                                        |    3 
 mm/maccess.c                                    |   22 +-
 mm/memcontrol.c                                 |   42 +++-
 mm/memory-failure.c                             |    7 
 mm/memory.c                                     |  107 +++++++++---
 mm/memory_hotplug.c                             |   30 ++-
 mm/mempolicy.c                                  |   49 +----
 mm/migrate.c                                    |  151 ++++++++++++++---
 mm/mmu_notifier.c                               |    9 -
 mm/nommu.c                                      |    4 
 mm/oom_kill.c                                   |   24 +-
 mm/page_alloc.c                                 |   14 +
 mm/page_isolation.c                             |   21 --
 mm/percpu-internal.h                            |   55 ++++++
 mm/percpu-km.c                                  |    5 
 mm/percpu-stats.c                               |   36 ++--
 mm/percpu-vm.c                                  |    5 
 mm/percpu.c                                     |  208 +++++++++++++++++++++---
 mm/process_vm_access.c                          |    2 
 mm/rmap.c                                       |    2 
 mm/shmem.c                                      |    5 
 mm/slab_common.c                                |    2 
 mm/swap.c                                       |   13 -
 mm/swap_state.c                                 |   80 +++++++--
 mm/swapfile.c                                   |    4 
 mm/usercopy.c                                   |    2 
 mm/userfaultfd.c                                |    2 
 mm/vmscan.c                                     |   36 ++--
 mm/vmstat.c                                     |   32 +++
 mm/workingset.c                                 |   23 +-
 mm/zpool.c                                      |    8 
 mm/zsmalloc.c                                   |    2 
 scripts/checkpatch.pl                           |  116 +++++++++----
 scripts/gdb/linux/rbtree.py                     |    4 
 security/tomoyo/domain.c                        |    2 
 tools/testing/selftests/cgroup/test_kmem.c      |   70 +++++++-
 tools/testing/selftests/kmod/kmod.sh            |    4 
 tools/testing/selftests/vm/hmm-tests.c          |   35 ++++
 virt/kvm/async_pf.c                             |    2 
 virt/kvm/kvm_main.c                             |    2 
 268 files changed, 2481 insertions(+), 1551 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-08-07  6:16 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-08-07  6:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- A few MM hotfixes

- kthread, tools, scripts, ntfs and ocfs2

- Some of MM



163 patches, based on d6efb3ac3e6c19ab722b28bdb9252bae0b9676b6.

Subsystems affected by this patch series:

  mm/pagemap
  mm/hofixes
  mm/pagealloc
  kthread
  tools
  scripts
  ntfs
  ocfs2
  mm/slab-generic
  mm/slab
  mm/slub
  mm/kcsan
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/shmem
  mm/memcg
  mm/pagemap
  mm/mremap
  mm/mincore
  mm/sparsemem
  mm/vmalloc
  mm/kasan
  mm/pagealloc
  mm/hugetlb
  mm/vmscan

Subsystem: mm/pagemap

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/memory.c: avoid access flag update TLB flush for retried page fault

Subsystem: mm/hofixes

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/migrate: fix migrate_pgmap_owner w/o CONFIG_MMU_NOTIFIER

Subsystem: mm/pagealloc

    David Hildenbrand <david@redhat.com>:
      mm/shuffle: don't move pages between zones and don't read garbage memmaps

Subsystem: kthread

    Peter Zijlstra <peterz@infradead.org>:
      mm: fix kthread_use_mm() vs TLB invalidate

    Ilias Stamatis <stamatis.iliass@gmail.com>:
      kthread: remove incorrect comment in kthread_create_on_cpu()

Subsystem: tools

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      tools/: replace HTTP links with HTTPS ones

    Gaurav Singh <gaurav1086@gmail.com>:
      tools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer dereference

Subsystem: scripts

    Jialu Xu <xujialu@vimux.org>:
      scripts/tags.sh: collect compiled source precisely

    Nikolay Borisov <nborisov@suse.com>:
      scripts/bloat-o-meter: Support comparing library archives

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      scripts/decode_stacktrace.sh: skip missing symbols
      scripts/decode_stacktrace.sh: guess basepath if not specified
      scripts/decode_stacktrace.sh: guess path to modules
      scripts/decode_stacktrace.sh: guess path to vmlinux by release name

    Joe Perches <joe@perches.com>:
      const_structs.checkpatch: add regulator_ops

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ntfs

    Luca Stefani <luca.stefani.ge1@gmail.com>:
      ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type

Subsystem: ocfs2

    Gang He <ghe@suse.com>:
      ocfs2: fix remounting needed after setfacl command

    Randy Dunlap <rdunlap@infradead.org>:
      ocfs2: suballoc.h: delete a duplicated word

    Junxiao Bi <junxiao.bi@oracle.com>:
      ocfs2: change slot number type s16 to u16

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      ocfs2: replace HTTP links with HTTPS ones

    Pavel Machek <pavel@ucw.cz>:
      ocfs2: fix unbalanced locking

Subsystem: mm/slab-generic

    Waiman Long <longman@redhat.com>:
      mm, treewide: rename kzfree() to kfree_sensitive()

    William Kucharski <william.kucharski@oracle.com>:
      mm: ksize() should silently accept a NULL pointer

Subsystem: mm/slab

    Kees Cook <keescook@chromium.org>:
    Patch series "mm: Expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB":
      mm/slab: expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB
      mm/slab: add naive detection of double free

    Long Li <lonuxli.64@gmail.com>:
      mm, slab: check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

    Xiao Yang <yangx.jy@cn.fujitsu.com>:
      mm/slab.c: update outdated kmem_list3 in a comment

Subsystem: mm/slub

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "slub_debug fixes and improvements":
      mm, slub: extend slub_debug syntax for multiple blocks
      mm, slub: make some slub_debug related attributes read-only
      mm, slub: remove runtime allocation order changes
      mm, slub: make remaining slub_debug related attributes read-only
      mm, slub: make reclaim_account attribute read-only
      mm, slub: introduce static key for slub_debug()
      mm, slub: introduce kmem_cache_debug_flags()
      mm, slub: extend checks guarded by slub_debug static key
      mm, slab/slub: move and improve cache_from_obj()
      mm, slab/slub: improve error reporting and overhead of cache_from_obj()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/slub.c: drop lockdep_assert_held() from put_map()

Subsystem: mm/kcsan

    Marco Elver <elver@google.com>:
      mm, kcsan: instrument SLAB/SLUB free with "ASSERT_EXCLUSIVE_ACCESS"

Subsystem: mm/debug

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/debug_vm_pgtable: Add some more tests", v5:
      mm/debug_vm_pgtable: add tests validating arch helpers for core MM features
      mm/debug_vm_pgtable: add tests validating advanced arch page table helpers
      mm/debug_vm_pgtable: add debug prints for individual tests
      Documentation/mm: add descriptions for arch page table helpers

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Improvements for dump_page()", v2:
      mm/debug: handle page->mapping better in dump_page
      mm/debug: dump compound page information on a second line
      mm/debug: print head flags in dump_page
      mm/debug: switch dump_page to get_kernel_nofault
      mm/debug: print the inode number in dump_page
      mm/debug: print hashed address of struct page

    John Hubbard <jhubbard@nvidia.com>:
      mm, dump_page: do not crash with bad compound_mapcount()

Subsystem: mm/pagecache

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: filemap: clear idle flag for writes
      mm: filemap: add missing FGP_ flags in kerneldoc comment for pagecache_get_page

Subsystem: mm/gup

    Tang Yizhou <tangyizhou@huawei.com>:
      mm/gup.c: fix the comment of return value for populate_vma_page_range()

Subsystem: mm/swap

    Zhen Lei <thunder.leizhen@huawei.com>:
    Patch series "clean up some functions in mm/swap_slots.c":
      mm/swap_slots.c: simplify alloc_swap_slot_cache()
      mm/swap_slots.c: simplify enable_swap_slots_cache()
      mm/swap_slots.c: remove redundant check for swap_slot_cache_initialized

    Krzysztof Kozlowski <krzk@kernel.org>:
      mm: swap: fix kerneldoc of swap_vma_readahead()

    Xianting Tian <xianting_tian@126.com>:
      mm/page_io.c: use blk_io_schedule() for avoiding task hung in sync io

Subsystem: mm/shmem

    Chris Down <chris@chrisdown.name>:
    Patch series "tmpfs: inode: Reduce risk of inum overflow", v7:
      tmpfs: per-superblock i_ino support
      tmpfs: support 64-bit inums per-sb

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: kmem: make memcg_kmem_enabled() irreversible
    Patch series "The new cgroup slab memory controller", v7:
      mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state()
      mm: memcg: prepare for byte-sized vmstat items
      mm: memcg: convert vmstat slab counters to bytes
      mm: slub: implement SLUB version of obj_to_index()

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: decouple reference counting from page accounting

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: obj_cgroup API
      mm: memcg/slab: allocate obj_cgroups for non-root slab pages
      mm: memcg/slab: save obj_cgroup for non-root slab objects
      mm: memcg/slab: charge individual slab objects instead of pages
      mm: memcg/slab: deprecate memory.kmem.slabinfo
      mm: memcg/slab: move memcg_kmem_bypass() to memcontrol.h
      mm: memcg/slab: use a single set of kmem_caches for all accounted allocations
      mm: memcg/slab: simplify memcg cache creation
      mm: memcg/slab: remove memcg_kmem_get_cache()
      mm: memcg/slab: deprecate slab_root_caches
      mm: memcg/slab: remove redundant check in memcg_accumulate_slabinfo()
      mm: memcg/slab: use a single set of kmem_caches for all allocations
      kselftests: cgroup: add kernel memory accounting tests
      tools/cgroup: add memcg_slabinfo.py tool

    Shakeel Butt <shakeelb@google.com>:
      mm: memcontrol: account kernel stack per node

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: remove unused argument by charge_slab_page()
      mm: slab: rename (un)charge_slab_page() to (un)account_slab_page()
      mm: kmem: switch to static_branch_likely() in memcg_kmem_enabled()
      mm: memcontrol: avoid workload stalls when lowering memory.high

    Chris Down <chris@chrisdown.name>:
    Patch series "mm, memcg: reclaim harder before high throttling", v2:
      mm, memcg: reclaim more aggressively before high allocator throttling
      mm, memcg: unify reclaim retry limits with page allocator

    Yafang Shao <laoar.shao@gmail.com>:
    Patch series "mm, memcg: memory.{low,min} reclaim fix & cleanup", v4:
      mm, memcg: avoid stale protection values when cgroup is above protection

    Chris Down <chris@chrisdown.name>:
      mm, memcg: decouple e{low,min} state mutations from protection checks

    Yafang Shao <laoar.shao@gmail.com>:
      memcg, oom: check memcg margin for parallel oom

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: restore proper dirty throttling when memory.high changes
      mm: memcontrol: don't count limit-setting reclaim as memory pressure

    Michal Koutný <mkoutny@suse.com>:
      mm/page_counter.c: fix protection usage propagation

Subsystem: mm/pagemap

    Ralph Campbell <rcampbell@nvidia.com>:
      mm: remove redundant check non_swap_entry()

    Alex Zhang <zhangalex@google.com>:
      mm/memory.c: make remap_pfn_range() reject unaligned addr

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: cleanup usage of <asm/pgalloc.h>":
      mm: remove unneeded includes of <asm/pgalloc.h>
      opeinrisc: switch to generic version of pte allocation
      xtensa: switch to generic version of pte allocation
      asm-generic: pgalloc: provide generic pmd_alloc_one() and pmd_free_one()
      asm-generic: pgalloc: provide generic pud_alloc_one() and pud_free_one()
      asm-generic: pgalloc: provide generic pgd_free()
      mm: move lib/ioremap.c to mm/

    Joerg Roedel <jroedel@suse.de>:
      mm: move p?d_alloc_track to separate header file

    Zhen Lei <thunder.leizhen@huawei.com>:
      mm/mmap: optimize a branch judgment in ksys_mmap_pgoff()

    Feng Tang <feng.tang@intel.com>:
    Patch series "make vm_committed_as_batch aware of vm overcommit policy", v6:
      proc/meminfo: avoid open coded reading of vm_committed_as
      mm/util.c: make vm_memory_committed() more accurate
      percpu_counter: add percpu_counter_sync()
      mm: adjust vm_committed_as_batch according to vm overcommit policy

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "arm64: Enable vmemmap mapping from device memory", v4:
      mm/sparsemem: enable vmem_altmap support in vmemmap_populate_basepages()
      mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf()
      arm64/mm: enable vmem_altmap support for vmemmap mappings

    Miaohe Lin <linmiaohe@huawei.com>:
      mm: mmap: merge vma after call_mmap() if possible

    Peter Collingbourne <pcc@google.com>:
      mm: remove unnecessary wrapper function do_mmap_pgoff()

Subsystem: mm/mremap

    Wei Yang <richard.weiyang@linux.alibaba.com>:
    Patch series "mm/mremap: cleanup move_page_tables() a little", v5:
      mm/mremap: it is sure to have enough space when extent meets requirement
      mm/mremap: calculate extent in one place
      mm/mremap: start addresses are properly aligned

Subsystem: mm/mincore

    Ricardo Cañuelo <ricardo.canuelo@collabora.com>:
      selftests: add mincore() tests

Subsystem: mm/sparsemem

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/sparse: never partially remove memmap for early section
      mm/sparse: only sub-section aligned range would be populated

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/sparse: cleanup the code surrounding memory_present()

Subsystem: mm/vmalloc

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      vmalloc: convert to XArray

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: simplify merge_or_add_vmap_area()
      mm/vmalloc: simplify augment_tree_propagate_check()
      mm/vmalloc: switch to "propagate()" callback
      mm/vmalloc: update the header about KVA rework

    Mike Rapoport <rppt@linux.ibm.com>:
      mm: vmalloc: remove redundant assignment in unmap_kernel_range_noflush()

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc.c: remove BUG() from the find_va_links()

Subsystem: mm/kasan

    Marco Elver <elver@google.com>:
      kasan: improve and simplify Kconfig.kasan
      kasan: update required compiler versions in documentation

    Walter Wu <walter-zh.wu@mediatek.com>:
    Patch series "kasan: memorize and print call_rcu stack", v8:
      rcu: kasan: record and print call_rcu() call stack
      kasan: record and print the free track
      kasan: add tests for call_rcu stack recording
      kasan: update documentation for generic kasan

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      kasan: remove kasan_unpoison_stack_above_sp_to()

    Walter Wu <walter-zh.wu@mediatek.com>:
      lib/test_kasan.c: fix KASAN unit tests for tag-based KASAN

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kasan: support stack instrumentation for tag-based mode", v2:
      kasan: don't tag stacks allocated with pagealloc
      efi: provide empty efi_enter_virtual_mode implementation
      kasan, arm64: don't instrument functions that enable kasan
      kasan: allow enabling stack tagging for tag-based mode
      kasan: adjust kasan_stack_oob for tag-based mode

Subsystem: mm/pagealloc

    Vlastimil Babka <vbabka@suse.cz>:
      mm, page_alloc: use unlikely() in task_capc()

    Jaewon Kim <jaewon31.kim@samsung.com>:
      page_alloc: consider highatomic reserve in watermark fast

    Charan Teja Reddy <charante@codeaurora.org>:
      mm, page_alloc: skip ->waternark_boost for atomic order-0 allocations

    David Hildenbrand <david@redhat.com>:
      mm: remove vm_total_pages
      mm/page_alloc: remove nr_free_pagecache_pages()
      mm/memory_hotplug: document why shuffle_zone() is relevant
      mm/shuffle: remove dynamic reconfiguration

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
      mm/page_alloc.c: extract the common part in pfn_to_bitidx()
      mm/page_alloc.c: simplify pageblock bitmap access
      mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()

    Qian Cai <cai@lca.pw>:
      mm/page_alloc: silence a KASAN false positive

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/page_alloc: fallbacks at most has 3 elements

    Muchun Song <songmuchun@bytedance.com>:
      mm/page_alloc.c: skip setting nodemask when we are in interrupt

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
      mm/page_alloc: fix memalloc_nocma_{save/restore} APIs

Subsystem: mm/hugetlb

    "Alexander A. Klimov" <grandmaster@al2klimov.de>:
      mm: thp: replace HTTP links with HTTPS ones

    Peter Xu <peterx@redhat.com>:
      mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible

    Hugh Dickins <hughd@google.com>:
      khugepaged: collapse_pte_mapped_thp() flush the right range
      khugepaged: collapse_pte_mapped_thp() protect the pmd lock
      khugepaged: retract_page_tables() remember to test exit
      khugepaged: khugepaged_test_exit() check mmget_still_valid()

Subsystem: mm/vmscan

    dylan-meiners <spacct.spacct@gmail.com>:
      mm/vmscan.c: fix typo

    Shakeel Butt <shakeelb@google.com>:
      mm: vmscan: consistent update to pgrefill

 Documentation/admin-guide/kernel-parameters.txt        |    2 
 Documentation/dev-tools/kasan.rst                      |   10 
 Documentation/filesystems/dlmfs.rst                    |    2 
 Documentation/filesystems/ocfs2.rst                    |    2 
 Documentation/filesystems/tmpfs.rst                    |   18 
 Documentation/vm/arch_pgtable_helpers.rst              |  258 +++++
 Documentation/vm/memory-model.rst                      |    9 
 Documentation/vm/slub.rst                              |   51 -
 arch/alpha/include/asm/pgalloc.h                       |   21 
 arch/alpha/include/asm/tlbflush.h                      |    1 
 arch/alpha/kernel/core_irongate.c                      |    1 
 arch/alpha/kernel/core_marvel.c                        |    1 
 arch/alpha/kernel/core_titan.c                         |    1 
 arch/alpha/kernel/machvec_impl.h                       |    2 
 arch/alpha/kernel/smp.c                                |    1 
 arch/alpha/mm/numa.c                                   |    1 
 arch/arc/mm/fault.c                                    |    1 
 arch/arc/mm/init.c                                     |    1 
 arch/arm/include/asm/pgalloc.h                         |   12 
 arch/arm/include/asm/tlb.h                             |    1 
 arch/arm/kernel/machine_kexec.c                        |    1 
 arch/arm/kernel/smp.c                                  |    1 
 arch/arm/kernel/suspend.c                              |    1 
 arch/arm/mach-omap2/omap-mpuss-lowpower.c              |    1 
 arch/arm/mm/hugetlbpage.c                              |    1 
 arch/arm/mm/init.c                                     |    9 
 arch/arm/mm/mmu.c                                      |    1 
 arch/arm64/include/asm/pgalloc.h                       |   39 
 arch/arm64/kernel/setup.c                              |    2 
 arch/arm64/kernel/smp.c                                |    1 
 arch/arm64/mm/hugetlbpage.c                            |    1 
 arch/arm64/mm/init.c                                   |    6 
 arch/arm64/mm/ioremap.c                                |    1 
 arch/arm64/mm/mmu.c                                    |   63 -
 arch/csky/include/asm/pgalloc.h                        |    7 
 arch/csky/kernel/smp.c                                 |    1 
 arch/hexagon/include/asm/pgalloc.h                     |    7 
 arch/ia64/include/asm/pgalloc.h                        |   24 
 arch/ia64/include/asm/tlb.h                            |    1 
 arch/ia64/kernel/process.c                             |    1 
 arch/ia64/kernel/smp.c                                 |    1 
 arch/ia64/kernel/smpboot.c                             |    1 
 arch/ia64/mm/contig.c                                  |    1 
 arch/ia64/mm/discontig.c                               |    4 
 arch/ia64/mm/hugetlbpage.c                             |    1 
 arch/ia64/mm/tlb.c                                     |    1 
 arch/m68k/include/asm/mmu_context.h                    |    2 
 arch/m68k/include/asm/sun3_pgalloc.h                   |    7 
 arch/m68k/kernel/dma.c                                 |    2 
 arch/m68k/kernel/traps.c                               |    3 
 arch/m68k/mm/cache.c                                   |    2 
 arch/m68k/mm/fault.c                                   |    1 
 arch/m68k/mm/kmap.c                                    |    2 
 arch/m68k/mm/mcfmmu.c                                  |    1 
 arch/m68k/mm/memory.c                                  |    1 
 arch/m68k/sun3x/dvma.c                                 |    2 
 arch/microblaze/include/asm/pgalloc.h                  |    6 
 arch/microblaze/include/asm/tlbflush.h                 |    1 
 arch/microblaze/kernel/process.c                       |    1 
 arch/microblaze/kernel/signal.c                        |    1 
 arch/microblaze/mm/init.c                              |    3 
 arch/mips/include/asm/pgalloc.h                        |   19 
 arch/mips/kernel/setup.c                               |    8 
 arch/mips/loongson64/numa.c                            |    1 
 arch/mips/sgi-ip27/ip27-memory.c                       |    2 
 arch/mips/sgi-ip32/ip32-memory.c                       |    1 
 arch/nds32/mm/mm-nds32.c                               |    2 
 arch/nios2/include/asm/pgalloc.h                       |    7 
 arch/openrisc/include/asm/pgalloc.h                    |   33 
 arch/openrisc/include/asm/tlbflush.h                   |    1 
 arch/openrisc/kernel/or32_ksyms.c                      |    1 
 arch/parisc/include/asm/mmu_context.h                  |    1 
 arch/parisc/include/asm/pgalloc.h                      |   12 
 arch/parisc/kernel/cache.c                             |    1 
 arch/parisc/kernel/pci-dma.c                           |    1 
 arch/parisc/kernel/process.c                           |    1 
 arch/parisc/kernel/signal.c                            |    1 
 arch/parisc/kernel/smp.c                               |    1 
 arch/parisc/mm/hugetlbpage.c                           |    1 
 arch/parisc/mm/init.c                                  |    5 
 arch/parisc/mm/ioremap.c                               |    2 
 arch/powerpc/include/asm/tlb.h                         |    1 
 arch/powerpc/mm/book3s64/hash_hugetlbpage.c            |    1 
 arch/powerpc/mm/book3s64/hash_pgtable.c                |    1 
 arch/powerpc/mm/book3s64/hash_tlb.c                    |    1 
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c           |    1 
 arch/powerpc/mm/init_32.c                              |    1 
 arch/powerpc/mm/init_64.c                              |    4 
 arch/powerpc/mm/kasan/8xx.c                            |    1 
 arch/powerpc/mm/kasan/book3s_32.c                      |    1 
 arch/powerpc/mm/mem.c                                  |    3 
 arch/powerpc/mm/nohash/40x.c                           |    1 
 arch/powerpc/mm/nohash/8xx.c                           |    1 
 arch/powerpc/mm/nohash/fsl_booke.c                     |    1 
 arch/powerpc/mm/nohash/kaslr_booke.c                   |    1 
 arch/powerpc/mm/nohash/tlb.c                           |    1 
 arch/powerpc/mm/numa.c                                 |    1 
 arch/powerpc/mm/pgtable.c                              |    1 
 arch/powerpc/mm/pgtable_64.c                           |    1 
 arch/powerpc/mm/ptdump/hashpagetable.c                 |    2 
 arch/powerpc/mm/ptdump/ptdump.c                        |    1 
 arch/powerpc/platforms/pseries/cmm.c                   |    1 
 arch/riscv/include/asm/pgalloc.h                       |   18 
 arch/riscv/mm/fault.c                                  |    1 
 arch/riscv/mm/init.c                                   |    3 
 arch/s390/crypto/prng.c                                |    4 
 arch/s390/include/asm/tlb.h                            |    1 
 arch/s390/include/asm/tlbflush.h                       |    1 
 arch/s390/kernel/machine_kexec.c                       |    1 
 arch/s390/kernel/ptrace.c                              |    1 
 arch/s390/kvm/diag.c                                   |    1 
 arch/s390/kvm/priv.c                                   |    1 
 arch/s390/kvm/pv.c                                     |    1 
 arch/s390/mm/cmm.c                                     |    1 
 arch/s390/mm/init.c                                    |    1 
 arch/s390/mm/mmap.c                                    |    1 
 arch/s390/mm/pgtable.c                                 |    1 
 arch/sh/include/asm/pgalloc.h                          |    4 
 arch/sh/kernel/idle.c                                  |    1 
 arch/sh/kernel/machine_kexec.c                         |    1 
 arch/sh/mm/cache-sh3.c                                 |    1 
 arch/sh/mm/cache-sh7705.c                              |    1 
 arch/sh/mm/hugetlbpage.c                               |    1 
 arch/sh/mm/init.c                                      |    7 
 arch/sh/mm/ioremap_fixed.c                             |    1 
 arch/sh/mm/numa.c                                      |    3 
 arch/sh/mm/tlb-sh3.c                                   |    1 
 arch/sparc/include/asm/ide.h                           |    1 
 arch/sparc/include/asm/tlb_64.h                        |    1 
 arch/sparc/kernel/leon_smp.c                           |    1 
 arch/sparc/kernel/process_32.c                         |    1 
 arch/sparc/kernel/signal_32.c                          |    1 
 arch/sparc/kernel/smp_32.c                             |    1 
 arch/sparc/kernel/smp_64.c                             |    1 
 arch/sparc/kernel/sun4m_irq.c                          |    1 
 arch/sparc/mm/highmem.c                                |    1 
 arch/sparc/mm/init_64.c                                |    1 
 arch/sparc/mm/io-unit.c                                |    1 
 arch/sparc/mm/iommu.c                                  |    1 
 arch/sparc/mm/tlb.c                                    |    1 
 arch/um/include/asm/pgalloc.h                          |    9 
 arch/um/include/asm/pgtable-3level.h                   |    3 
 arch/um/kernel/mem.c                                   |   17 
 arch/x86/ia32/ia32_aout.c                              |    1 
 arch/x86/include/asm/mmu_context.h                     |    1 
 arch/x86/include/asm/pgalloc.h                         |   42 
 arch/x86/kernel/alternative.c                          |    1 
 arch/x86/kernel/apic/apic.c                            |    1 
 arch/x86/kernel/mpparse.c                              |    1 
 arch/x86/kernel/traps.c                                |    1 
 arch/x86/mm/fault.c                                    |    1 
 arch/x86/mm/hugetlbpage.c                              |    1 
 arch/x86/mm/init_32.c                                  |    2 
 arch/x86/mm/init_64.c                                  |   12 
 arch/x86/mm/kaslr.c                                    |    1 
 arch/x86/mm/pgtable_32.c                               |    1 
 arch/x86/mm/pti.c                                      |    1 
 arch/x86/platform/uv/bios_uv.c                         |    1 
 arch/x86/power/hibernate.c                             |    2 
 arch/xtensa/include/asm/pgalloc.h                      |   46 
 arch/xtensa/kernel/xtensa_ksyms.c                      |    1 
 arch/xtensa/mm/cache.c                                 |    1 
 arch/xtensa/mm/fault.c                                 |    1 
 crypto/adiantum.c                                      |    2 
 crypto/ahash.c                                         |    4 
 crypto/api.c                                           |    2 
 crypto/asymmetric_keys/verify_pefile.c                 |    4 
 crypto/deflate.c                                       |    2 
 crypto/drbg.c                                          |   10 
 crypto/ecc.c                                           |    8 
 crypto/ecdh.c                                          |    2 
 crypto/gcm.c                                           |    2 
 crypto/gf128mul.c                                      |    4 
 crypto/jitterentropy-kcapi.c                           |    2 
 crypto/rng.c                                           |    2 
 crypto/rsa-pkcs1pad.c                                  |    6 
 crypto/seqiv.c                                         |    2 
 crypto/shash.c                                         |    2 
 crypto/skcipher.c                                      |    2 
 crypto/testmgr.c                                       |    6 
 crypto/zstd.c                                          |    2 
 drivers/base/node.c                                    |   10 
 drivers/block/xen-blkback/common.h                     |    1 
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c    |    2 
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c    |    2 
 drivers/crypto/amlogic/amlogic-gxl-cipher.c            |    4 
 drivers/crypto/atmel-ecc.c                             |    2 
 drivers/crypto/caam/caampkc.c                          |   28 
 drivers/crypto/cavium/cpt/cptvf_main.c                 |    6 
 drivers/crypto/cavium/cpt/cptvf_reqmanager.c           |   12 
 drivers/crypto/cavium/nitrox/nitrox_lib.c              |    4 
 drivers/crypto/cavium/zip/zip_crypto.c                 |    6 
 drivers/crypto/ccp/ccp-crypto-rsa.c                    |    6 
 drivers/crypto/ccree/cc_aead.c                         |    4 
 drivers/crypto/ccree/cc_buffer_mgr.c                   |    4 
 drivers/crypto/ccree/cc_cipher.c                       |    6 
 drivers/crypto/ccree/cc_hash.c                         |    8 
 drivers/crypto/ccree/cc_request_mgr.c                  |    2 
 drivers/crypto/marvell/cesa/hash.c                     |    2 
 drivers/crypto/marvell/octeontx/otx_cptvf_main.c       |    6 
 drivers/crypto/marvell/octeontx/otx_cptvf_reqmgr.h     |    2 
 drivers/crypto/nx/nx.c                                 |    4 
 drivers/crypto/virtio/virtio_crypto_algs.c             |   12 
 drivers/crypto/virtio/virtio_crypto_core.c             |    2 
 drivers/iommu/ipmmu-vmsa.c                             |    1 
 drivers/md/dm-crypt.c                                  |   32 
 drivers/md/dm-integrity.c                              |    6 
 drivers/misc/ibmvmc.c                                  |    6 
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c |    2 
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c         |    6 
 drivers/net/ppp/ppp_mppe.c                             |    6 
 drivers/net/wireguard/noise.c                          |    4 
 drivers/net/wireguard/peer.c                           |    2 
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c           |    2 
 drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c      |    6 
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c           |    6 
 drivers/net/wireless/intersil/orinoco/wext.c           |    4 
 drivers/s390/crypto/ap_bus.h                           |    4 
 drivers/staging/ks7010/ks_hostif.c                     |    2 
 drivers/staging/rtl8723bs/core/rtw_security.c          |    2 
 drivers/staging/wlan-ng/p80211netdev.c                 |    2 
 drivers/target/iscsi/iscsi_target_auth.c               |    2 
 drivers/xen/balloon.c                                  |    1 
 drivers/xen/privcmd.c                                  |    1 
 fs/Kconfig                                             |   21 
 fs/aio.c                                               |    6 
 fs/binfmt_elf_fdpic.c                                  |    1 
 fs/cifs/cifsencrypt.c                                  |    2 
 fs/cifs/connect.c                                      |   10 
 fs/cifs/dfs_cache.c                                    |    2 
 fs/cifs/misc.c                                         |    8 
 fs/crypto/inline_crypt.c                               |    5 
 fs/crypto/keyring.c                                    |    6 
 fs/crypto/keysetup_v1.c                                |    4 
 fs/ecryptfs/keystore.c                                 |    4 
 fs/ecryptfs/messaging.c                                |    2 
 fs/hugetlbfs/inode.c                                   |    2 
 fs/ntfs/dir.c                                          |    2 
 fs/ntfs/inode.c                                        |   27 
 fs/ntfs/inode.h                                        |    4 
 fs/ntfs/mft.c                                          |    4 
 fs/ocfs2/Kconfig                                       |    6 
 fs/ocfs2/acl.c                                         |    2 
 fs/ocfs2/blockcheck.c                                  |    2 
 fs/ocfs2/dlmglue.c                                     |    8 
 fs/ocfs2/ocfs2.h                                       |    4 
 fs/ocfs2/suballoc.c                                    |    4 
 fs/ocfs2/suballoc.h                                    |    2 
 fs/ocfs2/super.c                                       |    4 
 fs/proc/meminfo.c                                      |   10 
 include/asm-generic/pgalloc.h                          |   80 +
 include/asm-generic/tlb.h                              |    1 
 include/crypto/aead.h                                  |    2 
 include/crypto/akcipher.h                              |    2 
 include/crypto/gf128mul.h                              |    2 
 include/crypto/hash.h                                  |    2 
 include/crypto/internal/acompress.h                    |    2 
 include/crypto/kpp.h                                   |    2 
 include/crypto/skcipher.h                              |    2 
 include/linux/efi.h                                    |    4 
 include/linux/fs.h                                     |   17 
 include/linux/huge_mm.h                                |    2 
 include/linux/kasan.h                                  |    4 
 include/linux/memcontrol.h                             |  209 +++-
 include/linux/mm.h                                     |   86 -
 include/linux/mm_types.h                               |    5 
 include/linux/mman.h                                   |    4 
 include/linux/mmu_notifier.h                           |   13 
 include/linux/mmzone.h                                 |   54 -
 include/linux/pageblock-flags.h                        |   30 
 include/linux/percpu_counter.h                         |    4 
 include/linux/sched/mm.h                               |    8 
 include/linux/shmem_fs.h                               |    3 
 include/linux/slab.h                                   |   11 
 include/linux/slab_def.h                               |    9 
 include/linux/slub_def.h                               |   31 
 include/linux/swap.h                                   |    2 
 include/linux/vmstat.h                                 |   14 
 init/Kconfig                                           |    9 
 init/main.c                                            |    2 
 ipc/shm.c                                              |    2 
 kernel/fork.c                                          |   54 -
 kernel/kthread.c                                       |    8 
 kernel/power/snapshot.c                                |    2 
 kernel/rcu/tree.c                                      |    2 
 kernel/scs.c                                           |    2 
 kernel/sysctl.c                                        |    2 
 lib/Kconfig.kasan                                      |   39 
 lib/Makefile                                           |    1 
 lib/ioremap.c                                          |  287 -----
 lib/mpi/mpiutil.c                                      |    6 
 lib/percpu_counter.c                                   |   19 
 lib/test_kasan.c                                       |   87 +
 mm/Kconfig                                             |    6 
 mm/Makefile                                            |    2 
 mm/debug.c                                             |  103 +-
 mm/debug_vm_pgtable.c                                  |  666 +++++++++++++
 mm/filemap.c                                           |    9 
 mm/gup.c                                               |    3 
 mm/huge_memory.c                                       |   14 
 mm/hugetlb.c                                           |   25 
 mm/ioremap.c                                           |  289 +++++
 mm/kasan/common.c                                      |   41 
 mm/kasan/generic.c                                     |   43 
 mm/kasan/generic_report.c                              |    1 
 mm/kasan/kasan.h                                       |   25 
 mm/kasan/quarantine.c                                  |    1 
 mm/kasan/report.c                                      |   54 -
 mm/kasan/tags.c                                        |   37 
 mm/khugepaged.c                                        |   75 -
 mm/memcontrol.c                                        |  832 ++++++++++-------
 mm/memory.c                                            |   15 
 mm/memory_hotplug.c                                    |   11 
 mm/migrate.c                                           |    6 
 mm/mm_init.c                                           |   20 
 mm/mmap.c                                              |   45 
 mm/mremap.c                                            |   19 
 mm/nommu.c                                             |    6 
 mm/oom_kill.c                                          |    2 
 mm/page-writeback.c                                    |    6 
 mm/page_alloc.c                                        |  226 ++--
 mm/page_counter.c                                      |    6 
 mm/page_io.c                                           |    2 
 mm/pgalloc-track.h                                     |   51 +
 mm/shmem.c                                             |  133 ++
 mm/shuffle.c                                           |   46 
 mm/shuffle.h                                           |   17 
 mm/slab.c                                              |  129 +-
 mm/slab.h                                              |  755 ++++++---------
 mm/slab_common.c                                       |  829 ++--------------
 mm/slob.c                                              |   12 
 mm/slub.c                                              |  680 ++++---------
 mm/sparse-vmemmap.c                                    |   62 -
 mm/sparse.c                                            |   31 
 mm/swap_slots.c                                        |   45 
 mm/swap_state.c                                        |    2 
 mm/util.c                                              |   52 +
 mm/vmalloc.c                                           |  176 +--
 mm/vmscan.c                                            |   39 
 mm/vmstat.c                                            |   38 
 mm/workingset.c                                        |    6 
 net/atm/mpoa_caches.c                                  |    4 
 net/bluetooth/ecdh_helper.c                            |    6 
 net/bluetooth/smp.c                                    |   24 
 net/core/sock.c                                        |    2 
 net/ipv4/tcp_fastopen.c                                |    2 
 net/mac80211/aead_api.c                                |    4 
 net/mac80211/aes_gmac.c                                |    2 
 net/mac80211/key.c                                     |    2 
 net/mac802154/llsec.c                                  |   20 
 net/sctp/auth.c                                        |    2 
 net/sunrpc/auth_gss/gss_krb5_crypto.c                  |    4 
 net/sunrpc/auth_gss/gss_krb5_keys.c                    |    6 
 net/sunrpc/auth_gss/gss_krb5_mech.c                    |    2 
 net/tipc/crypto.c                                      |   10 
 net/wireless/core.c                                    |    2 
 net/wireless/ibss.c                                    |    4 
 net/wireless/lib80211_crypt_tkip.c                     |    2 
 net/wireless/lib80211_crypt_wep.c                      |    2 
 net/wireless/nl80211.c                                 |   24 
 net/wireless/sme.c                                     |    6 
 net/wireless/util.c                                    |    2 
 net/wireless/wext-sme.c                                |    2 
 scripts/Makefile.kasan                                 |    3 
 scripts/bloat-o-meter                                  |    2 
 scripts/coccinelle/free/devm_free.cocci                |    4 
 scripts/coccinelle/free/ifnullfree.cocci               |    4 
 scripts/coccinelle/free/kfree.cocci                    |    6 
 scripts/coccinelle/free/kfreeaddr.cocci                |    2 
 scripts/const_structs.checkpatch                       |    1 
 scripts/decode_stacktrace.sh                           |   85 +
 scripts/spelling.txt                                   |   19 
 scripts/tags.sh                                        |   18 
 security/apparmor/domain.c                             |    4 
 security/apparmor/include/file.h                       |    2 
 security/apparmor/policy.c                             |   24 
 security/apparmor/policy_ns.c                          |    6 
 security/apparmor/policy_unpack.c                      |   14 
 security/keys/big_key.c                                |    6 
 security/keys/dh.c                                     |   14 
 security/keys/encrypted-keys/encrypted.c               |   14 
 security/keys/trusted-keys/trusted_tpm1.c              |   34 
 security/keys/user_defined.c                           |    6 
 tools/cgroup/memcg_slabinfo.py                         |  226 ++++
 tools/include/linux/jhash.h                            |    2 
 tools/lib/rbtree.c                                     |    2 
 tools/lib/traceevent/event-parse.h                     |    2 
 tools/testing/ktest/examples/README                    |    2 
 tools/testing/ktest/examples/crosstests.conf           |    2 
 tools/testing/selftests/Makefile                       |    1 
 tools/testing/selftests/cgroup/.gitignore              |    1 
 tools/testing/selftests/cgroup/Makefile                |    2 
 tools/testing/selftests/cgroup/cgroup_util.c           |    2 
 tools/testing/selftests/cgroup/test_kmem.c             |  382 +++++++
 tools/testing/selftests/mincore/.gitignore             |    2 
 tools/testing/selftests/mincore/Makefile               |    6 
 tools/testing/selftests/mincore/mincore_selftest.c     |  361 +++++++
 397 files changed, 5547 insertions(+), 4072 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-07-24  4:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-07-24  4:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


15 patches, based on f37e99aca03f63aa3f2bd13ceaf769455d12c4b0.

Subsystems affected by this patch series:

  mm/pagemap
  mm/shmem
  mm/hotfixes
  mm/memcg
  mm/hugetlb
  mailmap
  squashfs
  scripts
  io-mapping
  MAINTAINERS
  gdb

Subsystem: mm/pagemap

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/memory.c: avoid access flag update TLB flush for retried page fault

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
      mm/mmap.c: close race between munmap() and expand_upwards()/downwards()

Subsystem: mm/shmem

    Chengguang Xu <cgxu519@mykernel.net>:
      vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way

Subsystem: mm/hotfixes

    Tom Rix <trix@redhat.com>:
      mm: initialize return of vm_insert_pages

    Bhupesh Sharma <bhsharma@redhat.com>:
      mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()

Subsystem: mm/memcg

    Hugh Dickins <hughd@google.com>:
      mm/memcg: fix refcount error while moving and swapping

    Muchun Song <songmuchun@bytedance.com>:
      mm: memcg/slab: fix memory leak at non-root kmem_cache destroy

Subsystem: mm/hugetlb

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/hugetlb: avoid hardcoding while checking if cma is enabled

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
      khugepaged: fix null-pointer dereference due to race

Subsystem: mailmap

    Mike Rapoport <rppt@linux.ibm.com>:
      mailmap: add entry for Mike Rapoport

Subsystem: squashfs

    Phillip Lougher <phillip@squashfs.org.uk>:
      squashfs: fix length field overlap check in metadata reading

Subsystem: scripts

    Pi-Hsun Shih <pihsun@chromium.org>:
      scripts/decode_stacktrace: strip basepath from all paths

Subsystem: io-mapping

    "Michael J. Ruhl" <michael.j.ruhl@intel.com>:
      io-mapping: indicate mapping failure

Subsystem: MAINTAINERS

    Andrey Konovalov <andreyknvl@google.com>:
      MAINTAINERS: add KCOV section

Subsystem: gdb

    Stefano Garzarella <sgarzare@redhat.com>:
      scripts/gdb: fix lx-symbols 'gdb.error' while loading modules

 .mailmap                     |    3 +++
 MAINTAINERS                  |   11 +++++++++++
 fs/squashfs/block.c          |    2 +-
 include/linux/io-mapping.h   |    5 ++++-
 include/linux/xattr.h        |    3 ++-
 mm/hugetlb.c                 |   15 ++++++++++-----
 mm/khugepaged.c              |    3 +++
 mm/memcontrol.c              |   13 ++++++++++---
 mm/memory.c                  |    9 +++++++--
 mm/mmap.c                    |   16 ++++++++++++++--
 mm/shmem.c                   |    2 +-
 mm/slab_common.c             |   35 ++++++++++++++++++++++++++++-------
 scripts/decode_stacktrace.sh |    4 ++--
 scripts/gdb/linux/symbols.py |    2 +-
 14 files changed, 97 insertions(+), 26 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-07-03 22:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-07-03 22:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

5 patches, based on cdd3bb54332f82295ed90cd0c09c78cd0c0ee822.

Subsystems affected by this patch series:

  mm/hugetlb
  samples
  mm/cma
  mm/vmalloc
  mm/pagealloc

Subsystem: mm/hugetlb

    Mike Kravetz <mike.kravetz@oracle.com>:
      mm/hugetlb.c: fix pages per hugetlb calculation

Subsystem: samples

    Kees Cook <keescook@chromium.org>:
      samples/vfs: avoid warning in statx override

Subsystem: mm/cma

    Barry Song <song.bao.hua@hisilicon.com>:
      mm/cma.c: use exact_nid true to fix possible per-numa cma leak

Subsystem: mm/vmalloc

    Christoph Hellwig <hch@lst.de>:
      vmalloc: fix the owner argument for the new __vmalloc_node_range callers

Subsystem: mm/pagealloc

    Joel Savitz <jsavitz@redhat.com>:
      mm/page_alloc: fix documentation error

 arch/arm64/kernel/probes/kprobes.c |    2 +-
 arch/x86/hyperv/hv_init.c          |    3 ++-
 kernel/module.c                    |    2 +-
 mm/cma.c                           |    4 ++--
 mm/hugetlb.c                       |    2 +-
 mm/page_alloc.c                    |    2 +-
 samples/vfs/test-statx.c           |    2 ++
 7 files changed, 10 insertions(+), 7 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-26 17:39   ` incoming Konstantin Ryabitsev
@ 2020-06-26 17:40     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 485+ messages in thread
From: Konstantin Ryabitsev @ 2020-06-26 17:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux-MM, mm-commits

On Fri, 26 Jun 2020 at 13:39, Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> > Konstantin, maybe mm-commits could be on lore too and then they'd have
> > been caught that way?
>
> Yes, I already have a request from Kees for linux-mm addition, so that
> should show up in archives before long.

correction: mm-commits, that is

-K


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-26  6:51 ` incoming Linus Torvalds
  2020-06-26  7:31   ` incoming Linus Torvalds
@ 2020-06-26 17:39   ` Konstantin Ryabitsev
  2020-06-26 17:40     ` incoming Konstantin Ryabitsev
  1 sibling, 1 reply; 485+ messages in thread
From: Konstantin Ryabitsev @ 2020-06-26 17:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux-MM, mm-commits

On Thu, Jun 25, 2020 at 11:51:06PM -0700, Linus Torvalds wrote:
> On Thu, Jun 25, 2020 at 8:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > 32 patches, based on 908f7d12d3ba51dfe0449b9723199b423f97ca9a.
> 
> You didn't cc lkml, so now none of the nice 'b4' automation seems to
> work for this series..
> 
> Yes, this cover-letter went to linux-mm (which is on lore), but the
> individual patches didn't.
> 
> Konstantin, maybe mm-commits could be on lore too and then they'd have
> been caught that way?

Yes, I already have a request from Kees for linux-mm addition, so that 
should show up in archives before long.

-K


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-26  6:51 ` incoming Linus Torvalds
@ 2020-06-26  7:31   ` Linus Torvalds
  2020-06-26 17:39   ` incoming Konstantin Ryabitsev
  1 sibling, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-06-26  7:31 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: Linux-MM, mm-commits

On Thu, Jun 25, 2020 at 11:51 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> You didn't cc lkml, so now none of the nice 'b4' automation seems to
> work for this series..

Note that I've picked them up the old-fashioned way, so don't re-send them.

So more of a note for "please, next time..."

                Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-26  3:28 incoming Andrew Morton
@ 2020-06-26  6:51 ` Linus Torvalds
  2020-06-26  7:31   ` incoming Linus Torvalds
  2020-06-26 17:39   ` incoming Konstantin Ryabitsev
  0 siblings, 2 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-06-26  6:51 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: Linux-MM, mm-commits

On Thu, Jun 25, 2020 at 8:28 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> 32 patches, based on 908f7d12d3ba51dfe0449b9723199b423f97ca9a.

You didn't cc lkml, so now none of the nice 'b4' automation seems to
work for this series..

Yes, this cover-letter went to linux-mm (which is on lore), but the
individual patches didn't.

Konstantin, maybe mm-commits could be on lore too and then they'd have
been caught that way?

                   Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-26  3:28 Andrew Morton
  2020-06-26  6:51 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-06-26  3:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

32 patches, based on 908f7d12d3ba51dfe0449b9723199b423f97ca9a.

Subsystems affected by this patch series:

  hotfixes
  mm/pagealloc
  kexec
  ocfs2
  lib
  misc
  mm/slab
  mm/slab
  mm/slub
  mm/swap
  mm/pagemap
  mm/vmalloc
  mm/memcg
  mm/gup
  mm/thp
  mm/vmscan
  x86
  mm/memory-hotplug
  MAINTAINERS

Subsystem: hotfixes

    Stafford Horne <shorne@gmail.com>:
      openrisc: fix boot oops when DEBUG_VM is enabled

    Michal Hocko <mhocko@suse.com>:
      mm: do_swap_page(): fix up the error code

Subsystem: mm/pagealloc

    Vlastimil Babka <vbabka@suse.cz>:
      mm, compaction: make capture control handling safe wrt interrupts

Subsystem: kexec

    Lianbo Jiang <lijiang@redhat.com>:
      kexec: do not verify the signature without the lockdown or mandatory signature

Subsystem: ocfs2

    Junxiao Bi <junxiao.bi@oracle.com>:
    Patch series "ocfs2: fix nfsd over ocfs2 issues", v2:
      ocfs2: avoid inode removal while nfsd is accessing it
      ocfs2: load global_inode_alloc
      ocfs2: fix panic on nfs server over ocfs2
      ocfs2: fix value of OCFS2_INVALID_SLOT

Subsystem: lib

    Randy Dunlap <rdunlap@infradead.org>:
      lib: fix test_hmm.c reference after free

Subsystem: misc

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      linux/bits.h: fix unsigned less than zero warnings

Subsystem: mm/slab

    Waiman Long <longman@redhat.com>:
      mm, slab: fix sign conversion problem in memcg_uncharge_slab()

Subsystem: mm/slab

    Waiman Long <longman@redhat.com>:
      mm/slab: use memzero_explicit() in kzfree()

Subsystem: mm/slub

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      slub: cure list_slab_objects() from double fix

Subsystem: mm/swap

    Hugh Dickins <hughd@google.com>:
      mm: fix swap cache node allocation mask

Subsystem: mm/pagemap

    Arjun Roy <arjunroy@google.com>:
      mm/memory.c: properly pte_offset_map_lock/unlock in vm_insert_pages()

    Christophe Leroy <christophe.leroy@csgroup.eu>:
      mm/debug_vm_pgtable: fix build failure with powerpc 8xx

    Stephen Rothwell <sfr@canb.auug.org.au>:
      make asm-generic/cacheflush.h more standalone

    Nathan Chancellor <natechancellor@gmail.com>:
      media: omap3isp: remove cacheflush.h

Subsystem: mm/vmalloc

    Masanari Iida <standby24x7@gmail.com>:
      mm/vmalloc.c: fix a warning while make xmldocs

Subsystem: mm/memcg

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: handle div0 crash race condition in memory.low

    Muchun Song <songmuchun@bytedance.com>:
      mm/memcontrol.c: add missed css_put()

    Chris Down <chris@chrisdown.name>:
      mm/memcontrol.c: prevent missed memory.low load tears

Subsystem: mm/gup

    Souptick Joarder <jrdr.linux@gmail.com>:
      docs: mm/gup: minor documentation update

Subsystem: mm/thp

    Yang Shi <yang.shi@linux.alibaba.com>:
      doc: THP CoW fault no longer allocate THP

Subsystem: mm/vmscan

    Johannes Weiner <hannes@cmpxchg.org>:
    Patch series "fix for "mm: balance LRU lists based on relative thrashing" patchset":
      mm: workingset: age nonresident information alongside anonymous pages

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
      mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
      mm/memory: fix IO cost for anonymous page

Subsystem: x86

    Christoph Hellwig <hch@lst.de>:
    Patch series "fix a hyperv W^X violation and remove vmalloc_exec":
      x86/hyperv: allocate the hypercall page with only read and execute bits
      arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page
      mm: remove vmalloc_exec

Subsystem: mm/memory-hotplug

    Ben Widawsky <ben.widawsky@intel.com>:
      mm/memory_hotplug.c: fix false softlockup during pfn range removal

Subsystem: MAINTAINERS

    Luc Van Oostenryck <luc.vanoostenryck@gmail.com>:
      MAINTAINERS: update info for sparse

 Documentation/admin-guide/cgroup-v2.rst    |    4 +-
 Documentation/admin-guide/mm/transhuge.rst |    3 -
 Documentation/core-api/pin_user_pages.rst  |    2 -
 MAINTAINERS                                |    4 +-
 arch/arm64/kernel/probes/kprobes.c         |   12 +------
 arch/openrisc/kernel/dma.c                 |    5 +++
 arch/x86/hyperv/hv_init.c                  |    4 +-
 arch/x86/include/asm/pgtable_types.h       |    2 +
 drivers/media/platform/omap3isp/isp.c      |    2 -
 drivers/media/platform/omap3isp/ispvideo.c |    1 
 fs/ocfs2/dlmglue.c                         |   17 ++++++++++
 fs/ocfs2/ocfs2.h                           |    1 
 fs/ocfs2/ocfs2_fs.h                        |    4 +-
 fs/ocfs2/suballoc.c                        |    9 +++--
 include/asm-generic/cacheflush.h           |    5 +++
 include/linux/bits.h                       |    3 +
 include/linux/mmzone.h                     |    4 +-
 include/linux/swap.h                       |    1 
 include/linux/vmalloc.h                    |    1 
 kernel/kexec_file.c                        |   36 ++++------------------
 kernel/module.c                            |    4 +-
 lib/test_hmm.c                             |    3 -
 mm/compaction.c                            |   17 ++++++++--
 mm/debug_vm_pgtable.c                      |    4 +-
 mm/memcontrol.c                            |   18 ++++++++---
 mm/memory.c                                |   33 +++++++++++++-------
 mm/memory_hotplug.c                        |   13 ++++++--
 mm/nommu.c                                 |   17 ----------
 mm/slab.h                                  |    4 +-
 mm/slab_common.c                           |    2 -
 mm/slub.c                                  |   19 ++---------
 mm/swap.c                                  |    3 -
 mm/swap_state.c                            |    4 +-
 mm/vmalloc.c                               |   21 -------------
 mm/vmscan.c                                |    3 +
 mm/workingset.c                            |   46 +++++++++++++++++------------
 36 files changed, 168 insertions(+), 163 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-12  0:30 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-12  0:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


A few fixes and stragglers.


5 patches, based on 623f6dc593eaf98b91916836785278eddddaacf8.

Subsystems affected by this patch series:

  mm/memory-failure
  ocfs2
  lib/lzo
  misc

Subsystem: mm/memory-failure

    Naoya Horiguchi <nao.horiguchi@gmail.com>:
    Patch series "hwpoison: fixes signaling on memory error":
      mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
      mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread

Subsystem: ocfs2

    Tom Seewald <tseewald@gmail.com>:
      ocfs2: fix build failure when TCP/IP is disabled

Subsystem: lib/lzo

    Dave Rodgman <dave.rodgman@arm.com>:
      lib/lzo: fix ambiguous encoding bug in lzo-rle

Subsystem: misc

    Christoph Hellwig <hch@lst.de>:
      amdgpu: a NULL ->mm does not mean a thread is a kthread

 Documentation/lzo.txt                      |    8 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h |    2 -
 fs/ocfs2/Kconfig                           |    2 -
 lib/lzo/lzo1x_compress.c                   |   13 ++++++++
 mm/memory-failure.c                        |   43 +++++++++++++++++------------
 5 files changed, 47 insertions(+), 21 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-11  1:40 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-11  1:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- various hotfixes and minor things

- hch's use_mm/unuse_mm clearnups

- new syscall process_madvise(): perform madvise() on a process other
  than self

25 patches, based on 6f630784cc0d92fb58ea326e2bc01aa056279ecb.

Subsystems affected by this patch series:

  mm/hugetlb
  scripts
  kcov
  lib
  nilfs
  checkpatch
  lib
  mm/debug
  ocfs2
  lib
  misc
  mm/madvise

Subsystem: mm/hugetlb

    Dan Carpenter <dan.carpenter@oracle.com>:
      khugepaged: selftests: fix timeout condition in wait_for_scan()

Subsystem: scripts

    SeongJae Park <sjpark@amazon.de>:
      scripts/spelling: add a few more typos

Subsystem: kcov

    Andrey Konovalov <andreyknvl@google.com>:
      kcov: check kcov_softirq in kcov_remote_stop()

Subsystem: lib

    Joe Perches <joe@perches.com>:
      lib/lz4/lz4_decompress.c: document deliberate use of `&'

Subsystem: nilfs

    Ryusuke Konishi <konishi.ryusuke@gmail.com>:
      nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()

Subsystem: checkpatch

    Tim Froidcoeur <tim.froidcoeur@tessares.net>:
      checkpatch: correct check for kernel parameters doc

Subsystem: lib

    Alexander Gordeev <agordeev@linux.ibm.com>:
      lib: fix bitmap_parse() on 64-bit big endian archs

Subsystem: mm/debug

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/debug_vm_pgtable: fix kernel crash by checking for THP support

Subsystem: ocfs2

    Keyur Patel <iamkeyur96@gmail.com>:
      ocfs2: fix spelling mistake and grammar

    Ben Widawsky <ben.widawsky@intel.com>:
      mm: add comments on pglist_data zones

Subsystem: lib

    Wei Yang <richard.weiyang@gmail.com>:
      lib: test get_count_order/long in test_bitops.c

Subsystem: misc

    Walter Wu <walter-zh.wu@mediatek.com>:
      stacktrace: cleanup inconsistent variable type

    Christoph Hellwig <hch@lst.de>:
    Patch series "improve use_mm / unuse_mm", v2:
      kernel: move use_mm/unuse_mm to kthread.c
      kernel: move use_mm/unuse_mm to kthread.c
      kernel: better document the use_mm/unuse_mm API contract
      kernel: set USER_DS in kthread_use_mm

Subsystem: mm/madvise

    Minchan Kim <minchan@kernel.org>:
    Patch series "introduce memory hinting API for external process", v7:
      mm/madvise: pass task and mm to do_madvise
      mm/madvise: introduce process_madvise() syscall: an external memory hinting API
      mm/madvise: check fatal signal pending of target process
      pid: move pidfd_get_pid() to pid.c
      mm/madvise: support both pid and pidfd for process_madvise

    Oleksandr Natalenko <oleksandr@redhat.com>:
      mm/madvise: allow KSM hints for remote API

    Minchan Kim <minchan@kernel.org>:
      mm: support vector address ranges for process_madvise
      mm: use only pidfd for process_madvise syscall

    YueHaibing <yuehaibing@huawei.com>:
      mm/madvise.c: remove duplicated include

 arch/alpha/kernel/syscalls/syscall.tbl              |    1 
 arch/arm/tools/syscall.tbl                          |    1 
 arch/arm64/include/asm/unistd.h                     |    2 
 arch/arm64/include/asm/unistd32.h                   |    4 
 arch/ia64/kernel/syscalls/syscall.tbl               |    1 
 arch/m68k/kernel/syscalls/syscall.tbl               |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl         |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl           |    3 
 arch/mips/kernel/syscalls/syscall_n64.tbl           |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl           |    3 
 arch/parisc/kernel/syscalls/syscall.tbl             |    3 
 arch/powerpc/kernel/syscalls/syscall.tbl            |    3 
 arch/powerpc/platforms/powernv/vas-fault.c          |    4 
 arch/s390/kernel/syscalls/syscall.tbl               |    3 
 arch/sh/kernel/syscalls/syscall.tbl                 |    1 
 arch/sparc/kernel/syscalls/syscall.tbl              |    3 
 arch/x86/entry/syscalls/syscall_32.tbl              |    3 
 arch/x86/entry/syscalls/syscall_64.tbl              |    5 
 arch/xtensa/kernel/syscalls/syscall.tbl             |    1 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h          |    5 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c |    1 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c  |    1 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c   |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c   |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c   |    2 
 drivers/gpu/drm/i915/gvt/kvmgt.c                    |    2 
 drivers/usb/gadget/function/f_fs.c                  |   10 
 drivers/usb/gadget/legacy/inode.c                   |    6 
 drivers/vfio/vfio_iommu_type1.c                     |    6 
 drivers/vhost/vhost.c                               |    8 
 fs/aio.c                                            |    1 
 fs/io-wq.c                                          |   15 -
 fs/io_uring.c                                       |   11 
 fs/nilfs2/segment.c                                 |    2 
 fs/ocfs2/mmap.c                                     |    2 
 include/linux/compat.h                              |   10 
 include/linux/kthread.h                             |    9 
 include/linux/mm.h                                  |    3 
 include/linux/mmu_context.h                         |    5 
 include/linux/mmzone.h                              |   14 
 include/linux/pid.h                                 |    1 
 include/linux/stacktrace.h                          |    2 
 include/linux/syscalls.h                            |   16 -
 include/uapi/asm-generic/unistd.h                   |    7 
 kernel/exit.c                                       |   17 -
 kernel/kcov.c                                       |   26 +
 kernel/kthread.c                                    |   95 +++++-
 kernel/pid.c                                        |   17 +
 kernel/sys_ni.c                                     |    2 
 lib/Kconfig.debug                                   |   10 
 lib/bitmap.c                                        |    9 
 lib/lz4/lz4_decompress.c                            |    3 
 lib/test_bitops.c                                   |   53 +++
 mm/Makefile                                         |    2 
 mm/debug_vm_pgtable.c                               |    6 
 mm/madvise.c                                        |  295 ++++++++++++++------
 mm/mmu_context.c                                    |   64 ----
 mm/oom_kill.c                                       |    6 
 mm/vmacache.c                                       |    4 
 scripts/checkpatch.pl                               |    4 
 scripts/spelling.txt                                |    9 
 tools/testing/selftests/vm/khugepaged.c             |    2 
 62 files changed, 526 insertions(+), 285 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-09  4:29 incoming Andrew Morton
@ 2020-06-09 16:58 ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-06-09 16:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Mon, Jun 8, 2020 at 9:29 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
>  942 files changed, 4580 insertions(+), 5662 deletions(-)

If you use proper tools, add a "-M" to your diff script, so that you see

   941 files changed, 2614 insertions(+), 3696 deletions(-)

because a big portion of the lines were due to a rename:

   rename include/{asm-generic => linux}/pgtable.h (91%)

but at some earlier point you mentioned "diffstat", so I guess "proper
tools" isn't an option ;(

                 Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-09  4:29 Andrew Morton
  2020-06-09 16:58 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-06-09  4:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- a kernel-wide sweep of show_stack()

- pagetable cleanups

- abstract out accesses to mmap_sem - prep for mmap_sem scalability work

- hch's user acess work


93 patches, based on abfbb29297c27e3f101f348dc9e467b0fe70f919:

Subsystems affected by this patch series:

  debug
  mm/pagemap
  mm/maccess
  mm/documentation

Subsystem: debug

    Dmitry Safonov <dima@arista.com>:
    Patch series "Add log level to show_stack()", v3:
      kallsyms/printk: add loglvl to print_ip_sym()
      alpha: add show_stack_loglvl()
      arc: add show_stack_loglvl()
      arm/asm: add loglvl to c_backtrace()
      arm: add loglvl to unwind_backtrace()
      arm: add loglvl to dump_backtrace()
      arm: wire up dump_backtrace_{entry,stm}
      arm: add show_stack_loglvl()
      arm64: add loglvl to dump_backtrace()
      arm64: add show_stack_loglvl()
      c6x: add show_stack_loglvl()
      csky: add show_stack_loglvl()
      h8300: add show_stack_loglvl()
      hexagon: add show_stack_loglvl()
      ia64: pass log level as arg into ia64_do_show_stack()
      ia64: add show_stack_loglvl()
      m68k: add show_stack_loglvl()
      microblaze: add loglvl to microblaze_unwind_inner()
      microblaze: add loglvl to microblaze_unwind()
      microblaze: add show_stack_loglvl()
      mips: add show_stack_loglvl()
      nds32: add show_stack_loglvl()
      nios2: add show_stack_loglvl()
      openrisc: add show_stack_loglvl()
      parisc: add show_stack_loglvl()
      powerpc: add show_stack_loglvl()
      riscv: add show_stack_loglvl()
      s390: add show_stack_loglvl()
      sh: add loglvl to dump_mem()
      sh: remove needless printk()
      sh: add loglvl to printk_address()
      sh: add loglvl to show_trace()
      sh: add show_stack_loglvl()
      sparc: add show_stack_loglvl()
      um/sysrq: remove needless variable sp
      um: add show_stack_loglvl()
      unicore32: remove unused pmode argument in c_backtrace()
      unicore32: add loglvl to c_backtrace()
      unicore32: add show_stack_loglvl()
      x86: add missing const qualifiers for log_lvl
      x86: add show_stack_loglvl()
      xtensa: add loglvl to show_trace()
      xtensa: add show_stack_loglvl()
      sysrq: use show_stack_loglvl()
      x86/amd_gart: print stacktrace for a leak with KERN_ERR
      power: use show_stack_loglvl()
      kdb: don't play with console_loglevel
      sched: print stack trace with KERN_INFO
      kernel: use show_stack_loglvl()
      kernel: rename show_stack_loglvl() => show_stack()

Subsystem: mm/pagemap

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: consolidate definitions of page table accessors", v2:
      mm: don't include asm/pgtable.h if linux/mm.h is already included
      mm: introduce include/linux/pgtable.h
      mm: reorder includes after introduction of linux/pgtable.h
      csky: replace definitions of __pXd_offset() with pXd_index()
      m68k/mm/motorola: move comment about page table allocation funcitons
      m68k/mm: move {cache,nocahe}_page() definitions close to their user
      x86/mm: simplify init_trampoline() and surrounding logic
      mm: pgtable: add shortcuts for accessing kernel PMD and PTE
      mm: consolidate pte_index() and pte_offset_*() definitions

    Michel Lespinasse <walken@google.com>:
      mmap locking API: initial implementation as rwsem wrappers
      MMU notifier: use the new mmap locking API
      DMA reservations: use the new mmap locking API
      mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
      mmap locking API: convert mmap_sem call sites missed by coccinelle
      mmap locking API: convert nested write lock sites
      mmap locking API: add mmap_read_trylock_non_owner()
      mmap locking API: add MMAP_LOCK_INITIALIZER
      mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked()
      mmap locking API: rename mmap_sem to mmap_lock
      mmap locking API: convert mmap_sem API comments
      mmap locking API: convert mmap_sem comments

Subsystem: mm/maccess

    Christoph Hellwig <hch@lst.de>:
    Patch series "clean up and streamline probe_kernel_* and friends", v4:
      maccess: unexport probe_kernel_write()
      maccess: remove various unused weak aliases
      maccess: remove duplicate kerneldoc comments
      maccess: clarify kerneldoc comments
      maccess: update the top of file comment
      maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofault
      maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofault
      maccess: rename strnlen_unsafe_user to strnlen_user_nofault
      maccess: remove probe_read_common and probe_write_common
      maccess: unify the probe kernel arch hooks
      bpf: factor out a bpf_trace_copy_string helper
      bpf: handle the compat string in bpf_trace_copy_string better

    Andrew Morton <akpm@linux-foundation.org>:
      bpf:bpf_seq_printf(): handle potentially unsafe format string better

    Christoph Hellwig <hch@lst.de>:
      bpf: rework the compat kernel probe handling
      tracing/kprobes: handle mixed kernel/userspace probes better
      maccess: remove strncpy_from_unsafe
      maccess: always use strict semantics for probe_kernel_read
      maccess: move user access routines together
      maccess: allow architectures to provide kernel probing directly
      x86: use non-set_fs based maccess routines
      maccess: return -ERANGE when probe_kernel_read() fails

Subsystem: mm/documentation

    Luis Chamberlain <mcgrof@kernel.org>:
      include/linux/cache.h: expand documentation over __read_mostly

 Documentation/admin-guide/mm/numa_memory_policy.rst   |   10 
 Documentation/admin-guide/mm/userfaultfd.rst          |    2 
 Documentation/filesystems/locking.rst                 |    2 
 Documentation/vm/hmm.rst                              |    6 
 Documentation/vm/transhuge.rst                        |    4 
 arch/alpha/boot/bootp.c                               |    1 
 arch/alpha/boot/bootpz.c                              |    1 
 arch/alpha/boot/main.c                                |    1 
 arch/alpha/include/asm/io.h                           |    1 
 arch/alpha/include/asm/pgtable.h                      |   16 
 arch/alpha/kernel/process.c                           |    1 
 arch/alpha/kernel/proto.h                             |    4 
 arch/alpha/kernel/ptrace.c                            |    1 
 arch/alpha/kernel/setup.c                             |    1 
 arch/alpha/kernel/smp.c                               |    1 
 arch/alpha/kernel/sys_alcor.c                         |    1 
 arch/alpha/kernel/sys_cabriolet.c                     |    1 
 arch/alpha/kernel/sys_dp264.c                         |    1 
 arch/alpha/kernel/sys_eb64p.c                         |    1 
 arch/alpha/kernel/sys_eiger.c                         |    1 
 arch/alpha/kernel/sys_jensen.c                        |    1 
 arch/alpha/kernel/sys_marvel.c                        |    1 
 arch/alpha/kernel/sys_miata.c                         |    1 
 arch/alpha/kernel/sys_mikasa.c                        |    1 
 arch/alpha/kernel/sys_nautilus.c                      |    1 
 arch/alpha/kernel/sys_noritake.c                      |    1 
 arch/alpha/kernel/sys_rawhide.c                       |    1 
 arch/alpha/kernel/sys_ruffian.c                       |    1 
 arch/alpha/kernel/sys_rx164.c                         |    1 
 arch/alpha/kernel/sys_sable.c                         |    1 
 arch/alpha/kernel/sys_sio.c                           |    1 
 arch/alpha/kernel/sys_sx164.c                         |    1 
 arch/alpha/kernel/sys_takara.c                        |    1 
 arch/alpha/kernel/sys_titan.c                         |    1 
 arch/alpha/kernel/sys_wildfire.c                      |    1 
 arch/alpha/kernel/traps.c                             |   40 
 arch/alpha/mm/fault.c                                 |   12 
 arch/alpha/mm/init.c                                  |    1 
 arch/arc/include/asm/bug.h                            |    3 
 arch/arc/include/asm/pgtable.h                        |   24 
 arch/arc/kernel/process.c                             |    4 
 arch/arc/kernel/stacktrace.c                          |   29 
 arch/arc/kernel/troubleshoot.c                        |    6 
 arch/arc/mm/fault.c                                   |    6 
 arch/arc/mm/highmem.c                                 |   14 
 arch/arc/mm/tlbex.S                                   |    4 
 arch/arm/include/asm/bug.h                            |    3 
 arch/arm/include/asm/efi.h                            |    3 
 arch/arm/include/asm/fixmap.h                         |    4 
 arch/arm/include/asm/idmap.h                          |    2 
 arch/arm/include/asm/pgtable-2level.h                 |    1 
 arch/arm/include/asm/pgtable-3level.h                 |    7 
 arch/arm/include/asm/pgtable-nommu.h                  |    3 
 arch/arm/include/asm/pgtable.h                        |   25 
 arch/arm/include/asm/traps.h                          |    3 
 arch/arm/include/asm/unwind.h                         |    3 
 arch/arm/kernel/head.S                                |    4 
 arch/arm/kernel/machine_kexec.c                       |    1 
 arch/arm/kernel/module.c                              |    1 
 arch/arm/kernel/process.c                             |    4 
 arch/arm/kernel/ptrace.c                              |    1 
 arch/arm/kernel/smp.c                                 |    1 
 arch/arm/kernel/suspend.c                             |    4 
 arch/arm/kernel/swp_emulate.c                         |    4 
 arch/arm/kernel/traps.c                               |   61 
 arch/arm/kernel/unwind.c                              |    7 
 arch/arm/kernel/vdso.c                                |    2 
 arch/arm/kernel/vmlinux.lds.S                         |    4 
 arch/arm/lib/backtrace-clang.S                        |    9 
 arch/arm/lib/backtrace.S                              |   14 
 arch/arm/lib/uaccess_with_memcpy.c                    |   16 
 arch/arm/mach-ebsa110/core.c                          |    1 
 arch/arm/mach-footbridge/common.c                     |    1 
 arch/arm/mach-imx/mm-imx21.c                          |    1 
 arch/arm/mach-imx/mm-imx27.c                          |    1 
 arch/arm/mach-imx/mm-imx3.c                           |    1 
 arch/arm/mach-integrator/core.c                       |    4 
 arch/arm/mach-iop32x/i2c.c                            |    1 
 arch/arm/mach-iop32x/iq31244.c                        |    1 
 arch/arm/mach-iop32x/iq80321.c                        |    1 
 arch/arm/mach-iop32x/n2100.c                          |    1 
 arch/arm/mach-ixp4xx/common.c                         |    1 
 arch/arm/mach-keystone/platsmp.c                      |    4 
 arch/arm/mach-sa1100/assabet.c                        |    3 
 arch/arm/mach-sa1100/hackkit.c                        |    4 
 arch/arm/mach-tegra/iomap.h                           |    2 
 arch/arm/mach-zynq/common.c                           |    4 
 arch/arm/mm/copypage-v4mc.c                           |    1 
 arch/arm/mm/copypage-v6.c                             |    1 
 arch/arm/mm/copypage-xscale.c                         |    1 
 arch/arm/mm/dump.c                                    |    1 
 arch/arm/mm/fault-armv.c                              |    1 
 arch/arm/mm/fault.c                                   |    9 
 arch/arm/mm/highmem.c                                 |    4 
 arch/arm/mm/idmap.c                                   |    4 
 arch/arm/mm/ioremap.c                                 |   31 
 arch/arm/mm/mm.h                                      |    8 
 arch/arm/mm/mmu.c                                     |    7 
 arch/arm/mm/pageattr.c                                |    1 
 arch/arm/mm/proc-arm1020.S                            |    4 
 arch/arm/mm/proc-arm1020e.S                           |    4 
 arch/arm/mm/proc-arm1022.S                            |    4 
 arch/arm/mm/proc-arm1026.S                            |    4 
 arch/arm/mm/proc-arm720.S                             |    4 
 arch/arm/mm/proc-arm740.S                             |    4 
 arch/arm/mm/proc-arm7tdmi.S                           |    4 
 arch/arm/mm/proc-arm920.S                             |    4 
 arch/arm/mm/proc-arm922.S                             |    4 
 arch/arm/mm/proc-arm925.S                             |    4 
 arch/arm/mm/proc-arm926.S                             |    4 
 arch/arm/mm/proc-arm940.S                             |    4 
 arch/arm/mm/proc-arm946.S                             |    4 
 arch/arm/mm/proc-arm9tdmi.S                           |    4 
 arch/arm/mm/proc-fa526.S                              |    4 
 arch/arm/mm/proc-feroceon.S                           |    4 
 arch/arm/mm/proc-mohawk.S                             |    4 
 arch/arm/mm/proc-sa110.S                              |    4 
 arch/arm/mm/proc-sa1100.S                             |    4 
 arch/arm/mm/proc-v6.S                                 |    4 
 arch/arm/mm/proc-v7.S                                 |    4 
 arch/arm/mm/proc-xsc3.S                               |    4 
 arch/arm/mm/proc-xscale.S                             |    4 
 arch/arm/mm/pv-fixup-asm.S                            |    4 
 arch/arm64/include/asm/io.h                           |    4 
 arch/arm64/include/asm/kernel-pgtable.h               |    2 
 arch/arm64/include/asm/kvm_mmu.h                      |    4 
 arch/arm64/include/asm/mmu_context.h                  |    4 
 arch/arm64/include/asm/pgtable.h                      |   40 
 arch/arm64/include/asm/stacktrace.h                   |    3 
 arch/arm64/include/asm/stage2_pgtable.h               |    2 
 arch/arm64/include/asm/vmap_stack.h                   |    4 
 arch/arm64/kernel/acpi.c                              |    4 
 arch/arm64/kernel/head.S                              |    4 
 arch/arm64/kernel/hibernate.c                         |    5 
 arch/arm64/kernel/kaslr.c                             |    4 
 arch/arm64/kernel/process.c                           |    2 
 arch/arm64/kernel/ptrace.c                            |    1 
 arch/arm64/kernel/smp.c                               |    1 
 arch/arm64/kernel/suspend.c                           |    4 
 arch/arm64/kernel/traps.c                             |   37 
 arch/arm64/kernel/vdso.c                              |    8 
 arch/arm64/kernel/vmlinux.lds.S                       |    3 
 arch/arm64/kvm/mmu.c                                  |   14 
 arch/arm64/mm/dump.c                                  |    1 
 arch/arm64/mm/fault.c                                 |    9 
 arch/arm64/mm/kasan_init.c                            |    3 
 arch/arm64/mm/mmu.c                                   |    8 
 arch/arm64/mm/pageattr.c                              |    1 
 arch/arm64/mm/proc.S                                  |    4 
 arch/c6x/include/asm/pgtable.h                        |    3 
 arch/c6x/kernel/traps.c                               |   28 
 arch/csky/include/asm/io.h                            |    2 
 arch/csky/include/asm/pgtable.h                       |   37 
 arch/csky/kernel/module.c                             |    1 
 arch/csky/kernel/ptrace.c                             |    5 
 arch/csky/kernel/stacktrace.c                         |   20 
 arch/csky/kernel/vdso.c                               |    4 
 arch/csky/mm/fault.c                                  |   10 
 arch/csky/mm/highmem.c                                |    2 
 arch/csky/mm/init.c                                   |    7 
 arch/csky/mm/tlb.c                                    |    1 
 arch/h8300/include/asm/pgtable.h                      |    1 
 arch/h8300/kernel/process.c                           |    1 
 arch/h8300/kernel/setup.c                             |    1 
 arch/h8300/kernel/signal.c                            |    1 
 arch/h8300/kernel/traps.c                             |   26 
 arch/h8300/mm/fault.c                                 |    1 
 arch/h8300/mm/init.c                                  |    1 
 arch/h8300/mm/memory.c                                |    1 
 arch/hexagon/include/asm/fixmap.h                     |    4 
 arch/hexagon/include/asm/pgtable.h                    |   55 
 arch/hexagon/kernel/traps.c                           |   39 
 arch/hexagon/kernel/vdso.c                            |    4 
 arch/hexagon/mm/uaccess.c                             |    2 
 arch/hexagon/mm/vm_fault.c                            |    9 
 arch/ia64/include/asm/pgtable.h                       |   34 
 arch/ia64/include/asm/ptrace.h                        |    1 
 arch/ia64/include/asm/uaccess.h                       |    2 
 arch/ia64/kernel/efi.c                                |    1 
 arch/ia64/kernel/entry.S                              |    4 
 arch/ia64/kernel/head.S                               |    5 
 arch/ia64/kernel/irq_ia64.c                           |    4 
 arch/ia64/kernel/ivt.S                                |    4 
 arch/ia64/kernel/kprobes.c                            |    4 
 arch/ia64/kernel/mca.c                                |    2 
 arch/ia64/kernel/mca_asm.S                            |    4 
 arch/ia64/kernel/perfmon.c                            |    8 
 arch/ia64/kernel/process.c                            |   37 
 arch/ia64/kernel/ptrace.c                             |    1 
 arch/ia64/kernel/relocate_kernel.S                    |    6 
 arch/ia64/kernel/setup.c                              |    4 
 arch/ia64/kernel/smp.c                                |    1 
 arch/ia64/kernel/smpboot.c                            |    1 
 arch/ia64/kernel/uncached.c                           |    4 
 arch/ia64/kernel/vmlinux.lds.S                        |    4 
 arch/ia64/mm/contig.c                                 |    1 
 arch/ia64/mm/fault.c                                  |   17 
 arch/ia64/mm/init.c                                   |   12 
 arch/m68k/68000/m68EZ328.c                            |    2 
 arch/m68k/68000/m68VZ328.c                            |    4 
 arch/m68k/68000/timers.c                              |    1 
 arch/m68k/amiga/config.c                              |    1 
 arch/m68k/apollo/config.c                             |    1 
 arch/m68k/atari/atasound.c                            |    1 
 arch/m68k/atari/stram.c                               |    1 
 arch/m68k/bvme6000/config.c                           |    1 
 arch/m68k/include/asm/mcf_pgtable.h                   |   63 
 arch/m68k/include/asm/motorola_pgalloc.h              |    8 
 arch/m68k/include/asm/motorola_pgtable.h              |   84 -
 arch/m68k/include/asm/pgtable_mm.h                    |    1 
 arch/m68k/include/asm/pgtable_no.h                    |    2 
 arch/m68k/include/asm/sun3_pgtable.h                  |   24 
 arch/m68k/include/asm/sun3xflop.h                     |    4 
 arch/m68k/kernel/head.S                               |    4 
 arch/m68k/kernel/process.c                            |    1 
 arch/m68k/kernel/ptrace.c                             |    1 
 arch/m68k/kernel/setup_no.c                           |    1 
 arch/m68k/kernel/signal.c                             |    1 
 arch/m68k/kernel/sys_m68k.c                           |   14 
 arch/m68k/kernel/traps.c                              |   27 
 arch/m68k/kernel/uboot.c                              |    1 
 arch/m68k/mac/config.c                                |    1 
 arch/m68k/mm/fault.c                                  |   10 
 arch/m68k/mm/init.c                                   |    2 
 arch/m68k/mm/mcfmmu.c                                 |    1 
 arch/m68k/mm/motorola.c                               |   65 
 arch/m68k/mm/sun3kmap.c                               |    1 
 arch/m68k/mm/sun3mmu.c                                |    1 
 arch/m68k/mvme147/config.c                            |    1 
 arch/m68k/mvme16x/config.c                            |    1 
 arch/m68k/q40/config.c                                |    1 
 arch/m68k/sun3/config.c                               |    1 
 arch/m68k/sun3/dvma.c                                 |    1 
 arch/m68k/sun3/mmu_emu.c                              |    1 
 arch/m68k/sun3/sun3dvma.c                             |    1 
 arch/m68k/sun3x/dvma.c                                |    1 
 arch/m68k/sun3x/prom.c                                |    1 
 arch/microblaze/include/asm/pgalloc.h                 |    4 
 arch/microblaze/include/asm/pgtable.h                 |   23 
 arch/microblaze/include/asm/uaccess.h                 |    2 
 arch/microblaze/include/asm/unwind.h                  |    3 
 arch/microblaze/kernel/hw_exception_handler.S         |    4 
 arch/microblaze/kernel/module.c                       |    4 
 arch/microblaze/kernel/setup.c                        |    4 
 arch/microblaze/kernel/signal.c                       |    9 
 arch/microblaze/kernel/stacktrace.c                   |    4 
 arch/microblaze/kernel/traps.c                        |   28 
 arch/microblaze/kernel/unwind.c                       |   46 
 arch/microblaze/mm/fault.c                            |   17 
 arch/microblaze/mm/init.c                             |    9 
 arch/microblaze/mm/pgtable.c                          |    4 
 arch/mips/fw/arc/memory.c                             |    1 
 arch/mips/include/asm/fixmap.h                        |    3 
 arch/mips/include/asm/mach-generic/floppy.h           |    1 
 arch/mips/include/asm/mach-jazz/floppy.h              |    1 
 arch/mips/include/asm/pgtable-32.h                    |   22 
 arch/mips/include/asm/pgtable-64.h                    |   32 
 arch/mips/include/asm/pgtable.h                       |    2 
 arch/mips/jazz/irq.c                                  |    4 
 arch/mips/jazz/jazzdma.c                              |    1 
 arch/mips/jazz/setup.c                                |    4 
 arch/mips/kernel/module.c                             |    1 
 arch/mips/kernel/process.c                            |    1 
 arch/mips/kernel/ptrace.c                             |    1 
 arch/mips/kernel/ptrace32.c                           |    1 
 arch/mips/kernel/smp-bmips.c                          |    1 
 arch/mips/kernel/traps.c                              |   58 
 arch/mips/kernel/vdso.c                               |    4 
 arch/mips/kvm/mips.c                                  |    4 
 arch/mips/kvm/mmu.c                                   |   20 
 arch/mips/kvm/tlb.c                                   |    1 
 arch/mips/kvm/trap_emul.c                             |    2 
 arch/mips/lib/dump_tlb.c                              |    1 
 arch/mips/lib/r3k_dump_tlb.c                          |    1 
 arch/mips/mm/c-octeon.c                               |    1 
 arch/mips/mm/c-r3k.c                                  |   11 
 arch/mips/mm/c-r4k.c                                  |   11 
 arch/mips/mm/c-tx39.c                                 |   11 
 arch/mips/mm/fault.c                                  |   12 
 arch/mips/mm/highmem.c                                |    2 
 arch/mips/mm/init.c                                   |    1 
 arch/mips/mm/page.c                                   |    1 
 arch/mips/mm/pgtable-32.c                             |    1 
 arch/mips/mm/pgtable-64.c                             |    1 
 arch/mips/mm/sc-ip22.c                                |    1 
 arch/mips/mm/sc-mips.c                                |    1 
 arch/mips/mm/sc-r5k.c                                 |    1 
 arch/mips/mm/tlb-r3k.c                                |    1 
 arch/mips/mm/tlb-r4k.c                                |    1 
 arch/mips/mm/tlbex.c                                  |    4 
 arch/mips/sgi-ip27/ip27-init.c                        |    1 
 arch/mips/sgi-ip27/ip27-timer.c                       |    1 
 arch/mips/sgi-ip32/ip32-memory.c                      |    1 
 arch/nds32/include/asm/highmem.h                      |    3 
 arch/nds32/include/asm/pgtable.h                      |   22 
 arch/nds32/kernel/head.S                              |    4 
 arch/nds32/kernel/module.c                            |    2 
 arch/nds32/kernel/traps.c                             |   33 
 arch/nds32/kernel/vdso.c                              |    6 
 arch/nds32/mm/fault.c                                 |   17 
 arch/nds32/mm/init.c                                  |   13 
 arch/nds32/mm/proc.c                                  |    7 
 arch/nios2/include/asm/pgtable.h                      |   24 
 arch/nios2/kernel/module.c                            |    1 
 arch/nios2/kernel/nios2_ksyms.c                       |    4 
 arch/nios2/kernel/traps.c                             |   35 
 arch/nios2/mm/fault.c                                 |   14 
 arch/nios2/mm/init.c                                  |    5 
 arch/nios2/mm/pgtable.c                               |    1 
 arch/nios2/mm/tlb.c                                   |    1 
 arch/openrisc/include/asm/io.h                        |    3 
 arch/openrisc/include/asm/pgtable.h                   |   33 
 arch/openrisc/include/asm/tlbflush.h                  |    1 
 arch/openrisc/kernel/asm-offsets.c                    |    1 
 arch/openrisc/kernel/entry.S                          |    4 
 arch/openrisc/kernel/head.S                           |    4 
 arch/openrisc/kernel/or32_ksyms.c                     |    4 
 arch/openrisc/kernel/process.c                        |    1 
 arch/openrisc/kernel/ptrace.c                         |    1 
 arch/openrisc/kernel/setup.c                          |    1 
 arch/openrisc/kernel/traps.c                          |   27 
 arch/openrisc/mm/fault.c                              |   12 
 arch/openrisc/mm/init.c                               |    1 
 arch/openrisc/mm/ioremap.c                            |    4 
 arch/openrisc/mm/tlb.c                                |    1 
 arch/parisc/include/asm/io.h                          |    2 
 arch/parisc/include/asm/mmu_context.h                 |    1 
 arch/parisc/include/asm/pgtable.h                     |   33 
 arch/parisc/kernel/asm-offsets.c                      |    4 
 arch/parisc/kernel/entry.S                            |    4 
 arch/parisc/kernel/head.S                             |    4 
 arch/parisc/kernel/module.c                           |    1 
 arch/parisc/kernel/pacache.S                          |    4 
 arch/parisc/kernel/pci-dma.c                          |    2 
 arch/parisc/kernel/pdt.c                              |    4 
 arch/parisc/kernel/ptrace.c                           |    1 
 arch/parisc/kernel/smp.c                              |    1 
 arch/parisc/kernel/traps.c                            |   42 
 arch/parisc/lib/memcpy.c                              |   14 
 arch/parisc/mm/fault.c                                |   10 
 arch/parisc/mm/fixmap.c                               |    6 
 arch/parisc/mm/init.c                                 |    1 
 arch/powerpc/include/asm/book3s/32/pgtable.h          |   20 
 arch/powerpc/include/asm/book3s/64/pgtable.h          |   43 
 arch/powerpc/include/asm/fixmap.h                     |    4 
 arch/powerpc/include/asm/io.h                         |    1 
 arch/powerpc/include/asm/kup.h                        |    2 
 arch/powerpc/include/asm/nohash/32/pgtable.h          |   17 
 arch/powerpc/include/asm/nohash/64/pgtable-4k.h       |    4 
 arch/powerpc/include/asm/nohash/64/pgtable.h          |   22 
 arch/powerpc/include/asm/nohash/pgtable.h             |    2 
 arch/powerpc/include/asm/pgtable.h                    |   28 
 arch/powerpc/include/asm/pkeys.h                      |    2 
 arch/powerpc/include/asm/tlb.h                        |    2 
 arch/powerpc/kernel/asm-offsets.c                     |    1 
 arch/powerpc/kernel/btext.c                           |    4 
 arch/powerpc/kernel/fpu.S                             |    3 
 arch/powerpc/kernel/head_32.S                         |    4 
 arch/powerpc/kernel/head_40x.S                        |    4 
 arch/powerpc/kernel/head_44x.S                        |    4 
 arch/powerpc/kernel/head_8xx.S                        |    4 
 arch/powerpc/kernel/head_fsl_booke.S                  |    4 
 arch/powerpc/kernel/io-workarounds.c                  |    4 
 arch/powerpc/kernel/irq.c                             |    4 
 arch/powerpc/kernel/mce_power.c                       |    4 
 arch/powerpc/kernel/paca.c                            |    4 
 arch/powerpc/kernel/process.c                         |   30 
 arch/powerpc/kernel/prom.c                            |    4 
 arch/powerpc/kernel/prom_init.c                       |    4 
 arch/powerpc/kernel/rtas_pci.c                        |    4 
 arch/powerpc/kernel/setup-common.c                    |    4 
 arch/powerpc/kernel/setup_32.c                        |    4 
 arch/powerpc/kernel/setup_64.c                        |    4 
 arch/powerpc/kernel/signal_32.c                       |    1 
 arch/powerpc/kernel/signal_64.c                       |    1 
 arch/powerpc/kernel/smp.c                             |    4 
 arch/powerpc/kernel/stacktrace.c                      |    2 
 arch/powerpc/kernel/traps.c                           |    1 
 arch/powerpc/kernel/vdso.c                            |    7 
 arch/powerpc/kvm/book3s_64_mmu_radix.c                |    4 
 arch/powerpc/kvm/book3s_hv.c                          |    6 
 arch/powerpc/kvm/book3s_hv_nested.c                   |    4 
 arch/powerpc/kvm/book3s_hv_rm_xics.c                  |    4 
 arch/powerpc/kvm/book3s_hv_rm_xive.c                  |    4 
 arch/powerpc/kvm/book3s_hv_uvmem.c                    |   18 
 arch/powerpc/kvm/e500_mmu_host.c                      |    4 
 arch/powerpc/kvm/fpu.S                                |    4 
 arch/powerpc/lib/code-patching.c                      |    1 
 arch/powerpc/mm/book3s32/hash_low.S                   |    4 
 arch/powerpc/mm/book3s32/mmu.c                        |    2 
 arch/powerpc/mm/book3s32/tlb.c                        |    6 
 arch/powerpc/mm/book3s64/hash_hugetlbpage.c           |    1 
 arch/powerpc/mm/book3s64/hash_native.c                |    4 
 arch/powerpc/mm/book3s64/hash_pgtable.c               |    5 
 arch/powerpc/mm/book3s64/hash_utils.c                 |    4 
 arch/powerpc/mm/book3s64/iommu_api.c                  |    4 
 arch/powerpc/mm/book3s64/radix_hugetlbpage.c          |    1 
 arch/powerpc/mm/book3s64/radix_pgtable.c              |    1 
 arch/powerpc/mm/book3s64/slb.c                        |    4 
 arch/powerpc/mm/book3s64/subpage_prot.c               |   16 
 arch/powerpc/mm/copro_fault.c                         |    4 
 arch/powerpc/mm/fault.c                               |   23 
 arch/powerpc/mm/hugetlbpage.c                         |    1 
 arch/powerpc/mm/init-common.c                         |    4 
 arch/powerpc/mm/init_32.c                             |    1 
 arch/powerpc/mm/init_64.c                             |    1 
 arch/powerpc/mm/kasan/8xx.c                           |    4 
 arch/powerpc/mm/kasan/book3s_32.c                     |    2 
 arch/powerpc/mm/kasan/kasan_init_32.c                 |    8 
 arch/powerpc/mm/mem.c                                 |    1 
 arch/powerpc/mm/nohash/40x.c                          |    5 
 arch/powerpc/mm/nohash/8xx.c                          |    2 
 arch/powerpc/mm/nohash/fsl_booke.c                    |    1 
 arch/powerpc/mm/nohash/tlb_low_64e.S                  |    4 
 arch/powerpc/mm/pgtable.c                             |    2 
 arch/powerpc/mm/pgtable_32.c                          |    5 
 arch/powerpc/mm/pgtable_64.c                          |    1 
 arch/powerpc/mm/ptdump/8xx.c                          |    2 
 arch/powerpc/mm/ptdump/bats.c                         |    4 
 arch/powerpc/mm/ptdump/book3s64.c                     |    2 
 arch/powerpc/mm/ptdump/hashpagetable.c                |    1 
 arch/powerpc/mm/ptdump/ptdump.c                       |    1 
 arch/powerpc/mm/ptdump/shared.c                       |    2 
 arch/powerpc/oprofile/cell/spu_task_sync.c            |    6 
 arch/powerpc/perf/callchain.c                         |    1 
 arch/powerpc/perf/callchain_32.c                      |    1 
 arch/powerpc/perf/callchain_64.c                      |    1 
 arch/powerpc/platforms/85xx/corenet_generic.c         |    4 
 arch/powerpc/platforms/85xx/mpc85xx_cds.c             |    4 
 arch/powerpc/platforms/85xx/qemu_e500.c               |    4 
 arch/powerpc/platforms/85xx/sbc8548.c                 |    4 
 arch/powerpc/platforms/85xx/smp.c                     |    4 
 arch/powerpc/platforms/86xx/mpc86xx_smp.c             |    4 
 arch/powerpc/platforms/8xx/cpm1.c                     |    1 
 arch/powerpc/platforms/8xx/micropatch.c               |    1 
 arch/powerpc/platforms/cell/cbe_regs.c                |    4 
 arch/powerpc/platforms/cell/interrupt.c               |    4 
 arch/powerpc/platforms/cell/pervasive.c               |    4 
 arch/powerpc/platforms/cell/setup.c                   |    1 
 arch/powerpc/platforms/cell/smp.c                     |    4 
 arch/powerpc/platforms/cell/spider-pic.c              |    4 
 arch/powerpc/platforms/cell/spufs/file.c              |   10 
 arch/powerpc/platforms/chrp/pci.c                     |    4 
 arch/powerpc/platforms/chrp/setup.c                   |    1 
 arch/powerpc/platforms/chrp/smp.c                     |    4 
 arch/powerpc/platforms/maple/setup.c                  |    1 
 arch/powerpc/platforms/maple/time.c                   |    1 
 arch/powerpc/platforms/powermac/setup.c               |    1 
 arch/powerpc/platforms/powermac/smp.c                 |    4 
 arch/powerpc/platforms/powermac/time.c                |    1 
 arch/powerpc/platforms/pseries/lpar.c                 |    4 
 arch/powerpc/platforms/pseries/setup.c                |    1 
 arch/powerpc/platforms/pseries/smp.c                  |    4 
 arch/powerpc/sysdev/cpm2.c                            |    1 
 arch/powerpc/sysdev/fsl_85xx_cache_sram.c             |    2 
 arch/powerpc/sysdev/mpic.c                            |    4 
 arch/powerpc/xmon/xmon.c                              |    1 
 arch/riscv/include/asm/fixmap.h                       |    4 
 arch/riscv/include/asm/io.h                           |    4 
 arch/riscv/include/asm/kasan.h                        |    4 
 arch/riscv/include/asm/pgtable-64.h                   |    7 
 arch/riscv/include/asm/pgtable.h                      |   22 
 arch/riscv/kernel/module.c                            |    2 
 arch/riscv/kernel/setup.c                             |    1 
 arch/riscv/kernel/soc.c                               |    2 
 arch/riscv/kernel/stacktrace.c                        |   23 
 arch/riscv/kernel/vdso.c                              |    4 
 arch/riscv/mm/cacheflush.c                            |    3 
 arch/riscv/mm/fault.c                                 |   14 
 arch/riscv/mm/init.c                                  |   31 
 arch/riscv/mm/kasan_init.c                            |    4 
 arch/riscv/mm/pageattr.c                              |    6 
 arch/riscv/mm/ptdump.c                                |    2 
 arch/s390/boot/ipl_parm.c                             |    4 
 arch/s390/boot/kaslr.c                                |    4 
 arch/s390/include/asm/hugetlb.h                       |    4 
 arch/s390/include/asm/kasan.h                         |    4 
 arch/s390/include/asm/pgtable.h                       |   15 
 arch/s390/include/asm/tlbflush.h                      |    1 
 arch/s390/kernel/asm-offsets.c                        |    4 
 arch/s390/kernel/dumpstack.c                          |   25 
 arch/s390/kernel/machine_kexec.c                      |    1 
 arch/s390/kernel/ptrace.c                             |    1 
 arch/s390/kernel/uv.c                                 |    4 
 arch/s390/kernel/vdso.c                               |    5 
 arch/s390/kvm/gaccess.c                               |    8 
 arch/s390/kvm/interrupt.c                             |    4 
 arch/s390/kvm/kvm-s390.c                              |   32 
 arch/s390/kvm/priv.c                                  |   38 
 arch/s390/mm/dump_pagetables.c                        |    1 
 arch/s390/mm/extmem.c                                 |    4 
 arch/s390/mm/fault.c                                  |   17 
 arch/s390/mm/gmap.c                                   |   80 
 arch/s390/mm/init.c                                   |    1 
 arch/s390/mm/kasan_init.c                             |    4 
 arch/s390/mm/pageattr.c                               |   13 
 arch/s390/mm/pgalloc.c                                |    2 
 arch/s390/mm/pgtable.c                                |    1 
 arch/s390/mm/vmem.c                                   |    1 
 arch/s390/pci/pci_mmio.c                              |    4 
 arch/sh/include/asm/io.h                              |    2 
 arch/sh/include/asm/kdebug.h                          |    6 
 arch/sh/include/asm/pgtable-3level.h                  |    7 
 arch/sh/include/asm/pgtable.h                         |    2 
 arch/sh/include/asm/pgtable_32.h                      |   25 
 arch/sh/include/asm/processor_32.h                    |    2 
 arch/sh/kernel/dumpstack.c                            |   54 
 arch/sh/kernel/machine_kexec.c                        |    1 
 arch/sh/kernel/process_32.c                           |    2 
 arch/sh/kernel/ptrace_32.c                            |    1 
 arch/sh/kernel/signal_32.c                            |    1 
 arch/sh/kernel/sys_sh.c                               |    6 
 arch/sh/kernel/traps.c                                |    4 
 arch/sh/kernel/vsyscall/vsyscall.c                    |    4 
 arch/sh/mm/cache-sh3.c                                |    1 
 arch/sh/mm/cache-sh4.c                                |   11 
 arch/sh/mm/cache-sh7705.c                             |    1 
 arch/sh/mm/fault.c                                    |   16 
 arch/sh/mm/kmap.c                                     |    5 
 arch/sh/mm/nommu.c                                    |    1 
 arch/sh/mm/pmb.c                                      |    4 
 arch/sparc/include/asm/floppy_32.h                    |    4 
 arch/sparc/include/asm/highmem.h                      |    4 
 arch/sparc/include/asm/ide.h                          |    2 
 arch/sparc/include/asm/io-unit.h                      |    4 
 arch/sparc/include/asm/pgalloc_32.h                   |    4 
 arch/sparc/include/asm/pgalloc_64.h                   |    2 
 arch/sparc/include/asm/pgtable_32.h                   |   34 
 arch/sparc/include/asm/pgtable_64.h                   |   32 
 arch/sparc/kernel/cpu.c                               |    4 
 arch/sparc/kernel/entry.S                             |    4 
 arch/sparc/kernel/head_64.S                           |    4 
 arch/sparc/kernel/ktlb.S                              |    4 
 arch/sparc/kernel/leon_smp.c                          |    1 
 arch/sparc/kernel/pci.c                               |    4 
 arch/sparc/kernel/process_32.c                        |   29 
 arch/sparc/kernel/process_64.c                        |    3 
 arch/sparc/kernel/ptrace_32.c                         |    1 
 arch/sparc/kernel/ptrace_64.c                         |    1 
 arch/sparc/kernel/setup_32.c                          |    1 
 arch/sparc/kernel/setup_64.c                          |    1 
 arch/sparc/kernel/signal32.c                          |    1 
 arch/sparc/kernel/signal_32.c                         |    1 
 arch/sparc/kernel/signal_64.c                         |    1 
 arch/sparc/kernel/smp_32.c                            |    1 
 arch/sparc/kernel/smp_64.c                            |    1 
 arch/sparc/kernel/sun4m_irq.c                         |    4 
 arch/sparc/kernel/trampoline_64.S                     |    4 
 arch/sparc/kernel/traps_32.c                          |    4 
 arch/sparc/kernel/traps_64.c                          |   24 
 arch/sparc/lib/clear_page.S                           |    4 
 arch/sparc/lib/copy_page.S                            |    2 
 arch/sparc/mm/fault_32.c                              |   21 
 arch/sparc/mm/fault_64.c                              |   17 
 arch/sparc/mm/highmem.c                               |   12 
 arch/sparc/mm/hugetlbpage.c                           |    1 
 arch/sparc/mm/init_32.c                               |    1 
 arch/sparc/mm/init_64.c                               |    7 
 arch/sparc/mm/io-unit.c                               |   11 
 arch/sparc/mm/iommu.c                                 |    9 
 arch/sparc/mm/tlb.c                                   |    1 
 arch/sparc/mm/tsb.c                                   |    4 
 arch/sparc/mm/ultra.S                                 |    4 
 arch/sparc/vdso/vma.c                                 |    4 
 arch/um/drivers/mconsole_kern.c                       |    2 
 arch/um/include/asm/mmu_context.h                     |    5 
 arch/um/include/asm/pgtable-3level.h                  |    4 
 arch/um/include/asm/pgtable.h                         |   69 
 arch/um/kernel/maccess.c                              |   12 
 arch/um/kernel/mem.c                                  |   10 
 arch/um/kernel/process.c                              |    1 
 arch/um/kernel/skas/mmu.c                             |    3 
 arch/um/kernel/skas/uaccess.c                         |    1 
 arch/um/kernel/sysrq.c                                |   35 
 arch/um/kernel/tlb.c                                  |    5 
 arch/um/kernel/trap.c                                 |   15 
 arch/um/kernel/um_arch.c                              |    1 
 arch/unicore32/include/asm/pgtable.h                  |   19 
 arch/unicore32/kernel/hibernate.c                     |    4 
 arch/unicore32/kernel/hibernate_asm.S                 |    4 
 arch/unicore32/kernel/module.c                        |    1 
 arch/unicore32/kernel/setup.h                         |    4 
 arch/unicore32/kernel/traps.c                         |   50 
 arch/unicore32/lib/backtrace.S                        |   24 
 arch/unicore32/mm/alignment.c                         |    4 
 arch/unicore32/mm/fault.c                             |    9 
 arch/unicore32/mm/mm.h                                |   10 
 arch/unicore32/mm/proc-ucv2.S                         |    4 
 arch/x86/boot/compressed/kaslr_64.c                   |    4 
 arch/x86/entry/vdso/vma.c                             |   14 
 arch/x86/events/core.c                                |    4 
 arch/x86/include/asm/agp.h                            |    2 
 arch/x86/include/asm/asm-prototypes.h                 |    4 
 arch/x86/include/asm/efi.h                            |    4 
 arch/x86/include/asm/iomap.h                          |    1 
 arch/x86/include/asm/kaslr.h                          |    2 
 arch/x86/include/asm/mmu.h                            |    2 
 arch/x86/include/asm/pgtable-3level.h                 |    8 
 arch/x86/include/asm/pgtable.h                        |   89 -
 arch/x86/include/asm/pgtable_32.h                     |   11 
 arch/x86/include/asm/pgtable_64.h                     |    4 
 arch/x86/include/asm/setup.h                          |   12 
 arch/x86/include/asm/stacktrace.h                     |    2 
 arch/x86/include/asm/uaccess.h                        |   16 
 arch/x86/include/asm/xen/hypercall.h                  |    4 
 arch/x86/include/asm/xen/page.h                       |    1 
 arch/x86/kernel/acpi/boot.c                           |    4 
 arch/x86/kernel/acpi/sleep.c                          |    4 
 arch/x86/kernel/alternative.c                         |    1 
 arch/x86/kernel/amd_gart_64.c                         |    5 
 arch/x86/kernel/apic/apic_numachip.c                  |    4 
 arch/x86/kernel/cpu/bugs.c                            |    4 
 arch/x86/kernel/cpu/common.c                          |    4 
 arch/x86/kernel/cpu/intel.c                           |    4 
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c             |    6 
 arch/x86/kernel/cpu/resctrl/rdtgroup.c                |    6 
 arch/x86/kernel/crash_core_32.c                       |    4 
 arch/x86/kernel/crash_core_64.c                       |    4 
 arch/x86/kernel/doublefault_32.c                      |    1 
 arch/x86/kernel/dumpstack.c                           |   21 
 arch/x86/kernel/early_printk.c                        |    4 
 arch/x86/kernel/espfix_64.c                           |    2 
 arch/x86/kernel/head64.c                              |    4 
 arch/x86/kernel/head_64.S                             |    4 
 arch/x86/kernel/i8259.c                               |    4 
 arch/x86/kernel/irqinit.c                             |    4 
 arch/x86/kernel/kprobes/core.c                        |    4 
 arch/x86/kernel/kprobes/opt.c                         |    4 
 arch/x86/kernel/ldt.c                                 |    2 
 arch/x86/kernel/machine_kexec_32.c                    |    1 
 arch/x86/kernel/machine_kexec_64.c                    |    1 
 arch/x86/kernel/module.c                              |    1 
 arch/x86/kernel/paravirt.c                            |    4 
 arch/x86/kernel/process_32.c                          |    1 
 arch/x86/kernel/process_64.c                          |    1 
 arch/x86/kernel/ptrace.c                              |    1 
 arch/x86/kernel/reboot.c                              |    4 
 arch/x86/kernel/smpboot.c                             |    4 
 arch/x86/kernel/tboot.c                               |    3 
 arch/x86/kernel/vm86_32.c                             |    4 
 arch/x86/kvm/mmu/paging_tmpl.h                        |    8 
 arch/x86/mm/cpu_entry_area.c                          |    4 
 arch/x86/mm/debug_pagetables.c                        |    2 
 arch/x86/mm/dump_pagetables.c                         |    1 
 arch/x86/mm/fault.c                                   |   22 
 arch/x86/mm/init.c                                    |   22 
 arch/x86/mm/init_32.c                                 |   27 
 arch/x86/mm/init_64.c                                 |    1 
 arch/x86/mm/ioremap.c                                 |    4 
 arch/x86/mm/kasan_init_64.c                           |    1 
 arch/x86/mm/kaslr.c                                   |   37 
 arch/x86/mm/maccess.c                                 |   44 
 arch/x86/mm/mem_encrypt_boot.S                        |    2 
 arch/x86/mm/mmio-mod.c                                |    4 
 arch/x86/mm/pat/cpa-test.c                            |    1 
 arch/x86/mm/pat/memtype.c                             |    1 
 arch/x86/mm/pat/memtype_interval.c                    |    4 
 arch/x86/mm/pgtable.c                                 |    1 
 arch/x86/mm/pgtable_32.c                              |    1 
 arch/x86/mm/pti.c                                     |    1 
 arch/x86/mm/setup_nx.c                                |    4 
 arch/x86/platform/efi/efi_32.c                        |    4 
 arch/x86/platform/efi/efi_64.c                        |    1 
 arch/x86/platform/olpc/olpc_ofw.c                     |    4 
 arch/x86/power/cpu.c                                  |    4 
 arch/x86/power/hibernate.c                            |    4 
 arch/x86/power/hibernate_32.c                         |    4 
 arch/x86/power/hibernate_64.c                         |    4 
 arch/x86/realmode/init.c                              |    4 
 arch/x86/um/vdso/vma.c                                |    4 
 arch/x86/xen/enlighten_pv.c                           |    1 
 arch/x86/xen/grant-table.c                            |    1 
 arch/x86/xen/mmu_pv.c                                 |    4 
 arch/x86/xen/smp_pv.c                                 |    2 
 arch/xtensa/include/asm/fixmap.h                      |   12 
 arch/xtensa/include/asm/highmem.h                     |    4 
 arch/xtensa/include/asm/initialize_mmu.h              |    2 
 arch/xtensa/include/asm/mmu_context.h                 |    4 
 arch/xtensa/include/asm/pgtable.h                     |   20 
 arch/xtensa/kernel/entry.S                            |    4 
 arch/xtensa/kernel/process.c                          |    1 
 arch/xtensa/kernel/ptrace.c                           |    1 
 arch/xtensa/kernel/setup.c                            |    1 
 arch/xtensa/kernel/traps.c                            |   42 
 arch/xtensa/kernel/vectors.S                          |    4 
 arch/xtensa/mm/cache.c                                |    4 
 arch/xtensa/mm/fault.c                                |   12 
 arch/xtensa/mm/highmem.c                              |    2 
 arch/xtensa/mm/ioremap.c                              |    4 
 arch/xtensa/mm/kasan_init.c                           |   10 
 arch/xtensa/mm/misc.S                                 |    4 
 arch/xtensa/mm/mmu.c                                  |    5 
 drivers/acpi/scan.c                                   |    3 
 drivers/android/binder_alloc.c                        |   14 
 drivers/atm/fore200e.c                                |    4 
 drivers/base/power/main.c                             |    4 
 drivers/block/z2ram.c                                 |    4 
 drivers/char/agp/frontend.c                           |    1 
 drivers/char/agp/generic.c                            |    1 
 drivers/char/bsr.c                                    |    1 
 drivers/char/mspec.c                                  |    3 
 drivers/dma-buf/dma-resv.c                            |    5 
 drivers/firmware/efi/arm-runtime.c                    |    4 
 drivers/firmware/efi/efi.c                            |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h            |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c     |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c     |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c      |    4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c               |   10 
 drivers/gpu/drm/amd/amdkfd/kfd_events.c               |    4 
 drivers/gpu/drm/drm_vm.c                              |    4 
 drivers/gpu/drm/etnaviv/etnaviv_gem.c                 |    2 
 drivers/gpu/drm/i915/gem/i915_gem_mman.c              |    4 
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c           |   14 
 drivers/gpu/drm/i915/i915_mm.c                        |    1 
 drivers/gpu/drm/i915/i915_perf.c                      |    2 
 drivers/gpu/drm/nouveau/nouveau_svm.c                 |   22 
 drivers/gpu/drm/radeon/radeon_cs.c                    |    4 
 drivers/gpu/drm/radeon/radeon_gem.c                   |    6 
 drivers/gpu/drm/ttm/ttm_bo_vm.c                       |   10 
 drivers/infiniband/core/umem_odp.c                    |    4 
 drivers/infiniband/core/uverbs_main.c                 |    6 
 drivers/infiniband/hw/hfi1/mmu_rb.c                   |    2 
 drivers/infiniband/hw/mlx4/mr.c                       |    4 
 drivers/infiniband/hw/qib/qib_file_ops.c              |    4 
 drivers/infiniband/hw/qib/qib_user_pages.c            |    6 
 drivers/infiniband/hw/usnic/usnic_uiom.c              |    4 
 drivers/infiniband/sw/rdmavt/mmap.c                   |    1 
 drivers/infiniband/sw/rxe/rxe_mmap.c                  |    1 
 drivers/infiniband/sw/siw/siw_mem.c                   |    4 
 drivers/iommu/amd_iommu_v2.c                          |    4 
 drivers/iommu/intel-svm.c                             |    4 
 drivers/macintosh/macio-adb.c                         |    4 
 drivers/macintosh/mediabay.c                          |    4 
 drivers/macintosh/via-pmu.c                           |    4 
 drivers/media/pci/bt8xx/bt878.c                       |    4 
 drivers/media/pci/bt8xx/btcx-risc.c                   |    4 
 drivers/media/pci/bt8xx/bttv-risc.c                   |    4 
 drivers/media/platform/davinci/vpbe_display.c         |    1 
 drivers/media/v4l2-core/v4l2-common.c                 |    1 
 drivers/media/v4l2-core/videobuf-core.c               |    4 
 drivers/media/v4l2-core/videobuf-dma-contig.c         |    4 
 drivers/media/v4l2-core/videobuf-dma-sg.c             |   10 
 drivers/media/v4l2-core/videobuf-vmalloc.c            |    4 
 drivers/misc/cxl/cxllib.c                             |    9 
 drivers/misc/cxl/fault.c                              |    4 
 drivers/misc/genwqe/card_utils.c                      |    2 
 drivers/misc/sgi-gru/grufault.c                       |   25 
 drivers/misc/sgi-gru/grufile.c                        |    4 
 drivers/mtd/ubi/ubi.h                                 |    2 
 drivers/net/ethernet/amd/7990.c                       |    4 
 drivers/net/ethernet/amd/hplance.c                    |    4 
 drivers/net/ethernet/amd/mvme147.c                    |    4 
 drivers/net/ethernet/amd/sun3lance.c                  |    4 
 drivers/net/ethernet/amd/sunlance.c                   |    4 
 drivers/net/ethernet/apple/bmac.c                     |    4 
 drivers/net/ethernet/apple/mace.c                     |    4 
 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c |    4 
 drivers/net/ethernet/freescale/fs_enet/mac-fcc.c      |    4 
 drivers/net/ethernet/freescale/fs_enet/mii-fec.c      |    4 
 drivers/net/ethernet/i825xx/82596.c                   |    4 
 drivers/net/ethernet/korina.c                         |    4 
 drivers/net/ethernet/marvell/pxa168_eth.c             |    4 
 drivers/net/ethernet/natsemi/jazzsonic.c              |    4 
 drivers/net/ethernet/natsemi/macsonic.c               |    4 
 drivers/net/ethernet/natsemi/xtsonic.c                |    4 
 drivers/net/ethernet/sun/sunbmac.c                    |    4 
 drivers/net/ethernet/sun/sunhme.c                     |    1 
 drivers/net/ethernet/sun/sunqe.c                      |    4 
 drivers/oprofile/buffer_sync.c                        |   12 
 drivers/sbus/char/flash.c                             |    1 
 drivers/sbus/char/uctrl.c                             |    1 
 drivers/scsi/53c700.c                                 |    4 
 drivers/scsi/a2091.c                                  |    1 
 drivers/scsi/a3000.c                                  |    1 
 drivers/scsi/arm/cumana_2.c                           |    4 
 drivers/scsi/arm/eesox.c                              |    4 
 drivers/scsi/arm/powertec.c                           |    4 
 drivers/scsi/dpt_i2o.c                                |    4 
 drivers/scsi/gvp11.c                                  |    1 
 drivers/scsi/lasi700.c                                |    1 
 drivers/scsi/mac53c94.c                               |    4 
 drivers/scsi/mesh.c                                   |    4 
 drivers/scsi/mvme147.c                                |    1 
 drivers/scsi/qlogicpti.c                              |    4 
 drivers/scsi/sni_53c710.c                             |    1 
 drivers/scsi/zorro_esp.c                              |    4 
 drivers/staging/android/ashmem.c                      |    4 
 drivers/staging/comedi/comedi_fops.c                  |    2 
 drivers/staging/kpc2000/kpc_dma/fileops.c             |    4 
 drivers/staging/media/atomisp/pci/hmm/hmm_bo.c        |    4 
 drivers/tee/optee/call.c                              |    4 
 drivers/tty/sysrq.c                                   |    4 
 drivers/tty/vt/consolemap.c                           |    2 
 drivers/vfio/pci/vfio_pci.c                           |   22 
 drivers/vfio/vfio_iommu_type1.c                       |    8 
 drivers/vhost/vdpa.c                                  |    4 
 drivers/video/console/newport_con.c                   |    1 
 drivers/video/fbdev/acornfb.c                         |    1 
 drivers/video/fbdev/atafb.c                           |    1 
 drivers/video/fbdev/cirrusfb.c                        |    1 
 drivers/video/fbdev/cyber2000fb.c                     |    1 
 drivers/video/fbdev/fb-puv3.c                         |    1 
 drivers/video/fbdev/hitfb.c                           |    1 
 drivers/video/fbdev/neofb.c                           |    1 
 drivers/video/fbdev/q40fb.c                           |    1 
 drivers/video/fbdev/savage/savagefb_driver.c          |    1 
 drivers/xen/balloon.c                                 |    1 
 drivers/xen/gntdev.c                                  |    6 
 drivers/xen/grant-table.c                             |    1 
 drivers/xen/privcmd.c                                 |   15 
 drivers/xen/xenbus/xenbus_probe.c                     |    1 
 drivers/xen/xenbus/xenbus_probe_backend.c             |    1 
 drivers/xen/xenbus/xenbus_probe_frontend.c            |    1 
 fs/aio.c                                              |    4 
 fs/coredump.c                                         |    8 
 fs/exec.c                                             |   18 
 fs/ext2/file.c                                        |    2 
 fs/ext4/super.c                                       |    6 
 fs/hugetlbfs/inode.c                                  |    2 
 fs/io_uring.c                                         |    4 
 fs/kernfs/file.c                                      |    4 
 fs/proc/array.c                                       |    1 
 fs/proc/base.c                                        |   24 
 fs/proc/meminfo.c                                     |    1 
 fs/proc/nommu.c                                       |    1 
 fs/proc/task_mmu.c                                    |   34 
 fs/proc/task_nommu.c                                  |   18 
 fs/proc/vmcore.c                                      |    1 
 fs/userfaultfd.c                                      |   46 
 fs/xfs/xfs_file.c                                     |    2 
 fs/xfs/xfs_inode.c                                    |   14 
 fs/xfs/xfs_iops.c                                     |    4 
 include/asm-generic/io.h                              |    2 
 include/asm-generic/pgtable-nopmd.h                   |    1 
 include/asm-generic/pgtable-nopud.h                   |    1 
 include/asm-generic/pgtable.h                         | 1322 ----------------
 include/linux/cache.h                                 |   10 
 include/linux/crash_dump.h                            |    3 
 include/linux/dax.h                                   |    1 
 include/linux/dma-noncoherent.h                       |    2 
 include/linux/fs.h                                    |    4 
 include/linux/hmm.h                                   |    2 
 include/linux/huge_mm.h                               |    2 
 include/linux/hugetlb.h                               |    2 
 include/linux/io-mapping.h                            |    4 
 include/linux/kallsyms.h                              |    4 
 include/linux/kasan.h                                 |    4 
 include/linux/mempolicy.h                             |    2 
 include/linux/mm.h                                    |   15 
 include/linux/mm_types.h                              |    4 
 include/linux/mmap_lock.h                             |  128 +
 include/linux/mmu_notifier.h                          |   13 
 include/linux/pagemap.h                               |    2 
 include/linux/pgtable.h                               | 1444 +++++++++++++++++-
 include/linux/rmap.h                                  |    2 
 include/linux/sched/debug.h                           |    7 
 include/linux/sched/mm.h                              |   10 
 include/linux/uaccess.h                               |   62 
 include/xen/arm/page.h                                |    4 
 init/init_task.c                                      |    1 
 ipc/shm.c                                             |    8 
 kernel/acct.c                                         |    6 
 kernel/bpf/stackmap.c                                 |   21 
 kernel/bpf/syscall.c                                  |    2 
 kernel/cgroup/cpuset.c                                |    4 
 kernel/debug/kdb/kdb_bt.c                             |   17 
 kernel/events/core.c                                  |   10 
 kernel/events/uprobes.c                               |   20 
 kernel/exit.c                                         |   11 
 kernel/fork.c                                         |   15 
 kernel/futex.c                                        |    4 
 kernel/locking/lockdep.c                              |    4 
 kernel/locking/rtmutex-debug.c                        |    4 
 kernel/power/snapshot.c                               |    1 
 kernel/relay.c                                        |    2 
 kernel/sched/core.c                                   |   10 
 kernel/sched/fair.c                                   |    4 
 kernel/sys.c                                          |   22 
 kernel/trace/bpf_trace.c                              |  176 +-
 kernel/trace/ftrace.c                                 |    8 
 kernel/trace/trace_kprobe.c                           |   80 
 kernel/trace/trace_output.c                           |    4 
 lib/dump_stack.c                                      |    4 
 lib/ioremap.c                                         |    1 
 lib/test_hmm.c                                        |   14 
 lib/test_lockup.c                                     |   16 
 mm/debug.c                                            |   10 
 mm/debug_vm_pgtable.c                                 |    1 
 mm/filemap.c                                          |   46 
 mm/frame_vector.c                                     |    6 
 mm/gup.c                                              |   73 
 mm/hmm.c                                              |    2 
 mm/huge_memory.c                                      |    8 
 mm/hugetlb.c                                          |    3 
 mm/init-mm.c                                          |    6 
 mm/internal.h                                         |    6 
 mm/khugepaged.c                                       |   72 
 mm/ksm.c                                              |   48 
 mm/maccess.c                                          |  496 +++---
 mm/madvise.c                                          |   40 
 mm/memcontrol.c                                       |   10 
 mm/memory.c                                           |   61 
 mm/mempolicy.c                                        |   36 
 mm/migrate.c                                          |   16 
 mm/mincore.c                                          |    8 
 mm/mlock.c                                            |   22 
 mm/mmap.c                                             |   74 
 mm/mmu_gather.c                                       |    2 
 mm/mmu_notifier.c                                     |   22 
 mm/mprotect.c                                         |   22 
 mm/mremap.c                                           |   14 
 mm/msync.c                                            |    8 
 mm/nommu.c                                            |   22 
 mm/oom_kill.c                                         |   14 
 mm/page_io.c                                          |    1 
 mm/page_reporting.h                                   |    2 
 mm/pagewalk.c                                         |   12 
 mm/pgtable-generic.c                                  |    6 
 mm/process_vm_access.c                                |    4 
 mm/ptdump.c                                           |    4 
 mm/rmap.c                                             |   12 
 mm/shmem.c                                            |    5 
 mm/sparse-vmemmap.c                                   |    1 
 mm/sparse.c                                           |    1 
 mm/swap_state.c                                       |    5 
 mm/swapfile.c                                         |    5 
 mm/userfaultfd.c                                      |   26 
 mm/util.c                                             |   12 
 mm/vmacache.c                                         |    1 
 mm/zsmalloc.c                                         |    4 
 net/ipv4/tcp.c                                        |    8 
 net/xdp/xdp_umem.c                                    |    4 
 security/keys/keyctl.c                                |    2 
 sound/core/oss/pcm_oss.c                              |    2 
 sound/core/sgbuf.c                                    |    1 
 sound/pci/hda/hda_intel.c                             |    4 
 sound/soc/intel/common/sst-firmware.c                 |    4 
 sound/soc/intel/haswell/sst-haswell-pcm.c             |    4 
 tools/include/linux/kallsyms.h                        |    2 
 virt/kvm/async_pf.c                                   |    4 
 virt/kvm/kvm_main.c                                   |    9 
 942 files changed, 4580 insertions(+), 5662 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-08  4:35 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-08  4:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


Various trees.  Mainly those parts of MM whose linux-next dependents
are now merged.  I'm still sitting on ~160 patches which await merges
from -next.


54 patches, based on 9aa900c8094dba7a60dc805ecec1e9f720744ba1.

Subsystems affected by this patch series:

  mm/proc
  ipc
  dynamic-debug
  panic
  lib
  sysctl
  mm/gup
  mm/pagemap

Subsystem: mm/proc

    SeongJae Park <sjpark@amazon.de>:
      mm/page_idle.c: skip offline pages

Subsystem: ipc

    Jules Irenge <jbi.octave@gmail.com>:
      ipc/msg: add missing annotation for freeque()

    Giuseppe Scrivano <gscrivan@redhat.com>:
      ipc/namespace.c: use a work queue to free_ipc

Subsystem: dynamic-debug

    Orson Zhai <orson.zhai@unisoc.com>:
      dynamic_debug: add an option to enable dynamic debug for modules only

Subsystem: panic

    Rafael Aquini <aquini@redhat.com>:
      kernel: add panic_on_taint

Subsystem: lib

    Manfred Spraul <manfred@colorfullife.com>:
      xarray.h: correct return code documentation for xa_store_{bh,irq}()

Subsystem: sysctl

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "support setting sysctl parameters from kernel command line", v3:
      kernel/sysctl: support setting sysctl parameters from kernel command line
      kernel/sysctl: support handling command line aliases
      kernel/hung_task convert hung_task_panic boot parameter to sysctl
      tools/testing/selftests/sysctl/sysctl.sh: support CONFIG_TEST_SYSCTL=y
      lib/test_sysctl: support testing of sysctl. boot parameter

    "Guilherme G. Piccoli" <gpiccoli@canonical.com>:
      kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases
      kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected
      panic: add sysctl to dump all CPUs backtraces on oops event

    Rafael Aquini <aquini@redhat.com>:
      kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.tainted

Subsystem: mm/gup

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/gup.c: convert to use get_user_{page|pages}_fast_only()

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers)
    Patch series "mm/gup: introduce pin_user_pages_locked(), use it in frame_vector.c", v2:
      mm/gup: introduce pin_user_pages_locked()
      mm/gup: frame_vector: convert get_user_pages() --> pin_user_pages()
      mm/gup: documentation fix for pin_user_pages*() APIs
    Patch series "vhost, docs: convert to pin_user_pages(), new "case 5"":
      docs: mm/gup: pin_user_pages.rst: add a "case 5"
      vhost: convert get_user_pages() --> pin_user_pages()

Subsystem: mm/pagemap

    Alexander Gordeev <agordeev@linux.ibm.com>:
      mm/mmap.c: add more sanity checks to get_unmapped_area()
      mm/mmap.c: do not allow mappings outside of allowed limits

    Christoph Hellwig <hch@lst.de>:
    Patch series "sort out the flush_icache_range mess", v2:
      arm: fix the flush_icache_range arguments in set_fiq_handler
      nds32: unexport flush_icache_page
      powerpc: unexport flush_icache_user_range
      unicore32: remove flush_cache_user_range
      asm-generic: fix the inclusion guards for cacheflush.h
      asm-generic: don't include <linux/mm.h> in cacheflush.h
      asm-generic: improve the flush_dcache_page stub
      alpha: use asm-generic/cacheflush.h
      arm64: use asm-generic/cacheflush.h
      c6x: use asm-generic/cacheflush.h
      hexagon: use asm-generic/cacheflush.h
      ia64: use asm-generic/cacheflush.h
      microblaze: use asm-generic/cacheflush.h
      m68knommu: use asm-generic/cacheflush.h
      openrisc: use asm-generic/cacheflush.h
      powerpc: use asm-generic/cacheflush.h
      riscv: use asm-generic/cacheflush.h
      arm,sparc,unicore32: remove flush_icache_user_range
      mm: rename flush_icache_user_range to flush_icache_user_page
      asm-generic: add a flush_icache_user_range stub
      sh: implement flush_icache_user_range
      xtensa: implement flush_icache_user_range
      arm: rename flush_cache_user_range to flush_icache_user_range
      m68k: implement flush_icache_user_range
      exec: only build read_code when needed
      exec: use flush_icache_user_range in read_code
      binfmt_flat: use flush_icache_user_range
      nommu: use flush_icache_user_range in brk and mmap
      module: move the set_fs hack for flush_icache_range to m68k

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      doc: cgroup: update note about conditions when oom killer is invoked

 Documentation/admin-guide/cgroup-v2.rst           |   17 +-
 Documentation/admin-guide/dynamic-debug-howto.rst |    5 
 Documentation/admin-guide/kdump/kdump.rst         |    8 +
 Documentation/admin-guide/kernel-parameters.txt   |   34 +++-
 Documentation/admin-guide/sysctl/kernel.rst       |   37 ++++
 Documentation/core-api/pin_user_pages.rst         |   47 ++++--
 arch/alpha/include/asm/cacheflush.h               |   38 +----
 arch/alpha/kernel/smp.c                           |    2 
 arch/arm/include/asm/cacheflush.h                 |    7 
 arch/arm/kernel/fiq.c                             |    4 
 arch/arm/kernel/traps.c                           |    2 
 arch/arm64/include/asm/cacheflush.h               |   46 ------
 arch/c6x/include/asm/cacheflush.h                 |   19 --
 arch/hexagon/include/asm/cacheflush.h             |   19 --
 arch/ia64/include/asm/cacheflush.h                |   30 ----
 arch/m68k/include/asm/cacheflush_mm.h             |    6 
 arch/m68k/include/asm/cacheflush_no.h             |   19 --
 arch/m68k/mm/cache.c                              |   13 +
 arch/microblaze/include/asm/cacheflush.h          |   29 ---
 arch/nds32/include/asm/cacheflush.h               |    4 
 arch/nds32/mm/cacheflush.c                        |    3 
 arch/openrisc/include/asm/cacheflush.h            |   33 ----
 arch/powerpc/include/asm/cacheflush.h             |   46 +-----
 arch/powerpc/kvm/book3s_64_mmu_hv.c               |    2 
 arch/powerpc/kvm/book3s_64_mmu_radix.c            |    2 
 arch/powerpc/mm/mem.c                             |    3 
 arch/powerpc/perf/callchain_64.c                  |    4 
 arch/riscv/include/asm/cacheflush.h               |   65 --------
 arch/sh/include/asm/cacheflush.h                  |    1 
 arch/sparc/include/asm/cacheflush_32.h            |    2 
 arch/sparc/include/asm/cacheflush_64.h            |    1 
 arch/um/include/asm/tlb.h                         |    2 
 arch/unicore32/include/asm/cacheflush.h           |   11 -
 arch/x86/include/asm/cacheflush.h                 |    2 
 arch/xtensa/include/asm/cacheflush.h              |    2 
 drivers/media/platform/omap3isp/ispvideo.c        |    2 
 drivers/nvdimm/pmem.c                             |    3 
 drivers/vhost/vhost.c                             |    5 
 fs/binfmt_flat.c                                  |    2 
 fs/exec.c                                         |    5 
 fs/proc/proc_sysctl.c                             |  163 ++++++++++++++++++++--
 include/asm-generic/cacheflush.h                  |   25 +--
 include/linux/dev_printk.h                        |    6 
 include/linux/dynamic_debug.h                     |    2 
 include/linux/ipc_namespace.h                     |    2 
 include/linux/kernel.h                            |    9 +
 include/linux/mm.h                                |   12 +
 include/linux/net.h                               |    3 
 include/linux/netdevice.h                         |    6 
 include/linux/printk.h                            |    9 -
 include/linux/sched/sysctl.h                      |    7 
 include/linux/sysctl.h                            |    4 
 include/linux/xarray.h                            |    4 
 include/rdma/ib_verbs.h                           |    6 
 init/main.c                                       |    2 
 ipc/msg.c                                         |    2 
 ipc/namespace.c                                   |   24 ++-
 kernel/events/core.c                              |    4 
 kernel/events/uprobes.c                           |    2 
 kernel/hung_task.c                                |   30 ++--
 kernel/module.c                                   |    8 -
 kernel/panic.c                                    |   45 ++++++
 kernel/sysctl.c                                   |   38 ++++-
 kernel/watchdog.c                                 |   37 +---
 lib/Kconfig.debug                                 |   12 +
 lib/Makefile                                      |    2 
 lib/dynamic_debug.c                               |    9 -
 lib/test_sysctl.c                                 |   13 +
 mm/frame_vector.c                                 |    7 
 mm/gup.c                                          |   74 +++++++--
 mm/mmap.c                                         |   28 ++-
 mm/nommu.c                                        |    4 
 mm/page_alloc.c                                   |    9 -
 mm/page_idle.c                                    |    7 
 tools/testing/selftests/sysctl/sysctl.sh          |   44 +++++
 virt/kvm/kvm_main.c                               |    8 -
 76 files changed, 732 insertions(+), 517 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-04 23:45 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-04 23:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


- More MM work.  100ish more to go.  Mike's "mm: remove
  __ARCH_HAS_5LEVEL_HACK" series should fix the current ppc issue.

- Various other little subsystems

127 patches, based on 6929f71e46bdddbf1c4d67c2728648176c67c555.


Subsystems affected by this patch series:

  kcov
  mm/pagemap
  mm/vmalloc
  mm/kmap
  mm/util
  mm/memory-hotplug
  mm/cleanups
  mm/zram
  procfs
  core-kernel
  get_maintainer
  lib
  bitops
  checkpatch
  binfmt
  init
  fat
  seq_file
  exec
  rapidio
  relay
  selftests
  ubsan

Subsystem: kcov

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kcov: collect coverage from usb soft interrupts", v4:
      kcov: cleanup debug messages
      kcov: fix potential use-after-free in kcov_remote_start
      kcov: move t->kcov assignments into kcov_start/stop
      kcov: move t->kcov_sequence assignment
      kcov: use t->kcov_mode as enabled indicator
      kcov: collect coverage from interrupts
      usb: core: kcov: collect coverage from usb complete callback

Subsystem: mm/pagemap

    Feng Tang <feng.tang@intel.com>:
      mm/util.c: remove the VM_WARN_ONCE for vm_committed_as underflow check

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: remove __ARCH_HAS_5LEVEL_HACK", v4:
      h8300: remove usage of __ARCH_USE_5LEVEL_HACK
      arm: add support for folded p4d page tables
      arm64: add support for folded p4d page tables
      hexagon: remove __ARCH_USE_5LEVEL_HACK
      ia64: add support for folded p4d page tables
      nios2: add support for folded p4d page tables
      openrisc: add support for folded p4d page tables
      powerpc: add support for folded p4d page tables

    Geert Uytterhoeven <geert+renesas@glider.be>:
      sh: fault: modernize printing of kernel messages

    Mike Rapoport <rppt@linux.ibm.com>:
      sh: drop __pXd_offset() macros that duplicate pXd_index() ones
      sh: add support for folded p4d page tables
      unicore32: remove __ARCH_USE_5LEVEL_HACK
      asm-generic: remove pgtable-nop4d-hack.h
      mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/debug: Add tests validating architecture page table:
      x86/mm: define mm_p4d_folded()
      mm/debug: add tests validating architecture page table helpers

Subsystem: mm/vmalloc

    Jeongtae Park <jtp.park@samsung.com>:
      mm/vmalloc: fix a typo in comment

Subsystem: mm/kmap

    Ira Weiny <ira.weiny@intel.com>:
    Patch series "Remove duplicated kmap code", v3:
      arch/kmap: remove BUG_ON()
      arch/xtensa: move kmap build bug out of the way
      arch/kmap: remove redundant arch specific kmaps
      arch/kunmap: remove duplicate kunmap implementations
      {x86,powerpc,microblaze}/kmap: move preempt disable
      arch/kmap_atomic: consolidate duplicate code
      arch/kunmap_atomic: consolidate duplicate code
      arch/kmap: ensure kmap_prot visibility
      arch/kmap: don't hard code kmap_prot values
      arch/kmap: define kmap_atomic_prot() for all arch's
      drm: remove drm specific kmap_atomic code
      kmap: remove kmap_atomic_to_page()
      parisc/kmap: remove duplicate kmap code
      sparc: remove unnecessary includes
      kmap: consolidate kmap_prot definitions

Subsystem: mm/util

    Waiman Long <longman@redhat.com>:
      mm: add kvfree_sensitive() for freeing sensitive data objects

Subsystem: mm/memory-hotplug

    Vishal Verma <vishal.l.verma@intel.com>:
      mm/memory_hotplug: refrain from adding memory into an impossible node

    David Hildenbrand <david@redhat.com>:
      powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
      mm/memory_hotplug: remove is_mem_section_removable()
    Patch series "mm/memory_hotplug: handle memblocks only with:
      mm/memory_hotplug: set node_start_pfn of hotadded pgdat to 0
      mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK
    Patch series "mm/memory_hotplug: Interface to add driver-managed system:
      mm/memory_hotplug: introduce add_memory_driver_managed()
      kexec_file: don't place kexec images on IORESOURCE_MEM_DRIVER_MANAGED
      device-dax: add memory via add_memory_driver_managed()

    Michal Hocko <mhocko@kernel.org>:
      mm/memory_hotplug: disable the functionality for 32b

Subsystem: mm/cleanups

    chenqiwu <chenqiwu@xiaomi.com>:
      mm: replace zero-length array with flexible-array member

    Ethon Paul <ethp@qq.com>:
      mm/memory_hotplug: fix a typo in comment "recoreded"->"recorded"
      mm: ksm: fix a typo in comment "alreaady"->"already"
      mm: mmap: fix a typo in comment "compatbility"->"compatibility"
      mm/hugetlb: fix a typos in comments
      mm/vmsan: fix some typos in comment
      mm/compaction: fix a typo in comment "pessemistic"->"pessimistic"
      mm/memblock: fix a typo in comment "implict"->"implicit"
      mm/list_lru: fix a typo in comment "numbesr"->"numbers"
      mm/filemap: fix a typo in comment "unneccssary"->"unnecessary"
      mm/frontswap: fix some typos in frontswap.c
      mm, memcg: fix some typos in memcontrol.c
      mm: fix a typo in comment "strucure"->"structure"
      mm/slub: fix a typo in comment "disambiguiation"->"disambiguation"
      mm/sparse: fix a typo in comment "convienence"->"convenience"
      mm/page-writeback: fix a typo in comment "effictive"->"effective"
      mm/memory: fix a typo in comment "attampt"->"attempt"

    Zou Wei <zou_wei@huawei.com>:
      mm: use false for bool variable

    Jason Yan <yanaijie@huawei.com>:
      include/linux/mm.h: return true in cpupid_pid_unset()

Subsystem: mm/zram

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      zcomp: Use ARRAY_SIZE() for backends list

Subsystem: procfs

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: rename "catch" function argument

Subsystem: core-kernel

    Jason Yan <yanaijie@huawei.com>:
      user.c: make uidhash_table static

Subsystem: get_maintainer

    Joe Perches <joe@perches.com>:
      get_maintainer: add email addresses from .yaml files
      get_maintainer: fix unexpected behavior for path/to//file (double slashes)

Subsystem: lib

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      lib/math: avoid trailing newline hidden in pr_fmt()

    KP Singh <kpsingh@chromium.org>:
      lib: Add might_fault() to strncpy_from_user.

    Jason Yan <yanaijie@huawei.com>:
      lib/test_lockup.c: make test_inode static

    Jann Horn <jannh@google.com>:
      lib/zlib: remove outdated and incorrect pre-increment optimization

    Joe Perches <joe@perches.com>:
      lib/percpu-refcount.c: use a more common logging style

    Tan Hu <tan.hu@zte.com.cn>:
      lib/flex_proportions.c: cleanup __fprop_inc_percpu_max

    Jesse Brandeburg <jesse.brandeburg@intel.com>:
      lib: make a test module with set/clear bit

Subsystem: bitops

    Arnd Bergmann <arnd@arndb.de>:
      include/linux/bitops.h: avoid clang shift-count-overflow warnings

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: additional MAINTAINER section entry ordering checks
      checkpatch: look for c99 comments in ctx_locate_comment
      checkpatch: disallow --git and --file/--fix

    Geert Uytterhoeven <geert+renesas@glider.be>:
      checkpatch: use patch subject when reading from stdin

Subsystem: binfmt

    Anthony Iliopoulos <ailiop@suse.com>:
      fs/binfmt_elf: remove redundant elf_map ifndef

    Nick Desaulniers <ndesaulniers@google.com>:
      elfnote: mark all .note sections SHF_ALLOC

Subsystem: init

    Chris Down <chris@chrisdown.name>:
      init: allow distribution configuration of default init

Subsystem: fat

    OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>:
      fat: don't allow to mount if the FAT length == 0
      fat: improve the readahead for FAT entries

Subsystem: seq_file

    Joe Perches <joe@perches.com>:
      fs/seq_file.c: seq_read: Update pr_info_ratelimited

    Kefeng Wang <wangkefeng.wang@huawei.com>:
    Patch series "seq_file: Introduce DEFINE_SEQ_ATTRIBUTE() helper macro":
      include/linux/seq_file.h: introduce DEFINE_SEQ_ATTRIBUTE() helper macro
      mm/vmstat.c: convert to use DEFINE_SEQ_ATTRIBUTE macro
      kernel/kprobes.c: convert to use DEFINE_SEQ_ATTRIBUTE macro

Subsystem: exec

    Christoph Hellwig <hch@lst.de>:
      exec: simplify the copy_strings_kernel calling convention
      exec: open code copy_string_kernel

Subsystem: rapidio

    Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>:
      rapidio: avoid data race between file operation callbacks and mport_cdev_add().

    John Hubbard <jhubbard@nvidia.com>:
      rapidio: convert get_user_pages() --> pin_user_pages()

Subsystem: relay

    Daniel Axtens <dja@axtens.net>:
      kernel/relay.c: handle alloc_percpu returning NULL in relay_open

    Pengcheng Yang <yangpc@wangsu.com>:
      kernel/relay.c: fix read_pos error when multiple readers

Subsystem: selftests

    Ram Pai <linuxram@us.ibm.com>:
    Patch series "selftests, powerpc, x86: Memory Protection Keys", v19:
      selftests/x86/pkeys: move selftests to arch-neutral directory
      selftests/vm/pkeys: rename all references to pkru to a generic name
      selftests/vm/pkeys: move generic definitions to header file

    Thiago Jung Bauermann <bauerman@linux.ibm.com>:
      selftests/vm/pkeys: move some definitions to arch-specific header
      selftests/vm/pkeys: make gcc check arguments of sigsafe_printf()

    Sandipan Das <sandipan@linux.ibm.com>:
      selftests: vm: pkeys: Use sane types for pkey register
      selftests: vm: pkeys: add helpers for pkey bits

    Ram Pai <linuxram@us.ibm.com>:
      selftests/vm/pkeys: fix pkey_disable_clear()
      selftests/vm/pkeys: fix assertion in pkey_disable_set/clear()
      selftests/vm/pkeys: fix alloc_random_pkey() to make it really random

    Sandipan Das <sandipan@linux.ibm.com>:
      selftests: vm: pkeys: use the correct huge page size

    Ram Pai <linuxram@us.ibm.com>:
      selftests/vm/pkeys: introduce generic pkey abstractions
      selftests/vm/pkeys: introduce powerpc support

    "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>:
      selftests/vm/pkeys: fix number of reserved powerpc pkeys

    Ram Pai <linuxram@us.ibm.com>:
      selftests/vm/pkeys: fix assertion in test_pkey_alloc_exhaust()
      selftests/vm/pkeys: improve checks to determine pkey support
      selftests/vm/pkeys: associate key on a mapped page and detect access violation
      selftests/vm/pkeys: associate key on a mapped page and detect write violation
      selftests/vm/pkeys: detect write violation on a mapped access-denied-key page
      selftests/vm/pkeys: introduce a sub-page allocator
      selftests/vm/pkeys: test correct behaviour of pkey-0
      selftests/vm/pkeys: override access right definitions on powerpc

    Sandipan Das <sandipan@linux.ibm.com>:
      selftests: vm: pkeys: use the correct page size on powerpc
      selftests: vm: pkeys: fix multilib builds for x86

    Jagadeesh Pagadala <jagdsh.linux@gmail.com>:
      tools/testing/selftests/vm: remove duplicate headers

Subsystem: ubsan

    Arnd Bergmann <arnd@arndb.de>:
      lib/ubsan.c: fix gcc-10 warnings

 Documentation/dev-tools/kcov.rst                               |   17 
 Documentation/features/debug/debug-vm-pgtable/arch-support.txt |   34 
 arch/arc/Kconfig                                               |    1 
 arch/arc/include/asm/highmem.h                                 |   20 
 arch/arc/mm/highmem.c                                          |   34 
 arch/arm/include/asm/highmem.h                                 |    9 
 arch/arm/include/asm/pgtable.h                                 |    1 
 arch/arm/lib/uaccess_with_memcpy.c                             |    7 
 arch/arm/mach-sa1100/assabet.c                                 |    2 
 arch/arm/mm/dump.c                                             |   29 
 arch/arm/mm/fault-armv.c                                       |    7 
 arch/arm/mm/fault.c                                            |   22 
 arch/arm/mm/highmem.c                                          |   41 
 arch/arm/mm/idmap.c                                            |    3 
 arch/arm/mm/init.c                                             |    2 
 arch/arm/mm/ioremap.c                                          |   12 
 arch/arm/mm/mm.h                                               |    2 
 arch/arm/mm/mmu.c                                              |   35 
 arch/arm/mm/pgd.c                                              |   40 
 arch/arm64/Kconfig                                             |    1 
 arch/arm64/include/asm/kvm_mmu.h                               |   10 
 arch/arm64/include/asm/pgalloc.h                               |   10 
 arch/arm64/include/asm/pgtable-types.h                         |    5 
 arch/arm64/include/asm/pgtable.h                               |   37 
 arch/arm64/include/asm/stage2_pgtable.h                        |   48 
 arch/arm64/kernel/hibernate.c                                  |   44 
 arch/arm64/kvm/mmu.c                                           |  209 
 arch/arm64/mm/fault.c                                          |    9 
 arch/arm64/mm/hugetlbpage.c                                    |   15 
 arch/arm64/mm/kasan_init.c                                     |   26 
 arch/arm64/mm/mmu.c                                            |   52 
 arch/arm64/mm/pageattr.c                                       |    7 
 arch/csky/include/asm/highmem.h                                |   12 
 arch/csky/mm/highmem.c                                         |   64 
 arch/h8300/include/asm/pgtable.h                               |    1 
 arch/hexagon/include/asm/fixmap.h                              |    4 
 arch/hexagon/include/asm/pgtable.h                             |    1 
 arch/ia64/include/asm/pgalloc.h                                |    4 
 arch/ia64/include/asm/pgtable.h                                |   17 
 arch/ia64/mm/fault.c                                           |    7 
 arch/ia64/mm/hugetlbpage.c                                     |   18 
 arch/ia64/mm/init.c                                            |   28 
 arch/microblaze/include/asm/highmem.h                          |   55 
 arch/microblaze/mm/highmem.c                                   |   21 
 arch/microblaze/mm/init.c                                      |    3 
 arch/mips/include/asm/highmem.h                                |   11 
 arch/mips/mm/cache.c                                           |    6 
 arch/mips/mm/highmem.c                                         |   62 
 arch/nds32/include/asm/highmem.h                               |    9 
 arch/nds32/mm/highmem.c                                        |   49 
 arch/nios2/include/asm/pgtable.h                               |    3 
 arch/nios2/mm/fault.c                                          |    9 
 arch/nios2/mm/ioremap.c                                        |    6 
 arch/openrisc/include/asm/pgtable.h                            |    1 
 arch/openrisc/mm/fault.c                                       |   10 
 arch/openrisc/mm/init.c                                        |    4 
 arch/parisc/include/asm/cacheflush.h                           |   32 
 arch/powerpc/Kconfig                                           |    1 
 arch/powerpc/include/asm/book3s/32/pgtable.h                   |    1 
 arch/powerpc/include/asm/book3s/64/hash.h                      |    4 
 arch/powerpc/include/asm/book3s/64/pgalloc.h                   |    4 
 arch/powerpc/include/asm/book3s/64/pgtable.h                   |   60 
 arch/powerpc/include/asm/book3s/64/radix.h                     |    6 
 arch/powerpc/include/asm/highmem.h                             |   56 
 arch/powerpc/include/asm/nohash/32/pgtable.h                   |    1 
 arch/powerpc/include/asm/nohash/64/pgalloc.h                   |    2 
 arch/powerpc/include/asm/nohash/64/pgtable-4k.h                |   32 
 arch/powerpc/include/asm/nohash/64/pgtable.h                   |    6 
 arch/powerpc/include/asm/pgtable.h                             |   10 
 arch/powerpc/kvm/book3s_64_mmu_radix.c                         |   32 
 arch/powerpc/lib/code-patching.c                               |    7 
 arch/powerpc/mm/book3s64/hash_pgtable.c                        |    4 
 arch/powerpc/mm/book3s64/radix_pgtable.c                       |   26 
 arch/powerpc/mm/book3s64/subpage_prot.c                        |    6 
 arch/powerpc/mm/highmem.c                                      |   26 
 arch/powerpc/mm/hugetlbpage.c                                  |   28 
 arch/powerpc/mm/kasan/kasan_init_32.c                          |    2 
 arch/powerpc/mm/mem.c                                          |    3 
 arch/powerpc/mm/nohash/book3e_pgtable.c                        |   15 
 arch/powerpc/mm/pgtable.c                                      |   30 
 arch/powerpc/mm/pgtable_64.c                                   |   10 
 arch/powerpc/mm/ptdump/hashpagetable.c                         |   20 
 arch/powerpc/mm/ptdump/ptdump.c                                |   12 
 arch/powerpc/platforms/pseries/hotplug-memory.c                |   26 
 arch/powerpc/xmon/xmon.c                                       |   27 
 arch/s390/Kconfig                                              |    1 
 arch/sh/include/asm/pgtable-2level.h                           |    1 
 arch/sh/include/asm/pgtable-3level.h                           |    1 
 arch/sh/include/asm/pgtable_32.h                               |    5 
 arch/sh/include/asm/pgtable_64.h                               |    5 
 arch/sh/kernel/io_trapped.c                                    |    7 
 arch/sh/mm/cache-sh4.c                                         |    4 
 arch/sh/mm/cache-sh5.c                                         |    7 
 arch/sh/mm/fault.c                                             |   64 
 arch/sh/mm/hugetlbpage.c                                       |   28 
 arch/sh/mm/init.c                                              |   15 
 arch/sh/mm/kmap.c                                              |    2 
 arch/sh/mm/tlbex_32.c                                          |    6 
 arch/sh/mm/tlbex_64.c                                          |    7 
 arch/sparc/include/asm/highmem.h                               |   29 
 arch/sparc/mm/highmem.c                                        |   31 
 arch/sparc/mm/io-unit.c                                        |    1 
 arch/sparc/mm/iommu.c                                          |    1 
 arch/unicore32/include/asm/pgtable.h                           |    1 
 arch/unicore32/kernel/hibernate.c                              |    4 
 arch/x86/Kconfig                                               |    1 
 arch/x86/include/asm/fixmap.h                                  |    1 
 arch/x86/include/asm/highmem.h                                 |   37 
 arch/x86/include/asm/pgtable_64.h                              |    6 
 arch/x86/mm/highmem_32.c                                       |   52 
 arch/xtensa/include/asm/highmem.h                              |   31 
 arch/xtensa/mm/highmem.c                                       |   28 
 drivers/block/zram/zcomp.c                                     |    7 
 drivers/dax/dax-private.h                                      |    1 
 drivers/dax/kmem.c                                             |   28 
 drivers/gpu/drm/ttm/ttm_bo_util.c                              |   56 
 drivers/gpu/drm/vmwgfx/vmwgfx_blit.c                           |   17 
 drivers/rapidio/devices/rio_mport_cdev.c                       |   27 
 drivers/usb/core/hcd.c                                         |    3 
 fs/binfmt_elf.c                                                |    4 
 fs/binfmt_em86.c                                               |    6 
 fs/binfmt_misc.c                                               |    4 
 fs/binfmt_script.c                                             |    6 
 fs/exec.c                                                      |   58 
 fs/fat/fatent.c                                                |  103 
 fs/fat/inode.c                                                 |    6 
 fs/proc/array.c                                                |    8 
 fs/seq_file.c                                                  |    7 
 include/asm-generic/5level-fixup.h                             |   59 
 include/asm-generic/pgtable-nop4d-hack.h                       |   64 
 include/asm-generic/pgtable-nopud.h                            |    4 
 include/drm/ttm/ttm_bo_api.h                                   |    4 
 include/linux/binfmts.h                                        |    3 
 include/linux/bitops.h                                         |    2 
 include/linux/elfnote.h                                        |    2 
 include/linux/highmem.h                                        |   89 
 include/linux/ioport.h                                         |    1 
 include/linux/memory_hotplug.h                                 |    9 
 include/linux/mm.h                                             |   12 
 include/linux/sched.h                                          |    3 
 include/linux/seq_file.h                                       |   19 
 init/Kconfig                                                   |   10 
 init/main.c                                                    |   10 
 kernel/kcov.c                                                  |  282 -
 kernel/kexec_file.c                                            |    5 
 kernel/kprobes.c                                               |   34 
 kernel/relay.c                                                 |   22 
 kernel/user.c                                                  |    2 
 lib/Kconfig.debug                                              |   44 
 lib/Makefile                                                   |    2 
 lib/flex_proportions.c                                         |    7 
 lib/math/prime_numbers.c                                       |   10 
 lib/percpu-refcount.c                                          |    6 
 lib/strncpy_from_user.c                                        |    1 
 lib/test_bitops.c                                              |   60 
 lib/test_lockup.c                                              |    2 
 lib/ubsan.c                                                    |   33 
 lib/zlib_inflate/inffast.c                                     |   91 
 mm/Kconfig                                                     |    4 
 mm/Makefile                                                    |    1 
 mm/compaction.c                                                |    2 
 mm/debug_vm_pgtable.c                                          |  382 +
 mm/filemap.c                                                   |    2 
 mm/frontswap.c                                                 |    6 
 mm/huge_memory.c                                               |    2 
 mm/hugetlb.c                                                   |   16 
 mm/internal.h                                                  |    2 
 mm/kasan/init.c                                                |   11 
 mm/ksm.c                                                       |   10 
 mm/list_lru.c                                                  |    2 
 mm/memblock.c                                                  |    2 
 mm/memcontrol.c                                                |    4 
 mm/memory.c                                                    |   10 
 mm/memory_hotplug.c                                            |  179 
 mm/mmap.c                                                      |    2 
 mm/mremap.c                                                    |    2 
 mm/page-writeback.c                                            |    2 
 mm/slub.c                                                      |    2 
 mm/sparse.c                                                    |    2 
 mm/util.c                                                      |   22 
 mm/vmalloc.c                                                   |    2 
 mm/vmscan.c                                                    |    6 
 mm/vmstat.c                                                    |   32 
 mm/zbud.c                                                      |    2 
 scripts/checkpatch.pl                                          |   62 
 scripts/get_maintainer.pl                                      |   46 
 security/keys/internal.h                                       |   11 
 security/keys/keyctl.c                                         |   16 
 tools/testing/selftests/lib/config                             |    1 
 tools/testing/selftests/vm/.gitignore                          |    1 
 tools/testing/selftests/vm/Makefile                            |   75 
 tools/testing/selftests/vm/mremap_dontunmap.c                  |    1 
 tools/testing/selftests/vm/pkey-helpers.h                      |  557 +-
 tools/testing/selftests/vm/pkey-powerpc.h                      |  153 
 tools/testing/selftests/vm/pkey-x86.h                          |  191 
 tools/testing/selftests/vm/protection_keys.c                   | 2370 ++++++++--
 tools/testing/selftests/x86/.gitignore                         |    1 
 tools/testing/selftests/x86/Makefile                           |    2 
 tools/testing/selftests/x86/pkey-helpers.h                     |  219 
 tools/testing/selftests/x86/protection_keys.c                  | 1506 ------
 200 files changed, 5182 insertions(+), 4033 deletions(-)




^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-03 22:55 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-03 22:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


More mm/ work, plenty more to come.

131 patches, based on d6f9469a03d832dcd17041ed67774ffb5f3e73b3.

Subsystems affected by this patch series:

  mm/slub
  mm/memcg
  mm/gup
  mm/kasan
  mm/pagealloc
  mm/hugetlb
  mm/vmscan
  mm/tools
  mm/mempolicy
  mm/memblock
  mm/hugetlbfs
  mm/thp
  mm/mmap
  mm/kconfig

Subsystem: mm/slub

    Wang Hai <wanghai38@huawei.com>:
      mm/slub: fix a memory leak in sysfs_slab_add()

Subsystem: mm/memcg

    Shakeel Butt <shakeelb@google.com>:
      mm/memcg: optimize memory.numa_stat like memory.stat

Subsystem: mm/gup

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "mm/gup, drm/i915: refactor gup_fast, convert to pin_user_pages()", v2:
      mm/gup: move __get_user_pages_fast() down a few lines in gup.c
      mm/gup: refactor and de-duplicate gup_fast() code
      mm/gup: introduce pin_user_pages_fast_only()
      drm/i915: convert get_user_pages() --> pin_user_pages()
      mm/gup: might_lock_read(mmap_sem) in get_user_pages_fast()

Subsystem: mm/kasan

    Daniel Axtens <dja@axtens.net>:
    Patch series "Fix some incompatibilites between KASAN and FORTIFY_SOURCE", v4:
      kasan: stop tests being eliminated as dead code with FORTIFY_SOURCE
      string.h: fix incompatibility between FORTIFY_SOURCE and KASAN

Subsystem: mm/pagealloc

    Michal Hocko <mhocko@suse.com>:
      mm: clarify __GFP_MEMALLOC usage

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: rework free_area_init*() funcitons":
      mm: memblock: replace dereferences of memblock_region.nid with API calls
      mm: make early_pfn_to_nid() and related defintions close to each other
      mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option
      mm: free_area_init: use maximal zone PFNs rather than zone sizes
      mm: use free_area_init() instead of free_area_init_nodes()
      alpha: simplify detection of memory zone boundaries
      arm: simplify detection of memory zone boundaries
      arm64: simplify detection of memory zone boundaries for UMA configs
      csky: simplify detection of memory zone boundaries
      m68k: mm: simplify detection of memory zone boundaries
      parisc: simplify detection of memory zone boundaries
      sparc32: simplify detection of memory zone boundaries
      unicore32: simplify detection of memory zone boundaries
      xtensa: simplify detection of memory zone boundaries

    Baoquan He <bhe@redhat.com>:
      mm: memmap_init: iterate over memblock regions rather that check each PFN

    Mike Rapoport <rppt@linux.ibm.com>:
      mm: remove early_pfn_in_nid() and CONFIG_NODES_SPAN_OTHER_NODES
      mm: free_area_init: allow defining max_zone_pfn in descending order
      mm: rename free_area_init_node() to free_area_init_memoryless_node()
      mm: clean up free_area_init_node() and its helpers
      mm: simplify find_min_pfn_with_active_regions()
      docs/vm: update memory-models documentation

    Wei Yang <richard.weiyang@gmail.com>:
    Patch series "mm/page_alloc.c: cleanup on check page", v3:
      mm/page_alloc.c: bad_[reason|flags] is not necessary when PageHWPoison
      mm/page_alloc.c: bad_flags is not necessary for bad_page()
      mm/page_alloc.c: rename free_pages_check_bad() to check_free_page_bad()
      mm/page_alloc.c: rename free_pages_check() to check_free_page()
      mm/page_alloc.c: extract check_[new|free]_page_bad() common part to page_bad_reason()

    Roman Gushchin <guro@fb.com>:
      mm,page_alloc,cma: conditionally prefer cma pageblocks for movable allocations

    Baoquan He <bhe@redhat.com>:
      mm/page_alloc.c: remove unused free_bootmem_with_active_regions
    Patch series "improvements about lowmem_reserve and /proc/zoneinfo", v2:
      mm/page_alloc.c: only tune sysctl_lowmem_reserve_ratio value once when changing it
      mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
      mm/vmstat.c: do not show lowmem reserve protection information of empty zone

    Joonsoo Kim <iamjoonsoo.kim@lge.com>:
    Patch series "integrate classzone_idx and high_zoneidx", v5:
      mm/page_alloc: use ac->high_zoneidx for classzone_idx
      mm/page_alloc: integrate classzone_idx and high_zoneidx

    Wei Yang <richard.weiyang@gmail.com>:
      mm/page_alloc.c: use NODE_MASK_NONE in build_zonelists()
      mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention

    Sandipan Das <sandipan@linux.ibm.com>:
      mm/page_alloc.c: reset numa stats for boot pagesets

    Charan Teja Reddy <charante@codeaurora.org>:
      mm, page_alloc: reset the zone->watermark_boost early

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/page_alloc: restrict and formalize compound_page_dtors[]

    Daniel Jordan <daniel.m.jordan@oracle.com>:
    Patch series "initialize deferred pages with interrupts enabled", v4:
      mm/pagealloc.c: call touch_nmi_watchdog() on max order boundaries in deferred init

    Pavel Tatashin <pasha.tatashin@soleen.com>:
      mm: initialize deferred pages with interrupts enabled
      mm: call cond_resched() from deferred_init_memmap()

    Daniel Jordan <daniel.m.jordan@oracle.com>:
    Patch series "padata: parallelize deferred page init", v3:
      padata: remove exit routine
      padata: initialize earlier
      padata: allocate work structures for parallel jobs from a pool
      padata: add basic support for multithreaded jobs
      mm: don't track number of pages during deferred initialization
      mm: parallelize deferred_init_memmap()
      mm: make deferred init's max threads arch-specific
      padata: document multithreaded jobs

    Chen Tao <chentao107@huawei.com>:
      mm/page_alloc.c: add missing newline

Subsystem: mm/hugetlb

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
    Patch series "thp/khugepaged improvements and CoW semantics", v4:
      khugepaged: add self test
      khugepaged: do not stop collapse if less than half PTEs are referenced
      khugepaged: drain all LRU caches before scanning pages
      khugepaged: drain LRU add pagevec after swapin
      khugepaged: allow to collapse a page shared across fork
      khugepaged: allow to collapse PTE-mapped compound pages
      thp: change CoW semantics for anon-THP
      khugepaged: introduce 'max_ptes_shared' tunable

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "Clean up hugetlb boot command line processing", v4:
      hugetlbfs: add arch_hugetlb_valid_size
      hugetlbfs: move hugepagesz= parsing to arch independent code
      hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
      hugetlbfs: clean up command line processing
      hugetlbfs: fix changes to command line processing

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/hugetlb: avoid unnecessary check on pud and pmd entry in huge_pte_offset

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/hugetlb: Add some new generic fallbacks", v3:
      arm64/mm: drop __HAVE_ARCH_HUGE_PTEP_GET
      mm/hugetlb: define a generic fallback for is_hugepage_only_range()
      mm/hugetlb: define a generic fallback for arch_clear_hugepage_flags()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: simplify calling a compound page destructor

Subsystem: mm/vmscan

    Wei Yang <richard.weiyang@gmail.com>:
      mm/vmscan.c: use update_lru_size() in update_lru_sizes()

    Jaewon Kim <jaewon31.kim@samsung.com>:
      mm/vmscan: count layzfree pages and fix nr_isolated_* mismatch

    Maninder Singh <maninder1.s@samsung.com>:
      mm/vmscan.c: change prototype for shrink_page_list

    Qiwu Chen <qiwuchen55@gmail.com>:
      mm/vmscan: update the comment of should_continue_reclaim()

    Johannes Weiner <hannes@cmpxchg.org>:
    Patch series "mm: memcontrol: charge swapin pages on instantiation", v2:
      mm: fix NUMA node file count error in replace_page_cache()
      mm: memcontrol: fix stat-corrupting race in charge moving
      mm: memcontrol: drop @compound parameter from memcg charging API
      mm: shmem: remove rare optimization when swapin races with hole punching
      mm: memcontrol: move out cgroup swaprate throttling
      mm: memcontrol: convert page cache to a new mem_cgroup_charge() API
      mm: memcontrol: prepare uncharging for removal of private page type counters
      mm: memcontrol: prepare move_account for removal of private page type counters
      mm: memcontrol: prepare cgroup vmstat infrastructure for native anon counters
      mm: memcontrol: switch to native NR_FILE_PAGES and NR_SHMEM counters
      mm: memcontrol: switch to native NR_ANON_MAPPED counter
      mm: memcontrol: switch to native NR_ANON_THPS counter
      mm: memcontrol: convert anon and file-thp to new mem_cgroup_charge() API
      mm: memcontrol: drop unused try/commit/cancel charge API
      mm: memcontrol: prepare swap controller setup for integration
      mm: memcontrol: make swap tracking an integral part of memory control
      mm: memcontrol: charge swapin pages on instantiation

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm: memcontrol: document the new swap control behavior

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: delete unused lrucare handling
      mm: memcontrol: update page->mem_cgroup stability rules
      mm: fix LRU balancing effect of new transparent huge pages
      mm: keep separate anon and file statistics on page reclaim activity
      mm: allow swappiness that prefers reclaiming anon over the file workingset
      mm: fold and remove lru_cache_add_anon() and lru_cache_add_file()
      mm: workingset: let cache workingset challenge anon
      mm: remove use-once cache bias from LRU balancing
      mm: vmscan: drop unnecessary div0 avoidance rounding in get_scan_count()
      mm: base LRU balancing on an explicit cost model
      mm: deactivations shouldn't bias the LRU balance
      mm: only count actual rotations as LRU reclaim cost
      mm: balance LRU lists based on relative thrashing
      mm: vmscan: determine anon/file pressure balance at the reclaim root
      mm: vmscan: reclaim writepage is IO cost
      mm: vmscan: limit the range of LRU type balancing

    Shakeel Butt <shakeelb@google.com>:
      mm: swap: fix vmstats for huge pages
      mm: swap: memcg: fix memcg stats for huge pages

Subsystem: mm/tools

    Changhee Han <ch0.han@lge.com>:
      tools/vm/page_owner_sort.c: filter out unneeded line

Subsystem: mm/mempolicy

    Michal Hocko <mhocko@suse.com>:
      mm, mempolicy: fix up gup usage in lookup_node

Subsystem: mm/memblock

    chenqiwu <chenqiwu@xiaomi.com>:
      include/linux/memblock.h: fix minor typo and unclear comment

    Mike Rapoport <rppt@linux.ibm.com>:
      sparc32: register memory occupied by kernel as memblock.memory

Subsystem: mm/hugetlbfs

    Shijie Hu <hushijie3@huawei.com>:
      hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs

Subsystem: mm/thp

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: thp: don't need to drain lru cache when splitting and mlocking THP

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()", v2:
      powerpc/mm: drop platform defined pmd_mknotpresent()
      mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid()

Subsystem: mm/mmap

    Scott Cheloha <cheloha@linux.vnet.ibm.com>:
      drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup

Subsystem: mm/kconfig

    Zong Li <zong.li@sifive.com>:
    Patch series "Extract DEBUG_WX to shared use":
      mm: add DEBUG_WX support
      riscv: support DEBUG_WX
      x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
      arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined

 Documentation/admin-guide/cgroup-v1/memory.rst           |   19 
 Documentation/admin-guide/kernel-parameters.txt          |   40 
 Documentation/admin-guide/mm/hugetlbpage.rst             |   35 
 Documentation/admin-guide/mm/transhuge.rst               |    7 
 Documentation/admin-guide/sysctl/vm.rst                  |   23 
 Documentation/core-api/padata.rst                        |   41 
 Documentation/features/vm/numa-memblock/arch-support.txt |   34 
 Documentation/vm/memory-model.rst                        |    9 
 Documentation/vm/page_owner.rst                          |    3 
 arch/alpha/mm/init.c                                     |   16 
 arch/alpha/mm/numa.c                                     |   22 
 arch/arc/include/asm/hugepage.h                          |    2 
 arch/arc/mm/init.c                                       |   41 
 arch/arm/include/asm/hugetlb.h                           |    7 
 arch/arm/include/asm/pgtable-3level.h                    |    2 
 arch/arm/mm/init.c                                       |   66 
 arch/arm64/Kconfig                                       |    2 
 arch/arm64/Kconfig.debug                                 |   29 
 arch/arm64/include/asm/hugetlb.h                         |   13 
 arch/arm64/include/asm/pgtable.h                         |    2 
 arch/arm64/mm/hugetlbpage.c                              |   48 
 arch/arm64/mm/init.c                                     |   56 
 arch/arm64/mm/numa.c                                     |    9 
 arch/c6x/mm/init.c                                       |    8 
 arch/csky/kernel/setup.c                                 |   26 
 arch/h8300/mm/init.c                                     |    6 
 arch/hexagon/mm/init.c                                   |    6 
 arch/ia64/Kconfig                                        |    1 
 arch/ia64/include/asm/hugetlb.h                          |    5 
 arch/ia64/mm/contig.c                                    |    2 
 arch/ia64/mm/discontig.c                                 |    2 
 arch/m68k/mm/init.c                                      |    6 
 arch/m68k/mm/mcfmmu.c                                    |    9 
 arch/m68k/mm/motorola.c                                  |   15 
 arch/m68k/mm/sun3mmu.c                                   |   10 
 arch/microblaze/Kconfig                                  |    1 
 arch/microblaze/mm/init.c                                |    2 
 arch/mips/Kconfig                                        |    1 
 arch/mips/include/asm/hugetlb.h                          |   11 
 arch/mips/include/asm/pgtable.h                          |    2 
 arch/mips/loongson64/numa.c                              |    2 
 arch/mips/mm/init.c                                      |    2 
 arch/mips/sgi-ip27/ip27-memory.c                         |    2 
 arch/nds32/mm/init.c                                     |   11 
 arch/nios2/mm/init.c                                     |    8 
 arch/openrisc/mm/init.c                                  |    9 
 arch/parisc/include/asm/hugetlb.h                        |   10 
 arch/parisc/mm/init.c                                    |   22 
 arch/powerpc/Kconfig                                     |   10 
 arch/powerpc/include/asm/book3s/64/pgtable.h             |    4 
 arch/powerpc/include/asm/hugetlb.h                       |    5 
 arch/powerpc/mm/hugetlbpage.c                            |   38 
 arch/powerpc/mm/mem.c                                    |    2 
 arch/riscv/Kconfig                                       |    2 
 arch/riscv/include/asm/hugetlb.h                         |   10 
 arch/riscv/include/asm/ptdump.h                          |   11 
 arch/riscv/mm/hugetlbpage.c                              |   44 
 arch/riscv/mm/init.c                                     |    5 
 arch/s390/Kconfig                                        |    1 
 arch/s390/include/asm/hugetlb.h                          |    8 
 arch/s390/mm/hugetlbpage.c                               |   34 
 arch/s390/mm/init.c                                      |    2 
 arch/sh/Kconfig                                          |    1 
 arch/sh/include/asm/hugetlb.h                            |    7 
 arch/sh/mm/init.c                                        |    2 
 arch/sparc/Kconfig                                       |   10 
 arch/sparc/include/asm/hugetlb.h                         |   10 
 arch/sparc/mm/init_32.c                                  |    1 
 arch/sparc/mm/init_64.c                                  |   67 
 arch/sparc/mm/srmmu.c                                    |   21 
 arch/um/kernel/mem.c                                     |   12 
 arch/unicore32/include/asm/memory.h                      |    2 
 arch/unicore32/include/mach/memory.h                     |    6 
 arch/unicore32/kernel/pci.c                              |   14 
 arch/unicore32/mm/init.c                                 |   43 
 arch/x86/Kconfig                                         |   11 
 arch/x86/Kconfig.debug                                   |   27 
 arch/x86/include/asm/hugetlb.h                           |   10 
 arch/x86/include/asm/pgtable.h                           |    2 
 arch/x86/mm/hugetlbpage.c                                |   35 
 arch/x86/mm/init.c                                       |    2 
 arch/x86/mm/init_64.c                                    |   12 
 arch/x86/mm/kmmio.c                                      |    2 
 arch/x86/mm/numa.c                                       |   11 
 arch/xtensa/mm/init.c                                    |    8 
 drivers/base/memory.c                                    |   44 
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c              |   22 
 fs/cifs/file.c                                           |   10 
 fs/fuse/dev.c                                            |    2 
 fs/hugetlbfs/inode.c                                     |   67 
 include/asm-generic/hugetlb.h                            |    2 
 include/linux/compaction.h                               |    9 
 include/linux/gfp.h                                      |    7 
 include/linux/hugetlb.h                                  |   16 
 include/linux/memblock.h                                 |   15 
 include/linux/memcontrol.h                               |  102 -
 include/linux/mm.h                                       |   52 
 include/linux/mmzone.h                                   |   46 
 include/linux/padata.h                                   |   43 
 include/linux/string.h                                   |   60 
 include/linux/swap.h                                     |   17 
 include/linux/vm_event_item.h                            |    4 
 include/linux/vmstat.h                                   |    2 
 include/trace/events/compaction.h                        |   22 
 include/trace/events/huge_memory.h                       |    3 
 include/trace/events/vmscan.h                            |   14 
 init/Kconfig                                             |   17 
 init/main.c                                              |    2 
 kernel/events/uprobes.c                                  |   22 
 kernel/padata.c                                          |  293 +++-
 kernel/sysctl.c                                          |    3 
 lib/test_kasan.c                                         |   29 
 mm/Kconfig                                               |    9 
 mm/Kconfig.debug                                         |   32 
 mm/compaction.c                                          |   70 -
 mm/filemap.c                                             |   55 
 mm/gup.c                                                 |  237 ++-
 mm/huge_memory.c                                         |  282 ----
 mm/hugetlb.c                                             |  260 ++-
 mm/internal.h                                            |   25 
 mm/khugepaged.c                                          |  316 ++--
 mm/memblock.c                                            |   19 
 mm/memcontrol.c                                          |  642 +++------
 mm/memory.c                                              |  103 -
 mm/memory_hotplug.c                                      |   10 
 mm/mempolicy.c                                           |    5 
 mm/migrate.c                                             |   30 
 mm/oom_kill.c                                            |    4 
 mm/page_alloc.c                                          |  735 ++++------
 mm/page_owner.c                                          |    7 
 mm/pgtable-generic.c                                     |    2 
 mm/rmap.c                                                |   53 
 mm/shmem.c                                               |  156 --
 mm/slab.c                                                |    4 
 mm/slub.c                                                |    8 
 mm/swap.c                                                |  199 +-
 mm/swap_cgroup.c                                         |   10 
 mm/swap_state.c                                          |  110 -
 mm/swapfile.c                                            |   39 
 mm/userfaultfd.c                                         |   15 
 mm/vmscan.c                                              |  344 ++--
 mm/vmstat.c                                              |   16 
 mm/workingset.c                                          |   23 
 tools/testing/selftests/vm/.gitignore                    |    1 
 tools/testing/selftests/vm/Makefile                      |    1 
 tools/testing/selftests/vm/khugepaged.c                  | 1035 +++++++++++++++
 tools/vm/page_owner_sort.c                               |    5 
 147 files changed, 3876 insertions(+), 3108 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-02 21:38     ` incoming Andrew Morton
@ 2020-06-02 22:18       ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-06-02 22:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Tue, Jun 2, 2020 at 2:38 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 2 Jun 2020 13:45:49 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >
> > Hmm. I have no issues with conflicts, and already took your previous series.
>
> Well that's odd.

I meant "I saw the conflicts and had no issue with them". Nothing odd.

And I actually much prefer seeing conflicts from your series (against
other pulls I've done) over having you delay your patch bombs because
of any fear for them.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-02 20:45   ` incoming Linus Torvalds
@ 2020-06-02 21:38     ` Andrew Morton
  2020-06-02 22:18       ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-06-02 21:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, Linux-MM

On Tue, 2 Jun 2020 13:45:49 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Jun 2, 2020 at 1:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > The local_lock merge made rather a mess of all of this.  I'm
> > cooking up a full resend of the same material.
> 
> Hmm. I have no issues with conflicts, and already took your previous series.

Well that's odd.

> I've pushed it out now - does my tree match what you expect?

Yup, thanks.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-02 20:08 ` incoming Andrew Morton
@ 2020-06-02 20:45   ` Linus Torvalds
  2020-06-02 21:38     ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-06-02 20:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Tue, Jun 2, 2020 at 1:08 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> The local_lock merge made rather a mess of all of this.  I'm
> cooking up a full resend of the same material.

Hmm. I have no issues with conflicts, and already took your previous series.

I've pushed it out now - does my tree match what you expect?

            Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-02 20:09 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-06-02 20:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


A few little subsystems and a start of a lot of MM patches.

128 patches, based on f359287765c04711ff54fbd11645271d8e5ff763:


Subsystems affected by this patch series:

  squashfs
  ocfs2
  parisc
  vfs
  mm/slab-generic
  mm/slub
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/memory-failure
  mm/vmalloc
  mm/kasan

Subsystem: squashfs

    Philippe Liard <pliard@google.com>:
      squashfs: migrate from ll_rw_block usage to BIO

Subsystem: ocfs2

    Jules Irenge <jbi.octave@gmail.com>:
      ocfs2: add missing annotation for dlm_empty_lockres()

    Gang He <ghe@suse.com>:
      ocfs2: mount shared volume without ha stack

Subsystem: parisc

    Andrew Morton <akpm@linux-foundation.org>:
      arch/parisc/include/asm/pgtable.h: remove unused `old_pte'

Subsystem: vfs

    Jeff Layton <jlayton@redhat.com>:
    Patch series "vfs: have syncfs() return error when there are writeback:
      vfs: track per-sb writeback errors and report them to syncfs
      fs/buffer.c: record blockdev write errors in super_block that it backs

Subsystem: mm/slab-generic

    Vlastimil Babka <vbabka@suse.cz>:
      usercopy: mark dma-kmalloc caches as usercopy caches

Subsystem: mm/slub

    Dongli Zhang <dongli.zhang@oracle.com>:
      mm/slub.c: fix corrupted freechain in deactivate_slab()

    Christoph Lameter <cl@linux.com>:
      slub: Remove userspace notifier for cache add/remove

    Christopher Lameter <cl@linux.com>:
      slub: remove kmalloc under list_lock from list_slab_objects() V2

    Qian Cai <cai@lca.pw>:
      mm/slub: fix stack overruns with SLUB_STATS

    Andrew Morton <akpm@linux-foundation.org>:
      Documentation/vm/slub.rst: s/Toggle/Enable/

Subsystem: mm/debug

    Vlastimil Babka <vbabka@suse.cz>:
      mm, dump_page(): do not crash with invalid mapping pointer

Subsystem: mm/pagecache

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Change readahead API", v11:
      mm: move readahead prototypes from mm.h
      mm: return void from various readahead functions
      mm: ignore return value of ->readpages
      mm: move readahead nr_pages check into read_pages
      mm: add new readahead_control API
      mm: use readahead_control to pass arguments
      mm: rename various 'offset' parameters to 'index'
      mm: rename readahead loop variable to 'i'
      mm: remove 'page_offset' from readahead loop
      mm: put readahead pages in cache earlier
      mm: add readahead address space operation
      mm: move end_index check out of readahead loop
      mm: add page_cache_readahead_unbounded
      mm: document why we don't set PageReadahead
      mm: use memalloc_nofs_save in readahead path
      fs: convert mpage_readpages to mpage_readahead
      btrfs: convert from readpages to readahead
      erofs: convert uncompressed files from readpages to readahead
      erofs: convert compressed files from readpages to readahead
      ext4: convert from readpages to readahead
      ext4: pass the inode to ext4_mpage_readpages
      f2fs: convert from readpages to readahead
      f2fs: pass the inode to f2fs_mpage_readpages
      fuse: convert from readpages to readahead
      iomap: convert from readpages to readahead

    Guoqing Jiang <guoqing.jiang@cloud.ionos.com>:
    Patch series "Introduce attach/detach_page_private to cleanup code":
      include/linux/pagemap.h: introduce attach/detach_page_private
      md: remove __clear_page_buffers and use attach/detach_page_private
      btrfs: use attach/detach_page_private
      fs/buffer.c: use attach/detach_page_private
      f2fs: use attach/detach_page_private
      iomap: use attach/detach_page_private
      ntfs: replace attach_page_buffers with attach_page_private
      orangefs: use attach/detach_page_private
      buffer_head.h: remove attach_page_buffers
      mm/migrate.c: call detach_page_private to cleanup code
      mm_types.h: change set_page_private to inline function

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap.c: remove misleading comment

    Chao Yu <yuchao0@huawei.com>:
      mm/page-writeback.c: remove unused variable

    NeilBrown <neilb@suse.de>:
      mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
      mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead

Subsystem: mm/gup

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/gup.c: update the documentation

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: introduce pin_user_pages_unlocked
      ivtv: convert get_user_pages() --> pin_user_pages()

    Miles Chen <miles.chen@mediatek.com>:
      mm/gup.c: further document vma_permits_fault()

Subsystem: mm/swap

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/swapfile: use list_{prev,next}_entry() instead of open-coding

    Qian Cai <cai@lca.pw>:
      mm/swap_state: fix a data race in swapin_nr_pages

    Andrea Righi <andrea.righi@canonical.com>:
      mm: swap: properly update readahead statistics in unuse_pte_range()

    Wei Yang <richard.weiyang@gmail.com>:
      mm/swapfile.c: offset is only used when there is more slots
      mm/swapfile.c: explicitly show ssd/non-ssd is handled mutually exclusive
      mm/swapfile.c: remove the unnecessary goto for SSD case
      mm/swapfile.c: simplify the calculation of n_goal
      mm/swapfile.c: remove the extra check in scan_swap_map_slots()
      mm/swapfile.c: found_free could be represented by (tmp < max)
      mm/swapfile.c: tmp is always smaller than max
      mm/swapfile.c: omit a duplicate code by compare tmp and max first

    Huang Ying <ying.huang@intel.com>:
      swap: try to scan more free slots even when fragmented

    Wei Yang <richard.weiyang@gmail.com>:
      mm/swapfile.c: classify SWAP_MAP_XXX to make it more readable
      mm/swapfile.c: __swap_entry_free() always free 1 entry

    Huang Ying <ying.huang@intel.com>:
      mm/swapfile.c: use prandom_u32_max()
      swap: reduce lock contention on swap cache from swap slots allocation

    Randy Dunlap <rdunlap@infradead.org>:
      mm: swapfile: fix /proc/swaps heading and Size/Used/Priority alignment

    Miaohe Lin <linmiaohe@huawei.com>:
      include/linux/swap.h: delete meaningless __add_to_swap_cache() declaration

Subsystem: mm/memcg

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: add workingset_restore in memory.stat

    Kaixu Xia <kaixuxia@tencent.com>:
      mm: memcontrol: simplify value comparison between count and limit

    Shakeel Butt <shakeelb@google.com>:
      memcg: expose root cgroup's memory.stat

    Jakub Kicinski <kuba@kernel.org>:
    Patch series "memcg: Slow down swap allocation as the available space gets:
      mm/memcg: prepare for swap over-high accounting and penalty calculation
      mm/memcg: move penalty delay clamping out of calculate_high_delay()
      mm/memcg: move cgroup high memory limit setting into struct page_counter
      mm/memcg: automatically penalize tasks with high swap use

    Zefan Li <lizefan@huawei.com>:
      memcg: fix memcg_kmem_bypass() for remote memcg charging

Subsystem: mm/pagemap

    Steven Price <steven.price@arm.com>:
    Patch series "Fix W+X debug feature on x86":
      x86: mm: ptdump: calculate effective permissions correctly
      mm: ptdump: expand type of 'val' in note_page()

    Huang Ying <ying.huang@intel.com>:
      /proc/PID/smaps: Add PMD migration entry parsing

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/memory: remove unnecessary pte_devmap case in copy_one_pte()

Subsystem: mm/memory-failure

    Wetp Zhang <wetp.zy@linux.alibaba.com>:
      mm, memory_failure: don't send BUS_MCEERR_AO for action required error

Subsystem: mm/vmalloc

    Christoph Hellwig <hch@lst.de>:
    Patch series "decruft the vmalloc API", v2:
      x86/hyperv: use vmalloc_exec for the hypercall page
      x86: fix vmap arguments in map_irq_stack
      staging: android: ion: use vmap instead of vm_map_ram
      staging: media: ipu3: use vmap instead of reimplementing it
      dma-mapping: use vmap insted of reimplementing it
      powerpc: add an ioremap_phb helper
      powerpc: remove __ioremap_at and __iounmap_at
      mm: remove __get_vm_area
      mm: unexport unmap_kernel_range_noflush
      mm: rename CONFIG_PGTABLE_MAPPING to CONFIG_ZSMALLOC_PGTABLE_MAPPING
      mm: only allow page table mappings for built-in zsmalloc
      mm: pass addr as unsigned long to vb_free
      mm: remove vmap_page_range_noflush and vunmap_page_range
      mm: rename vmap_page_range to map_kernel_range
      mm: don't return the number of pages from map_kernel_range{,_noflush}
      mm: remove map_vm_range
      mm: remove unmap_vmap_area
      mm: remove the prot argument from vm_map_ram
      mm: enforce that vmap can't map pages executable
      gpu/drm: remove the powerpc hack in drm_legacy_sg_alloc
      mm: remove the pgprot argument to __vmalloc
      mm: remove the prot argument to __vmalloc_node
      mm: remove both instances of __vmalloc_node_flags
      mm: remove __vmalloc_node_flags_caller
      mm: switch the test_vmalloc module to use __vmalloc_node
      mm: remove vmalloc_user_node_flags
      arm64: use __vmalloc_node in arch_alloc_vmap_stack
      powerpc: use __vmalloc_node in alloc_vm_stack
      s390: use __vmalloc_node in stack_alloc

    Joerg Roedel <jroedel@suse.de>:
    Patch series "mm: Get rid of vmalloc_sync_(un)mappings()", v3:
      mm: add functions to track page directory modifications
      mm/vmalloc: track which page-table levels were modified
      mm/ioremap: track which page-table levels were modified
      x86/mm/64: implement arch_sync_kernel_mappings()
      x86/mm/32: implement arch_sync_kernel_mappings()
      mm: remove vmalloc_sync_(un)mappings()
      x86/mm: remove vmalloc faulting

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix clang compilation warning due to stack protector

    Kees Cook <keescook@chromium.org>:
      ubsan: entirely disable alignment checks under UBSAN_TRAP

    Jing Xia <jing.xia@unisoc.com>:
      mm/mm_init.c: report kasan-tag information stored in page->flags

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: move kasan_report() into report.c

 Documentation/admin-guide/cgroup-v2.rst            |   24 +
 Documentation/core-api/cachetlb.rst                |    2 
 Documentation/filesystems/locking.rst              |    6 
 Documentation/filesystems/proc.rst                 |    4 
 Documentation/filesystems/vfs.rst                  |   15 
 Documentation/vm/slub.rst                          |    2 
 arch/arm/configs/omap2plus_defconfig               |    2 
 arch/arm64/include/asm/pgtable.h                   |    3 
 arch/arm64/include/asm/vmap_stack.h                |    6 
 arch/arm64/mm/dump.c                               |    2 
 arch/parisc/include/asm/pgtable.h                  |    2 
 arch/powerpc/include/asm/io.h                      |   10 
 arch/powerpc/include/asm/pci-bridge.h              |    2 
 arch/powerpc/kernel/irq.c                          |    5 
 arch/powerpc/kernel/isa-bridge.c                   |   28 +
 arch/powerpc/kernel/pci_64.c                       |   56 +-
 arch/powerpc/mm/ioremap_64.c                       |   50 --
 arch/riscv/include/asm/pgtable.h                   |    4 
 arch/riscv/mm/ptdump.c                             |    2 
 arch/s390/kernel/setup.c                           |    9 
 arch/sh/kernel/cpu/sh4/sq.c                        |    3 
 arch/x86/hyperv/hv_init.c                          |    5 
 arch/x86/include/asm/kvm_host.h                    |    3 
 arch/x86/include/asm/pgtable-2level_types.h        |    2 
 arch/x86/include/asm/pgtable-3level_types.h        |    2 
 arch/x86/include/asm/pgtable_64_types.h            |    2 
 arch/x86/include/asm/pgtable_types.h               |    8 
 arch/x86/include/asm/switch_to.h                   |   23 -
 arch/x86/kernel/irq_64.c                           |    2 
 arch/x86/kernel/setup_percpu.c                     |    6 
 arch/x86/kvm/svm/sev.c                             |    3 
 arch/x86/mm/dump_pagetables.c                      |   35 +
 arch/x86/mm/fault.c                                |  196 ----------
 arch/x86/mm/init_64.c                              |    5 
 arch/x86/mm/pti.c                                  |    8 
 arch/x86/mm/tlb.c                                  |   37 -
 block/blk-core.c                                   |    1 
 drivers/acpi/apei/ghes.c                           |    6 
 drivers/base/node.c                                |    2 
 drivers/block/drbd/drbd_bitmap.c                   |    4 
 drivers/block/loop.c                               |    2 
 drivers/dax/device.c                               |    1 
 drivers/gpu/drm/drm_scatter.c                      |   11 
 drivers/gpu/drm/etnaviv/etnaviv_dump.c             |    4 
 drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c   |    2 
 drivers/lightnvm/pblk-init.c                       |    5 
 drivers/md/dm-bufio.c                              |    4 
 drivers/md/md-bitmap.c                             |   12 
 drivers/media/common/videobuf2/videobuf2-dma-sg.c  |    3 
 drivers/media/common/videobuf2/videobuf2-vmalloc.c |    3 
 drivers/media/pci/ivtv/ivtv-udma.c                 |   19 -
 drivers/media/pci/ivtv/ivtv-yuv.c                  |   17 
 drivers/media/pci/ivtv/ivtvfb.c                    |    4 
 drivers/mtd/ubi/io.c                               |    4 
 drivers/pcmcia/electra_cf.c                        |   45 --
 drivers/scsi/sd_zbc.c                              |    3 
 drivers/staging/android/ion/ion_heap.c             |    4 
 drivers/staging/media/ipu3/ipu3-css-pool.h         |    4 
 drivers/staging/media/ipu3/ipu3-dmamap.c           |   30 -
 fs/block_dev.c                                     |    7 
 fs/btrfs/disk-io.c                                 |    4 
 fs/btrfs/extent_io.c                               |   64 ---
 fs/btrfs/extent_io.h                               |    3 
 fs/btrfs/inode.c                                   |   39 --
 fs/buffer.c                                        |   23 -
 fs/erofs/data.c                                    |   41 --
 fs/erofs/decompressor.c                            |    2 
 fs/erofs/zdata.c                                   |   31 -
 fs/exfat/inode.c                                   |    7 
 fs/ext2/inode.c                                    |   10 
 fs/ext4/ext4.h                                     |    5 
 fs/ext4/inode.c                                    |   25 -
 fs/ext4/readpage.c                                 |   25 -
 fs/ext4/verity.c                                   |   35 -
 fs/f2fs/data.c                                     |   56 +-
 fs/f2fs/f2fs.h                                     |   14 
 fs/f2fs/verity.c                                   |   35 -
 fs/fat/inode.c                                     |    7 
 fs/file_table.c                                    |    1 
 fs/fs-writeback.c                                  |    1 
 fs/fuse/file.c                                     |  100 +----
 fs/gfs2/aops.c                                     |   23 -
 fs/gfs2/dir.c                                      |    9 
 fs/gfs2/quota.c                                    |    2 
 fs/hpfs/file.c                                     |    7 
 fs/iomap/buffered-io.c                             |  113 +----
 fs/iomap/trace.h                                   |    2 
 fs/isofs/inode.c                                   |    7 
 fs/jfs/inode.c                                     |    7 
 fs/mpage.c                                         |   38 --
 fs/nfs/blocklayout/extent_tree.c                   |    2 
 fs/nfs/internal.h                                  |   10 
 fs/nfs/write.c                                     |    4 
 fs/nfsd/vfs.c                                      |    9 
 fs/nilfs2/inode.c                                  |   15 
 fs/ntfs/aops.c                                     |    2 
 fs/ntfs/malloc.h                                   |    2 
 fs/ntfs/mft.c                                      |    2 
 fs/ocfs2/aops.c                                    |   34 -
 fs/ocfs2/dlm/dlmmaster.c                           |    1 
 fs/ocfs2/ocfs2.h                                   |    4 
 fs/ocfs2/slot_map.c                                |   46 +-
 fs/ocfs2/super.c                                   |   21 +
 fs/omfs/file.c                                     |    7 
 fs/open.c                                          |    3 
 fs/orangefs/inode.c                                |   32 -
 fs/proc/meminfo.c                                  |    3 
 fs/proc/task_mmu.c                                 |   16 
 fs/qnx6/inode.c                                    |    7 
 fs/reiserfs/inode.c                                |    8 
 fs/squashfs/block.c                                |  273 +++++++-------
 fs/squashfs/decompressor.h                         |    5 
 fs/squashfs/decompressor_multi.c                   |    9 
 fs/squashfs/decompressor_multi_percpu.c            |   17 
 fs/squashfs/decompressor_single.c                  |    9 
 fs/squashfs/lz4_wrapper.c                          |   17 
 fs/squashfs/lzo_wrapper.c                          |   17 
 fs/squashfs/squashfs.h                             |    4 
 fs/squashfs/xz_wrapper.c                           |   51 +-
 fs/squashfs/zlib_wrapper.c                         |   63 +--
 fs/squashfs/zstd_wrapper.c                         |   62 +--
 fs/sync.c                                          |    6 
 fs/ubifs/debug.c                                   |    2 
 fs/ubifs/lprops.c                                  |    2 
 fs/ubifs/lpt_commit.c                              |    4 
 fs/ubifs/orphan.c                                  |    2 
 fs/udf/inode.c                                     |    7 
 fs/xfs/kmem.c                                      |    2 
 fs/xfs/xfs_aops.c                                  |   13 
 fs/xfs/xfs_buf.c                                   |    2 
 fs/zonefs/super.c                                  |    7 
 include/asm-generic/5level-fixup.h                 |    5 
 include/asm-generic/pgtable.h                      |   27 +
 include/linux/buffer_head.h                        |    8 
 include/linux/fs.h                                 |   18 
 include/linux/iomap.h                              |    3 
 include/linux/memcontrol.h                         |    4 
 include/linux/mm.h                                 |   67 ++-
 include/linux/mm_types.h                           |    6 
 include/linux/mmzone.h                             |    1 
 include/linux/mpage.h                              |    4 
 include/linux/page_counter.h                       |    8 
 include/linux/pagemap.h                            |  193 ++++++++++
 include/linux/ptdump.h                             |    3 
 include/linux/sched.h                              |    3 
 include/linux/swap.h                               |   17 
 include/linux/vmalloc.h                            |   49 +-
 include/linux/zsmalloc.h                           |    2 
 include/trace/events/erofs.h                       |    6 
 include/trace/events/f2fs.h                        |    6 
 include/trace/events/writeback.h                   |    5 
 kernel/bpf/core.c                                  |    6 
 kernel/bpf/syscall.c                               |   29 -
 kernel/dma/remap.c                                 |   48 --
 kernel/groups.c                                    |    2 
 kernel/module.c                                    |    3 
 kernel/notifier.c                                  |    1 
 kernel/sys.c                                       |    2 
 kernel/trace/trace.c                               |   12 
 lib/Kconfig.ubsan                                  |    2 
 lib/ioremap.c                                      |   46 +-
 lib/test_vmalloc.c                                 |   26 -
 mm/Kconfig                                         |    4 
 mm/debug.c                                         |   56 ++
 mm/fadvise.c                                       |    6 
 mm/filemap.c                                       |    1 
 mm/gup.c                                           |   77 +++-
 mm/internal.h                                      |   14 
 mm/kasan/Makefile                                  |   21 -
 mm/kasan/common.c                                  |   19 -
 mm/kasan/report.c                                  |   22 +
 mm/memcontrol.c                                    |  198 +++++++---
 mm/memory-failure.c                                |   15 
 mm/memory.c                                        |    2 
 mm/migrate.c                                       |    9 
 mm/mm_init.c                                       |   16 
 mm/nommu.c                                         |   52 +-
 mm/page-writeback.c                                |   62 ++-
 mm/page_alloc.c                                    |    7 
 mm/percpu.c                                        |    2 
 mm/ptdump.c                                        |   17 
 mm/readahead.c                                     |  349 ++++++++++--------
 mm/slab_common.c                                   |    3 
 mm/slub.c                                          |   67 ++-
 mm/swap_state.c                                    |    5 
 mm/swapfile.c                                      |  194 ++++++----
 mm/util.c                                          |    2 
 mm/vmalloc.c                                       |  399 ++++++++-------------
 mm/vmscan.c                                        |    4 
 mm/vmstat.c                                        |   11 
 mm/zsmalloc.c                                      |   12 
 net/bridge/netfilter/ebtables.c                    |    6 
 net/ceph/ceph_common.c                             |    3 
 sound/core/memalloc.c                              |    2 
 sound/core/pcm_memory.c                            |    2 
 195 files changed, 2292 insertions(+), 2288 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-06-02  4:44 incoming Andrew Morton
@ 2020-06-02 20:08 ` Andrew Morton
  2020-06-02 20:45   ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-06-02 20:08 UTC (permalink / raw)
  To: Linus Torvalds, mm-commits, linux-mm

The local_lock merge made rather a mess of all of this.  I'm 
cooking up a full resend of the same material.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-06-02  4:44 Andrew Morton
  2020-06-02 20:08 ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-06-02  4:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


A few little subsystems and a start of a lot of MM patches.

128 patches, based on 9bf9511e3d9f328c03f6f79bfb741c3d18f2f2c0:

Subsystems affected by this patch series:

  squashfs
  ocfs2
  parisc
  vfs
  mm/slab-generic
  mm/slub
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/memory-failure
  mm/vmalloc
  mm/kasan

Subsystem: squashfs

    Philippe Liard <pliard@google.com>:
      squashfs: migrate from ll_rw_block usage to BIO

Subsystem: ocfs2

    Jules Irenge <jbi.octave@gmail.com>:
      ocfs2: add missing annotation for dlm_empty_lockres()

    Gang He <ghe@suse.com>:
      ocfs2: mount shared volume without ha stack

Subsystem: parisc

    Andrew Morton <akpm@linux-foundation.org>:
      arch/parisc/include/asm/pgtable.h: remove unused `old_pte'

Subsystem: vfs

    Jeff Layton <jlayton@redhat.com>:
    Patch series "vfs: have syncfs() return error when there are writeback:
      vfs: track per-sb writeback errors and report them to syncfs
      fs/buffer.c: record blockdev write errors in super_block that it backs

Subsystem: mm/slab-generic

    Vlastimil Babka <vbabka@suse.cz>:
      usercopy: mark dma-kmalloc caches as usercopy caches

Subsystem: mm/slub

    Dongli Zhang <dongli.zhang@oracle.com>:
      mm/slub.c: fix corrupted freechain in deactivate_slab()

    Christoph Lameter <cl@linux.com>:
      slub: Remove userspace notifier for cache add/remove

    Christopher Lameter <cl@linux.com>:
      slub: remove kmalloc under list_lock from list_slab_objects() V2

    Qian Cai <cai@lca.pw>:
      mm/slub: fix stack overruns with SLUB_STATS

    Andrew Morton <akpm@linux-foundation.org>:
      Documentation/vm/slub.rst: s/Toggle/Enable/

Subsystem: mm/debug

    Vlastimil Babka <vbabka@suse.cz>:
      mm, dump_page(): do not crash with invalid mapping pointer

Subsystem: mm/pagecache

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
    Patch series "Change readahead API", v11:
      mm: move readahead prototypes from mm.h
      mm: return void from various readahead functions
      mm: ignore return value of ->readpages
      mm: move readahead nr_pages check into read_pages
      mm: add new readahead_control API
      mm: use readahead_control to pass arguments
      mm: rename various 'offset' parameters to 'index'
      mm: rename readahead loop variable to 'i'
      mm: remove 'page_offset' from readahead loop
      mm: put readahead pages in cache earlier
      mm: add readahead address space operation
      mm: move end_index check out of readahead loop
      mm: add page_cache_readahead_unbounded
      mm: document why we don't set PageReadahead
      mm: use memalloc_nofs_save in readahead path
      fs: convert mpage_readpages to mpage_readahead
      btrfs: convert from readpages to readahead
      erofs: convert uncompressed files from readpages to readahead
      erofs: convert compressed files from readpages to readahead
      ext4: convert from readpages to readahead
      ext4: pass the inode to ext4_mpage_readpages
      f2fs: convert from readpages to readahead
      f2fs: pass the inode to f2fs_mpage_readpages
      fuse: convert from readpages to readahead
      iomap: convert from readpages to readahead

    Guoqing Jiang <guoqing.jiang@cloud.ionos.com>:
    Patch series "Introduce attach/detach_page_private to cleanup code":
      include/linux/pagemap.h: introduce attach/detach_page_private
      md: remove __clear_page_buffers and use attach/detach_page_private
      btrfs: use attach/detach_page_private
      fs/buffer.c: use attach/detach_page_private
      f2fs: use attach/detach_page_private
      iomap: use attach/detach_page_private
      ntfs: replace attach_page_buffers with attach_page_private
      orangefs: use attach/detach_page_private
      buffer_head.h: remove attach_page_buffers
      mm/migrate.c: call detach_page_private to cleanup code
      mm_types.h: change set_page_private to inline function

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap.c: remove misleading comment

    Chao Yu <yuchao0@huawei.com>:
      mm/page-writeback.c: remove unused variable

    NeilBrown <neilb@suse.de>:
      mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
      mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead

Subsystem: mm/gup

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/gup.c: update the documentation

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup: introduce pin_user_pages_unlocked
      ivtv: convert get_user_pages() --> pin_user_pages()

    Miles Chen <miles.chen@mediatek.com>:
      mm/gup.c: further document vma_permits_fault()

Subsystem: mm/swap

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/swapfile: use list_{prev,next}_entry() instead of open-coding

    Qian Cai <cai@lca.pw>:
      mm/swap_state: fix a data race in swapin_nr_pages

    Andrea Righi <andrea.righi@canonical.com>:
      mm: swap: properly update readahead statistics in unuse_pte_range()

    Wei Yang <richard.weiyang@gmail.com>:
      mm/swapfile.c: offset is only used when there is more slots
      mm/swapfile.c: explicitly show ssd/non-ssd is handled mutually exclusive
      mm/swapfile.c: remove the unnecessary goto for SSD case
      mm/swapfile.c: simplify the calculation of n_goal
      mm/swapfile.c: remove the extra check in scan_swap_map_slots()
      mm/swapfile.c: found_free could be represented by (tmp < max)
      mm/swapfile.c: tmp is always smaller than max
      mm/swapfile.c: omit a duplicate code by compare tmp and max first

    Huang Ying <ying.huang@intel.com>:
      swap: try to scan more free slots even when fragmented

    Wei Yang <richard.weiyang@gmail.com>:
      mm/swapfile.c: classify SWAP_MAP_XXX to make it more readable
      mm/swapfile.c: __swap_entry_free() always free 1 entry

    Huang Ying <ying.huang@intel.com>:
      mm/swapfile.c: use prandom_u32_max()
      swap: reduce lock contention on swap cache from swap slots allocation

    Randy Dunlap <rdunlap@infradead.org>:
      mm: swapfile: fix /proc/swaps heading and Size/Used/Priority alignment

    Miaohe Lin <linmiaohe@huawei.com>:
      include/linux/swap.h: delete meaningless __add_to_swap_cache() declaration

Subsystem: mm/memcg

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: add workingset_restore in memory.stat

    Kaixu Xia <kaixuxia@tencent.com>:
      mm: memcontrol: simplify value comparison between count and limit

    Shakeel Butt <shakeelb@google.com>:
      memcg: expose root cgroup's memory.stat

    Jakub Kicinski <kuba@kernel.org>:
    Patch series "memcg: Slow down swap allocation as the available space gets:
      mm/memcg: prepare for swap over-high accounting and penalty calculation
      mm/memcg: move penalty delay clamping out of calculate_high_delay()
      mm/memcg: move cgroup high memory limit setting into struct page_counter
      mm/memcg: automatically penalize tasks with high swap use

    Zefan Li <lizefan@huawei.com>:
      memcg: fix memcg_kmem_bypass() for remote memcg charging

Subsystem: mm/pagemap

    Steven Price <steven.price@arm.com>:
    Patch series "Fix W+X debug feature on x86":
      x86: mm: ptdump: calculate effective permissions correctly
      mm: ptdump: expand type of 'val' in note_page()

    Huang Ying <ying.huang@intel.com>:
      /proc/PID/smaps: Add PMD migration entry parsing

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/memory: remove unnecessary pte_devmap case in copy_one_pte()

Subsystem: mm/memory-failure

    Wetp Zhang <wetp.zy@linux.alibaba.com>:
      mm, memory_failure: don't send BUS_MCEERR_AO for action required error

Subsystem: mm/vmalloc

    Christoph Hellwig <hch@lst.de>:
    Patch series "decruft the vmalloc API", v2:
      x86/hyperv: use vmalloc_exec for the hypercall page
      x86: fix vmap arguments in map_irq_stack
      staging: android: ion: use vmap instead of vm_map_ram
      staging: media: ipu3: use vmap instead of reimplementing it
      dma-mapping: use vmap insted of reimplementing it
      powerpc: add an ioremap_phb helper
      powerpc: remove __ioremap_at and __iounmap_at
      mm: remove __get_vm_area
      mm: unexport unmap_kernel_range_noflush
      mm: rename CONFIG_PGTABLE_MAPPING to CONFIG_ZSMALLOC_PGTABLE_MAPPING
      mm: only allow page table mappings for built-in zsmalloc
      mm: pass addr as unsigned long to vb_free
      mm: remove vmap_page_range_noflush and vunmap_page_range
      mm: rename vmap_page_range to map_kernel_range
      mm: don't return the number of pages from map_kernel_range{,_noflush}
      mm: remove map_vm_range
      mm: remove unmap_vmap_area
      mm: remove the prot argument from vm_map_ram
      mm: enforce that vmap can't map pages executable
      gpu/drm: remove the powerpc hack in drm_legacy_sg_alloc
      mm: remove the pgprot argument to __vmalloc
      mm: remove the prot argument to __vmalloc_node
      mm: remove both instances of __vmalloc_node_flags
      mm: remove __vmalloc_node_flags_caller
      mm: switch the test_vmalloc module to use __vmalloc_node
      mm: remove vmalloc_user_node_flags
      arm64: use __vmalloc_node in arch_alloc_vmap_stack
      powerpc: use __vmalloc_node in alloc_vm_stack
      s390: use __vmalloc_node in stack_alloc

    Joerg Roedel <jroedel@suse.de>:
    Patch series "mm: Get rid of vmalloc_sync_(un)mappings()", v3:
      mm: add functions to track page directory modifications
      mm/vmalloc: track which page-table levels were modified
      mm/ioremap: track which page-table levels were modified
      x86/mm/64: implement arch_sync_kernel_mappings()
      x86/mm/32: implement arch_sync_kernel_mappings()
      mm: remove vmalloc_sync_(un)mappings()
      x86/mm: remove vmalloc faulting

Subsystem: mm/kasan

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: fix clang compilation warning due to stack protector

    Kees Cook <keescook@chromium.org>:
      ubsan: entirely disable alignment checks under UBSAN_TRAP

    Jing Xia <jing.xia@unisoc.com>:
      mm/mm_init.c: report kasan-tag information stored in page->flags

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: move kasan_report() into report.c

 Documentation/admin-guide/cgroup-v2.rst            |   24 +
 Documentation/core-api/cachetlb.rst                |    2 
 Documentation/filesystems/locking.rst              |    6 
 Documentation/filesystems/proc.rst                 |    4 
 Documentation/filesystems/vfs.rst                  |   15 
 Documentation/vm/slub.rst                          |    2 
 arch/arm/configs/omap2plus_defconfig               |    2 
 arch/arm64/include/asm/pgtable.h                   |    3 
 arch/arm64/include/asm/vmap_stack.h                |    6 
 arch/arm64/mm/dump.c                               |    2 
 arch/parisc/include/asm/pgtable.h                  |    2 
 arch/powerpc/include/asm/io.h                      |   10 
 arch/powerpc/include/asm/pci-bridge.h              |    2 
 arch/powerpc/kernel/irq.c                          |    5 
 arch/powerpc/kernel/isa-bridge.c                   |   28 +
 arch/powerpc/kernel/pci_64.c                       |   56 +-
 arch/powerpc/mm/ioremap_64.c                       |   50 --
 arch/riscv/include/asm/pgtable.h                   |    4 
 arch/riscv/mm/ptdump.c                             |    2 
 arch/s390/kernel/setup.c                           |    9 
 arch/sh/kernel/cpu/sh4/sq.c                        |    3 
 arch/x86/hyperv/hv_init.c                          |    5 
 arch/x86/include/asm/kvm_host.h                    |    3 
 arch/x86/include/asm/pgtable-2level_types.h        |    2 
 arch/x86/include/asm/pgtable-3level_types.h        |    2 
 arch/x86/include/asm/pgtable_64_types.h            |    2 
 arch/x86/include/asm/pgtable_types.h               |    8 
 arch/x86/include/asm/switch_to.h                   |   23 -
 arch/x86/kernel/irq_64.c                           |    2 
 arch/x86/kernel/setup_percpu.c                     |    6 
 arch/x86/kvm/svm/sev.c                             |    3 
 arch/x86/mm/dump_pagetables.c                      |   35 +
 arch/x86/mm/fault.c                                |  196 ----------
 arch/x86/mm/init_64.c                              |    5 
 arch/x86/mm/pti.c                                  |    8 
 arch/x86/mm/tlb.c                                  |   37 -
 block/blk-core.c                                   |    1 
 drivers/acpi/apei/ghes.c                           |    6 
 drivers/base/node.c                                |    2 
 drivers/block/drbd/drbd_bitmap.c                   |    4 
 drivers/block/loop.c                               |    2 
 drivers/dax/device.c                               |    1 
 drivers/gpu/drm/drm_scatter.c                      |   11 
 drivers/gpu/drm/etnaviv/etnaviv_dump.c             |    4 
 drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c   |    2 
 drivers/lightnvm/pblk-init.c                       |    5 
 drivers/md/dm-bufio.c                              |    4 
 drivers/md/md-bitmap.c                             |   12 
 drivers/media/common/videobuf2/videobuf2-dma-sg.c  |    3 
 drivers/media/common/videobuf2/videobuf2-vmalloc.c |    3 
 drivers/media/pci/ivtv/ivtv-udma.c                 |   19 -
 drivers/media/pci/ivtv/ivtv-yuv.c                  |   17 
 drivers/media/pci/ivtv/ivtvfb.c                    |    4 
 drivers/mtd/ubi/io.c                               |    4 
 drivers/pcmcia/electra_cf.c                        |   45 --
 drivers/scsi/sd_zbc.c                              |    3 
 drivers/staging/android/ion/ion_heap.c             |    4 
 drivers/staging/media/ipu3/ipu3-css-pool.h         |    4 
 drivers/staging/media/ipu3/ipu3-dmamap.c           |   30 -
 fs/block_dev.c                                     |    7 
 fs/btrfs/disk-io.c                                 |    4 
 fs/btrfs/extent_io.c                               |   64 ---
 fs/btrfs/extent_io.h                               |    3 
 fs/btrfs/inode.c                                   |   39 --
 fs/buffer.c                                        |   23 -
 fs/erofs/data.c                                    |   41 --
 fs/erofs/decompressor.c                            |    2 
 fs/erofs/zdata.c                                   |   31 -
 fs/exfat/inode.c                                   |    7 
 fs/ext2/inode.c                                    |   10 
 fs/ext4/ext4.h                                     |    5 
 fs/ext4/inode.c                                    |   25 -
 fs/ext4/readpage.c                                 |   25 -
 fs/ext4/verity.c                                   |   35 -
 fs/f2fs/data.c                                     |   56 +-
 fs/f2fs/f2fs.h                                     |   14 
 fs/f2fs/verity.c                                   |   35 -
 fs/fat/inode.c                                     |    7 
 fs/file_table.c                                    |    1 
 fs/fs-writeback.c                                  |    1 
 fs/fuse/file.c                                     |  100 +----
 fs/gfs2/aops.c                                     |   23 -
 fs/gfs2/dir.c                                      |    9 
 fs/gfs2/quota.c                                    |    2 
 fs/hpfs/file.c                                     |    7 
 fs/iomap/buffered-io.c                             |  113 +----
 fs/iomap/trace.h                                   |    2 
 fs/isofs/inode.c                                   |    7 
 fs/jfs/inode.c                                     |    7 
 fs/mpage.c                                         |   38 --
 fs/nfs/blocklayout/extent_tree.c                   |    2 
 fs/nfs/internal.h                                  |   10 
 fs/nfs/write.c                                     |    4 
 fs/nfsd/vfs.c                                      |    9 
 fs/nilfs2/inode.c                                  |   15 
 fs/ntfs/aops.c                                     |    2 
 fs/ntfs/malloc.h                                   |    2 
 fs/ntfs/mft.c                                      |    2 
 fs/ocfs2/aops.c                                    |   34 -
 fs/ocfs2/dlm/dlmmaster.c                           |    1 
 fs/ocfs2/ocfs2.h                                   |    4 
 fs/ocfs2/slot_map.c                                |   46 +-
 fs/ocfs2/super.c                                   |   21 +
 fs/omfs/file.c                                     |    7 
 fs/open.c                                          |    3 
 fs/orangefs/inode.c                                |   32 -
 fs/proc/meminfo.c                                  |    3 
 fs/proc/task_mmu.c                                 |   16 
 fs/qnx6/inode.c                                    |    7 
 fs/reiserfs/inode.c                                |    8 
 fs/squashfs/block.c                                |  273 +++++++-------
 fs/squashfs/decompressor.h                         |    5 
 fs/squashfs/decompressor_multi.c                   |    9 
 fs/squashfs/decompressor_multi_percpu.c            |   17 
 fs/squashfs/decompressor_single.c                  |    9 
 fs/squashfs/lz4_wrapper.c                          |   17 
 fs/squashfs/lzo_wrapper.c                          |   17 
 fs/squashfs/squashfs.h                             |    4 
 fs/squashfs/xz_wrapper.c                           |   51 +-
 fs/squashfs/zlib_wrapper.c                         |   63 +--
 fs/squashfs/zstd_wrapper.c                         |   62 +--
 fs/sync.c                                          |    6 
 fs/ubifs/debug.c                                   |    2 
 fs/ubifs/lprops.c                                  |    2 
 fs/ubifs/lpt_commit.c                              |    4 
 fs/ubifs/orphan.c                                  |    2 
 fs/udf/inode.c                                     |    7 
 fs/xfs/kmem.c                                      |    2 
 fs/xfs/xfs_aops.c                                  |   13 
 fs/xfs/xfs_buf.c                                   |    2 
 fs/zonefs/super.c                                  |    7 
 include/asm-generic/5level-fixup.h                 |    5 
 include/asm-generic/pgtable.h                      |   27 +
 include/linux/buffer_head.h                        |    8 
 include/linux/fs.h                                 |   18 
 include/linux/iomap.h                              |    3 
 include/linux/memcontrol.h                         |    4 
 include/linux/mm.h                                 |   67 ++-
 include/linux/mm_types.h                           |    6 
 include/linux/mmzone.h                             |    1 
 include/linux/mpage.h                              |    4 
 include/linux/page_counter.h                       |    8 
 include/linux/pagemap.h                            |  193 ++++++++++
 include/linux/ptdump.h                             |    3 
 include/linux/sched.h                              |    3 
 include/linux/swap.h                               |   17 
 include/linux/vmalloc.h                            |   49 +-
 include/linux/zsmalloc.h                           |    2 
 include/trace/events/erofs.h                       |    6 
 include/trace/events/f2fs.h                        |    6 
 include/trace/events/writeback.h                   |    5 
 kernel/bpf/core.c                                  |    6 
 kernel/bpf/syscall.c                               |   29 -
 kernel/dma/remap.c                                 |   48 --
 kernel/groups.c                                    |    2 
 kernel/module.c                                    |    3 
 kernel/notifier.c                                  |    1 
 kernel/sys.c                                       |    2 
 kernel/trace/trace.c                               |   12 
 lib/Kconfig.ubsan                                  |    2 
 lib/ioremap.c                                      |   46 +-
 lib/test_vmalloc.c                                 |   26 -
 mm/Kconfig                                         |    4 
 mm/debug.c                                         |   56 ++
 mm/fadvise.c                                       |    6 
 mm/filemap.c                                       |    1 
 mm/gup.c                                           |   77 +++-
 mm/internal.h                                      |   14 
 mm/kasan/Makefile                                  |   21 -
 mm/kasan/common.c                                  |   19 -
 mm/kasan/report.c                                  |   22 +
 mm/memcontrol.c                                    |  198 +++++++---
 mm/memory-failure.c                                |   15 
 mm/memory.c                                        |    2 
 mm/migrate.c                                       |    9 
 mm/mm_init.c                                       |   16 
 mm/nommu.c                                         |   52 +-
 mm/page-writeback.c                                |   62 ++-
 mm/page_alloc.c                                    |    7 
 mm/percpu.c                                        |    2 
 mm/ptdump.c                                        |   17 
 mm/readahead.c                                     |  349 ++++++++++--------
 mm/slab_common.c                                   |    3 
 mm/slub.c                                          |   67 ++-
 mm/swap_state.c                                    |    5 
 mm/swapfile.c                                      |  194 ++++++----
 mm/util.c                                          |    2 
 mm/vmalloc.c                                       |  399 ++++++++-------------
 mm/vmscan.c                                        |    4 
 mm/vmstat.c                                        |   11 
 mm/zsmalloc.c                                      |   12 
 net/bridge/netfilter/ebtables.c                    |    6 
 net/ceph/ceph_common.c                             |    3 
 sound/core/memalloc.c                              |    2 
 sound/core/pcm_memory.c                            |    2 
 195 files changed, 2292 insertions(+), 2288 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-05-29 21:12       ` incoming Andrew Morton
@ 2020-05-29 21:20         ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-05-29 21:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Fri, May 29, 2020 at 2:12 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Stupid diffstat.  Means that basically all my diffstats are very wrong.

I'm actually used to diffstats not matching 100%/

Usually it's not due to this issue - a "git diff --stat" *will* give
the stat from the actual combined diff result - but with git diffstats
the issue is that I might have gotten a patch from another source.

So the diffstat I see after-the-merge is possibly different from the
pre-merge diffstat simply due to merge issues.

So then I usually take a look at "ok, why did that diffstat differ"
and go "Ahh".

In your case, when I looked at the diffstat, I couldn't for the life
of me see how you would have gotten the diffstat you did, since I only
saw a single patch with no merge issues.

> Thanks for spotting it.
>
> I can fix that...

I can also just live with it, knowing what your workflow is. The
diffstat matching exactly just isn't that important - in fact,
different versions of "diff" can give slightly different output anyway
depending on diff algorithms even when they are looking at the exact
same before/after state. There's not necessarily always only one way
to generate a valid diff.

So to me, the diffstat is more of a guide than a hard thing, and I
want to see the rough outline,

In fact, one reason I want to see it in pull requests is actually just
that I want to get a feel for what changes even before I do the pull
or merge, so it's not just a "match against what I get" thing.

                      Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-05-29 20:38     ` incoming Linus Torvalds
@ 2020-05-29 21:12       ` Andrew Morton
  2020-05-29 21:20         ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-05-29 21:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, Linux-MM

On Fri, 29 May 2020 13:38:35 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, May 29, 2020 at 1:31 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > Bah.  I got lazy (didn't want to interrupt an ongoing build) so I
> > generated the diffstat prior to folding two patches into a single one.
> > Evidently diffstat isn't as smart as I had assumed!
> 
> Ahh. Yes - given two patches, diffstat just adds up the line number
> counts for the individual diffs, it doesn't count some kind of
> "combined diff result" line counts.

Stupid diffstat.  Means that basically all my diffstats are very wrong.
Thanks for spotting it.

I can fix that...


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-05-29 20:31   ` incoming Andrew Morton
@ 2020-05-29 20:38     ` Linus Torvalds
  2020-05-29 21:12       ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-05-29 20:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Fri, May 29, 2020 at 1:31 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Bah.  I got lazy (didn't want to interrupt an ongoing build) so I
> generated the diffstat prior to folding two patches into a single one.
> Evidently diffstat isn't as smart as I had assumed!

Ahh. Yes - given two patches, diffstat just adds up the line number
counts for the individual diffs, it doesn't count some kind of
"combined diff result" line counts.

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-05-28 20:10 ` incoming Linus Torvalds
@ 2020-05-29 20:31   ` Andrew Morton
  2020-05-29 20:38     ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-05-29 20:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, Linux-MM

On Thu, 28 May 2020 13:10:18 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Hmm..
> 
> On Wed, May 27, 2020 at 10:20 PM Andrew Morton
> <akpm@linux-foundation.org> wrote:
> >
> >  fs/binfmt_elf.c                |    2 +-
> >  include/asm-generic/topology.h |    2 +-
> >  include/linux/mm.h             |   19 +++++++++++++++----
> >  mm/khugepaged.c                |    1 +
> >  mm/z3fold.c                    |    3 +++
> >  5 files changed, 21 insertions(+), 6 deletions(-)
> 
> I wonder how you generate that diffstat.
> 
> The change to <linux/mm.h> simply doesn't match what you sent me.  The
> patch you sent me that changed mm.h had this:
> 
>  include/linux/mm.h |   15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> (note 15 lines changed: it's +13 and -2) but now suddenly in your
> overall diffstat you have that
> 
>   include/linux/mm.h             |   19 +++++++++++++++----
> 
> with +15/-4.
> 
> So your diffstat simply doesn't match what you are sending. What's going on?
> 

Bah.  I got lazy (didn't want to interrupt an ongoing build) so I
generated the diffstat prior to folding two patches into a single one. 
Evidently diffstat isn't as smart as I had assumed!


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-05-28  5:20 incoming Andrew Morton
@ 2020-05-28 20:10 ` Linus Torvalds
  2020-05-29 20:31   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-05-28 20:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

Hmm..

On Wed, May 27, 2020 at 10:20 PM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
>  fs/binfmt_elf.c                |    2 +-
>  include/asm-generic/topology.h |    2 +-
>  include/linux/mm.h             |   19 +++++++++++++++----
>  mm/khugepaged.c                |    1 +
>  mm/z3fold.c                    |    3 +++
>  5 files changed, 21 insertions(+), 6 deletions(-)

I wonder how you generate that diffstat.

The change to <linux/mm.h> simply doesn't match what you sent me.  The
patch you sent me that changed mm.h had this:

 include/linux/mm.h |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

(note 15 lines changed: it's +13 and -2) but now suddenly in your
overall diffstat you have that

  include/linux/mm.h             |   19 +++++++++++++++----

with +15/-4.

So your diffstat simply doesn't match what you are sending. What's going on?

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-05-28  5:20 Andrew Morton
  2020-05-28 20:10 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-05-28  5:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


5 fixes, based on 444fc5cde64330661bf59944c43844e7d4c2ccd8:


    Qian Cai <cai@lca.pw>:
      mm/z3fold: silence kmemleak false positives of slots

    Hugh Dickins <hughd@google.com>:
      mm,thp: stop leaking unreleased file pages

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm: remove VM_BUG_ON(PageSlab()) from page_mapcount()

    Alexander Potapenko <glider@google.com>:
      fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info()

    Arnd Bergmann <arnd@arndb.de>:
      include/asm-generic/topology.h: guard cpumask_of_node() macro argument

 fs/binfmt_elf.c                |    2 +-
 include/asm-generic/topology.h |    2 +-
 include/linux/mm.h             |   19 +++++++++++++++----
 mm/khugepaged.c                |    1 +
 mm/z3fold.c                    |    3 +++
 5 files changed, 21 insertions(+), 6 deletions(-)




^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-05-23  5:22 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-05-23  5:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

11 fixes, based on 444565650a5fe9c63ddf153e6198e31705dedeb2:


    David Hildenbrand <david@redhat.com>:
      device-dax: don't leak kernel memory to user space after unloading kmem

    Nick Desaulniers <ndesaulniers@google.com>:
      x86: bitops: fix build regression

    John Hubbard <jhubbard@nvidia.com>:
      rapidio: fix an error in get_user_pages_fast() error handling
      selftests/vm/.gitignore: add mremap_dontunmap
      selftests/vm/write_to_hugetlbfs.c: fix unused variable warning

    Marco Elver <elver@google.com>:
      kasan: disable branch tracing for core runtime

    Arnd Bergmann <arnd@arndb.de>:
      sh: include linux/time_types.h for sockios

    Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>:
      MAINTAINERS: update email address for Naoya Horiguchi

    Mike Rapoport <rppt@linux.ibm.com>:
      sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()

    Uladzislau Rezki <uladzislau.rezki@sony.com>:
      z3fold: fix use-after-free when freeing handles

    Baoquan He <bhe@redhat.com>:
      MAINTAINERS: add files related to kdump

 MAINTAINERS                                     |    7 ++++++-
 arch/sh/include/uapi/asm/sockios.h              |    2 ++
 arch/sparc/mm/srmmu.c                           |    2 +-
 arch/x86/include/asm/bitops.h                   |   12 ++++++------
 drivers/dax/kmem.c                              |   14 +++++++++++---
 drivers/rapidio/devices/rio_mport_cdev.c        |    5 +++++
 mm/kasan/Makefile                               |   16 ++++++++--------
 mm/kasan/generic.c                              |    1 -
 mm/kasan/tags.c                                 |    1 -
 mm/z3fold.c                                     |   11 ++++++-----
 tools/testing/selftests/vm/.gitignore           |    1 +
 tools/testing/selftests/vm/write_to_hugetlbfs.c |    2 --
 12 files changed, 46 insertions(+), 28 deletions(-)





^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-05-14  0:50 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-05-14  0:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

7 fixes, based on 24085f70a6e1b0cb647ec92623284641d8270637:

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: fix inconsistent oom event behavior

    Roman Penyaev <rpenyaev@suse.de>:
      epoll: call final ep_events_available() check under the lock

    Peter Xu <peterx@redhat.com>:
      mm/gup: fix fixup_user_fault() on multiple retries

    Brian Geffon <bgeffon@google.com>:
      userfaultfd: fix remap event with MREMAP_DONTUNMAP

    Vasily Averin <vvs@virtuozzo.com>:
      ipc/util.c: sysvipc_find_ipc() incorrectly updates position index

    Andrey Konovalov <andreyknvl@google.com>:
      kasan: consistently disable debugging features
      kasan: add missing functions declarations to kasan.h

 fs/eventpoll.c             |   48 ++++++++++++++++++++++++++-------------------
 include/linux/memcontrol.h |    2 +
 ipc/util.c                 |   12 +++++------
 mm/gup.c                   |   12 ++++++-----
 mm/kasan/Makefile          |   15 +++++++++-----
 mm/kasan/kasan.h           |   34 ++++++++++++++++++++++++++++++-
 mm/mremap.c                |    2 -
 7 files changed, 86 insertions(+), 39 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-05-08  1:35 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-05-08  1:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


14 fixes and one selftest to verify the ipc fixes herein.


15 patches, based on a811c1fa0a02c062555b54651065899437bacdbe:


    Oleg Nesterov <oleg@redhat.com>:
      ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: fix error return value of mem_cgroup_css_alloc()

    David Hildenbrand <david@redhat.com>:
      mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous()

    Maciej Grochowski <maciej.grochowski@pm.me>:
      kernel/kcov.c: fix typos in kcov_remote_start documentation

    Ivan Delalande <colona@arista.com>:
      scripts/decodecode: fix trapping instruction formatting

    Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>:
      arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory()

    Khazhismel Kumykov <khazhy@google.com>:
      eventpoll: fix missing wakeup for ovflist in ep_poll_callback

    Aymeric Agon-Rambosson <aymeric.agon@yandex.com>:
      scripts/gdb: repair rb_first() and rb_last()

    Waiman Long <longman@redhat.com>:
      mm/slub: fix incorrect interpretation of s->offset

    Filipe Manana <fdmanana@suse.com>:
      percpu: make pcpu_alloc() aware of current gfp context

    Roman Penyaev <rpenyaev@suse.de>:
      kselftests: introduce new epoll60 testcase for catching lost wakeups
      epoll: atomically remove wait entry on wake up

    Qiwu Chen <qiwuchen55@gmail.com>:
      mm/vmscan: remove unnecessary argument description of isolate_lru_pages()

    Kees Cook <keescook@chromium.org>:
      ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST

    Henry Willard <henry.willard@oracle.com>:
      mm: limit boost_watermark on small zones

 arch/x86/kvm/svm/sev.c                                        |    2 
 fs/eventpoll.c                                                |   61 ++--
 ipc/mqueue.c                                                  |   34 +-
 kernel/kcov.c                                                 |    4 
 lib/Kconfig.ubsan                                             |   15 -
 mm/memcontrol.c                                               |   15 -
 mm/page_alloc.c                                               |    9 
 mm/percpu.c                                                   |   14 
 mm/slub.c                                                     |   45 ++-
 mm/vmscan.c                                                   |    1 
 scripts/decodecode                                            |    2 
 scripts/gdb/linux/rbtree.py                                   |    4 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |  146 ++++++++++
 tools/testing/selftests/wireguard/qemu/debug.config           |    1 
 14 files changed, 275 insertions(+), 78 deletions(-)





^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-04-21  1:13 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-04-21  1:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


15 fixes, based on ae83d0b416db002fe95601e7f97f64b59514d936:


    Masahiro Yamada <masahiroy@kernel.org>:
      sh: fix build error in mm/init.c

    Kees Cook <keescook@chromium.org>:
      slub: avoid redzone when choosing freepointer location

    Peter Xu <peterx@redhat.com>:
      mm/userfaultfd: disable userfaultfd-wp on x86_32

    Bartosz Golaszewski <bgolaszewski@baylibre.com>:
      MAINTAINERS: add an entry for kfifo

    Longpeng <longpeng2@huawei.com>:
      mm/hugetlb: fix a addressing exception caused by huge_pte_offset

    Michal Hocko <mhocko@suse.com>:
      mm, gup: return EINTR when gup is interrupted by fatal signals

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      checkpatch: fix a typo in the regex for $allocFunctions

    George Burgess IV <gbiv@google.com>:
      tools/build: tweak unused value workaround

    Muchun Song <songmuchun@bytedance.com>:
      mm/ksm: fix NULL pointer dereference when KSM zero page is enabled

    Hugh Dickins <hughd@google.com>:
      mm/shmem: fix build without THP

    Jann Horn <jannh@google.com>:
      vmalloc: fix remap_vmalloc_range() bounds checks

    Hugh Dickins <hughd@google.com>:
      shmem: fix possible deadlocks on shmlock_user_lock

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path

    Sudip Mukherjee <sudipm.mukherjee@gmail.com>:
      coredump: fix null pointer dereference on coredump

    Lucas Stach <l.stach@pengutronix.de>:
      tools/vm: fix cross-compile build

 MAINTAINERS                                      |    7 +++++++
 arch/sh/mm/init.c                                |    2 +-
 arch/x86/Kconfig                                 |    2 +-
 fs/coredump.c                                    |    2 ++
 fs/proc/vmcore.c                                 |    5 +++--
 include/linux/vmalloc.h                          |    2 +-
 mm/gup.c                                         |    2 +-
 mm/hugetlb.c                                     |   14 ++++++++------
 mm/ksm.c                                         |   12 ++++++++++--
 mm/shmem.c                                       |   13 ++++++++-----
 mm/slub.c                                        |   12 ++++++++++--
 mm/vmalloc.c                                     |   16 +++++++++++++---
 samples/vfio-mdev/mdpy.c                         |    2 +-
 scripts/checkpatch.pl                            |    2 +-
 tools/build/feature/test-sync-compare-and-swap.c |    2 +-
 tools/vm/Makefile                                |    2 ++
 16 files changed, 70 insertions(+), 27 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-04-12  7:41 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-04-12  7:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


A straggler.  This patch caused a lot of build errors on a lot of
architectures for a long time, but Anshuman believes it's all fixed up
now.

1 patch, based on GIT b032227c62939b5481bcd45442b36dfa263f4a7c.

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/debug: add tests validating architecture page table helpers

 Documentation/features/debug/debug-vm-pgtable/arch-support.txt |   34 
 arch/arc/Kconfig                                               |    1 
 arch/arm64/Kconfig                                             |    1 
 arch/powerpc/Kconfig                                           |    1 
 arch/s390/Kconfig                                              |    1 
 arch/x86/Kconfig                                               |    1 
 arch/x86/include/asm/pgtable_64.h                              |    6 
 include/linux/mmdebug.h                                        |    5 
 init/main.c                                                    |    2 
 lib/Kconfig.debug                                              |   26 
 mm/Makefile                                                    |    1 
 mm/debug_vm_pgtable.c                                          |  392 ++++++++++
 12 files changed, 471 insertions(+)




^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-04-10 21:30 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-04-10 21:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


Almost all of the rest of MM.  Various other things.

35 patches, based on c0cc271173b2e1c2d8d0ceaef14e4dfa79eefc0d.

Subsystems affected by this patch series:

  hfs
  mm/memcg
  mm/slab-generic
  mm/slab
  mm/pagealloc
  mm/gup
  ocfs2
  mm/hugetlb
  mm/pagemap
  mm/memremap
  kmod
  misc
  seqfile

Subsystem: hfs

    Simon Gander <simon@tuxera.com>:
      hfsplus: fix crash and filesystem corruption when deleting files

Subsystem: mm/memcg

    Jakub Kicinski <kuba@kernel.org>:
      mm, memcg: do not high throttle allocators based on wraparound

Subsystem: mm/slab-generic

    Qiujun Huang <hqjagain@gmail.com>:
      mm, slab_common: fix a typo in comment "eariler"->"earlier"

Subsystem: mm/slab

    Mauro Carvalho Chehab <mchehab+huawei@kernel.org>:
      docs: mm: slab.h: fix a broken cross-reference

Subsystem: mm/pagealloc

    Randy Dunlap <rdunlap@infradead.org>:
      mm/page_alloc.c: fix kernel-doc warning

    Jason Yan <yanaijie@huawei.com>:
      mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static

Subsystem: mm/gup

    Miles Chen <miles.chen@mediatek.com>:
      mm/gup: fix null pointer dereference detected by coverity

Subsystem: ocfs2

    Changwei Ge <chge@linux.alibaba.com>:
      ocfs2: no need try to truncate file beyond i_size

Subsystem: mm/hugetlb

    Aslan Bakirov <aslan@fb.com>:
      mm: cma: NUMA node interface

    Roman Gushchin <guro@fb.com>:
      mm: hugetlb: optionally allocate gigantic hugepages using cma

Subsystem: mm/pagemap

    Jaewon Kim <jaewon31.kim@samsung.com>:
      mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area

    Arjun Roy <arjunroy@google.com>:
      mm/memory.c: refactor insert_page to prepare for batched-lock insert
      mm: bring sparc pte_index() semantics inline with other platforms
      mm: define pte_index as macro for x86
      mm/memory.c: add vm_insert_pages()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
      mm/vma: introduce VM_ACCESS_FLAGS
      mm/special: create generic fallbacks for pte_special() and pte_mkspecial()

Subsystem: mm/memremap

    Logan Gunthorpe <logang@deltatee.com>:
    Patch series "Allow setting caching mode in arch_add_memory() for P2PDMA", v4:
      mm/memory_hotplug: drop the flags field from struct mhp_restrictions
      mm/memory_hotplug: rename mhp_restrictions to mhp_params
      x86/mm: thread pgprot_t through init_memory_mapping()
      x86/mm: introduce __set_memory_prot()
      powerpc/mm: thread pgprot_t through create_section_mapping()
      mm/memory_hotplug: add pgprot_t to mhp_params
      mm/memremap: set caching mode for PCI P2PDMA memory to WC

Subsystem: kmod

    Eric Biggers <ebiggers@google.com>:
    Patch series "module autoloading fixes and cleanups", v5:
      kmod: make request_module() return an error when autoloading is disabled
      fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
      docs: admin-guide: document the kernel.modprobe sysctl
      selftests: kmod: fix handling test numbers above 9
      selftests: kmod: test disabling module autoloading

Subsystem: misc

    Pali Rohár <pali@kernel.org>:
      change email address for Pali Rohár

    kbuild test robot <lkp@intel.com>:
      drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings

Subsystem: seqfile

    Vasily Averin <vvs@virtuozzo.com>:
    Patch series "seq_file .next functions should increase position index":
      fs/seq_file.c: seq_read(): add info message about buggy .next functions
      kernel/gcov/fs.c: gcov_seq_next() should increase position index
      ipc/util.c: sysvipc_find_ipc() should increase position index

 Documentation/ABI/testing/sysfs-platform-dell-laptop |    8 
 Documentation/admin-guide/kernel-parameters.txt      |    8 
 Documentation/admin-guide/sysctl/kernel.rst          |   21 ++
 MAINTAINERS                                          |   16 -
 arch/alpha/include/asm/page.h                        |    3 
 arch/alpha/include/asm/pgtable.h                     |    2 
 arch/arc/include/asm/page.h                          |    2 
 arch/arm/include/asm/page.h                          |    4 
 arch/arm/include/asm/pgtable-2level.h                |    2 
 arch/arm/include/asm/pgtable.h                       |   15 -
 arch/arm/mach-omap2/omap-secure.c                    |    2 
 arch/arm/mach-omap2/omap-secure.h                    |    2 
 arch/arm/mach-omap2/omap-smc.S                       |    2 
 arch/arm/mm/fault.c                                  |    2 
 arch/arm/mm/mmu.c                                    |   14 +
 arch/arm64/include/asm/page.h                        |    4 
 arch/arm64/mm/fault.c                                |    2 
 arch/arm64/mm/init.c                                 |    6 
 arch/arm64/mm/mmu.c                                  |    7 
 arch/c6x/include/asm/page.h                          |    5 
 arch/csky/include/asm/page.h                         |    3 
 arch/csky/include/asm/pgtable.h                      |    3 
 arch/h8300/include/asm/page.h                        |    2 
 arch/hexagon/include/asm/page.h                      |    3 
 arch/hexagon/include/asm/pgtable.h                   |    2 
 arch/ia64/include/asm/page.h                         |    5 
 arch/ia64/include/asm/pgtable.h                      |    2 
 arch/ia64/mm/init.c                                  |    7 
 arch/m68k/include/asm/mcf_pgtable.h                  |   10 -
 arch/m68k/include/asm/motorola_pgtable.h             |    2 
 arch/m68k/include/asm/page.h                         |    3 
 arch/m68k/include/asm/sun3_pgtable.h                 |    2 
 arch/microblaze/include/asm/page.h                   |    2 
 arch/microblaze/include/asm/pgtable.h                |    4 
 arch/mips/include/asm/page.h                         |    5 
 arch/mips/include/asm/pgtable.h                      |   44 +++-
 arch/nds32/include/asm/page.h                        |    3 
 arch/nds32/include/asm/pgtable.h                     |    9 -
 arch/nds32/mm/fault.c                                |    2 
 arch/nios2/include/asm/page.h                        |    3 
 arch/nios2/include/asm/pgtable.h                     |    3 
 arch/openrisc/include/asm/page.h                     |    5 
 arch/openrisc/include/asm/pgtable.h                  |    2 
 arch/parisc/include/asm/page.h                       |    3 
 arch/parisc/include/asm/pgtable.h                    |    2 
 arch/powerpc/include/asm/book3s/64/hash.h            |    3 
 arch/powerpc/include/asm/book3s/64/radix.h           |    3 
 arch/powerpc/include/asm/page.h                      |    9 -
 arch/powerpc/include/asm/page_64.h                   |    7 
 arch/powerpc/include/asm/sparsemem.h                 |    3 
 arch/powerpc/mm/book3s64/hash_utils.c                |    5 
 arch/powerpc/mm/book3s64/pgtable.c                   |    7 
 arch/powerpc/mm/book3s64/pkeys.c                     |    2 
 arch/powerpc/mm/book3s64/radix_pgtable.c             |   18 +-
 arch/powerpc/mm/mem.c                                |   12 -
 arch/riscv/include/asm/page.h                        |    3 
 arch/s390/include/asm/page.h                         |    3 
 arch/s390/mm/fault.c                                 |    2 
 arch/s390/mm/init.c                                  |    9 -
 arch/sh/include/asm/page.h                           |    3 
 arch/sh/mm/init.c                                    |    7 
 arch/sparc/include/asm/page_32.h                     |    3 
 arch/sparc/include/asm/page_64.h                     |    3 
 arch/sparc/include/asm/pgtable_32.h                  |    7 
 arch/sparc/include/asm/pgtable_64.h                  |   10 -
 arch/um/include/asm/pgtable.h                        |   10 -
 arch/unicore32/include/asm/page.h                    |    3 
 arch/unicore32/include/asm/pgtable.h                 |    3 
 arch/unicore32/mm/fault.c                            |    2 
 arch/x86/include/asm/page_types.h                    |    7 
 arch/x86/include/asm/pgtable.h                       |    6 
 arch/x86/include/asm/set_memory.h                    |    1 
 arch/x86/kernel/amd_gart_64.c                        |    3 
 arch/x86/kernel/setup.c                              |    4 
 arch/x86/mm/init.c                                   |    9 -
 arch/x86/mm/init_32.c                                |   19 +-
 arch/x86/mm/init_64.c                                |   42 ++--
 arch/x86/mm/mm_internal.h                            |    3 
 arch/x86/mm/pat/set_memory.c                         |   13 +
 arch/x86/mm/pkeys.c                                  |    2 
 arch/x86/platform/uv/bios_uv.c                       |    3 
 arch/x86/um/asm/vm-flags.h                           |   10 -
 arch/xtensa/include/asm/page.h                       |    3 
 arch/xtensa/include/asm/pgtable.h                    |    3 
 drivers/char/hw_random/omap3-rom-rng.c               |    4 
 drivers/dma/tegra20-apb-dma.c                        |    1 
 drivers/hwmon/dell-smm-hwmon.c                       |    4 
 drivers/platform/x86/dell-laptop.c                   |    4 
 drivers/platform/x86/dell-rbtn.c                     |    4 
 drivers/platform/x86/dell-rbtn.h                     |    2 
 drivers/platform/x86/dell-smbios-base.c              |    4 
 drivers/platform/x86/dell-smbios-smm.c               |    2 
 drivers/platform/x86/dell-smbios.h                   |    2 
 drivers/platform/x86/dell-smo8800.c                  |    2 
 drivers/platform/x86/dell-wmi.c                      |    4 
 drivers/power/supply/bq2415x_charger.c               |    4 
 drivers/power/supply/bq27xxx_battery.c               |    2 
 drivers/power/supply/isp1704_charger.c               |    2 
 drivers/power/supply/rx51_battery.c                  |    4 
 drivers/staging/gasket/gasket_core.c                 |    2 
 fs/filesystems.c                                     |    4 
 fs/hfsplus/attributes.c                              |    4 
 fs/ocfs2/alloc.c                                     |    4 
 fs/seq_file.c                                        |    7 
 fs/udf/ecma_167.h                                    |    2 
 fs/udf/osta_udf.h                                    |    2 
 include/linux/cma.h                                  |   14 +
 include/linux/hugetlb.h                              |   12 +
 include/linux/memblock.h                             |    3 
 include/linux/memory_hotplug.h                       |   21 +-
 include/linux/mm.h                                   |   34 +++
 include/linux/power/bq2415x_charger.h                |    2 
 include/linux/slab.h                                 |    2 
 ipc/util.c                                           |    2 
 kernel/gcov/fs.c                                     |    2 
 kernel/kmod.c                                        |    4 
 mm/cma.c                                             |   16 +
 mm/gup.c                                             |    3 
 mm/hugetlb.c                                         |  109 ++++++++++++
 mm/memblock.c                                        |    2 
 mm/memcontrol.c                                      |    3 
 mm/memory.c                                          |  168 +++++++++++++++++--
 mm/memory_hotplug.c                                  |   13 -
 mm/memremap.c                                        |   17 +
 mm/mmap.c                                            |    4 
 mm/mprotect.c                                        |    4 
 mm/page_alloc.c                                      |    5 
 mm/slab_common.c                                     |    2 
 tools/laptop/freefall/freefall.c                     |    2 
 tools/testing/selftests/kmod/kmod.sh                 |   43 ++++
 130 files changed, 710 insertions(+), 370 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-04-07  3:02 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-04-07  3:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


- a lot more of MM, quite a bit more yet to come.

- various other subsystems



166 patches based on 7e63420847ae5f1036e4f7c42f0b3282e73efbc2.

Subsystems affected by this patch series:

  mm/memcg
  mm/pagemap
  mm/vmalloc
  mm/pagealloc
  mm/migration
  mm/thp
  mm/ksm
  mm/madvise
  mm/virtio
  mm/userfaultfd
  mm/memory-hotplug
  mm/shmem
  mm/rmap
  mm/zswap
  mm/zsmalloc
  mm/cleanups
  procfs
  misc
  MAINTAINERS
  bitops
  lib
  checkpatch
  epoll
  binfmt
  kallsyms
  reiserfs
  kmod
  gcov
  kconfig
  kcov
  ubsan
  fault-injection
  ipc

Subsystem: mm/memcg

    Chris Down <chris@chrisdown.name>:
      mm, memcg: bypass high reclaim iteration for cgroup hierarchy root

Subsystem: mm/pagemap

    Li Xinhai <lixinhai.lxh@gmail.com>:
    Patch series "mm: Fix misuse of parent anon_vma in dup_mmap path":
      mm: don't prepare anon_vma if vma has VM_WIPEONFORK
      Revert "mm/rmap.c: reuse mergeable anon_vma as parent when fork"
      mm: set vm_next and vm_prev to NULL in vm_area_dup()

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/vma: Use all available wrappers when possible", v2:
      mm/vma: add missing VMA flag readable name for VM_SYNC
      mm/vma: make vma_is_accessible() available for general use
      mm/vma: replace all remaining open encodings with is_vm_hugetlb_page()
      mm/vma: replace all remaining open encodings with vma_is_anonymous()
      mm/vma: append unlikely() while testing VMA access permissions

Subsystem: mm/vmalloc

    Qiujun Huang <hqjagain@gmail.com>:
      mm/vmalloc: fix a typo in comment

Subsystem: mm/pagealloc

    Michal Hocko <mhocko@suse.com>:
      mm: make it clear that gfp reclaim modifiers are valid only for sleepable allocations

Subsystem: mm/migration

    Wei Yang <richardw.yang@linux.intel.com>:
    Patch series "cleanup on do_pages_move()", v5:
      mm/migrate.c: no need to check for i > start in do_pages_move()
      mm/migrate.c: wrap do_move_pages_to_node() and store_status()
      mm/migrate.c: check pagelist in move_pages_and_store_status()
      mm/migrate.c: unify "not queued for migration" handling in do_pages_move()

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/migrate.c: migrate PG_readahead flag

Subsystem: mm/thp

    David Rientjes <rientjes@google.com>:
      mm, shmem: add vmstat for hugepage fallback
      mm, thp: track fallbacks due to failed memcg charges separately

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      include/linux/pagemap.h: optimise find_subpage for !THP
      mm: remove CONFIG_TRANSPARENT_HUGE_PAGECACHE

Subsystem: mm/ksm

    Li Chen <chenli@uniontech.com>:
      mm/ksm.c: update get_user_pages() argument in comment

Subsystem: mm/madvise

    Huang Ying <ying.huang@intel.com>:
      mm: code cleanup for MADV_FREE

Subsystem: mm/virtio

    Alexander Duyck <alexander.h.duyck@linux.intel.com>:
    Patch series "mm / virtio: Provide support for free page reporting", v17:
      mm: adjust shuffle code to allow for future coalescing
      mm: use zone and order instead of free area in free_list manipulators
      mm: add function __putback_isolated_page
      mm: introduce Reported pages
      virtio-balloon: pull page poisoning config out of free page hinting
      virtio-balloon: add support for providing free page reports to host
      mm/page_reporting: rotate reported pages to the tail of the list
      mm/page_reporting: add budget limit on how many pages can be reported per pass
      mm/page_reporting: add free page reporting documentation

    David Hildenbrand <david@redhat.com>:
      virtio-balloon: switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

Subsystem: mm/userfaultfd

    Shaohua Li <shli@fb.com>:
    Patch series "userfaultfd: write protection support", v6:
      userfaultfd: wp: add helper for writeprotect check

    Andrea Arcangeli <aarcange@redhat.com>:
      userfaultfd: wp: hook userfault handler to write protection fault
      userfaultfd: wp: add WP pagetable tracking to x86
      userfaultfd: wp: userfaultfd_pte/huge_pmd_wp() helpers
      userfaultfd: wp: add UFFDIO_COPY_MODE_WP

    Peter Xu <peterx@redhat.com>:
      mm: merge parameters for change_protection()
      userfaultfd: wp: apply _PAGE_UFFD_WP bit
      userfaultfd: wp: drop _PAGE_UFFD_WP properly when fork
      userfaultfd: wp: add pmd_swp_*uffd_wp() helpers
      userfaultfd: wp: support swap and page migration
      khugepaged: skip collapse if uffd-wp detected

    Shaohua Li <shli@fb.com>:
      userfaultfd: wp: support write protection for userfault vma range

    Andrea Arcangeli <aarcange@redhat.com>:
      userfaultfd: wp: add the writeprotect API to userfaultfd ioctl

    Shaohua Li <shli@fb.com>:
      userfaultfd: wp: enabled write protection in userfaultfd API

    Peter Xu <peterx@redhat.com>:
      userfaultfd: wp: don't wake up when doing write protect

    Martin Cracauer <cracauer@cons.org>:
      userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update

    Peter Xu <peterx@redhat.com>:
      userfaultfd: wp: declare _UFFDIO_WRITEPROTECT conditionally
      userfaultfd: selftests: refactor statistics
      userfaultfd: selftests: add write-protect test

Subsystem: mm/memory-hotplug

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: drop superfluous section checks when onlining/offlining":
      drivers/base/memory.c: drop section_count
      drivers/base/memory.c: drop pages_correctly_probed()
      mm/page_ext.c: drop pfn_present() check when onlining

    Baoquan He <bhe@redhat.com>:
      mm/memory_hotplug.c: only respect mem= parameter during boot stage

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug.c: simplify calculation of number of pages in __remove_pages()
      mm/memory_hotplug.c: cleanup __add_pages()

    Baoquan He <bhe@redhat.com>:
    Patch series "mm/hotplug: Only use subsection map for VMEMMAP", v4:
      mm/sparse.c: introduce new function fill_subsection_map()
      mm/sparse.c: introduce a new function clear_subsection_map()
      mm/sparse.c: only use subsection map in VMEMMAP case
      mm/sparse.c: add note about only VMEMMAP supporting sub-section hotplug
      mm/sparse.c: move subsection_map related functions together

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: allow to specify a default online_type", v3:
      drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
      drivers/base/memory: map MMOP_OFFLINE to 0
      drivers/base/memory: store mapping between MMOP_* and string in an array
      powernv/memtrace: always online added memory blocks
      hv_balloon: don't check for memhp_auto_online manually
      mm/memory_hotplug: unexport memhp_auto_online
      mm/memory_hotplug: convert memhp_auto_online to store an online_type
      mm/memory_hotplug: allow to specify a default online_type

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/memory_hotplug.c: use __pfn_to_section() instead of open-coding

Subsystem: mm/shmem

    Kees Cook <keescook@chromium.org>:
      mm/shmem.c: distribute switch variables for initialization

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/shmem.c: clean code by removing unnecessary assignment

    Hugh Dickins <hughd@google.com>:
      mm: huge tmpfs: try to split_huge_page() when punching hole

Subsystem: mm/rmap

    Palmer Dabbelt <palmerdabbelt@google.com>:
      mm: prevent a warning when casting void* -> enum

Subsystem: mm/zswap

    "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>:
      mm/zswap: allow setting default status, compressor and allocator in Kconfig

Subsystem: mm/zsmalloc

Subsystem: mm/cleanups

    Jules Irenge <jbi.octave@gmail.com>:
      mm/compaction: add missing annotation for compact_lock_irqsave
      mm/hugetlb: add missing annotation for gather_surplus_pages()
      mm/mempolicy: add missing annotation for queue_pages_pmd()
      mm/slub: add missing annotation for get_map()
      mm/slub: add missing annotation for put_map()
      mm/zsmalloc: add missing annotation for migrate_read_lock()
      mm/zsmalloc: add missing annotation for migrate_read_unlock()
      mm/zsmalloc: add missing annotation for pin_tag()
      mm/zsmalloc: add missing annotation for unpin_tag()

    chenqiwu <chenqiwu@xiaomi.com>:
      mm: fix ambiguous comments for better code readability

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/mm_init.c: clean code. Use BUILD_BUG_ON when comparing compile time constant

    Joe Perches <joe@perches.com>:
      mm: use fallthrough;

    Steven Price <steven.price@arm.com>:
      include/linux/swapops.h: correct guards for non_swap_entry()

    Ira Weiny <ira.weiny@intel.com>:
      include/linux/memremap.h: remove stale comments

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/dmapool.c: micro-optimisation remove unnecessary branch

    Waiman Long <longman@redhat.com>:
      mm: remove dummy struct bootmem_data/bootmem_data_t

Subsystem: procfs

    Jules Irenge <jbi.octave@gmail.com>:
      fs/proc/inode.c: annotate close_pdeo() for sparse

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: faster open/read/close with "permanent" files
      proc: speed up /proc/*/statm

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      proc: inline vma_stop into m_stop
      proc: remove m_cache_vma
      proc: use ppos instead of m->version
      seq_file: remove m->version
      proc: inline m_next_vma into m_next

Subsystem: misc

    Michal Simek <michal.simek@xilinx.com>:
      asm-generic: fix unistd_32.h generation format

    Nathan Chancellor <natechancellor@gmail.com>:
      kernel/extable.c: use address-of operator on section symbols

    Masahiro Yamada <masahiroy@kernel.org>:
      sparc,x86: vdso: remove meaningless undefining CONFIG_OPTIMIZE_INLINING
      compiler: remove CONFIG_OPTIMIZE_INLINING entirely

    Vegard Nossum <vegard.nossum@oracle.com>:
      compiler.h: fix error in BUILD_BUG_ON() reporting

Subsystem: MAINTAINERS

    Joe Perches <joe@perches.com>:
      MAINTAINERS: list the section entries in the preferred order

Subsystem: bitops

    Josh Poimboeuf <jpoimboe@redhat.com>:
      bitops: always inline sign extension helpers

Subsystem: lib

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      lib/test_lockup: test module to generate lockups

    Colin Ian King <colin.king@canonical.com>:
      lib/test_lockup.c: fix spelling mistake "iteraions" -> "iterations"

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      lib/test_lockup.c: add parameters for locking generic vfs locks

    "Gustavo A. R. Silva" <gustavo@embeddedor.com>:
      lib/bch.c: replace zero-length array with flexible-array member
      lib/ts_bm.c: replace zero-length array with flexible-array member
      lib/ts_fsm.c: replace zero-length array with flexible-array member
      lib/ts_kmp.c: replace zero-length array with flexible-array member

    Geert Uytterhoeven <geert+renesas@glider.be>:
      lib/scatterlist: fix sg_copy_buffer() kerneldoc

    Kees Cook <keescook@chromium.org>:
      lib: test_stackinit.c: XFAIL switch variable init tests

    Alexander Potapenko <glider@google.com>:
      lib/stackdepot.c: check depot_index before accessing the stack slab
lib/stackdepot.c: fix a condition in stack_depot_fetch()
      lib/stackdepot.c: build with -fno-builtin
      kasan: stackdepot: move filter_irq_stacks() to stackdepot.c

    Qian Cai <cai@lca.pw>:
      percpu_counter: fix a data race at vm_committed_as

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      lib/test_bitmap.c: make use of EXP2_IN_BITS

    chenqiwu <chenqiwu@xiaomi.com>:
      lib/rbtree: fix coding style of assignments

    Dan Carpenter <dan.carpenter@oracle.com>:
      lib/test_kmod.c: remove a NULL test

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      linux/bits.h: add compile time sanity check of GENMASK inputs

    Chris Wilson <chris@chris-wilson.co.uk>:
      lib/list: prevent compiler reloads inside 'safe' list iteration

    Nathan Chancellor <natechancellor@gmail.com>:
      lib/dynamic_debug.c: use address-of operator on section symbols

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: remove email address comment from email address comparisons

    Lubomir Rintel <lkundrak@v3.sk>:
      checkpatch: check SPDX tags in YAML files

    John Hubbard <jhubbard@nvidia.com>:
      checkpatch: support "base-commit:" format

    Joe Perches <joe@perches.com>:
      checkpatch: prefer fallthrough; over fallthrough comments

    Antonio Borneo <borneo.antonio@gmail.com>:
      checkpatch: fix minor typo and mixed space+tab in indentation
      checkpatch: fix multiple const * types
      checkpatch: add command-line option for TAB size

    Joe Perches <joe@perches.com>:
      checkpatch: improve Gerrit Change-Id: test

    Lubomir Rintel <lkundrak@v3.sk>:
      checkpatch: check proper licensing of Devicetree bindings

    Joe Perches <joe@perches.com>:
      checkpatch: avoid warning about uninitialized_var()

Subsystem: epoll

    Roman Penyaev <rpenyaev@suse.de>:
      kselftest: introduce new epoll test case

    Jason Baron <jbaron@akamai.com>:
      fs/epoll: make nesting accounting safe for -rt kernel

Subsystem: binfmt

    Alexey Dobriyan <adobriyan@gmail.com>:
      fs/binfmt_elf.c: delete "loc" variable
      fs/binfmt_elf.c: allocate less for static executable
      fs/binfmt_elf.c: don't free interpreter's ELF pheaders on common path

Subsystem: kallsyms

    Will Deacon <will@kernel.org>:
    Patch series "Unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()":
      samples/hw_breakpoint: drop HW_BREAKPOINT_R when reporting writes
      samples/hw_breakpoint: drop use of kallsyms_lookup_name()
      kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()

Subsystem: reiserfs

    Colin Ian King <colin.king@canonical.com>:
      reiserfs: clean up several indentation issues

Subsystem: kmod

    Qiujun Huang <hqjagain@gmail.com>:
      kernel/kmod.c: fix a typo "assuems" -> "assumes"

Subsystem: gcov

    "Gustavo A. R. Silva" <gustavo@embeddedor.com>:
      gcov: gcc_4_7: replace zero-length array with flexible-array member
      gcov: gcc_3_4: replace zero-length array with flexible-array member
      kernel/gcov/fs.c: replace zero-length array with flexible-array member

Subsystem: kconfig

    Krzysztof Kozlowski <krzk@kernel.org>:
      init/Kconfig: clean up ANON_INODES and old IO schedulers options

Subsystem: kcov

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "kcov: collect coverage from usb soft interrupts", v4:
      kcov: cleanup debug messages
      kcov: fix potential use-after-free in kcov_remote_start
      kcov: move t->kcov assignments into kcov_start/stop
      kcov: move t->kcov_sequence assignment
      kcov: use t->kcov_mode as enabled indicator
      kcov: collect coverage from interrupts
      usb: core: kcov: collect coverage from usb complete callback

Subsystem: ubsan

    Kees Cook <keescook@chromium.org>:
    Patch series "ubsan: Split out bounds checker", v5:
      ubsan: add trap instrumentation option
      ubsan: split "bounds" checker from other options
      drivers/misc/lkdtm/bugs.c: add arithmetic overflow and array bounds checks
      ubsan: check panic_on_warn
      kasan: unset panic_on_warn before calling panic()
      ubsan: include bug type in report header

Subsystem: fault-injection

    Qiujun Huang <hqjagain@gmail.com>:
      lib/Kconfig.debug: fix a typo "capabilitiy" -> "capability"

Subsystem: ipc

    Somala Swaraj <somalaswaraj@gmail.com>:
      ipc/mqueue.c: fix a brace coding style issue

    Jason Yan <yanaijie@huawei.com>:
      ipc/shm.c: make compat_ksys_shmctl() static

 Documentation/admin-guide/kernel-parameters.txt               |   13 
 Documentation/admin-guide/mm/transhuge.rst                    |   14 
 Documentation/admin-guide/mm/userfaultfd.rst                  |   51 
 Documentation/dev-tools/kcov.rst                              |   17 
 Documentation/vm/free_page_reporting.rst                      |   41 
 Documentation/vm/zswap.rst                                    |   20 
 MAINTAINERS                                                   |   35 
 arch/alpha/include/asm/mmzone.h                               |    2 
 arch/alpha/kernel/syscalls/syscallhdr.sh                      |    2 
 arch/csky/mm/fault.c                                          |    4 
 arch/ia64/kernel/syscalls/syscallhdr.sh                       |    2 
 arch/ia64/kernel/vmlinux.lds.S                                |    2 
 arch/m68k/mm/fault.c                                          |    4 
 arch/microblaze/kernel/syscalls/syscallhdr.sh                 |    2 
 arch/mips/kernel/syscalls/syscallhdr.sh                       |    3 
 arch/mips/mm/fault.c                                          |    4 
 arch/nds32/kernel/vmlinux.lds.S                               |    1 
 arch/parisc/kernel/syscalls/syscallhdr.sh                     |    2 
 arch/powerpc/kernel/syscalls/syscallhdr.sh                    |    3 
 arch/powerpc/kvm/e500_mmu_host.c                              |    2 
 arch/powerpc/mm/fault.c                                       |    2 
 arch/powerpc/platforms/powernv/memtrace.c                     |   14 
 arch/sh/kernel/syscalls/syscallhdr.sh                         |    2 
 arch/sh/mm/fault.c                                            |    2 
 arch/sparc/kernel/syscalls/syscallhdr.sh                      |    2 
 arch/sparc/vdso/vdso32/vclock_gettime.c                       |    4 
 arch/x86/Kconfig                                              |    1 
 arch/x86/configs/i386_defconfig                               |    1 
 arch/x86/configs/x86_64_defconfig                             |    1 
 arch/x86/entry/vdso/vdso32/vclock_gettime.c                   |    4 
 arch/x86/include/asm/pgtable.h                                |   67 +
 arch/x86/include/asm/pgtable_64.h                             |    8 
 arch/x86/include/asm/pgtable_types.h                          |   12 
 arch/x86/mm/fault.c                                           |    2 
 arch/xtensa/kernel/syscalls/syscallhdr.sh                     |    2 
 drivers/base/memory.c                                         |  138 --
 drivers/hv/hv_balloon.c                                       |   25 
 drivers/misc/lkdtm/bugs.c                                     |   75 +
 drivers/misc/lkdtm/core.c                                     |    3 
 drivers/misc/lkdtm/lkdtm.h                                    |    3 
 drivers/usb/core/hcd.c                                        |    3 
 drivers/virtio/Kconfig                                        |    1 
 drivers/virtio/virtio_balloon.c                               |  190 ++-
 fs/binfmt_elf.c                                               |   56 
 fs/eventpoll.c                                                |   64 -
 fs/proc/array.c                                               |   39 
 fs/proc/cpuinfo.c                                             |    1 
 fs/proc/generic.c                                             |   31 
 fs/proc/inode.c                                               |  188 ++-
 fs/proc/internal.h                                            |    6 
 fs/proc/kmsg.c                                                |    1 
 fs/proc/stat.c                                                |    1 
 fs/proc/task_mmu.c                                            |   97 -
 fs/reiserfs/do_balan.c                                        |    2 
 fs/reiserfs/ioctl.c                                           |   11 
 fs/reiserfs/namei.c                                           |   10 
 fs/seq_file.c                                                 |   28 
 fs/userfaultfd.c                                              |  116 +
 include/asm-generic/pgtable.h                                 |    1 
 include/asm-generic/pgtable_uffd.h                            |   66 +
 include/asm-generic/tlb.h                                     |    3 
 include/linux/bitops.h                                        |    4 
 include/linux/bits.h                                          |   22 
 include/linux/compiler.h                                      |    2 
 include/linux/compiler_types.h                                |   11 
 include/linux/gfp.h                                           |    2 
 include/linux/huge_mm.h                                       |    2 
 include/linux/list.h                                          |   50 
 include/linux/memory.h                                        |    1 
 include/linux/memory_hotplug.h                                |   13 
 include/linux/memremap.h                                      |    2 
 include/linux/mm.h                                            |   25 
 include/linux/mm_inline.h                                     |   15 
 include/linux/mm_types.h                                      |    4 
 include/linux/mmzone.h                                        |   47 
 include/linux/page-flags.h                                    |   16 
 include/linux/page_reporting.h                                |   26 
 include/linux/pagemap.h                                       |    4 
 include/linux/percpu_counter.h                                |    4 
 include/linux/proc_fs.h                                       |   17 
 include/linux/sched.h                                         |    3 
 include/linux/seq_file.h                                      |    1 
 include/linux/shmem_fs.h                                      |   10 
 include/linux/stackdepot.h                                    |    2 
 include/linux/swapops.h                                       |    5 
 include/linux/userfaultfd_k.h                                 |   42 
 include/linux/vm_event_item.h                                 |    5 
 include/trace/events/huge_memory.h                            |    1 
 include/trace/events/mmflags.h                                |    1 
 include/trace/events/vmscan.h                                 |    2 
 include/uapi/linux/userfaultfd.h                              |   40 
 include/uapi/linux/virtio_balloon.h                           |    1 
 init/Kconfig                                                  |    8 
 ipc/mqueue.c                                                  |    5 
 ipc/shm.c                                                     |    2 
 ipc/util.c                                                    |    1 
 kernel/configs/tiny.config                                    |    1 
 kernel/events/core.c                                          |    3 
 kernel/extable.c                                              |    3 
 kernel/fork.c                                                 |   10 
 kernel/gcov/fs.c                                              |    2 
 kernel/gcov/gcc_3_4.c                                         |    6 
 kernel/gcov/gcc_4_7.c                                         |    2 
 kernel/kallsyms.c                                             |    2 
 kernel/kcov.c                                                 |  282 +++-
 kernel/kmod.c                                                 |    2 
 kernel/module.c                                               |    1 
 kernel/sched/fair.c                                           |    2 
 lib/Kconfig.debug                                             |   35 
 lib/Kconfig.ubsan                                             |   51 
 lib/Makefile                                                  |    8 
 lib/bch.c                                                     |    2 
 lib/dynamic_debug.c                                           |    2 
 lib/rbtree.c                                                  |    4 
 lib/scatterlist.c                                             |    2 
 lib/stackdepot.c                                              |   39 
 lib/test_bitmap.c                                             |    2 
 lib/test_kmod.c                                               |    2 
 lib/test_lockup.c                                             |  601 +++++++++-
 lib/test_stackinit.c                                          |   28 
 lib/ts_bm.c                                                   |    2 
 lib/ts_fsm.c                                                  |    2 
 lib/ts_kmp.c                                                  |    2 
 lib/ubsan.c                                                   |   47 
 mm/Kconfig                                                    |  135 ++
 mm/Makefile                                                   |    1 
 mm/compaction.c                                               |    3 
 mm/dmapool.c                                                  |    4 
 mm/filemap.c                                                  |   14 
 mm/gup.c                                                      |    9 
 mm/huge_memory.c                                              |   36 
 mm/hugetlb.c                                                  |    1 
 mm/hugetlb_cgroup.c                                           |    6 
 mm/internal.h                                                 |    2 
 mm/kasan/common.c                                             |   23 
 mm/kasan/report.c                                             |   10 
 mm/khugepaged.c                                               |   39 
 mm/ksm.c                                                      |    5 
 mm/list_lru.c                                                 |    2 
 mm/memcontrol.c                                               |    5 
 mm/memory-failure.c                                           |    2 
 mm/memory.c                                                   |   42 
 mm/memory_hotplug.c                                           |   53 
 mm/mempolicy.c                                                |   11 
 mm/migrate.c                                                  |  122 +-
 mm/mm_init.c                                                  |    2 
 mm/mmap.c                                                     |   10 
 mm/mprotect.c                                                 |   76 -
 mm/page_alloc.c                                               |  174 ++
 mm/page_ext.c                                                 |    5 
 mm/page_isolation.c                                           |    6 
 mm/page_reporting.c                                           |  384 ++++++
 mm/page_reporting.h                                           |   54 
 mm/rmap.c                                                     |   23 
 mm/shmem.c                                                    |  168 +-
 mm/shuffle.c                                                  |   12 
 mm/shuffle.h                                                  |    6 
 mm/slab_common.c                                              |    1 
 mm/slub.c                                                     |    3 
 mm/sparse.c                                                   |  236 ++-
 mm/swap.c                                                     |   20 
 mm/swapfile.c                                                 |    1 
 mm/userfaultfd.c                                              |   98 +
 mm/vmalloc.c                                                  |    2 
 mm/vmscan.c                                                   |   12 
 mm/vmstat.c                                                   |    3 
 mm/zsmalloc.c                                                 |   10 
 mm/zswap.c                                                    |   24 
 samples/hw_breakpoint/data_breakpoint.c                       |   11 
 scripts/Makefile.ubsan                                        |   16 
 scripts/checkpatch.pl                                         |  155 +-
 tools/lib/rbtree.c                                            |    4 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c |   67 +
 tools/testing/selftests/vm/userfaultfd.c                      |  233 +++
 174 files changed, 3990 insertions(+), 1399 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-04-02  4:01 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-04-02  4:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


A large amount of MM, plenty more to come.


155 patches, based on GIT 1a323ea5356edbb3073dc59d51b9e6b86908857d

Subsystems affected by this patch series:

  tools
  kthread
  kbuild
  scripts
  ocfs2
  vfs
  mm/slub
  mm/kmemleak
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/mremap
  mm/sparsemem
  mm/kasan
  mm/pagealloc
  mm/vmscan
  mm/compaction
  mm/mempolicy
  mm/hugetlbfs
  mm/hugetlb

Subsystem: tools

    David Ahern <dsahern@kernel.org>:
      tools/accounting/getdelays.c: fix netlink attribute length

Subsystem: kthread

    Petr Mladek <pmladek@suse.com>:
      kthread: mark timer used by delayed kthread works as IRQ safe

Subsystem: kbuild

    Masahiro Yamada <masahiroy@kernel.org>:
      asm-generic: make more kernel-space headers mandatory

Subsystem: scripts

    Jonathan Neuschäfer <j.neuschaefer@gmx.net>:
      scripts/spelling.txt: add syfs/sysfs pattern

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ocfs2

    Alex Shi <alex.shi@linux.alibaba.com>:
      ocfs2: remove FS_OCFS2_NM
      ocfs2: remove unused macros
      ocfs2: use OCFS2_SEC_BITS in macro
      ocfs2: remove dlm_lock_is_remote

    wangyan <wangyan122@huawei.com>:
      ocfs2: there is no need to log twice in several functions
      ocfs2: correct annotation from "l_next_rec" to "l_next_free_rec"

    Alex Shi <alex.shi@linux.alibaba.com>:
      ocfs2: remove useless err

    Jules Irenge <jbi.octave@gmail.com>:
      ocfs2: Add missing annotations for ocfs2_refcount_cache_lock() and ocfs2_refcount_cache_unlock()

    "Gustavo A. R. Silva" <gustavo@embeddedor.com>:
      ocfs2: replace zero-length array with flexible-array member
      ocfs2: cluster: replace zero-length array with flexible-array member
      ocfs2: dlm: replace zero-length array with flexible-array member
      ocfs2: ocfs2_fs.h: replace zero-length array with flexible-array member

    wangjian <wangjian161@huawei.com>:
      ocfs2: roll back the reference count modification of the parent directory if an error occurs

    Takashi Iwai <tiwai@suse.de>:
      ocfs2: use scnprintf() for avoiding potential buffer overflow

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      ocfs2: use memalloc_nofs_save instead of memalloc_noio_save

Subsystem: vfs

    Kees Cook <keescook@chromium.org>:
      fs_parse: Remove pr_notice() about each validation

Subsystem: mm/slub

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/slub.c: replace cpu_slab->partial with wrapped APIs
      mm/slub.c: replace kmem_cache->cpu_partial with wrapped APIs

    Kees Cook <keescook@chromium.org>:
      slub: improve bit diffusion for freelist ptr obfuscation
      slub: relocate freelist pointer to middle of object

    Vlastimil Babka <vbabka@suse.cz>:
      Revert "topology: add support for node_to_mem_node() to determine the fallback node"

Subsystem: mm/kmemleak

    Nathan Chancellor <natechancellor@gmail.com>:
      mm/kmemleak.c: use address-of operator on section symbols

    Qian Cai <cai@lca.pw>:
      mm/Makefile: disable KCSAN for kmemleak

Subsystem: mm/pagecache

    Jan Kara <jack@suse.cz>:
      mm/filemap.c: don't bother dropping mmap_sem for zero size readahead

    Mauricio Faria de Oliveira <mfo@canonical.com>:
      mm/page-writeback.c: write_cache_pages(): deduplicate identical checks

    Xianting Tian <xianting_tian@126.com>:
      mm/filemap.c: clear page error before actual read

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/filemap.c: remove unused argument from shrink_readahead_size_eio()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm/filemap.c: use vm_fault error code directly
      include/linux/pagemap.h: rename arguments to find_subpage
      mm/page-writeback.c: use VM_BUG_ON_PAGE in clear_page_dirty_for_io
      mm/filemap.c: unexport find_get_entry
      mm/filemap.c: rewrite pagecache_get_page documentation

Subsystem: mm/gup

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "mm/gup: track FOLL_PIN pages", v6:
      mm/gup: split get_user_pages_remote() into two routines
      mm/gup: pass a flags arg to __gup_device_* functions
      mm: introduce page_ref_sub_return()
      mm/gup: pass gup flags to two more routines
      mm/gup: require FOLL_GET for get_user_pages_fast()
      mm/gup: track FOLL_PIN pages
      mm/gup: page->hpage_pinned_refcount: exact pin counts for huge pages
      mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting
      mm/gup_benchmark: support pin_user_pages() and related calls
      selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: improve dump_page() for compound pages

    John Hubbard <jhubbard@nvidia.com>:
      mm: dump_page(): additional diagnostics for huge pinned pages

    Claudio Imbrenda <imbrenda@linux.ibm.com>:
      mm/gup/writeback: add callbacks for inaccessible pages

    Pingfan Liu <kernelfans@gmail.com>:
      mm/gup: rename nr as nr_pinned in get_user_pages_fast()
      mm/gup: fix omission of check on FOLL_LONGTERM in gup fast path

Subsystem: mm/swap

    Chen Wandun <chenwandun@huawei.com>:
      mm/swapfile.c: fix comments for swapcache_prepare

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/swap.c: not necessary to export __pagevec_lru_add()

    Qian Cai <cai@lca.pw>:
      mm/swapfile: fix data races in try_to_unuse()

    Wei Yang <richard.weiyang@linux.alibaba.com>:
      mm/swap_slots.c: assign|reset cache slot by value directly

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: swap: make page_evictable() inline
      mm: swap: use smp_mb__after_atomic() to order LRU bit set

    Wei Yang <richard.weiyang@gmail.com>:
      mm/swap_state.c: use the same way to count page in [add_to|delete_from]_swap_cache

Subsystem: mm/memcg

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: fix build error around the usage of kmem_caches

    Kirill Tkhai <ktkhai@virtuozzo.com>:
      mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: use mem_cgroup_from_obj()
    Patch series "mm: memcg: kmem API cleanup", v2:
      mm: kmem: cleanup (__)memcg_kmem_charge_memcg() arguments
      mm: kmem: cleanup memcg_kmem_uncharge_memcg() arguments
      mm: kmem: rename memcg_kmem_(un)charge() into memcg_kmem_(un)charge_page()
      mm: kmem: switch to nr_pages in (__)memcg_kmem_charge_memcg()
      mm: memcg/slab: cache page number in memcg_(un)charge_slab()
      mm: kmem: rename (__)memcg_kmem_(un)charge_memcg() to __memcg_kmem_(un)charge()

    Johannes Weiner <hannes@cmpxchg.org>:
    Patch series "mm: memcontrol: recursive memory.low protection", v3:
      mm: memcontrol: fix memory.low proportional distribution
      mm: memcontrol: clean up and document effective low/min calculations
      mm: memcontrol: recursive memory.low protection

    Shakeel Butt <shakeelb@google.com>:
      memcg: css_tryget_online cleanups

    Vincenzo Frascino <vincenzo.frascino@arm.com>:
      mm/memcontrol.c: make mem_cgroup_id_get_many() __maybe_unused

    Chris Down <chris@chrisdown.name>:
      mm, memcg: prevent memory.high load/store tearing
      mm, memcg: prevent memory.max load tearing
      mm, memcg: prevent memory.low load/store tearing
      mm, memcg: prevent memory.min load/store tearing
      mm, memcg: prevent memory.swap.max load tearing
      mm, memcg: prevent mem_cgroup_protected store tearing

    Roman Gushchin <guro@fb.com>:
      mm: memcg: make memory.oom.group tolerable to task migration

Subsystem: mm/pagemap

    Thomas Hellstrom <thellstrom@vmware.com>:
      mm/mapping_dirty_helpers: Update huge page-table entry callbacks

    Anshuman Khandual <anshuman.khandual@arm.com>:
    Patch series "mm/vma: some more minor changes", v2:
      mm/vma: move VM_NO_KHUGEPAGED into generic header
      mm/vma: make vma_is_foreign() available for general use
      mm/vma: make is_vma_temporary_stack() available for general use

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: add pagemap.h to the fine documentation

    Peter Xu <peterx@redhat.com>:
    Patch series "mm: Page fault enhancements", v6:
      mm/gup: rename "nonblocking" to "locked" where proper
      mm/gup: fix __get_user_pages() on fault retry of hugetlb
      mm: introduce fault_signal_pending()
      x86/mm: use helper fault_signal_pending()
      arc/mm: use helper fault_signal_pending()
      arm64/mm: use helper fault_signal_pending()
      powerpc/mm: use helper fault_signal_pending()
      sh/mm: use helper fault_signal_pending()
      mm: return faster for non-fatal signals in user mode faults
      userfaultfd: don't retake mmap_sem to emulate NOPAGE
      mm: introduce FAULT_FLAG_DEFAULT
      mm: introduce FAULT_FLAG_INTERRUPTIBLE
      mm: allow VM_FAULT_RETRY for multiple times
      mm/gup: allow VM_FAULT_RETRY for multiple times
      mm/gup: allow to react to fatal signals
      mm/userfaultfd: honor FAULT_FLAG_KILLABLE in fault path

    WANG Wenhu <wenhu.wang@vivo.com>:
      mm: clarify a confusing comment for remap_pfn_range()

    Wang Wenhu <wenhu.wang@vivo.com>:
      mm/memory.c: clarify a confusing comment for vm_iomap_memory

    Jaewon Kim <jaewon31.kim@samsung.com>:
    Patch series "mm: mmap: add mmap trace point", v3:
      mmap: remove inline of vm_unmapped_area
      mm: mmap: add trace point of vm_unmapped_area

Subsystem: mm/mremap

    Brian Geffon <bgeffon@google.com>:
      mm/mremap: add MREMAP_DONTUNMAP to mremap()
      selftests: add MREMAP_DONTUNMAP selftest

Subsystem: mm/sparsemem

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/sparsemem: get address to page struct instead of address to pfn

    Pingfan Liu <kernelfans@gmail.com>:
      mm/sparse: rename pfn_present() to pfn_in_present_section()

    Baoquan He <bhe@redhat.com>:
      mm/sparse.c: use kvmalloc/kvfree to alloc/free memmap for the classic sparse
      mm/sparse.c: allocate memmap preferring the given node

Subsystem: mm/kasan

    Walter Wu <walter-zh.wu@mediatek.com>:
    Patch series "fix the missing underflow in memory operation function", v4:
      kasan: detect negative size in memory operation function
      kasan: add test for invalid size in memmove

Subsystem: mm/pagealloc

    Joel Savitz <jsavitz@redhat.com>:
      mm/page_alloc: increase default min_free_kbytes bound

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm, pagealloc: micro-optimisation: save two branches on hot page allocation path

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/page_alloc.c: use free_area_empty() instead of open-coding

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/page_alloc.c: micro-optimisation Remove unnecessary branch

    chenqiwu <chenqiwu@xiaomi.com>:
      mm/page_alloc: simplify page_is_buddy() for better code readability

Subsystem: mm/vmscan

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: vmpressure: don't need call kfree if kstrndup fails
      mm: vmpressure: use mem_cgroup_is_root API
      mm: vmscan: replace open codings to NUMA_NO_NODE

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/vmscan.c: remove cpu online notification for now

    Qian Cai <cai@lca.pw>:
      mm/vmscan.c: fix data races using kswapd_classzone_idx

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/vmscan.c: Clean code by removing unnecessary assignment

    Kirill Tkhai <ktkhai@virtuozzo.com>:
      mm/vmscan.c: make may_enter_fs bool in shrink_page_list()

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/vmscan.c: do_try_to_free_pages(): clean code by removing unnecessary assignment

    Michal Hocko <mhocko@suse.com>:
      selftests: vm: drop dependencies on page flags from mlock2 tests

Subsystem: mm/compaction

    Rik van Riel <riel@surriel.com>:
    Patch series "fix THP migration for CMA allocations", v2:
      mm,compaction,cma: add alloc_contig flag to compact_control
      mm,thp,compaction,cma: allow THP migration for CMA allocations

    Vlastimil Babka <vbabka@suse.cz>:
      mm, compaction: fully assume capture is not NULL in compact_zone_order()

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/compaction: really limit compact_unevictable_allowed to 0 and 1
      mm/compaction: Disable compact_unevictable_allowed on RT

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/compaction.c: clean code by removing unnecessary assignment

Subsystem: mm/mempolicy

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/mempolicy: support MPOL_MF_STRICT for huge page mapping
      mm/mempolicy: check hugepage migration is supported by arch in vma_migratable()

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk()

    Randy Dunlap <rdunlap@infradead.org>:
      mm: mempolicy: require at least one nodeid for MPOL_PREFERRED

    Colin Ian King <colin.king@canonical.com>:
      mm/memblock.c: remove redundant assignment to variable max_addr

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
    Patch series "hugetlbfs: use i_mmap_rwsem for more synchronization", v2:
      hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
      hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race

Subsystem: mm/hugetlb

    Mina Almasry <almasrymina@google.com>:
      hugetlb_cgroup: add hugetlb_cgroup reservation counter
      hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations
      mm/hugetlb_cgroup: fix hugetlb_cgroup migration
      hugetlb_cgroup: add reservation accounting for private mappings
      hugetlb: disable region_add file_region coalescing
      hugetlb_cgroup: add accounting for shared mappings
      hugetlb_cgroup: support noreserve mappings
      hugetlb: support file_region coalescing again
      hugetlb_cgroup: add hugetlb_cgroup reservation tests
      hugetlb_cgroup: add hugetlb_cgroup reservation docs

    Mateusz Nosek <mateusznosek0@gmail.com>:
      mm/hugetlb.c: clean code by removing unnecessary initialization

    Vlastimil Babka <vbabka@suse.cz>:
      mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge()

    Christophe Leroy <christophe.leroy@c-s.fr>:
      selftests/vm: fix map_hugetlb length used for testing read and write
      mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP

 Documentation/admin-guide/cgroup-v1/hugetlb.rst        |  103 +-
 Documentation/admin-guide/cgroup-v2.rst                |   11 
 Documentation/admin-guide/sysctl/vm.rst                |    3 
 Documentation/core-api/mm-api.rst                      |    3 
 Documentation/core-api/pin_user_pages.rst              |   86 +
 arch/alpha/include/asm/Kbuild                          |   11 
 arch/alpha/mm/fault.c                                  |    6 
 arch/arc/include/asm/Kbuild                            |   21 
 arch/arc/mm/fault.c                                    |   37 
 arch/arm/include/asm/Kbuild                            |   12 
 arch/arm/mm/fault.c                                    |    7 
 arch/arm64/include/asm/Kbuild                          |   18 
 arch/arm64/mm/fault.c                                  |   26 
 arch/c6x/include/asm/Kbuild                            |   37 
 arch/csky/include/asm/Kbuild                           |   36 
 arch/h8300/include/asm/Kbuild                          |   46 
 arch/hexagon/include/asm/Kbuild                        |   33 
 arch/hexagon/mm/vm_fault.c                             |    5 
 arch/ia64/include/asm/Kbuild                           |    7 
 arch/ia64/mm/fault.c                                   |    5 
 arch/m68k/include/asm/Kbuild                           |   24 
 arch/m68k/mm/fault.c                                   |    7 
 arch/microblaze/include/asm/Kbuild                     |   29 
 arch/microblaze/mm/fault.c                             |    5 
 arch/mips/include/asm/Kbuild                           |   13 
 arch/mips/mm/fault.c                                   |    5 
 arch/nds32/include/asm/Kbuild                          |   37 
 arch/nds32/mm/fault.c                                  |    5 
 arch/nios2/include/asm/Kbuild                          |   38 
 arch/nios2/mm/fault.c                                  |    7 
 arch/openrisc/include/asm/Kbuild                       |   36 
 arch/openrisc/mm/fault.c                               |    5 
 arch/parisc/include/asm/Kbuild                         |   18 
 arch/parisc/mm/fault.c                                 |    8 
 arch/powerpc/include/asm/Kbuild                        |    4 
 arch/powerpc/mm/book3s64/pkeys.c                       |   12 
 arch/powerpc/mm/fault.c                                |   20 
 arch/powerpc/platforms/pseries/hotplug-memory.c        |    2 
 arch/riscv/include/asm/Kbuild                          |   28 
 arch/riscv/mm/fault.c                                  |    9 
 arch/s390/include/asm/Kbuild                           |   15 
 arch/s390/mm/fault.c                                   |   10 
 arch/sh/include/asm/Kbuild                             |   16 
 arch/sh/mm/fault.c                                     |   13 
 arch/sparc/include/asm/Kbuild                          |   14 
 arch/sparc/mm/fault_32.c                               |    5 
 arch/sparc/mm/fault_64.c                               |    5 
 arch/um/kernel/trap.c                                  |    3 
 arch/unicore32/include/asm/Kbuild                      |   34 
 arch/unicore32/mm/fault.c                              |    8 
 arch/x86/include/asm/Kbuild                            |    2 
 arch/x86/include/asm/mmu_context.h                     |   15 
 arch/x86/mm/fault.c                                    |   32 
 arch/xtensa/include/asm/Kbuild                         |   26 
 arch/xtensa/mm/fault.c                                 |    5 
 drivers/base/node.c                                    |    2 
 drivers/gpu/drm/ttm/ttm_bo_vm.c                        |   12 
 fs/fs_parser.c                                         |    2 
 fs/hugetlbfs/inode.c                                   |   30 
 fs/ocfs2/alloc.c                                       |    3 
 fs/ocfs2/cluster/heartbeat.c                           |   12 
 fs/ocfs2/cluster/netdebug.c                            |    4 
 fs/ocfs2/cluster/tcp.c                                 |   27 
 fs/ocfs2/cluster/tcp.h                                 |    2 
 fs/ocfs2/dir.c                                         |    4 
 fs/ocfs2/dlm/dlmcommon.h                               |    8 
 fs/ocfs2/dlm/dlmdebug.c                                |  100 -
 fs/ocfs2/dlm/dlmmaster.c                               |    2 
 fs/ocfs2/dlm/dlmthread.c                               |    3 
 fs/ocfs2/dlmglue.c                                     |    2 
 fs/ocfs2/journal.c                                     |    2 
 fs/ocfs2/namei.c                                       |   15 
 fs/ocfs2/ocfs2_fs.h                                    |   18 
 fs/ocfs2/refcounttree.c                                |    2 
 fs/ocfs2/reservations.c                                |    3 
 fs/ocfs2/stackglue.c                                   |    2 
 fs/ocfs2/suballoc.c                                    |    5 
 fs/ocfs2/super.c                                       |   46 
 fs/pipe.c                                              |    2 
 fs/userfaultfd.c                                       |   64 -
 include/asm-generic/Kbuild                             |   52 +
 include/linux/cgroup-defs.h                            |    5 
 include/linux/fs.h                                     |    5 
 include/linux/gfp.h                                    |    6 
 include/linux/huge_mm.h                                |   10 
 include/linux/hugetlb.h                                |   76 +
 include/linux/hugetlb_cgroup.h                         |  175 +++
 include/linux/kasan.h                                  |    2 
 include/linux/kthread.h                                |    3 
 include/linux/memcontrol.h                             |   66 -
 include/linux/mempolicy.h                              |   29 
 include/linux/mm.h                                     |  243 +++-
 include/linux/mm_types.h                               |    7 
 include/linux/mmzone.h                                 |    6 
 include/linux/page_ref.h                               |    9 
 include/linux/pagemap.h                                |   29 
 include/linux/sched/signal.h                           |   18 
 include/linux/swap.h                                   |    1 
 include/linux/topology.h                               |   17 
 include/trace/events/mmap.h                            |   48 
 include/uapi/linux/mman.h                              |    5 
 kernel/cgroup/cgroup.c                                 |   17 
 kernel/fork.c                                          |    9 
 kernel/sysctl.c                                        |   31 
 lib/test_kasan.c                                       |   19 
 mm/Makefile                                            |    1 
 mm/compaction.c                                        |   31 
 mm/debug.c                                             |   54 -
 mm/filemap.c                                           |   77 -
 mm/gup.c                                               |  682 ++++++++++---
 mm/gup_benchmark.c                                     |   71 +
 mm/huge_memory.c                                       |   29 
 mm/hugetlb.c                                           |  866 ++++++++++++-----
 mm/hugetlb_cgroup.c                                    |  347 +++++-
 mm/internal.h                                          |   32 
 mm/kasan/common.c                                      |   26 
 mm/kasan/generic.c                                     |    9 
 mm/kasan/generic_report.c                              |   11 
 mm/kasan/kasan.h                                       |    2 
 mm/kasan/report.c                                      |    5 
 mm/kasan/tags.c                                        |    9 
 mm/kasan/tags_report.c                                 |   11 
 mm/khugepaged.c                                        |    4 
 mm/kmemleak.c                                          |    2 
 mm/list_lru.c                                          |   12 
 mm/mapping_dirty_helpers.c                             |   42 
 mm/memblock.c                                          |    2 
 mm/memcontrol.c                                        |  378 ++++---
 mm/memory-failure.c                                    |   29 
 mm/memory.c                                            |    4 
 mm/mempolicy.c                                         |   73 +
 mm/migrate.c                                           |   25 
 mm/mmap.c                                              |   32 
 mm/mremap.c                                            |   92 +
 mm/page-writeback.c                                    |   19 
 mm/page_alloc.c                                        |   82 -
 mm/page_counter.c                                      |   29 
 mm/page_ext.c                                          |    2 
 mm/rmap.c                                              |   39 
 mm/shuffle.c                                           |    2 
 mm/slab.h                                              |   32 
 mm/slab_common.c                                       |    2 
 mm/slub.c                                              |   27 
 mm/sparse.c                                            |   33 
 mm/swap.c                                              |    5 
 mm/swap_slots.c                                        |   12 
 mm/swap_state.c                                        |    2 
 mm/swapfile.c                                          |   10 
 mm/userfaultfd.c                                       |   11 
 mm/vmpressure.c                                        |    8 
 mm/vmscan.c                                            |  111 --
 mm/vmstat.c                                            |    2 
 scripts/spelling.txt                                   |   21 
 tools/accounting/getdelays.c                           |    2 
 tools/testing/selftests/vm/.gitignore                  |    1 
 tools/testing/selftests/vm/Makefile                    |    2 
 tools/testing/selftests/vm/charge_reserved_hugetlb.sh  |  575 +++++++++++
 tools/testing/selftests/vm/gup_benchmark.c             |   15 
 tools/testing/selftests/vm/hugetlb_reparenting_test.sh |  244 ++++
 tools/testing/selftests/vm/map_hugetlb.c               |   14 
 tools/testing/selftests/vm/mlock2-tests.c              |  233 ----
 tools/testing/selftests/vm/mremap_dontunmap.c          |  313 ++++++
 tools/testing/selftests/vm/run_vmtests                 |   37 
 tools/testing/selftests/vm/write_hugetlb_memory.sh     |   23 
 tools/testing/selftests/vm/write_to_hugetlbfs.c        |  242 ++++
 165 files changed, 5020 insertions(+), 2376 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-03-29  2:14 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-03-29  2:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

5 fixes, based on 83fd69c93340177dcd66fd26ce6441fb581c1dbf:


    Naohiro Aota <naohiro.aota@wdc.com>:
      mm/swapfile.c: move inode_lock out of claim_swapfile

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory.c: indicate all memory blocks as removable

    Mina Almasry <almasrymina@google.com>:
      hugetlb_cgroup: fix illegal access to memory

    Roman Gushchin <guro@fb.com>:
      mm: fork: fix kernel_stack memcg stats for various stack implementations

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
      mm/sparse: fix kernel crash with pfn_section_valid check

 drivers/base/memory.c      |   23 +++--------------------
 include/linux/memcontrol.h |   12 ++++++++++++
 kernel/fork.c              |    4 ++--
 mm/hugetlb_cgroup.c        |    3 +--
 mm/memcontrol.c            |   38 ++++++++++++++++++++++++++++++++++++++
 mm/sparse.c                |    6 ++++++
 mm/swapfile.c              |   41 ++++++++++++++++++++---------------------
 7 files changed, 82 insertions(+), 45 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-03-22  1:19 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-03-22  1:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

10 fixes, based on c63c50fc2ec9afc4de21ef9ead2eac64b178cce1:


    Chunguang Xu <brookxu@tencent.com>:
      memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

    Baoquan He <bhe@redhat.com>:
      mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case

    Qian Cai <cai@lca.pw>:
      page-flags: fix a crash at SetPageError(THP_SWAP)

    Chris Down <chris@chrisdown.name>:
      mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
      mm, memcg: throttle allocators based on ancestral memory.high

    Michal Hocko <mhocko@suse.com>:
      mm: do not allow MADV_PAGEOUT for CoW pages

    Roman Penyaev <rpenyaev@suse.de>:
      epoll: fix possible lost wakeup on epoll_ctl() path

    Qian Cai <cai@lca.pw>:
      mm/mmu_notifier: silence PROVE_RCU_LIST warnings

    Vlastimil Babka <vbabka@suse.cz>:
      mm, slub: prevent kmalloc_node crashes and memory leaks

    Joerg Roedel <jroedel@suse.de>:
      x86/mm: split vmalloc_sync_all()

 arch/x86/mm/fault.c        |   26 ++++++++++-
 drivers/acpi/apei/ghes.c   |    2 
 fs/eventpoll.c             |    8 +--
 include/linux/page-flags.h |    2 
 include/linux/vmalloc.h    |    5 +-
 kernel/notifier.c          |    2 
 mm/madvise.c               |   12 +++--
 mm/memcontrol.c            |  105 ++++++++++++++++++++++++++++-----------------
 mm/mmu_notifier.c          |   27 +++++++----
 mm/nommu.c                 |   10 +++-
 mm/slub.c                  |   26 +++++++----
 mm/sparse.c                |    8 ++-
 mm/vmalloc.c               |   11 +++-
 13 files changed, 165 insertions(+), 79 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-03-06  6:27 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-03-06  6:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

7 fixes, based on 9f65ed5fe41ce08ed1cb1f6a950f9ec694c142ad:

    Mel Gorman <mgorman@techsingularity.net>:
      mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa

    Huang Ying <ying.huang@intel.com>:
      mm: fix possible PMD dirty bit lost in set_pmd_migration_entry()

    "Kirill A. Shutemov" <kirill@shutemov.name>:
      mm: avoid data corruption on CoW fault into PFN-mapped VMA

    OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>:
      fat: fix uninit-memory access for partial initialized inode

    Sebastian Andrzej Siewior <bigeasy@linutronix.de>:
      mm/z3fold.c: do not include rwlock.h directly

    Vlastimil Babka <vbabka@suse.cz>:
      mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled

    Miroslav Benes <mbenes@suse.cz>:
      arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description

 arch/Kconfig        |    5 +++--
 fs/fat/inode.c      |   19 +++++++------------
 include/linux/mm.h  |    4 ++++
 mm/huge_memory.c    |    3 +--
 mm/memory.c         |   35 +++++++++++++++++++++++++++--------
 mm/memory_hotplug.c |    8 +++++++-
 mm/mprotect.c       |   38 ++++++++++++++++++++++++++++++++++++--
 mm/z3fold.c         |    1 -
 8 files changed, 85 insertions(+), 28 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-21 18:32   ` incoming Konstantin Ryabitsev
@ 2020-02-27  9:59     ` Vlastimil Babka
  0 siblings, 0 replies; 485+ messages in thread
From: Vlastimil Babka @ 2020-02-27  9:59 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Linus Torvalds; +Cc: Andrew Morton, Linux-MM, mm-commits

On 2/21/20 7:32 PM, Konstantin Ryabitsev wrote:
> On Fri, Feb 21, 2020 at 10:21:19AM -0800, Linus Torvalds wrote:
>> On Thu, Feb 20, 2020 at 8:00 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>> >
>> > - A few y2038 fixes which missed the merge window whiole dependencies
>> >   in NFS were being sorted out.
>> >
>> > - A bunch of fixes.  Some minor, some not.
>> 
>> Hmm. Konstantin's nice lore script _used_ to pick up your patches, but
>> now they don't.
>> 
>> I'm not sure what changed. It worked with your big series of 118 patches.
>> 
>> It doesn't work with this smaller series of fixes.
>>
>> I think the difference is that you've done something bad to your patch
>> sending. That big series was properly threaded with each of the
>> patches being a reply to the 'incoming' message.
>> 
>> This series is not.
> 
> This is correct -- each patch is posted without an in-reply-to, so 
> public-inbox doesn't group them into a thread.
> 
> E.g.:
> https://lore.kernel.org/linux-mm/20200221040350.84HaG%25akpm@linux-foundation.org/
> 
>> 
>> Please, Andrew, can you make your email flow more consistent so that I
>> can actually use the nice new tool to download a patch series?
> 
> Andrew, I'll be happy to provide you with a helper tool if you can 
> describe me your workflow. E.g. if you have a quilt directory of patches 
> plus a series file, it could easily be a tiny wrapper like:
> 
> send-patches --base-commit 1234abcd --cover cover.txt patchdir/series

Once/if there is such tool, could it perhaps instead of mass e-mailing create
git commits, push them to korg repo and send a pull request?

Thanks,
Vlastimil

> -K
> 



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-21 18:21 ` incoming Linus Torvalds
  2020-02-21 18:32   ` incoming Konstantin Ryabitsev
@ 2020-02-21 19:33   ` Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-02-21 19:33 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: Linux-MM, mm-commits

Side note: I've obviously picked it up the old-fashioned way, but I
had been looking forward to seeing if I could just automate this more.

            Linus

On Fri, Feb 21, 2020 at 10:21 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Please, Andrew, can you make your email flow more consistent so that I
> can actually use the nice new tool to download a patch series?
>
>             Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-21 18:21 ` incoming Linus Torvalds
@ 2020-02-21 18:32   ` Konstantin Ryabitsev
  2020-02-27  9:59     ` incoming Vlastimil Babka
  2020-02-21 19:33   ` incoming Linus Torvalds
  1 sibling, 1 reply; 485+ messages in thread
From: Konstantin Ryabitsev @ 2020-02-21 18:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, Linux-MM, mm-commits

On Fri, Feb 21, 2020 at 10:21:19AM -0800, Linus Torvalds wrote:
> On Thu, Feb 20, 2020 at 8:00 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > - A few y2038 fixes which missed the merge window whiole dependencies
> >   in NFS were being sorted out.
> >
> > - A bunch of fixes.  Some minor, some not.
> 
> Hmm. Konstantin's nice lore script _used_ to pick up your patches, but
> now they don't.
> 
> I'm not sure what changed. It worked with your big series of 118 patches.
> 
> It doesn't work with this smaller series of fixes.
>
> I think the difference is that you've done something bad to your patch
> sending. That big series was properly threaded with each of the
> patches being a reply to the 'incoming' message.
> 
> This series is not.

This is correct -- each patch is posted without an in-reply-to, so 
public-inbox doesn't group them into a thread.

E.g.:
https://lore.kernel.org/linux-mm/20200221040350.84HaG%25akpm@linux-foundation.org/

> 
> Please, Andrew, can you make your email flow more consistent so that I
> can actually use the nice new tool to download a patch series?

Andrew, I'll be happy to provide you with a helper tool if you can 
describe me your workflow. E.g. if you have a quilt directory of patches 
plus a series file, it could easily be a tiny wrapper like:

send-patches --base-commit 1234abcd --cover cover.txt patchdir/series

-K


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-21  4:00 incoming Andrew Morton
  2020-02-21  4:03 ` incoming Andrew Morton
@ 2020-02-21 18:21 ` Linus Torvalds
  2020-02-21 18:32   ` incoming Konstantin Ryabitsev
  2020-02-21 19:33   ` incoming Linus Torvalds
  1 sibling, 2 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-02-21 18:21 UTC (permalink / raw)
  To: Andrew Morton, Konstantin Ryabitsev; +Cc: Linux-MM, mm-commits

On Thu, Feb 20, 2020 at 8:00 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> - A few y2038 fixes which missed the merge window whiole dependencies
>   in NFS were being sorted out.
>
> - A bunch of fixes.  Some minor, some not.

Hmm. Konstantin's nice lore script _used_ to pick up your patches, but
now they don't.

I'm not sure what changed. It worked with your big series of 118 patches.

It doesn't work with this smaller series of fixes.

I think the difference is that you've done something bad to your patch
sending. That big series was properly threaded with each of the
patches being a reply to the 'incoming' message.

This series is not.

Please, Andrew, can you make your email flow more consistent so that I
can actually use the nice new tool to download a patch series?

            Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-21  4:00 incoming Andrew Morton
@ 2020-02-21  4:03 ` Andrew Morton
  2020-02-21 18:21 ` incoming Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-02-21  4:03 UTC (permalink / raw)
  To: Linus Torvalds, linux-mm, mm-commits

On Thu, 20 Feb 2020 20:00:30 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:

> - A few y2038 fixes which missed the merge window whiole dependencies
>   in NFS were being sorted out.
> 
> - A bunch of fixes.  Some minor, some not.

15 patches, based on ca7e1fd1026c5af6a533b4b5447e1d2f153e28f2



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-02-21  4:00 Andrew Morton
  2020-02-21  4:03 ` incoming Andrew Morton
  2020-02-21 18:21 ` incoming Linus Torvalds
  0 siblings, 2 replies; 485+ messages in thread
From: Andrew Morton @ 2020-02-21  4:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


- A few y2038 fixes which missed the merge window whiole dependencies
  in NFS were being sorted out.

- A bunch of fixes.  Some minor, some not.


Subsystems affected by this patch series:


    Arnd Bergmann <arnd@arndb.de>:
      y2038: remove ktime to/from timespec/timeval conversion
      y2038: remove unused time32 interfaces
      y2038: hide timeval/timespec/itimerval/itimerspec types

    Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>:
      Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"

    Christian Borntraeger <borntraeger@de.ibm.com>:
      include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap

    SeongJae Park <sjpark@amazon.de>:
      selftests/vm: add missed tests in run_vmtests

    Joe Perches <joe@perches.com>:
      get_maintainer: remove uses of P: for maintainer name

    Douglas Anderson <dianders@chromium.org>:
      scripts/get_maintainer.pl: deprioritize old Fixes: addresses

    Christoph Hellwig <hch@lst.de>:
      mm/swapfile.c: fix a comment in sys_swapon()

    Vasily Averin <vvs@virtuozzo.com>:
      mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps()

    Alexandru Ardelean <alexandru.ardelean@analog.com>:
      lib/string.c: update match_string() doc-strings with correct behavior

    Gavin Shan <gshan@redhat.com>:
      mm/vmscan.c: don't round up scan size for online memory cgroup

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM

    Alexander Potapenko <glider@google.com>:
      lib/stackdepot.c: fix global out-of-bounds in stack_slabs

    Randy Dunlap <rdunlap@infradead.org>:
      MAINTAINERS: use tabs for SAFESETID

 MAINTAINERS                            |    8 -
 include/linux/compat.h                 |   29 ------
 include/linux/ktime.h                  |   37 -------
 include/linux/time32.h                 |  154 ---------------------------------
 include/linux/timekeeping32.h          |   32 ------
 include/linux/types.h                  |    5 -
 include/uapi/asm-generic/posix_types.h |    2 
 include/uapi/linux/swab.h              |    4 
 include/uapi/linux/time.h              |   22 ++--
 ipc/sem.c                              |    6 -
 kernel/compat.c                        |   64 -------------
 kernel/time/time.c                     |   43 ---------
 lib/stackdepot.c                       |    8 +
 lib/string.c                           |   16 +++
 mm/memcontrol.c                        |    4 
 mm/sparse.c                            |    2 
 mm/swapfile.c                          |    2 
 mm/vmscan.c                            |    9 +
 scripts/get_maintainer.pl              |   32 ------
 tools/testing/selftests/vm/run_vmtests |   33 +++++++
 20 files changed, 93 insertions(+), 419 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-04  2:46   ` incoming Andrew Morton
@ 2020-02-04  3:11     ` Linus Torvalds
  0 siblings, 0 replies; 485+ messages in thread
From: Linus Torvalds @ 2020-02-04  3:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Tue, Feb 4, 2020 at 2:46 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 4 Feb 2020 02:27:48 +0000 Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > What's the base? You've changed your scripts or something, and that
> > information is no longer in your cover letter..
>
> Crap, sorry, geriatric.
>
> d4e9056daedca3891414fe3c91de3449a5dad0f2

Ok, I've tentatively applied it with the MIME decoding fixes I found,
and I'll guess I'll let it build and sit for a while before merging it
into my tree.

I didn't find anything else odd in there. But...

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-04  2:27 ` incoming Linus Torvalds
@ 2020-02-04  2:46   ` Andrew Morton
  2020-02-04  3:11     ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-02-04  2:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, Linux-MM

On Tue, 4 Feb 2020 02:27:48 +0000 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Feb 4, 2020 at 1:33 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > The rest of MM and the rest of everything else.
> 
> What's the base? You've changed your scripts or something, and that
> information is no longer in your cover letter..
> 

Crap, sorry, geriatric.

d4e9056daedca3891414fe3c91de3449a5dad0f2


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2020-02-04  1:33 incoming Andrew Morton
@ 2020-02-04  2:27 ` Linus Torvalds
  2020-02-04  2:46   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2020-02-04  2:27 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mm-commits, Linux-MM

On Tue, Feb 4, 2020 at 1:33 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> The rest of MM and the rest of everything else.

What's the base? You've changed your scripts or something, and that
information is no longer in your cover letter..

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-02-04  1:33 Andrew Morton
  2020-02-04  2:27 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2020-02-04  1:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


The rest of MM and the rest of everything else.


Subsystems affected by this patch series:

  hotfixes
  mm/pagealloc
  mm/memory-hotplug
  ipc
  misc
  mm/cleanups
  mm/pagemap
  procfs
  lib
  cleanups
  arm

Subsystem: hotfixes

    Gang He <GHe@suse.com>:
      ocfs2: fix oops when writing cloned file

    David Hildenbrand <david@redhat.com>:
    Patch series "mm: fix max_pfn not falling on section boundary", v2:
      mm/page_alloc.c: fix uninitialized memmaps on a partially populated last section
      fs/proc/page.c: allow inspection of last section and fix end detection
      mm/page_alloc.c: initialize memmap of unavailable memory directly

Subsystem: mm/pagealloc

    David Hildenbrand <david@redhat.com>:
      mm/page_alloc: fix and rework pfn handling in memmap_init_zone()
      mm: factor out next_present_section_nr()

Subsystem: mm/memory-hotplug

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6:
      mm/memmap_init: update variable name in memmap_init_zone

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()
      mm/memory_hotplug: we always have a zone in find_(smallest|biggest)_section_pfn
      mm/memory_hotplug: don't check for "all holes" in shrink_zone_span()
      mm/memory_hotplug: drop local variables in shrink_zone_span()
      mm/memory_hotplug: cleanup __remove_pages()
      mm/memory_hotplug: drop valid_start/valid_end from test_pages_in_a_zone()

Subsystem: ipc

    Manfred Spraul <manfred@colorfullife.com>:
      smp_mb__{before,after}_atomic(): update Documentation

    Davidlohr Bueso <dave@stgolabs.net>:
      ipc/mqueue.c: remove duplicated code

    Manfred Spraul <manfred@colorfullife.com>:
      ipc/mqueue.c: update/document memory barriers
      ipc/msg.c: update and document memory barriers
      ipc/sem.c: document and update memory barriers

    Lu Shuaibing <shuaibinglu@126.com>:
      ipc/msg.c: consolidate all xxxctl_down() functions
drivers/block/null_blk_main.c: fix layout

Subsystem: misc

    Andrew Morton <akpm@linux-foundation.org>:
      drivers/block/null_blk_main.c: fix layout
      drivers/block/null_blk_main.c: fix uninitialized var warnings

    Randy Dunlap <rdunlap@infradead.org>:
      pinctrl: fix pxa2xx.c build warnings

Subsystem: mm/cleanups

    Florian Westphal <fw@strlen.de>:
      mm: remove __krealloc

Subsystem: mm/pagemap

    Steven Price <steven.price@arm.com>:
    Patch series "Generic page walk and ptdump", v17:
      mm: add generic p?d_leaf() macros
      arc: mm: add p?d_leaf() definitions
      arm: mm: add p?d_leaf() definitions
      arm64: mm: add p?d_leaf() definitions
      mips: mm: add p?d_leaf() definitions
      powerpc: mm: add p?d_leaf() definitions
      riscv: mm: add p?d_leaf() definitions
      s390: mm: add p?d_leaf() definitions
      sparc: mm: add p?d_leaf() definitions
      x86: mm: add p?d_leaf() definitions
      mm: pagewalk: add p4d_entry() and pgd_entry()
      mm: pagewalk: allow walking without vma
      mm: pagewalk: don't lock PTEs for walk_page_range_novma()
      mm: pagewalk: fix termination condition in walk_pte_range()
      mm: pagewalk: add 'depth' parameter to pte_hole
      x86: mm: point to struct seq_file from struct pg_state
      x86: mm+efi: convert ptdump_walk_pgd_level() to take a mm_struct
      x86: mm: convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
      mm: add generic ptdump
      x86: mm: convert dump_pagetables to use walk_page_range
      arm64: mm: convert mm/dump.c to use walk_page_range()
      arm64: mm: display non-present entries in ptdump
      mm: ptdump: reduce level numbers by 1 in note_page()
      x86: mm: avoid allocating struct mm_struct on the stack

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "Fixup page directory freeing", v4:
      powerpc/mmu_gather: enable RCU_TABLE_FREE even for !SMP case

    Peter Zijlstra <peterz@infradead.org>:
      mm/mmu_gather: invalidate TLB correctly on batch allocation failure and flush
      asm-generic/tlb: avoid potential double flush
      asm-gemeric/tlb: remove stray function declarations
      asm-generic/tlb: add missing CONFIG symbol
      asm-generic/tlb: rename HAVE_RCU_TABLE_FREE
      asm-generic/tlb: rename HAVE_MMU_GATHER_PAGE_SIZE
      asm-generic/tlb: rename HAVE_MMU_GATHER_NO_GATHER
      asm-generic/tlb: provide MMU_GATHER_TABLE_FREE

Subsystem: procfs

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: decouple proc from VFS with "struct proc_ops"
      proc: convert everything to "struct proc_ops"

Subsystem: lib

    Yury Norov <yury.norov@gmail.com>:
    Patch series "lib: rework bitmap_parse", v5:
      lib/string: add strnchrnul()
      bitops: more BITS_TO_* macros
      lib: add test for bitmap_parse()
      lib: make bitmap_parse_user a wrapper on bitmap_parse
      lib: rework bitmap_parse()
      lib: new testcases for bitmap_parse{_user}
      include/linux/cpumask.h: don't calculate length of the input string

Subsystem: cleanups

    Masahiro Yamada <masahiroy@kernel.org>:
      treewide: remove redundant IS_ERR() before error code check

Subsystem: arm

    Chen-Yu Tsai <wens@csie.org>:
      ARM: dma-api: fix max_pfn off-by-one error in __dma_supported()

 Documentation/memory-barriers.txt                     |   14 
 arch/Kconfig                                          |   17 
 arch/alpha/kernel/srm_env.c                           |   17 
 arch/arc/include/asm/pgtable.h                        |    1 
 arch/arm/Kconfig                                      |    2 
 arch/arm/include/asm/pgtable-2level.h                 |    1 
 arch/arm/include/asm/pgtable-3level.h                 |    1 
 arch/arm/include/asm/tlb.h                            |    6 
 arch/arm/kernel/atags_proc.c                          |    8 
 arch/arm/mm/alignment.c                               |   14 
 arch/arm/mm/dma-mapping.c                             |    2 
 arch/arm64/Kconfig                                    |    3 
 arch/arm64/Kconfig.debug                              |   19 
 arch/arm64/include/asm/pgtable.h                      |    2 
 arch/arm64/include/asm/ptdump.h                       |    8 
 arch/arm64/mm/Makefile                                |    4 
 arch/arm64/mm/dump.c                                  |  152 ++----
 arch/arm64/mm/mmu.c                                   |    4 
 arch/arm64/mm/ptdump_debugfs.c                        |    2 
 arch/ia64/kernel/salinfo.c                            |   24 -
 arch/m68k/kernel/bootinfo_proc.c                      |    8 
 arch/mips/include/asm/pgtable.h                       |    5 
 arch/mips/lasat/picvue_proc.c                         |   31 -
 arch/powerpc/Kconfig                                  |    7 
 arch/powerpc/include/asm/book3s/32/pgalloc.h          |    8 
 arch/powerpc/include/asm/book3s/64/pgalloc.h          |    2 
 arch/powerpc/include/asm/book3s/64/pgtable.h          |    3 
 arch/powerpc/include/asm/nohash/pgalloc.h             |    8 
 arch/powerpc/include/asm/tlb.h                        |   11 
 arch/powerpc/kernel/proc_powerpc.c                    |   10 
 arch/powerpc/kernel/rtas-proc.c                       |   70 +--
 arch/powerpc/kernel/rtas_flash.c                      |   34 -
 arch/powerpc/kernel/rtasd.c                           |   14 
 arch/powerpc/mm/book3s64/pgtable.c                    |    7 
 arch/powerpc/mm/numa.c                                |   12 
 arch/powerpc/platforms/pseries/lpar.c                 |   24 -
 arch/powerpc/platforms/pseries/lparcfg.c              |   14 
 arch/powerpc/platforms/pseries/reconfig.c             |    8 
 arch/powerpc/platforms/pseries/scanlog.c              |   15 
 arch/riscv/include/asm/pgtable-64.h                   |    7 
 arch/riscv/include/asm/pgtable.h                      |    7 
 arch/s390/Kconfig                                     |    4 
 arch/s390/include/asm/pgtable.h                       |    2 
 arch/sh/mm/alignment.c                                |   17 
 arch/sparc/Kconfig                                    |    3 
 arch/sparc/include/asm/pgtable_64.h                   |    2 
 arch/sparc/include/asm/tlb_64.h                       |   11 
 arch/sparc/kernel/led.c                               |   15 
 arch/um/drivers/mconsole_kern.c                       |    9 
 arch/um/kernel/exitcode.c                             |   15 
 arch/um/kernel/process.c                              |   15 
 arch/x86/Kconfig                                      |    3 
 arch/x86/Kconfig.debug                                |   20 
 arch/x86/include/asm/pgtable.h                        |   10 
 arch/x86/include/asm/tlb.h                            |    4 
 arch/x86/kernel/cpu/mtrr/if.c                         |   21 
 arch/x86/mm/Makefile                                  |    4 
 arch/x86/mm/debug_pagetables.c                        |   18 
 arch/x86/mm/dump_pagetables.c                         |  418 +++++-------------
 arch/x86/platform/efi/efi_32.c                        |    2 
 arch/x86/platform/efi/efi_64.c                        |    4 
 arch/x86/platform/uv/tlb_uv.c                         |   14 
 arch/xtensa/platforms/iss/simdisk.c                   |   10 
 crypto/af_alg.c                                       |    2 
 drivers/acpi/battery.c                                |   15 
 drivers/acpi/proc.c                                   |   15 
 drivers/acpi/scan.c                                   |    2 
 drivers/base/memory.c                                 |    9 
 drivers/block/null_blk_main.c                         |   58 +-
 drivers/char/hw_random/bcm2835-rng.c                  |    2 
 drivers/char/hw_random/omap-rng.c                     |    4 
 drivers/clk/clk.c                                     |    2 
 drivers/dma/mv_xor_v2.c                               |    2 
 drivers/firmware/efi/arm-runtime.c                    |    2 
 drivers/gpio/gpiolib-devres.c                         |    2 
 drivers/gpio/gpiolib-of.c                             |    8 
 drivers/gpio/gpiolib.c                                |    2 
 drivers/hwmon/dell-smm-hwmon.c                        |   15 
 drivers/i2c/busses/i2c-mv64xxx.c                      |    5 
 drivers/i2c/busses/i2c-synquacer.c                    |    2 
 drivers/ide/ide-proc.c                                |   19 
 drivers/input/input.c                                 |   28 -
 drivers/isdn/capi/kcapi_proc.c                        |    6 
 drivers/macintosh/via-pmu.c                           |   17 
 drivers/md/md.c                                       |   15 
 drivers/misc/sgi-gru/gruprocfs.c                      |   42 -
 drivers/mtd/ubi/build.c                               |    2 
 drivers/net/wireless/cisco/airo.c                     |  126 ++---
 drivers/net/wireless/intel/ipw2x00/libipw_module.c    |   15 
 drivers/net/wireless/intersil/hostap/hostap_hw.c      |    4 
 drivers/net/wireless/intersil/hostap/hostap_proc.c    |   14 
 drivers/net/wireless/intersil/hostap/hostap_wlan.h    |    2 
 drivers/net/wireless/ray_cs.c                         |   20 
 drivers/of/device.c                                   |    2 
 drivers/parisc/led.c                                  |   17 
 drivers/pci/controller/pci-tegra.c                    |    2 
 drivers/pci/proc.c                                    |   25 -
 drivers/phy/phy-core.c                                |    4 
 drivers/pinctrl/pxa/pinctrl-pxa2xx.c                  |    1 
 drivers/platform/x86/thinkpad_acpi.c                  |   15 
 drivers/platform/x86/toshiba_acpi.c                   |   60 +-
 drivers/pnp/isapnp/proc.c                             |    9 
 drivers/pnp/pnpbios/proc.c                            |   17 
 drivers/s390/block/dasd_proc.c                        |   15 
 drivers/s390/cio/blacklist.c                          |   14 
 drivers/s390/cio/css.c                                |   11 
 drivers/scsi/esas2r/esas2r_main.c                     |    9 
 drivers/scsi/scsi_devinfo.c                           |   15 
 drivers/scsi/scsi_proc.c                              |   29 -
 drivers/scsi/sg.c                                     |   30 -
 drivers/spi/spi-orion.c                               |    3 
 drivers/staging/rtl8192u/ieee80211/ieee80211_module.c |   14 
 drivers/tty/sysrq.c                                   |    8 
 drivers/usb/gadget/function/rndis.c                   |   17 
 drivers/video/fbdev/imxfb.c                           |    2 
 drivers/video/fbdev/via/viafbdev.c                    |  105 ++--
 drivers/zorro/proc.c                                  |    9 
 fs/cifs/cifs_debug.c                                  |  108 ++--
 fs/cifs/dfs_cache.c                                   |   13 
 fs/cifs/dfs_cache.h                                   |    2 
 fs/ext4/super.c                                       |    2 
 fs/f2fs/node.c                                        |    2 
 fs/fscache/internal.h                                 |    2 
 fs/fscache/object-list.c                              |   11 
 fs/fscache/proc.c                                     |    2 
 fs/jbd2/journal.c                                     |   13 
 fs/jfs/jfs_debug.c                                    |   14 
 fs/lockd/procfs.c                                     |   12 
 fs/nfsd/nfsctl.c                                      |   13 
 fs/nfsd/stats.c                                       |   12 
 fs/ocfs2/file.c                                       |   14 
 fs/ocfs2/suballoc.c                                   |    2 
 fs/proc/cpuinfo.c                                     |   12 
 fs/proc/generic.c                                     |   38 -
 fs/proc/inode.c                                       |   76 +--
 fs/proc/internal.h                                    |    5 
 fs/proc/kcore.c                                       |   13 
 fs/proc/kmsg.c                                        |   14 
 fs/proc/page.c                                        |   54 +-
 fs/proc/proc_net.c                                    |   32 -
 fs/proc/proc_sysctl.c                                 |    2 
 fs/proc/root.c                                        |    2 
 fs/proc/stat.c                                        |   12 
 fs/proc/task_mmu.c                                    |    4 
 fs/proc/vmcore.c                                      |   10 
 fs/sysfs/group.c                                      |    2 
 include/asm-generic/pgtable.h                         |   20 
 include/asm-generic/tlb.h                             |  138 +++--
 include/linux/bitmap.h                                |    8 
 include/linux/bitops.h                                |    4 
 include/linux/cpumask.h                               |    4 
 include/linux/memory_hotplug.h                        |    4 
 include/linux/mm.h                                    |    6 
 include/linux/mmzone.h                                |   10 
 include/linux/pagewalk.h                              |   49 +-
 include/linux/proc_fs.h                               |   23 
 include/linux/ptdump.h                                |   24 -
 include/linux/seq_file.h                              |   13 
 include/linux/slab.h                                  |    1 
 include/linux/string.h                                |    1 
 include/linux/sunrpc/stats.h                          |    4 
 ipc/mqueue.c                                          |  123 ++++-
 ipc/msg.c                                             |   62 +-
 ipc/sem.c                                             |   66 +-
 ipc/util.c                                            |   14 
 kernel/configs.c                                      |    9 
 kernel/irq/proc.c                                     |   42 -
 kernel/kallsyms.c                                     |   12 
 kernel/latencytop.c                                   |   14 
 kernel/locking/lockdep_proc.c                         |   15 
 kernel/module.c                                       |   12 
 kernel/profile.c                                      |   24 -
 kernel/sched/psi.c                                    |   48 +-
 lib/bitmap.c                                          |  195 ++++----
 lib/string.c                                          |   17 
 lib/test_bitmap.c                                     |  105 ++++
 mm/Kconfig.debug                                      |   21 
 mm/Makefile                                           |    1 
 mm/gup.c                                              |    2 
 mm/hmm.c                                              |   66 +-
 mm/memory_hotplug.c                                   |  104 +---
 mm/memremap.c                                         |    2 
 mm/migrate.c                                          |    5 
 mm/mincore.c                                          |    1 
 mm/mmu_gather.c                                       |  158 ++++--
 mm/page_alloc.c                                       |   75 +--
 mm/pagewalk.c                                         |  167 +++++--
 mm/ptdump.c                                           |  159 ++++++
 mm/slab_common.c                                      |   37 -
 mm/sparse.c                                           |   10 
 mm/swapfile.c                                         |   14 
 net/atm/mpoa_proc.c                                   |   17 
 net/atm/proc.c                                        |    8 
 net/core/dev.c                                        |    2 
 net/core/filter.c                                     |    2 
 net/core/pktgen.c                                     |   44 -
 net/ipv4/ipconfig.c                                   |   10 
 net/ipv4/netfilter/ipt_CLUSTERIP.c                    |   16 
 net/ipv4/route.c                                      |   24 -
 net/netfilter/xt_recent.c                             |   17 
 net/sunrpc/auth_gss/svcauth_gss.c                     |   10 
 net/sunrpc/cache.c                                    |   45 -
 net/sunrpc/stats.c                                    |   21 
 net/xfrm/xfrm_policy.c                                |    2 
 samples/kfifo/bytestream-example.c                    |   11 
 samples/kfifo/inttype-example.c                       |   11 
 samples/kfifo/record-example.c                        |   11 
 scripts/coccinelle/free/devm_free.cocci               |    4 
 sound/core/info.c                                     |   34 -
 sound/soc/codecs/ak4104.c                             |    3 
 sound/soc/codecs/cs4270.c                             |    3 
 sound/soc/codecs/tlv320aic32x4.c                      |    6 
 sound/soc/sunxi/sun4i-spdif.c                         |    2 
 tools/include/linux/bitops.h                          |    9 
 214 files changed, 2589 insertions(+), 2227 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-01-31  6:10 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-01-31  6:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


Most of -mm and quite a number of other subsystems.

MM is fairly quiet this time.  Holidays, I assume.



119 patches, based on 39bed42de2e7d74686a2d5a45638d6a5d7e7d473:

Subsystems affected by this patch series:

  hotfixes
  scripts
  ocfs2
  mm/slub
  mm/kmemleak
  mm/debug
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/tracing
  mm/kasan
  mm/initialization
  mm/pagealloc
  mm/vmscan
  mm/tools
  mm/memblock
  mm/oom-kill
  mm/hugetlb
  mm/migration
  mm/mmap
  mm/memory-hotplug
  mm/zswap
  mm/cleanups
  mm/zram
  misc
  lib
  binfmt
  init
  reiserfs
  exec
  dma-mapping
  kcov

Subsystem: hotfixes

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      lib/test_bitmap: correct test data offsets for 32-bit

    "Theodore Ts'o" <tytso@mit.edu>:
      memcg: fix a crash in wb_workfn when a device disappears

    Dan Carpenter <dan.carpenter@oracle.com>:
      mm/mempolicy.c: fix out of bounds write in mpol_parse_str()

    Pingfan Liu <kernelfans@gmail.com>:
      mm/sparse.c: reset section's mem_map when fully deactivated

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/migrate.c: also overwrite error when it is bigger than zero

    Dan Williams <dan.j.williams@intel.com>:
      mm/memory_hotplug: fix remove_memory() lockdep splat

    Wei Yang <richardw.yang@linux.intel.com>:
      mm: thp: don't need care deferred split queue in memcg charge move path

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: move_pages: report the number of non-attempted pages

Subsystem: scripts

    Xiong <xndchn@gmail.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

    Luca Ceresoli <luca@lucaceresoli.net>:
      scripts/spelling.txt: add "issus" typo

Subsystem: ocfs2

    Aditya Pakki <pakki001@umn.edu>:
      fs: ocfs: remove unnecessary assertion in dlm_migrate_lockres

    zhengbin <zhengbin13@huawei.com>:
      ocfs2: remove unneeded semicolons

    Masahiro Yamada <masahiroy@kernel.org>:
      ocfs2: make local header paths relative to C files

    Colin Ian King <colin.king@canonical.com>:
      ocfs2/dlm: remove redundant assignment to ret

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      ocfs2/dlm: move BITS_TO_BYTES() to bitops.h for wider use

    wangyan <wangyan122@huawei.com>:
      ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
      ocfs2: use ocfs2_update_inode_fsync_trans() to access t_tid in handle->h_transaction

Subsystem: mm/slub

    Yu Zhao <yuzhao@google.com>:
      mm/slub.c: avoid slub allocation while holding list_lock

Subsystem: mm/kmemleak

    He Zhe <zhe.he@windriver.com>:
      mm/kmemleak: turn kmemleak_lock and object->lock to raw_spinlock_t

Subsystem: mm/debug

    Vlastimil Babka <vbabka@suse.cz>:
      mm/debug.c: always print flags in dump_page()

Subsystem: mm/pagecache

    Ira Weiny <ira.weiny@intel.com>:
      mm/filemap.c: clean up filemap_write_and_wait()

Subsystem: mm/gup

    Qiujun Huang <hqjagain@gmail.com>:
      mm: fix gup_pud_range

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/gup.c: use is_vm_hugetlb_page() to check whether to follow huge

    John Hubbard <jhubbard@nvidia.com>:
    Patch series "mm/gup: prereqs to track dma-pinned pages: FOLL_PIN", v12:
      mm/gup: factor out duplicate code from four routines
      mm/gup: move try_get_compound_head() to top, fix minor issues

    Dan Williams <dan.j.williams@intel.com>:
      mm: Cleanup __put_devmap_managed_page() vs ->page_free()

    John Hubbard <jhubbard@nvidia.com>:
      mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages
      goldish_pipe: rename local pin_user_pages() routine
      mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM
      vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call
      mm/gup: allow FOLL_FORCE for get_user_pages_fast()
      IB/umem: use get_user_pages_fast() to pin DMA pages
      media/v4l2-core: set pages dirty upon releasing DMA buffers
      mm/gup: introduce pin_user_pages*() and FOLL_PIN
      goldish_pipe: convert to pin_user_pages() and put_user_page()
      IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP
      mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote()
      drm/via: set FOLL_PIN via pin_user_pages_fast()
      fs/io_uring: set FOLL_PIN via pin_user_pages()
      net/xdp: set FOLL_PIN via pin_user_pages()
      media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page() conversion
      vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion
      powerpc: book3s64: convert to pin_user_pages() and put_user_page()
      mm/gup_benchmark: use proper FOLL_WRITE flags instead of hard-coding "1"
      mm, tree-wide: rename put_user_page*() to unpin_user_page*()

Subsystem: mm/swap

    Vasily Averin <vvs@virtuozzo.com>:
      mm/swapfile.c: swap_next should increase position index

Subsystem: mm/memcg

    Kaitao Cheng <pilgrimtao@gmail.com>:
      mm/memcontrol.c: cleanup some useless code

Subsystem: mm/pagemap

    Li Xinhai <lixinhai.lxh@gmail.com>:
      mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page

Subsystem: mm/tracing

    Junyong Sun <sunjy516@gmail.com>:
      mm, tracing: print symbol name for kmem_alloc_node call_site events

Subsystem: mm/kasan

    "Gustavo A. R. Silva" <gustavo@embeddedor.com>:
      lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more()

Subsystem: mm/initialization

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      mm/early_ioremap.c: use %pa to print resource_size_t variables

Subsystem: mm/pagealloc

    "Kirill A. Shutemov" <kirill@shutemov.name>:
      mm/page_alloc: skip non present sections on zone initialization

    David Hildenbrand <david@redhat.com>:
      mm: remove the memory isolate notifier
      mm: remove "count" parameter from has_unmovable_pages()

Subsystem: mm/vmscan

    Liu Song <liu.song11@zte.com.cn>:
      mm/vmscan.c: remove unused return value of shrink_node

    Alex Shi <alex.shi@linux.alibaba.com>:
      mm/vmscan: remove prefetch_prev_lru_page
      mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE

Subsystem: mm/tools

    Daniel Wagner <dwagner@suse.de>:
      tools/vm/slabinfo: fix sanity checks enabling

Subsystem: mm/memblock

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/memblock: define memblock_physmem_add()
      memblock: Use __func__ in remaining memblock_dbg() call sites

Subsystem: mm/oom-kill

    David Rientjes <rientjes@google.com>:
      mm, oom: dump stack of victim when reaping failed

Subsystem: mm/hugetlb

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/huge_memory.c: use head to check huge zero page
      mm/huge_memory.c: use head to emphasize the purpose of page
      mm/huge_memory.c: reduce critical section protected by split_queue_lock

Subsystem: mm/migration

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/migrate: remove useless mask of start address
      mm/migrate: clean up some minor coding style
      mm/migrate: add stable check in migrate_vma_insert_page()

    David Rientjes <rientjes@google.com>:
      mm, thp: fix defrag setting if newline is not used

Subsystem: mm/mmap

    Miaohe Lin <linmiaohe@huawei.com>:
      mm/mmap.c: get rid of odd jump labels in find_mergeable_anon_vma()

Subsystem: mm/memory-hotplug

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: pass in nid to online_pages()":
      mm/memory_hotplug: pass in nid to online_pages()

    Qian Cai <cai@lca.pw>:
      mm/hotplug: silence a lockdep splat with printk()
      mm/page_isolation: fix potential warning from user

Subsystem: mm/zswap

    Vitaly Wool <vitaly.wool@konsulko.com>:
      mm/zswap.c: add allocation hysteresis if pool limit is hit

    Dan Carpenter <dan.carpenter@oracle.com>:
      zswap: potential NULL dereference on error in init_zswap()

Subsystem: mm/cleanups

    Yu Zhao <yuzhao@google.com>:
      include/linux/mm.h: clean up obsolete check on space in page->flags

    Wei Yang <richardw.yang@linux.intel.com>:
      include/linux/mm.h: remove dead code totalram_pages_set()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      include/linux/memory.h: drop fields 'hw' and 'phys_callback' from struct memory_block

    Hao Lee <haolee.swjtu@gmail.com>:
      mm: fix comments related to node reclaim

Subsystem: mm/zram

    Taejoon Song <taejoon.song@lge.com>:
      zram: try to avoid worst-case scenario on same element pages

    Colin Ian King <colin.king@canonical.com>:
      drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store

Subsystem: misc

    Akinobu Mita <akinobu.mita@gmail.com>:
    Patch series "add header file for kelvin to/from Celsius conversion:
      include/linux/units.h: add helpers for kelvin to/from Celsius conversion
      ACPI: thermal: switch to use <linux/units.h> helpers
      platform/x86: asus-wmi: switch to use <linux/units.h> helpers
      platform/x86: intel_menlow: switch to use <linux/units.h> helpers
      thermal: int340x: switch to use <linux/units.h> helpers
      thermal: intel_pch: switch to use <linux/units.h> helpers
      nvme: hwmon: switch to use <linux/units.h> helpers
      thermal: remove kelvin to/from Celsius conversion helpers from <linux/thermal.h>
      iwlegacy: use <linux/units.h> helpers
      iwlwifi: use <linux/units.h> helpers
      thermal: armada: remove unused TO_MCELSIUS macro
      iio: adc: qcom-vadc-common: use <linux/units.h> helpers

Subsystem: lib

    Mikhail Zaslonko <zaslonko@linux.ibm.com>:
    Patch series "S390 hardware support for kernel zlib", v3:
      lib/zlib: add s390 hardware support for kernel zlib_deflate
      s390/boot: rename HEAP_SIZE due to name collision
      lib/zlib: add s390 hardware support for kernel zlib_inflate
      s390/boot: add dfltcc= kernel command line parameter
      lib/zlib: add zlib_deflate_dfltcc_enabled() function
      btrfs: use larger zlib buffer for s390 hardware compression

    Nathan Chancellor <natechancellor@gmail.com>:
      lib/scatterlist.c: adjust indentation in __sg_alloc_table

    Yury Norov <yury.norov@gmail.com>:
      uapi: rename ext2_swab() to swab() and share globally in swab.h
      lib/find_bit.c: join _find_next_bit{_le}
      lib/find_bit.c: uninline helper _find_next_bit()

Subsystem: binfmt

    Alexey Dobriyan <adobriyan@gmail.com>:
      fs/binfmt_elf.c: smaller code generation around auxv vector fill
      fs/binfmt_elf.c: fix ->start_code calculation
      fs/binfmt_elf.c: don't copy ELF header around
      fs/binfmt_elf.c: better codegen around current->mm
      fs/binfmt_elf.c: make BAD_ADDR() unlikely
      fs/binfmt_elf.c: coredump: allocate core ELF header on stack
      fs/binfmt_elf.c: coredump: delete duplicated overflow check
      fs/binfmt_elf.c: coredump: allow process with empty address space to coredump

Subsystem: init

    Arvind Sankar <nivedita@alum.mit.edu>:
      init/main.c: log arguments and environment passed to init
      init/main.c: remove unnecessary repair_env_string in do_initcall_level
    Patch series "init/main.c: minor cleanup/bugfix of envvar handling", v2:
      init/main.c: fix quoted value handling in unknown_bootoption

    Christophe Leroy <christophe.leroy@c-s.fr>:
      init/main.c: fix misleading "This architecture does not have kernel memory protection" message

Subsystem: reiserfs

    Yunfeng Ye <yeyunfeng@huawei.com>:
      reiserfs: prevent NULL pointer dereference in reiserfs_insert_item()

Subsystem: exec

    Alexey Dobriyan <adobriyan@gmail.com>:
      execve: warn if process starts with executable stack

Subsystem: dma-mapping

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      include/linux/io-mapping.h-mapping: use PHYS_PFN() macro in io_mapping_map_atomic_wc()

Subsystem: kcov

    Dmitry Vyukov <dvyukov@google.com>:
      kcov: ignore fault-inject and stacktrace

 Documentation/admin-guide/kernel-parameters.txt              |   12 
 Documentation/core-api/index.rst                             |    1 
 Documentation/core-api/pin_user_pages.rst                    |  234 +++++
 Documentation/vm/zswap.rst                                   |   13 
 arch/powerpc/mm/book3s64/iommu_api.c                         |   14 
 arch/s390/boot/compressed/decompressor.c                     |    8 
 arch/s390/boot/ipl_parm.c                                    |   14 
 arch/s390/include/asm/setup.h                                |    7 
 arch/s390/kernel/setup.c                                     |   14 
 drivers/acpi/thermal.c                                       |   34 
 drivers/base/memory.c                                        |   25 
 drivers/block/zram/zram_drv.c                                |   10 
 drivers/gpu/drm/via/via_dmablit.c                            |    6 
 drivers/iio/adc/qcom-vadc-common.c                           |    6 
 drivers/iio/adc/qcom-vadc-common.h                           |    1 
 drivers/infiniband/core/umem.c                               |   21 
 drivers/infiniband/core/umem_odp.c                           |   13 
 drivers/infiniband/hw/hfi1/user_pages.c                      |    4 
 drivers/infiniband/hw/mthca/mthca_memfree.c                  |    8 
 drivers/infiniband/hw/qib/qib_user_pages.c                   |    4 
 drivers/infiniband/hw/qib/qib_user_sdma.c                    |    8 
 drivers/infiniband/hw/usnic/usnic_uiom.c                     |    4 
 drivers/infiniband/sw/siw/siw_mem.c                          |    4 
 drivers/media/v4l2-core/videobuf-dma-sg.c                    |   20 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_init.h             |    1 
 drivers/net/wireless/intel/iwlegacy/4965-mac.c               |    3 
 drivers/net/wireless/intel/iwlegacy/4965.c                   |   17 
 drivers/net/wireless/intel/iwlegacy/common.h                 |    3 
 drivers/net/wireless/intel/iwlwifi/dvm/dev.h                 |    5 
 drivers/net/wireless/intel/iwlwifi/dvm/devices.c             |    6 
 drivers/nvdimm/pmem.c                                        |    6 
 drivers/nvme/host/hwmon.c                                    |   13 
 drivers/platform/goldfish/goldfish_pipe.c                    |   39 
 drivers/platform/x86/asus-wmi.c                              |    7 
 drivers/platform/x86/intel_menlow.c                          |    9 
 drivers/thermal/armada_thermal.c                             |    2 
 drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c |    7 
 drivers/thermal/intel/intel_pch_thermal.c                    |    3 
 drivers/vfio/vfio_iommu_type1.c                              |   39 
 fs/binfmt_elf.c                                              |  154 +--
 fs/btrfs/compression.c                                       |    2 
 fs/btrfs/zlib.c                                              |  135 ++
 fs/exec.c                                                    |    5 
 fs/fs-writeback.c                                            |    2 
 fs/io_uring.c                                                |    6 
 fs/ocfs2/cluster/quorum.c                                    |    2 
 fs/ocfs2/dlm/Makefile                                        |    2 
 fs/ocfs2/dlm/dlmast.c                                        |    8 
 fs/ocfs2/dlm/dlmcommon.h                                     |    4 
 fs/ocfs2/dlm/dlmconvert.c                                    |    8 
 fs/ocfs2/dlm/dlmdebug.c                                      |    8 
 fs/ocfs2/dlm/dlmdomain.c                                     |    8 
 fs/ocfs2/dlm/dlmlock.c                                       |    8 
 fs/ocfs2/dlm/dlmmaster.c                                     |   10 
 fs/ocfs2/dlm/dlmrecovery.c                                   |   10 
 fs/ocfs2/dlm/dlmthread.c                                     |    8 
 fs/ocfs2/dlm/dlmunlock.c                                     |    8 
 fs/ocfs2/dlmfs/Makefile                                      |    2 
 fs/ocfs2/dlmfs/dlmfs.c                                       |    4 
 fs/ocfs2/dlmfs/userdlm.c                                     |    6 
 fs/ocfs2/dlmglue.c                                           |    2 
 fs/ocfs2/journal.h                                           |    8 
 fs/ocfs2/namei.c                                             |    3 
 fs/reiserfs/stree.c                                          |    3 
 include/linux/backing-dev.h                                  |   10 
 include/linux/bitops.h                                       |    1 
 include/linux/fs.h                                           |    6 
 include/linux/io-mapping.h                                   |    5 
 include/linux/memblock.h                                     |    7 
 include/linux/memory.h                                       |   29 
 include/linux/memory_hotplug.h                               |    3 
 include/linux/mm.h                                           |  116 +-
 include/linux/mmzone.h                                       |    2 
 include/linux/page-isolation.h                               |    8 
 include/linux/swab.h                                         |    1 
 include/linux/thermal.h                                      |   11 
 include/linux/units.h                                        |   84 +
 include/linux/zlib.h                                         |    6 
 include/trace/events/kmem.h                                  |    4 
 include/trace/events/writeback.h                             |   37 
 include/uapi/linux/swab.h                                    |   10 
 include/uapi/linux/sysctl.h                                  |    2 
 init/main.c                                                  |   36 
 kernel/Makefile                                              |    1 
 lib/Kconfig                                                  |    7 
 lib/Makefile                                                 |    2 
 lib/decompress_inflate.c                                     |   13 
 lib/find_bit.c                                               |   82 -
 lib/scatterlist.c                                            |    2 
 lib/test_bitmap.c                                            |    9 
 lib/test_kasan.c                                             |    1 
 lib/zlib_deflate/deflate.c                                   |   85 +
 lib/zlib_deflate/deflate_syms.c                              |    1 
 lib/zlib_deflate/deftree.c                                   |   54 -
 lib/zlib_deflate/defutil.h                                   |  134 ++
 lib/zlib_dfltcc/Makefile                                     |   13 
 lib/zlib_dfltcc/dfltcc.c                                     |   57 +
 lib/zlib_dfltcc/dfltcc.h                                     |  155 +++
 lib/zlib_dfltcc/dfltcc_deflate.c                             |  280 ++++++
 lib/zlib_dfltcc/dfltcc_inflate.c                             |  149 +++
 lib/zlib_dfltcc/dfltcc_syms.c                                |   17 
 lib/zlib_dfltcc/dfltcc_util.h                                |  123 ++
 lib/zlib_inflate/inflate.c                                   |   32 
 lib/zlib_inflate/inflate.h                                   |    8 
 lib/zlib_inflate/infutil.h                                   |   18 
 mm/Makefile                                                  |    1 
 mm/backing-dev.c                                             |    1 
 mm/debug.c                                                   |   18 
 mm/early_ioremap.c                                           |    8 
 mm/filemap.c                                                 |   34 
 mm/gup.c                                                     |  503 ++++++-----
 mm/gup_benchmark.c                                           |    9 
 mm/huge_memory.c                                             |   44 
 mm/kmemleak.c                                                |  112 +-
 mm/memblock.c                                                |   22 
 mm/memcontrol.c                                              |   25 
 mm/memory_hotplug.c                                          |   24 
 mm/mempolicy.c                                               |    6 
 mm/memremap.c                                                |   95 --
 mm/migrate.c                                                 |   77 +
 mm/mmap.c                                                    |   30 
 mm/oom_kill.c                                                |    2 
 mm/page_alloc.c                                              |   83 +
 mm/page_isolation.c                                          |   69 -
 mm/page_vma_mapped.c                                         |   12 
 mm/process_vm_access.c                                       |   32 
 mm/slub.c                                                    |   88 +
 mm/sparse.c                                                  |    2 
 mm/swap.c                                                    |   27 
 mm/swapfile.c                                                |    2 
 mm/vmscan.c                                                  |   24 
 mm/zswap.c                                                   |   88 +
 net/xdp/xdp_umem.c                                           |    4 
 scripts/spelling.txt                                         |   14 
 tools/testing/selftests/vm/gup_benchmark.c                   |    6 
 tools/vm/slabinfo.c                                          |    4 
 136 files changed, 2790 insertions(+), 1358 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-01-14  0:28 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-01-14  0:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits


11 MM fixes, based on b3a987b0264d3ddbb24293ebff10eddfc472f653:


    Vlastimil Babka <vbabka@suse.cz>:
      mm, thp: tweak reclaim/compaction effort of local-only and all-node allocations

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: don't free usage map when removing a re-added early section

    "Kirill A. Shutemov" <kirill@shutemov.name>:
    Patch series "Fix two above-47bit hint address vs. THP bugs":
      mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment
      mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: fix percpu slab vmstats flushing

    Vlastimil Babka <vbabka@suse.cz>:
      mm, debug_pagealloc: don't rely on static keys too early

    Wen Yang <wenyang@linux.alibaba.com>:
    Patch series "use div64_ul() instead of div_u64() if the divisor is:
      mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio()
      mm/page-writeback.c: use div64_ul() for u64-by-unsigned-long divide
      mm/page-writeback.c: improve arithmetic divisions

    Adrian Huang <ahuang12@lenovo.com>:
      mm: memcg/slab: call flush_memcg_workqueue() only if memcg workqueue is valid

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: khugepaged: add trace status description for SCAN_PAGE_HAS_PRIVATE

 include/linux/mm.h                 |   18 +++++++++-
 include/linux/mmzone.h             |    5 +--
 include/trace/events/huge_memory.h |    3 +
 init/main.c                        |    1 
 mm/huge_memory.c                   |   38 ++++++++++++++---------
 mm/memcontrol.c                    |   37 +++++-----------------
 mm/mempolicy.c                     |   10 ++++--
 mm/page-writeback.c                |   10 +++---
 mm/page_alloc.c                    |   61 ++++++++++---------------------------
 mm/shmem.c                         |    7 ++--
 mm/slab.c                          |    4 +-
 mm/slab_common.c                   |    3 +
 mm/slub.c                          |    2 -
 mm/sparse.c                        |    9 ++++-
 mm/vmalloc.c                       |    4 +-
 15 files changed, 102 insertions(+), 110 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2020-01-04 20:55 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2020-01-04 20:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


17 fixes, base on 5613970af3f5f8372c596b138bd64f3918513515:


    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: shrink zones when offlining memory

    Chanho Min <chanho.min@lge.com>:
      mm/zsmalloc.c: fix the migrated zspage statistics.

    Andrey Konovalov <andreyknvl@google.com>:
      kcov: fix struct layout for kcov_remote_arg

    Shakeel Butt <shakeelb@google.com>:
      memcg: account security cred as well to kmemcg

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: move_pages: return valid node id in status if the page is already on the target node

    Eric Biggers <ebiggers@google.com>:
      fs/direct-io.c: include fs/internal.h for missing prototype
      fs/nsfs.c: include headers for missing declarations
      fs/namespace.c: make to_mnt_ns() static

    Nick Desaulniers <ndesaulniers@google.com>:
      hexagon: parenthesize registers in asm predicates
      hexagon: work around compiler crash

    Randy Dunlap <rdunlap@infradead.org>:
      fs/posix_acl.c: fix kernel-doc warnings

    Ilya Dryomov <idryomov@gmail.com>:
      mm/oom: fix pgtables units mismatch in Killed process message

    Navid Emamdoost <navid.emamdoost@gmail.com>:
      mm/gup: fix memory leak in __gup_benchmark_ioctl

    Waiman Long <longman@redhat.com>:
      mm/hugetlb: defer freeing of huge pages if in non-task context

    Kai Li <li.kai4@h3c.com>:
      ocfs2: call journal flush to mark journal as empty after journal recovery when mount

    Gang He <GHe@suse.com>:
      ocfs2: fix the crash due to call ocfs2_get_dlm_debug once less

    Nick Desaulniers <ndesaulniers@google.com>:
      hexagon: define ioremap_uc

 Documentation/dev-tools/kcov.rst    |   10 +++----
 arch/arm64/mm/mmu.c                 |    4 --
 arch/hexagon/include/asm/atomic.h   |    8 ++---
 arch/hexagon/include/asm/bitops.h   |    8 ++---
 arch/hexagon/include/asm/cmpxchg.h  |    2 -
 arch/hexagon/include/asm/futex.h    |    6 ++--
 arch/hexagon/include/asm/io.h       |    1 
 arch/hexagon/include/asm/spinlock.h |   20 +++++++-------
 arch/hexagon/kernel/stacktrace.c    |    4 --
 arch/hexagon/kernel/vm_entry.S      |    2 -
 arch/ia64/mm/init.c                 |    4 --
 arch/powerpc/mm/mem.c               |    3 --
 arch/s390/mm/init.c                 |    4 --
 arch/sh/mm/init.c                   |    4 --
 arch/x86/mm/init_32.c               |    4 --
 arch/x86/mm/init_64.c               |    4 --
 fs/direct-io.c                      |    2 +
 fs/namespace.c                      |    2 -
 fs/nsfs.c                           |    3 ++
 fs/ocfs2/dlmglue.c                  |    1 
 fs/ocfs2/journal.c                  |    8 +++++
 fs/posix_acl.c                      |    7 +++-
 include/linux/memory_hotplug.h      |    7 +++-
 include/uapi/linux/kcov.h           |   10 +++----
 kernel/cred.c                       |    6 ++--
 mm/gup_benchmark.c                  |    8 ++++-
 mm/hugetlb.c                        |   51 +++++++++++++++++++++++++++++++++++-
 mm/memory_hotplug.c                 |   31 +++++++++++----------
 mm/memremap.c                       |    2 -
 mm/migrate.c                        |   23 ++++++++++++----
 mm/oom_kill.c                       |    2 -
 mm/zsmalloc.c                       |    5 +++
 32 files changed, 166 insertions(+), 90 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-12-18  4:50 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-12-18  4:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-mm, mm-commits

6 fixes based on 2187f215ebaac73ddbd814696d7c7fa34f0c3de0:

    Andrey Ryabinin <aryabinin@virtuozzo.com>:
      kasan: fix crashes on access to memory mapped by vm_map_ram()

    Daniel Axtens <dja@axtens.net>:
      mm/memory.c: add apply_to_existing_page_range() helper
      kasan: use apply_to_existing_page_range() for releasing vmalloc shadow
      kasan: don't assume percpu shadow allocations will succeed

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: vmscan: protect shrinker idr replace with CONFIG_MEMCG

    Changbin Du <changbin.du@gmail.com>:
      lib/Kconfig.debug: fix some messed up configurations

 include/linux/kasan.h |   15 +++--
 include/linux/mm.h    |    3 +
 lib/Kconfig.debug     |  100 ++++++++++++++++++------------------
 mm/kasan/common.c     |   36 ++++++++-----
 mm/memory.c           |  136 ++++++++++++++++++++++++++++++++++----------------
 mm/vmalloc.c          |  133 ++++++++++++++++++++++++++++--------------------
 mm/vmscan.c           |    2 
 7 files changed, 260 insertions(+), 165 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-12-05  0:48 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-12-05  0:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


Most of the rest of MM and various other things.  Some Kconfig rework
still awaits merges of dependent trees from linux-next.


86 patches, based on 63de37476ebd1e9bab6a9e17186dc5aa1da9ea99.

Subsystems affected by this patch series:

  mm/hotfixes
  mm/memcg
  mm/vmstat
  mm/thp
  procfs
  sysctl
  misc
  notifiers
  core-kernel
  bitops
  lib
  checkpatch
  epoll
  binfmt
  init
  rapidio
  uaccess
  kcov
  ubsan
  ipc
  bitmap
  mm/pagemap

Subsystem: mm/hotfixes

    zhong jiang <zhongjiang@huawei.com>:
      mm/kasan/common.c: fix compile error

Subsystem: mm/memcg

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: wait for !root kmem_cache refcnt killing on root kmem_cache destruction

Subsystem: mm/vmstat

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm/vmstat: add helpers to get vmstat item names for each enum type
      mm/memcontrol: use vmstat names for printing statistics

Subsystem: mm/thp

    Yu Zhao <yuzhao@google.com>:
      mm/memory.c: replace is_zero_pfn with is_huge_zero_pmd for thp

Subsystem: procfs

    Alexey Dobriyan <adobriyan@gmail.com>:
      proc: change ->nlink under proc_subdir_lock
      fs/proc/generic.c: delete useless "len" variable
      fs/proc/internal.h: shuffle "struct pde_opener"

    Miaohe Lin <linmiaohe@huawei.com>:
      include/linux/proc_fs.h: fix confusing macro arg name

    Krzysztof Kozlowski <krzk@kernel.org>:
      fs/proc/Kconfig: fix indentation

Subsystem: sysctl

    Alessio Balsini <balsini@android.com>:
      include/linux/sysctl.h: inline braces for ctl_table and ctl_table_header

Subsystem: misc

    Stephen Boyd <swboyd@chromium.org>:
      .gitattributes: use 'dts' diff driver for dts files

    Rikard Falkeborn <rikard.falkeborn@gmail.com>:
      linux/build_bug.h: change type to int

    Masahiro Yamada <yamada.masahiro@socionext.com>:
      linux/scc.h: make uapi linux/scc.h self-contained

    Krzysztof Kozlowski <krzk@kernel.org>:
      arch/Kconfig: fix indentation

    Joe Perches <joe@perches.com>:
      scripts/get_maintainer.pl: add signatures from Fixes: <badcommit> lines in commit message

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
      kernel.h: update comment about simple_strto<foo>() functions
      auxdisplay: charlcd: deduplicate simple_strtoul()

Subsystem: notifiers

    Xiaoming Ni <nixiaoming@huawei.com>:
      kernel/notifier.c: intercept duplicate registrations to avoid infinite loops
      kernel/notifier.c: remove notifier_chain_cond_register()
      kernel/notifier.c: remove blocking_notifier_chain_cond_register()

Subsystem: core-kernel

    Nathan Chancellor <natechancellor@gmail.com>:
      kernel/profile.c: use cpumask_available to check for NULL cpumask

    Joe Perches <joe@perches.com>:
      kernel/sys.c: avoid copying possible padding bytes in copy_to_user

Subsystem: bitops

    William Breathitt Gray <vilhelm.gray@gmail.com>:
      bitops: introduce the for_each_set_clump8 macro
      lib/test_bitmap.c: add for_each_set_clump8 test cases
      gpio: 104-dio-48e: utilize for_each_set_clump8 macro
      gpio: 104-idi-48: utilize for_each_set_clump8 macro
      gpio: gpio-mm: utilize for_each_set_clump8 macro
      gpio: ws16c48: utilize for_each_set_clump8 macro
      gpio: pci-idio-16: utilize for_each_set_clump8 macro
      gpio: pcie-idio-24: utilize for_each_set_clump8 macro
      gpio: uniphier: utilize for_each_set_clump8 macro
      gpio: 74x164: utilize the for_each_set_clump8 macro
      thermal: intel: intel_soc_dts_iosf: Utilize for_each_set_clump8 macro
      gpio: pisosr: utilize the for_each_set_clump8 macro
      gpio: max3191x: utilize the for_each_set_clump8 macro
      gpio: pca953x: utilize the for_each_set_clump8 macro

Subsystem: lib

    Wei Yang <richardw.yang@linux.intel.com>:
      lib/rbtree: set successor's parent unconditionally
      lib/rbtree: get successor's color directly

    Laura Abbott <labbott@redhat.com>:
      lib/test_meminit.c: add bulk alloc/free tests

    Trent Piepho <tpiepho@gmail.com>:
      lib/math/rational.c: fix possible incorrect result from rational fractions helper

    Huang Shijie <sjhuang@iluvatar.ai>:
      lib/genalloc.c: export symbol addr_in_gen_pool
      lib/genalloc.c: rename addr_in_gen_pool to gen_pool_has_addr

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: improve ignoring CamelCase SI style variants like mA
      checkpatch: reduce is_maintained_obsolete lookup runtime

Subsystem: epoll

    Jason Baron <jbaron@akamai.com>:
      epoll: simplify ep_poll_safewake() for CONFIG_DEBUG_LOCK_ALLOC

    Heiher <r@hev.cc>:
      fs/epoll: remove unnecessary wakeups of nested epoll
      selftests: add epoll selftests

Subsystem: binfmt

    Alexey Dobriyan <adobriyan@gmail.com>:
      fs/binfmt_elf.c: delete unused "interp_map_addr" argument
      fs/binfmt_elf.c: extract elf_read() function

Subsystem: init

    Krzysztof Kozlowski <krzk@kernel.org>:
      init/Kconfig: fix indentation

Subsystem: rapidio

    "Ben Dooks (Codethink)" <ben.dooks@codethink.co.uk>:
      drivers/rapidio/rio-driver.c: fix missing include of  <linux/rio_drv.h>
      drivers/rapidio/rio-access.c: fix missing include of <linux/rio_drv.h>

Subsystem: uaccess

    Daniel Vetter <daniel.vetter@ffwll.ch>:
      drm: limit to INT_MAX in create_blob ioctl

    Kees Cook <keescook@chromium.org>:
      uaccess: disallow > INT_MAX copy sizes

Subsystem: kcov

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series " kcov: collect coverage from usb and vhost", v3:
      kcov: remote coverage support
      usb, kcov: collect coverage from hub_event
      vhost, kcov: collect coverage from vhost_worker

Subsystem: ubsan

    Julien Grall <julien.grall@arm.com>:
      lib/ubsan: don't serialize UBSAN report

Subsystem: ipc

    Masahiro Yamada <yamada.masahiro@socionext.com>:
      arch: ipcbuf.h: make uapi asm/ipcbuf.h self-contained
      arch: msgbuf.h: make uapi asm/msgbuf.h self-contained
      arch: sembuf.h: make uapi asm/sembuf.h self-contained

Subsystem: bitmap

    Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
    Patch series "gpio: pca953x: Convert to bitmap (extended) API", v2:
      lib/test_bitmap: force argument of bitmap_parselist_user() to proper address space
      lib/test_bitmap: undefine macros after use
      lib/test_bitmap: name EXP_BYTES properly
      lib/test_bitmap: rename exp to exp1 to avoid ambiguous name
      lib/test_bitmap: move exp1 and exp2 upper for others to use
      lib/test_bitmap: fix comment about this file
      lib/bitmap: introduce bitmap_replace() helper
      gpio: pca953x: remove redundant variable and check in IRQ handler
      gpio: pca953x: use input from regs structure in pca953x_irq_pending()
      gpio: pca953x: convert to use bitmap API
      gpio: pca953x: tighten up indentation

Subsystem: mm/pagemap

    Mike Rapoport <rppt@linux.ibm.com>:
    Patch series "mm: remove __ARCH_HAS_4LEVEL_HACK", v13:
      alpha: use pgtable-nopud instead of 4level-fixup
      arm: nommu: use pgtable-nopud instead of 4level-fixup
      c6x: use pgtable-nopud instead of 4level-fixup
      m68k: nommu: use pgtable-nopud instead of 4level-fixup
      m68k: mm: use pgtable-nopXd instead of 4level-fixup
      microblaze: use pgtable-nopmd instead of 4level-fixup
      nds32: use pgtable-nopmd instead of 4level-fixup
      parisc: use pgtable-nopXd instead of 4level-fixup

    Helge Deller <deller@gmx.de>:
      parisc/hugetlb: use pgtable-nopXd instead of 4level-fixup

    Mike Rapoport <rppt@linux.ibm.com>:
      sparc32: use pgtable-nopud instead of 4level-fixup
      um: remove unused pxx_offset_proc() and addr_pte() functions
      um: add support for folded p4d page tables
      mm: remove __ARCH_HAS_4LEVEL_HACK and include/asm-generic/4level-fixup.h

 .gitattributes                                                |    2 
 Documentation/core-api/genalloc.rst                           |    2 
 Documentation/dev-tools/kcov.rst                              |  129 
 arch/Kconfig                                                  |   22 
 arch/alpha/include/asm/mmzone.h                               |    1 
 arch/alpha/include/asm/pgalloc.h                              |    4 
 arch/alpha/include/asm/pgtable.h                              |   24 
 arch/alpha/mm/init.c                                          |   12 
 arch/arm/include/asm/pgtable.h                                |    2 
 arch/arm/mm/dma-mapping.c                                     |    2 
 arch/c6x/include/asm/pgtable.h                                |    2 
 arch/m68k/include/asm/mcf_pgalloc.h                           |    7 
 arch/m68k/include/asm/mcf_pgtable.h                           |   28 
 arch/m68k/include/asm/mmu_context.h                           |   12 
 arch/m68k/include/asm/motorola_pgalloc.h                      |    4 
 arch/m68k/include/asm/motorola_pgtable.h                      |   32 
 arch/m68k/include/asm/page.h                                  |    9 
 arch/m68k/include/asm/pgtable_mm.h                            |   11 
 arch/m68k/include/asm/pgtable_no.h                            |    2 
 arch/m68k/include/asm/sun3_pgalloc.h                          |    5 
 arch/m68k/include/asm/sun3_pgtable.h                          |   18 
 arch/m68k/kernel/sys_m68k.c                                   |   10 
 arch/m68k/mm/init.c                                           |    6 
 arch/m68k/mm/kmap.c                                           |   39 
 arch/m68k/mm/mcfmmu.c                                         |   16 
 arch/m68k/mm/motorola.c                                       |   17 
 arch/m68k/sun3x/dvma.c                                        |    7 
 arch/microblaze/include/asm/page.h                            |    3 
 arch/microblaze/include/asm/pgalloc.h                         |   16 
 arch/microblaze/include/asm/pgtable.h                         |   32 
 arch/microblaze/kernel/signal.c                               |   10 
 arch/microblaze/mm/init.c                                     |    7 
 arch/microblaze/mm/pgtable.c                                  |   13 
 arch/mips/include/uapi/asm/msgbuf.h                           |    1 
 arch/mips/include/uapi/asm/sembuf.h                           |    2 
 arch/nds32/include/asm/page.h                                 |    3 
 arch/nds32/include/asm/pgalloc.h                              |    3 
 arch/nds32/include/asm/pgtable.h                              |   12 
 arch/nds32/include/asm/tlb.h                                  |    1 
 arch/nds32/kernel/pm.c                                        |    4 
 arch/nds32/mm/fault.c                                         |   16 
 arch/nds32/mm/init.c                                          |   11 
 arch/nds32/mm/mm-nds32.c                                      |    6 
 arch/nds32/mm/proc.c                                          |   26 
 arch/parisc/include/asm/page.h                                |   30 
 arch/parisc/include/asm/pgalloc.h                             |   41 
 arch/parisc/include/asm/pgtable.h                             |   52 
 arch/parisc/include/asm/tlb.h                                 |    2 
 arch/parisc/include/uapi/asm/msgbuf.h                         |    1 
 arch/parisc/include/uapi/asm/sembuf.h                         |    1 
 arch/parisc/kernel/cache.c                                    |   13 
 arch/parisc/kernel/pci-dma.c                                  |    9 
 arch/parisc/mm/fixmap.c                                       |   10 
 arch/parisc/mm/hugetlbpage.c                                  |   18 
 arch/powerpc/include/uapi/asm/msgbuf.h                        |    2 
 arch/powerpc/include/uapi/asm/sembuf.h                        |    2 
 arch/s390/include/uapi/asm/ipcbuf.h                           |    2 
 arch/sparc/include/asm/pgalloc_32.h                           |    6 
 arch/sparc/include/asm/pgtable_32.h                           |   28 
 arch/sparc/include/uapi/asm/ipcbuf.h                          |    2 
 arch/sparc/include/uapi/asm/msgbuf.h                          |    2 
 arch/sparc/include/uapi/asm/sembuf.h                          |    2 
 arch/sparc/mm/fault_32.c                                      |   11 
 arch/sparc/mm/highmem.c                                       |    6 
 arch/sparc/mm/io-unit.c                                       |    6 
 arch/sparc/mm/iommu.c                                         |    6 
 arch/sparc/mm/srmmu.c                                         |   51 
 arch/um/include/asm/pgtable-2level.h                          |    1 
 arch/um/include/asm/pgtable-3level.h                          |    1 
 arch/um/include/asm/pgtable.h                                 |    3 
 arch/um/kernel/mem.c                                          |    8 
 arch/um/kernel/skas/mmu.c                                     |   12 
 arch/um/kernel/skas/uaccess.c                                 |    7 
 arch/um/kernel/tlb.c                                          |   85 
 arch/um/kernel/trap.c                                         |    4 
 arch/x86/include/uapi/asm/msgbuf.h                            |    3 
 arch/x86/include/uapi/asm/sembuf.h                            |    2 
 arch/xtensa/include/uapi/asm/ipcbuf.h                         |    2 
 arch/xtensa/include/uapi/asm/msgbuf.h                         |    2 
 arch/xtensa/include/uapi/asm/sembuf.h                         |    1 
 drivers/auxdisplay/charlcd.c                                  |   34 
 drivers/base/node.c                                           |    9 
 drivers/gpio/gpio-104-dio-48e.c                               |   75 
 drivers/gpio/gpio-104-idi-48.c                                |   36 
 drivers/gpio/gpio-74x164.c                                    |   19 
 drivers/gpio/gpio-gpio-mm.c                                   |   75 
 drivers/gpio/gpio-max3191x.c                                  |   19 
 drivers/gpio/gpio-pca953x.c                                   |  209 
 drivers/gpio/gpio-pci-idio-16.c                               |   75 
 drivers/gpio/gpio-pcie-idio-24.c                              |  111 
 drivers/gpio/gpio-pisosr.c                                    |   12 
 drivers/gpio/gpio-uniphier.c                                  |   13 
 drivers/gpio/gpio-ws16c48.c                                   |   73 
 drivers/gpu/drm/drm_property.c                                |    2 
 drivers/misc/sram-exec.c                                      |    2 
 drivers/rapidio/rio-access.c                                  |    2 
 drivers/rapidio/rio-driver.c                                  |    1 
 drivers/thermal/intel/intel_soc_dts_iosf.c                    |   31 
 drivers/thermal/intel/intel_soc_dts_iosf.h                    |    2 
 drivers/usb/core/hub.c                                        |    5 
 drivers/vhost/vhost.c                                         |    6 
 drivers/vhost/vhost.h                                         |    1 
 fs/binfmt_elf.c                                               |   56 
 fs/eventpoll.c                                                |   52 
 fs/proc/Kconfig                                               |    8 
 fs/proc/generic.c                                             |   37 
 fs/proc/internal.h                                            |    2 
 include/asm-generic/4level-fixup.h                            |   39 
 include/asm-generic/bitops/find.h                             |   17 
 include/linux/bitmap.h                                        |   51 
 include/linux/bitops.h                                        |   12 
 include/linux/build_bug.h                                     |    4 
 include/linux/genalloc.h                                      |    2 
 include/linux/kcov.h                                          |   23 
 include/linux/kernel.h                                        |   19 
 include/linux/mm.h                                            |   10 
 include/linux/notifier.h                                      |    4 
 include/linux/proc_fs.h                                       |    4 
 include/linux/rbtree_augmented.h                              |    6 
 include/linux/sched.h                                         |    8 
 include/linux/sysctl.h                                        |    6 
 include/linux/thread_info.h                                   |    2 
 include/linux/vmstat.h                                        |   54 
 include/uapi/asm-generic/ipcbuf.h                             |    2 
 include/uapi/asm-generic/msgbuf.h                             |    2 
 include/uapi/asm-generic/sembuf.h                             |    1 
 include/uapi/linux/kcov.h                                     |   28 
 include/uapi/linux/scc.h                                      |    1 
 init/Kconfig                                                  |   78 
 kernel/dma/remap.c                                            |    2 
 kernel/kcov.c                                                 |  547 +
 kernel/notifier.c                                             |   45 
 kernel/profile.c                                              |    6 
 kernel/sys.c                                                  |    4 
 lib/bitmap.c                                                  |   12 
 lib/find_bit.c                                                |   14 
 lib/genalloc.c                                                |    7 
 lib/math/rational.c                                           |   63 
 lib/test_bitmap.c                                             |  206 
 lib/test_meminit.c                                            |   20 
 lib/ubsan.c                                                   |   64 
 mm/kasan/common.c                                             |    1 
 mm/memcontrol.c                                               |   52 
 mm/memory.c                                                   |   10 
 mm/slab_common.c                                              |   12 
 mm/vmstat.c                                                   |   60 
 net/sunrpc/rpc_pipe.c                                         |    2 
 scripts/checkpatch.pl                                         |   13 
 scripts/get_maintainer.pl                                     |   38 
 tools/testing/selftests/Makefile                              |    1 
 tools/testing/selftests/filesystems/epoll/.gitignore          |    1 
 tools/testing/selftests/filesystems/epoll/Makefile            |    7 
 tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c | 3074 ++++++++++
 usr/include/Makefile                                          |    4 
 154 files changed, 5270 insertions(+), 1360 deletions(-)





^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-12-01 21:07 ` incoming Linus Torvalds
@ 2019-12-02  8:21   ` Steven Price
  0 siblings, 0 replies; 485+ messages in thread
From: Steven Price @ 2019-12-02  8:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, mm-commits, Linux-MM

On Sun, Dec 01, 2019 at 09:07:47PM +0000, Linus Torvalds wrote:
> On Sat, Nov 30, 2019 at 5:47 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> >     Steven Price <steven.price@arm.com>:
> >     Patch series "Generic page walk and ptdump", v15:
> >       mm: add generic p?d_leaf() macros
> >       arc: mm: add p?d_leaf() definitions
> >       arm: mm: add p?d_leaf() definitions
> >       arm64: mm: add p?d_leaf() definitions
> >       mips: mm: add p?d_leaf() definitions
> >       powerpc: mm: add p?d_leaf() definitions
> >       riscv: mm: add p?d_leaf() definitions
> >       s390: mm: add p?d_leaf() definitions
> >       sparc: mm: add p?d_leaf() definitions
> >       x86: mm: add p?d_leaf() definitions
> >       mm: pagewalk: add p4d_entry() and pgd_entry()
> >       mm: pagewalk: allow walking without vma
> >       mm: pagewalk: add test_p?d callbacks
> >       mm: pagewalk: add 'depth' parameter to pte_hole
> >       x86: mm: point to struct seq_file from struct pg_state
> >       x86: mm+efi: convert ptdump_walk_pgd_level() to take a mm_struct
> >       x86: mm: convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
> >       x86: mm: convert ptdump_walk_pgd_level_core() to take an mm_struct
> >       mm: add generic ptdump
> >       x86: mm: convert dump_pagetables to use walk_page_range
> >       arm64: mm: convert mm/dump.c to use walk_page_range()
> >       arm64: mm: display non-present entries in ptdump
> >       mm: ptdump: reduce level numbers by 1 in note_page()
> 
> I've dropped these, and since they clearly weren't ready I don't want
> to see them re-sent for 5.5.

Sorry about this, I'll try to track down the cause of this and hopefully
resubmit for 5.6.

Thanks,

Steve

> If somebody figures out the bug, trying again for 5.6 sounds fine.
> 
>                Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-12-01  1:47 incoming Andrew Morton
  2019-12-01  5:17 ` incoming James Bottomley
@ 2019-12-01 21:07 ` Linus Torvalds
  2019-12-02  8:21   ` incoming Steven Price
  1 sibling, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2019-12-01 21:07 UTC (permalink / raw)
  To: Andrew Morton, Steven Price; +Cc: mm-commits, Linux-MM

On Sat, Nov 30, 2019 at 5:47 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
>     Steven Price <steven.price@arm.com>:
>     Patch series "Generic page walk and ptdump", v15:
>       mm: add generic p?d_leaf() macros
>       arc: mm: add p?d_leaf() definitions
>       arm: mm: add p?d_leaf() definitions
>       arm64: mm: add p?d_leaf() definitions
>       mips: mm: add p?d_leaf() definitions
>       powerpc: mm: add p?d_leaf() definitions
>       riscv: mm: add p?d_leaf() definitions
>       s390: mm: add p?d_leaf() definitions
>       sparc: mm: add p?d_leaf() definitions
>       x86: mm: add p?d_leaf() definitions
>       mm: pagewalk: add p4d_entry() and pgd_entry()
>       mm: pagewalk: allow walking without vma
>       mm: pagewalk: add test_p?d callbacks
>       mm: pagewalk: add 'depth' parameter to pte_hole
>       x86: mm: point to struct seq_file from struct pg_state
>       x86: mm+efi: convert ptdump_walk_pgd_level() to take a mm_struct
>       x86: mm: convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
>       x86: mm: convert ptdump_walk_pgd_level_core() to take an mm_struct
>       mm: add generic ptdump
>       x86: mm: convert dump_pagetables to use walk_page_range
>       arm64: mm: convert mm/dump.c to use walk_page_range()
>       arm64: mm: display non-present entries in ptdump
>       mm: ptdump: reduce level numbers by 1 in note_page()

I've dropped these, and since they clearly weren't ready I don't want
to see them re-sent for 5.5.

If somebody figures out the bug, trying again for 5.6 sounds fine.

               Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-12-01  1:47 incoming Andrew Morton
@ 2019-12-01  5:17 ` James Bottomley
  2019-12-01 21:07 ` incoming Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: James Bottomley @ 2019-12-01  5:17 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: mm-commits, linux-mm

On Sat, 2019-11-30 at 17:47 -0800, Andrew Morton wrote:
> - a small number of updates to scripts/, ocfs2 and fs/buffer.c
> 
> - most of MM.  I still have quite a lot of material (mostly not MM)
>   staged after linux-next due to -next dependencies.  I'll send thos
>   across next week as the preprequisites get merged up.
> 
> 158 patches, based on 32ef9553635ab1236c33951a8bd9b5af1c3b1646.

Hey, Andrew, would it be at all possible for you to thread these
patches under something like this incoming message?  The selfish reason
I'm asking is so I can mark the thread as read instead of having to do
it individually for 158 messages ... my thumb would thank you for this.

Regards,

James



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-12-01  1:47 Andrew Morton
  2019-12-01  5:17 ` incoming James Bottomley
  2019-12-01 21:07 ` incoming Linus Torvalds
  0 siblings, 2 replies; 485+ messages in thread
From: Andrew Morton @ 2019-12-01  1:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- a small number of updates to scripts/, ocfs2 and fs/buffer.c

- most of MM.  I still have quite a lot of material (mostly not MM)
  staged after linux-next due to -next dependencies.  I'll send thos
  across next week as the preprequisites get merged up.

158 patches, based on 32ef9553635ab1236c33951a8bd9b5af1c3b1646.

Subsystems affected by this patch series:

  scripts
  ocfs2
  vfs
  mm/slab
  mm/slub
  mm/pagecache
  mm/gup
  mm/swap
  mm/memcg
  mm/pagemap
  mm/memfd
  mm/memory-failure
  mm/memory-hotplug
  mm/sparsemem
  mm/vmalloc
  mm/kasan
  mm/pagealloc
  mm/vmscan
  mm/proc
  mm/z3fold
  mm/mempolicy
  mm/memblock
  mm/hugetlbfs
  mm/hugetlb
  mm/migration
  mm/thp
  mm/cma
  mm/autonuma
  mm/page-poison
  mm/mmap
  mm/madvise
  mm/userfaultfd
  mm/shmem
  mm/cleanups
  mm/support

Subsystem: scripts

    Colin Ian King <colin.king@canonical.com>:
      scripts/spelling.txt: add more spellings to spelling.txt

Subsystem: ocfs2

    Ding Xiang <dingxiang@cmss.chinamobile.com>:
      ocfs2: fix passing zero to 'PTR_ERR' warning

Subsystem: vfs

    Saurav Girepunje <saurav.girepunje@gmail.com>:
      fs/buffer.c: fix use true/false for bool type

    Ben Dooks <ben.dooks@codethink.co.uk>:
      fs/buffer.c: include internal.h for missing declarations

Subsystem: mm/slab

    Pengfei Li <lpf.vector@gmail.com>:
    Patch series "mm, slab: Make kmalloc_info[] contain all types of names", v6:
      mm, slab: make kmalloc_info[] contain all types of names
      mm, slab: remove unused kmalloc_size()
      mm, slab_common: use enum kmalloc_cache_type to iterate over kmalloc caches

Subsystem: mm/slub

    Miles Chen <miles.chen@mediatek.com>:
      mm: slub: print the offset of fault addresses

    Yu Zhao <yuzhao@google.com>:
      mm/slub.c: update comments
      mm/slub.c: clean up validate_slab()

Subsystem: mm/pagecache

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm/filemap.c: remove redundant cache invalidation after async direct-io write
      fs/direct-io.c: keep dio_warn_stale_pagecache() when CONFIG_BLOCK=n
      mm/filemap.c: warn if stale pagecache is left after direct write

Subsystem: mm/gup

    zhong jiang <zhongjiang@huawei.com>:
      mm/gup.c: allow CMA migration to propagate errors back to caller

    Liu Xiang <liuxiang_1999@126.com>:
      mm/gup.c: fix comments of __get_user_pages() and get_user_pages_remote()

Subsystem: mm/swap

    Naohiro Aota <naohiro.aota@wdc.com>:
      mm, swap: disallow swapon() on zoned block devices

    Fengguang Wu <fengguang.wu@intel.com>:
      mm/swap.c: trivial mark_page_accessed() cleanup

Subsystem: mm/memcg

    Yafang Shao <laoar.shao@gmail.com>:
      mm, memcg: clean up reclaim iter array

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: remove dead code from memory_max_write()
      mm: memcontrol: try harder to set a new memory.high

    Hao Lee <haolee.swjtu@gmail.com>:
      include/linux/memcontrol.h: fix comments based on per-node memcg

    Shakeel Butt <shakeelb@google.com>:
      mm: vmscan: memcontrol: remove mem_cgroup_select_victim_node()

    Chris Down <chris@chrisdown.name>:
      Documentation/admin-guide/cgroup-v2.rst: document why inactive_X + active_X may not equal X

Subsystem: mm/pagemap

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: drop mmap_sem before calling balance_dirty_pages() in write fault

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
      shmem: pin the file in shmem_fault() if mmap_sem is dropped

    "Joel Fernandes (Google)" <joel@joelfernandes.org>:
      mm: emit tracepoint when RSS changes
      rss_stat: add support to detect RSS updates of external mm

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/mmap.c: remove a never-triggered warning in __vma_adjust()

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm/swap.c: piggyback lru_add_drain_all() calls

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/mmap.c: prev could be retrieved from vma->vm_prev
      mm/mmap.c: __vma_unlink_prev() is not necessary now
      mm/mmap.c: extract __vma_unlink_list() as counterpart for __vma_link_list()
      mm/mmap.c: rb_parent is not necessary in __vma_link_list()
      mm/rmap.c: don't reuse anon_vma if we just want a copy
      mm/rmap.c: reuse mergeable anon_vma as parent when fork

    Gaowei Pu <pugaowei@gmail.com>:
      mm/mmap.c: use IS_ERR_VALUE to check return value of get_unmapped_area

    Vineet Gupta <Vineet.Gupta1@synopsys.com>:
    Patch series "elide extraneous generated code for folded p4d/pud/pmd", v3:
      ARC: mm: remove __ARCH_USE_5LEVEL_HACK
      asm-generic/tlb: stub out pud_free_tlb() if nopud ...
      asm-generic/tlb: stub out p4d_free_tlb() if nop4d ...
      asm-generic/tlb: stub out pmd_free_tlb() if nopmd
      asm-generic/mm: stub out p{4,u}d_clear_bad() if __PAGETABLE_P{4,U}D_FOLDED

    Miles Chen <miles.chen@mediatek.com>:
      mm/rmap.c: fix outdated comment in page_get_anon_vma()

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/rmap.c: use VM_BUG_ON_PAGE() in __page_check_anon_rmap()

    Thomas Hellstrom <thellstrom@vmware.com>:
      mm: move the backup x_devmap() functions to asm-generic/pgtable.h
      mm/memory.c: fix a huge pud insertion race during faulting

    Steven Price <steven.price@arm.com>:
    Patch series "Generic page walk and ptdump", v15:
      mm: add generic p?d_leaf() macros
      arc: mm: add p?d_leaf() definitions
      arm: mm: add p?d_leaf() definitions
      arm64: mm: add p?d_leaf() definitions
      mips: mm: add p?d_leaf() definitions
      powerpc: mm: add p?d_leaf() definitions
      riscv: mm: add p?d_leaf() definitions
      s390: mm: add p?d_leaf() definitions
      sparc: mm: add p?d_leaf() definitions
      x86: mm: add p?d_leaf() definitions
      mm: pagewalk: add p4d_entry() and pgd_entry()
      mm: pagewalk: allow walking without vma
      mm: pagewalk: add test_p?d callbacks
      mm: pagewalk: add 'depth' parameter to pte_hole
      x86: mm: point to struct seq_file from struct pg_state
      x86: mm+efi: convert ptdump_walk_pgd_level() to take a mm_struct
      x86: mm: convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
      x86: mm: convert ptdump_walk_pgd_level_core() to take an mm_struct
      mm: add generic ptdump
      x86: mm: convert dump_pagetables to use walk_page_range
      arm64: mm: convert mm/dump.c to use walk_page_range()
      arm64: mm: display non-present entries in ptdump
      mm: ptdump: reduce level numbers by 1 in note_page()

Subsystem: mm/memfd

    Nicolas Geoffray <ngeoffray@google.com>:
      mm, memfd: fix COW issue on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings

    "Joel Fernandes (Google)" <joel@joelfernandes.org>:
      memfd: add test for COW on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings

Subsystem: mm/memory-failure

    Jane Chu <jane.chu@oracle.com>:
      mm/memory-failure.c clean up around tk pre-allocation

    Naoya Horiguchi <nao.horiguchi@gmail.com>:
      mm, soft-offline: convert parameter to pfn

    Yunfeng Ye <yeyunfeng@huawei.com>:
      mm/memory-failure.c: use page_shift() in add_to_kill()

Subsystem: mm/memory-hotplug

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/hotplug: reorder memblock_[free|remove]() calls in try_remove_memory()

    Alastair D'Silva <alastair@d-silva.org>:
      mm/memory_hotplug.c: add a bounds check to __add_pages()

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: Export generic_online_page()":
      mm/memory_hotplug: export generic_online_page()
      hv_balloon: use generic_online_page()
      mm/memory_hotplug: remove __online_page_free() and __online_page_increment_counters()
    Patch series "mm: Memory offlining + page isolation cleanups", v2:
      mm/page_alloc.c: don't set pages PageReserved() when offlining
      mm/page_isolation.c: convert SKIP_HWPOISON to MEMORY_OFFLINE

    "Ben Dooks (Codethink)" <ben.dooks@codethink.co.uk>:
      include/linux/memory_hotplug.h: move definitions of {set,clear}_zone_contiguous

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory.c: drop the mem_sysfs_mutex
      mm/memory_hotplug.c: don't allow to online/offline memory blocks with holes

Subsystem: mm/sparsemem

    Vincent Whitchurch <vincent.whitchurch@axis.com>:
      mm/sparse: consistently do not zero memmap

    Ilya Leoshkevich <iii@linux.ibm.com>:
      mm/sparse.c: mark populate_section_memmap as __meminit

    Michal Hocko <mhocko@suse.com>:
      mm/sparse.c: do not waste pre allocated memmap space

Subsystem: mm/vmalloc

    Liu Xiang <liuxiang_1999@126.com>:
      mm/vmalloc.c: remove unnecessary highmem_mask from parameter of gfpflags_allow_blocking()

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: remove preempt_disable/enable when doing preloading
      mm/vmalloc: respect passed gfp_mask when doing preloading
      mm/vmalloc: add more comments to the adjust_va_to_fit_type()

    Anders Roxell <anders.roxell@linaro.org>:
      selftests: vm: add fragment CONFIG_TEST_VMALLOC

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: rework vmap_area_lock

Subsystem: mm/kasan

    Daniel Axtens <dja@axtens.net>:
    Patch series "kasan: support backing vmalloc space with real shadow:
      kasan: support backing vmalloc space with real shadow memory
      kasan: add test for vmalloc
      fork: support VMAP_STACK with KASAN_VMALLOC
      x86/kasan: support KASAN_VMALLOC

Subsystem: mm/pagealloc

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/page_alloc: add alloc_contig_pages()

    Mel Gorman <mgorman@techsingularity.net>:
      mm, pcp: share common code between memory hotplug and percpu sysctl handler
      mm, pcpu: make zone pcp updates and reset internal to the mm

    Hao Lee <haolee.swjtu@gmail.com>:
      include/linux/mmzone.h: fix comment for ISOLATE_UNMAPPED macro

    lijiazi <jqqlijiazi@gmail.com>:
      mm/page_alloc.c: print reserved_highatomic info

Subsystem: mm/vmscan

    Andrey Ryabinin <aryabinin@virtuozzo.com>:
      mm/vmscan: remove unused lru_pages argument

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/vmscan.c: remove unused scan_control parameter from pageout()

    Johannes Weiner <hannes@cmpxchg.org>:
    Patch series "mm: vmscan: cgroup-related cleanups":
      mm: vmscan: simplify lruvec_lru_size()
      mm: clean up and clarify lruvec lookup procedure
      mm: vmscan: move inactive_list_is_low() swap check to the caller
      mm: vmscan: naming fixes: global_reclaim() and sane_reclaim()
      mm: vmscan: replace shrink_node() loop with a retry jump
      mm: vmscan: turn shrink_node_memcg() into shrink_lruvec()
      mm: vmscan: split shrink_node() into node part and memcgs part
      mm: vmscan: harmonize writeback congestion tracking for nodes & memcgs
    Patch series "mm: fix page aging across multiple cgroups":
      mm: vmscan: move file exhaustion detection to the node level
      mm: vmscan: detect file thrashing at the reclaim root
      mm: vmscan: enforce inactive:active ratio at the reclaim root

    Xianting Tian <xianting_tian@126.com>:
      mm/vmscan.c: fix typo in comment

Subsystem: mm/proc

    Johannes Weiner <hannes@cmpxchg.org>:
      kernel: sysctl: make drop_caches write-only

Subsystem: mm/z3fold

    Vitaly Wool <vitaly.wool@konsulko.com>:
      mm/z3fold.c: add inter-page compaction

Subsystem: mm/mempolicy

    Li Xinhai <lixinhai.lxh@gmail.com>:
    Patch series "mm: Fix checking unmapped holes for mbind", v4:
      mm/mempolicy.c: check range first in queue_pages_test_walk
      mm/mempolicy.c: fix checking unmapped holes for mbind

Subsystem: mm/memblock

    Cao jin <caoj.fnst@cn.fujitsu.com>:
      mm/memblock.c: cleanup doc
      mm/memblock: correct doc for function

    Yunfeng Ye <yeyunfeng@huawei.com>:
      mm: support memblock alloc on the exact node for sparse_buffer_init()

Subsystem: mm/hugetlbfs

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: hugetlb_fault_mutex_hash() cleanup
      mm/hugetlbfs: fix error handling when setting up mounts
    Patch series "hugetlbfs: convert macros to static inline, fix sparse warning":
      powerpc/mm: remove pmd_huge/pud_huge stubs and include hugetlb.h
      hugetlbfs: convert macros to static inline, fix sparse warning

    Piotr Sarna <p.sarna@tlen.pl>:
      hugetlbfs: add O_TMPFILE support

    Waiman Long <longman@redhat.com>:
      hugetlbfs: take read_lock on i_mmap for PMD sharing

Subsystem: mm/hugetlb

    Mina Almasry <almasrymina@google.com>:
      hugetlb: region_chg provides only cache entry
      hugetlb: remove duplicated code

    Wei Yang <richardw.yang@linux.intel.com>:
      hugetlb: remove unused hstate in hugetlb_fault_mutex_hash()

    Zhigang Lu <tonnylu@tencent.com>:
      mm/hugetlb: avoid looping to the same hugepage if !pages and !vmas

    zhong jiang <zhongjiang@huawei.com>:
      mm/huge_memory.c: split_huge_pages_fops should be defined with DEFINE_DEBUGFS_ATTRIBUTE

Subsystem: mm/migration

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm/migrate.c: handle freed page at the first place

Subsystem: mm/thp

    "Kirill A. Shutemov" <kirill@shutemov.name>:
      mm, thp: do not queue fully unmapped pages for deferred split

    Song Liu <songliubraving@fb.com>:
      mm/thp: flush file for !is_shmem PageDirty() case in collapse_file()

Subsystem: mm/cma

    Yunfeng Ye <yeyunfeng@huawei.com>:
      mm/cma.c: switch to bitmap_zalloc() for cma bitmap allocation

    zhong jiang <zhongjiang@huawei.com>:
      mm/cma_debug.c: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops

Subsystem: mm/autonuma

    Huang Ying <ying.huang@intel.com>:
      autonuma: fix watermark checking in migrate_balanced_pgdat()
      autonuma: reduce cache footprint when scanning page tables

Subsystem: mm/page-poison

    zhong jiang <zhongjiang@huawei.com>:
      mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops

Subsystem: mm/mmap

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/mmap.c: make vma_merge() comment more easy to understand

Subsystem: mm/madvise

    Yunfeng Ye <yeyunfeng@huawei.com>:
      mm/madvise.c: replace with page_size() in madvise_inject_error()

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/madvise.c: use PAGE_ALIGN[ED] for range checking

Subsystem: mm/userfaultfd

    Wei Yang <richardw.yang@linux.intel.com>:
      userfaultfd: use vma_pagesize for all huge page size calculation
      userfaultfd: remove unnecessary WARN_ON() in __mcopy_atomic_hugetlb()
      userfaultfd: wrap the common dst_vma check into an inlined function

    Andrea Arcangeli <aarcange@redhat.com>:
      fs/userfaultfd.c: wp: clear VM_UFFD_MISSING or VM_UFFD_WP during userfaultfd_register()

    Mike Rapoport <rppt@linux.ibm.com>:
      userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK

Subsystem: mm/shmem

    Colin Ian King <colin.king@canonical.com>:
      mm/shmem.c: make array 'values' static const, makes object smaller

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: shmem: use proper gfp flags for shmem_writepage()

    Chen Jun <chenjun102@huawei.com>:
      mm/shmem.c: cast the type of unmap_start to u64

Subsystem: mm/cleanups

    Hao Lee <haolee.swjtu@gmail.com>:
      mm: fix struct member name in function comments

    Wei Yang <richardw.yang@linux.intel.com>:
      mm: fix typos in comments when calling __SetPageUptodate()

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/memory_hotplug.c: remove __online_page_set_limits()

    Krzysztof Kozlowski <krzk@kernel.org>:
      mm/Kconfig: fix indentation

    Randy Dunlap <rdunlap@infradead.org>:
      mm/Kconfig: fix trivial help text punctuation

Subsystem: mm/support

    Minchan Kim <minchan@google.com>:
      mm/page_io.c: annotate refault stalls from swap_readpage

 Documentation/admin-guide/cgroup-v2.rst          |    7 
 Documentation/dev-tools/kasan.rst                |   63 +
 arch/Kconfig                                     |    9 
 arch/arc/include/asm/pgtable.h                   |    2 
 arch/arc/mm/fault.c                              |   10 
 arch/arc/mm/highmem.c                            |    4 
 arch/arm/include/asm/pgtable-2level.h            |    1 
 arch/arm/include/asm/pgtable-3level.h            |    1 
 arch/arm64/Kconfig                               |    1 
 arch/arm64/Kconfig.debug                         |   19 
 arch/arm64/include/asm/pgtable.h                 |    2 
 arch/arm64/include/asm/ptdump.h                  |    8 
 arch/arm64/mm/Makefile                           |    4 
 arch/arm64/mm/dump.c                             |  148 +---
 arch/arm64/mm/mmu.c                              |    4 
 arch/arm64/mm/ptdump_debugfs.c                   |    2 
 arch/mips/include/asm/pgtable.h                  |    5 
 arch/powerpc/include/asm/book3s/64/pgtable-4k.h  |    3 
 arch/powerpc/include/asm/book3s/64/pgtable-64k.h |    3 
 arch/powerpc/include/asm/book3s/64/pgtable.h     |   30 
 arch/powerpc/mm/book3s64/radix_pgtable.c         |    1 
 arch/riscv/include/asm/pgtable-64.h              |    7 
 arch/riscv/include/asm/pgtable.h                 |    7 
 arch/s390/include/asm/pgtable.h                  |    2 
 arch/sparc/include/asm/pgtable_64.h              |    2 
 arch/x86/Kconfig                                 |    2 
 arch/x86/Kconfig.debug                           |   20 
 arch/x86/include/asm/pgtable.h                   |   10 
 arch/x86/mm/Makefile                             |    4 
 arch/x86/mm/debug_pagetables.c                   |    8 
 arch/x86/mm/dump_pagetables.c                    |  431 +++---------
 arch/x86/mm/kasan_init_64.c                      |   61 +
 arch/x86/platform/efi/efi_32.c                   |    2 
 arch/x86/platform/efi/efi_64.c                   |    4 
 drivers/base/memory.c                            |   40 -
 drivers/firmware/efi/arm-runtime.c               |    2 
 drivers/hv/hv_balloon.c                          |    4 
 drivers/xen/balloon.c                            |    1 
 fs/buffer.c                                      |    6 
 fs/direct-io.c                                   |   21 
 fs/hugetlbfs/inode.c                             |   67 +
 fs/ocfs2/acl.c                                   |    4 
 fs/proc/task_mmu.c                               |    4 
 fs/userfaultfd.c                                 |   21 
 include/asm-generic/4level-fixup.h               |    1 
 include/asm-generic/5level-fixup.h               |    1 
 include/asm-generic/pgtable-nop4d.h              |    2 
 include/asm-generic/pgtable-nopmd.h              |    2 
 include/asm-generic/pgtable-nopud.h              |    2 
 include/asm-generic/pgtable.h                    |   71 ++
 include/asm-generic/tlb.h                        |    4 
 include/linux/fs.h                               |    6 
 include/linux/gfp.h                              |    2 
 include/linux/hugetlb.h                          |  142 +++-
 include/linux/kasan.h                            |   31 
 include/linux/memblock.h                         |    3 
 include/linux/memcontrol.h                       |   51 -
 include/linux/memory_hotplug.h                   |   11 
 include/linux/mm.h                               |   42 -
 include/linux/mmzone.h                           |   34 
 include/linux/moduleloader.h                     |    2 
 include/linux/page-isolation.h                   |    4 
 include/linux/pagewalk.h                         |   42 -
 include/linux/ptdump.h                           |   22 
 include/linux/slab.h                             |   20 
 include/linux/string.h                           |    2 
 include/linux/swap.h                             |    2 
 include/linux/vmalloc.h                          |   12 
 include/trace/events/kmem.h                      |   53 +
 kernel/events/uprobes.c                          |    2 
 kernel/fork.c                                    |    4 
 kernel/sysctl.c                                  |    2 
 lib/Kconfig.kasan                                |   16 
 lib/test_kasan.c                                 |   26 
 lib/vsprintf.c                                   |   40 -
 mm/Kconfig                                       |   40 -
 mm/Kconfig.debug                                 |   21 
 mm/Makefile                                      |    1 
 mm/cma.c                                         |    6 
 mm/cma_debug.c                                   |   10 
 mm/filemap.c                                     |   56 -
 mm/gup.c                                         |   40 -
 mm/hmm.c                                         |    8 
 mm/huge_memory.c                                 |    2 
 mm/hugetlb.c                                     |  298 ++------
 mm/hwpoison-inject.c                             |    4 
 mm/internal.h                                    |   27 
 mm/kasan/common.c                                |  233 ++++++
 mm/kasan/generic_report.c                        |    3 
 mm/kasan/kasan.h                                 |    1 
 mm/khugepaged.c                                  |   18 
 mm/madvise.c                                     |   14 
 mm/memblock.c                                    |  113 ++-
 mm/memcontrol.c                                  |  167 ----
 mm/memory-failure.c                              |   61 -
 mm/memory.c                                      |   56 +
 mm/memory_hotplug.c                              |   86 +-
 mm/mempolicy.c                                   |   59 +
 mm/migrate.c                                     |   21 
 mm/mincore.c                                     |    1 
 mm/mmap.c                                        |   75 --
 mm/mprotect.c                                    |    8 
 mm/mremap.c                                      |    4 
 mm/nommu.c                                       |   10 
 mm/page_alloc.c                                  |  137 +++
 mm/page_io.c                                     |   15 
 mm/page_isolation.c                              |   12 
 mm/pagewalk.c                                    |  126 ++-
 mm/pgtable-generic.c                             |    9 
 mm/ptdump.c                                      |  167 ++++
 mm/rmap.c                                        |   65 +
 mm/shmem.c                                       |   29 
 mm/slab.c                                        |    7 
 mm/slab.h                                        |    6 
 mm/slab_common.c                                 |  101 +-
 mm/slub.c                                        |   36 -
 mm/sparse.c                                      |   22 
 mm/swap.c                                        |   29 
 mm/swapfile.c                                    |    7 
 mm/userfaultfd.c                                 |   77 +-
 mm/util.c                                        |   22 
 mm/vmalloc.c                                     |  196 +++--
 mm/vmscan.c                                      |  798 +++++++++++------------
 mm/workingset.c                                  |   75 +-
 mm/z3fold.c                                      |  375 ++++++++--
 scripts/spelling.txt                             |   28 
 tools/testing/selftests/memfd/memfd_test.c       |   36 +
 tools/testing/selftests/vm/config                |    1 
 128 files changed, 3409 insertions(+), 2121 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-11-22  1:53 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-11-22  1:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


4 fixes, based on 81429eb8d9ca40b0c65bb739d29fa856c5d5e958:

    Vincent Whitchurch <vincent.whitchurch@axis.com>:
      mm/sparse: consistently do not zero memmap

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()

    Andrey Ryabinin <aryabinin@virtuozzo.com>:
      mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()

 fs/ocfs2/xattr.c    |   56 ++++++++++++++++++++++++++++++----------------------
 mm/ksm.c            |   14 ++++++-------
 mm/memory_hotplug.c |   16 ++++++++++++--
 mm/sparse.c         |    2 -
 4 files changed, 54 insertions(+), 34 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-11-16  1:34 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-11-16  1:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


11 fixes, based on 875fef493f21e54d20d71a581687990aaa50268c:

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: mempolicy: fix the wrong return value and potential pages leak of mbind

    zhong jiang <zhongjiang@huawei.com>:
      mm: fix trying to reclaim unevictable lru page when calling madvise_pageout

    Lasse Collin <lasse.collin@tukaani.org>:
      lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocations

    Roman Gushchin <guro@fb.com>:
      mm: memcg: switch to css_tryget() in get_mem_cgroup_from_mm()
      mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup()

    Laura Abbott <labbott@redhat.com>:
      mm: slub: really fix slab walking for init_on_free

    Song Liu <songliubraving@fb.com>:
      mm,thp: recheck each page before collapsing file THP

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: fix try_offline_node()

    Vinayak Menon <vinmenon@codeaurora.org>:
      mm/page_io.c: do not free shared swap slots

    Ralph Campbell <rcampbell@nvidia.com>:
      mm/debug.c: __dump_page() prints an extra line
      mm/debug.c: PageAnon() is true for PageKsm() pages

 drivers/base/memory.c  |   36 ++++++++++++++++++++++++++++++++++++
 include/linux/memory.h |    1 +
 lib/xz/xz_dec_lzma2.c  |    1 +
 mm/debug.c             |   33 ++++++++++++++++++---------------
 mm/hugetlb_cgroup.c    |    2 +-
 mm/khugepaged.c        |   28 ++++++++++++++++------------
 mm/madvise.c           |   16 ++++++++++++----
 mm/memcontrol.c        |    2 +-
 mm/memory_hotplug.c    |   47 +++++++++++++++++++++++++++++------------------
 mm/mempolicy.c         |   14 +++++++++-----
 mm/page_io.c           |    6 +++---
 mm/slub.c              |   39 +++++++++------------------------------
 12 files changed, 136 insertions(+), 89 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-11-06  5:16 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-11-06  5:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

17 fixes, based on 26bc672134241a080a83b2ab9aa8abede8d30e1c:

    Shakeel Butt <shakeelb@google.com>:
      mm: memcontrol: fix NULL-ptr deref in percpu stats flush

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup_benchmark: fix MAP_HUGETLB case

    Mel Gorman <mgorman@techsingularity.net>:
      mm, meminit: recalculate pcpu batch and high limits after init completes

    Yang Shi <yang.shi@linux.alibaba.com>:
      mm: thp: handle page cache THP correctly in PageTransCompoundMap

    Shuning Zhang <sunny.s.zhang@oracle.com>:
      ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()

    Jason Gunthorpe <jgg@mellanox.com>:
      mm/mmu_notifiers: use the right return code for WARN_ON

    Michal Hocko <mhocko@suse.com>:
      mm, vmstat: hide /proc/pagetypeinfo from normal users
      mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

    Ville Syrjälä <ville.syrjala@linux.intel.com>:
      mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y

    Johannes Weiner <hannes@cmpxchg.org>:
      mm/page_alloc.c: ratelimit allocation failure warnings more aggressively

    Vitaly Wool <vitaly.wool@konsulko.com>:
      zswap: add Vitaly to the maintainers list

    Kevin Hao <haokexin@gmail.com>:
      dump_stack: avoid the livelock of the dump_lock

    Song Liu <songliubraving@fb.com>:
      MAINTAINERS: update information for "MEMORY MANAGEMENT"

    Roman Gushchin <guro@fb.com>:
      mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly

    Ilya Leoshkevich <iii@linux.ibm.com>:
      scripts/gdb: fix debugging modules compiled with hot/cold partitioning

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: fix updating the node span

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges

 MAINTAINERS                                |    5 +
 fs/ocfs2/file.c                            |  125 ++++++++++++++++++++++-------
 include/linux/mm.h                         |    5 -
 include/linux/mm_types.h                   |    5 +
 include/linux/page-flags.h                 |   20 ++++
 lib/dump_stack.c                           |    7 +
 mm/khugepaged.c                            |    7 -
 mm/memcontrol.c                            |   23 +++--
 mm/memory_hotplug.c                        |    8 +
 mm/mmu_notifier.c                          |    2 
 mm/page_alloc.c                            |   17 ++-
 mm/slab.h                                  |    4 
 mm/vmstat.c                                |   25 ++++-
 scripts/gdb/linux/symbols.py               |    3 
 tools/testing/selftests/vm/gup_benchmark.c |    2 
 15 files changed, 197 insertions(+), 61 deletions(-)




^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-10-19  3:19 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-10-19  3:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


Rather a lot of fixes, almost all affecting mm/.


26 patches, based on b9959c7a347d6adbb558fba7e36e9fef3cba3b07:

    David Hildenbrand <david@redhat.com>:
      drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store()
      fs/proc/page.c: don't access uninitialized memmaps in fs/proc/page.c
      mm/memory-failure.c: don't access uninitialized memmaps in memory_failure()

    Joel Colledge <joel.colledge@linbit.com>:
      scripts/gdb: fix lx-dmesg when CONFIG_PRINTK_CALLER is set

    Qian Cai <cai@lca.pw>:
      mm/page_owner: don't access uninitialized memmaps when reading /proc/pagetypeinfo

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()

    "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>:
    Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6:
      mm/memunmap: don't access uninitialized memmap in memunmap_pages()

    Roman Gushchin <guro@fb.com>:
      mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release

    Chengguang Xu <cgxu519@mykernel.net>:
      ocfs2: fix error handling in ocfs2_setattr()

    John Hubbard <jhubbard@nvidia.com>:
      mm/gup_benchmark: add a missing "w" to getopt string
      mm/gup: fix a misnamed "write" argument, and a related bug

    Honglei Wang <honglei.wang@oracle.com>:
      mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size

    Mike Rapoport <rppt@linux.ibm.com>:
      mm: memblock: do not enforce current limit for memblock_phys* family

    David Hildenbrand <david@redhat.com>:
      hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()

    Yi Li <yilikernel@gmail.com>:
      ocfs2: fix panic due to ocfs2_wq is null

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm/memcontrol: update lruvec counters in mem_cgroup_move_account

    Chenwandun <chenwandun@huawei.com>:
      zram: fix race between backing_dev_show and backing_dev_store

    Ben Dooks <ben.dooks@codethink.co.uk>:
      mm: include <linux/huge_mm.h> for is_vma_temporary_stack
      mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition

    "Ben Dooks (Codethink)" <ben.dooks@codethink.co.uk>:
      mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
    Patch series "Fixes for THP in page cache", v2:
      proc/meminfo: fix output alignment
      mm/thp: fix node page state in split_huge_page_to_list()

    William Kucharski <william.kucharski@oracle.com>:
      mm/vmscan.c: support removing arbitrary sized pages from mapping

    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>:
      mm/thp: allow dropping THP from page cache

    Song Liu <songliubraving@fb.com>:
      kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register

    Ilya Leoshkevich <iii@linux.ibm.com>:
      scripts/gdb: fix debugging modules on s390

 drivers/base/memory.c                      |    3 +
 drivers/block/zram/zram_drv.c              |    5 +
 fs/ocfs2/file.c                            |    2 
 fs/ocfs2/journal.c                         |    3 -
 fs/ocfs2/localalloc.c                      |    3 -
 fs/proc/meminfo.c                          |    4 -
 fs/proc/page.c                             |   28 ++++++----
 kernel/events/uprobes.c                    |   13 ++++-
 mm/filemap.c                               |    1 
 mm/gup.c                                   |   14 +++--
 mm/huge_memory.c                           |    9 ++-
 mm/hugetlb.c                               |    5 -
 mm/init-mm.c                               |    1 
 mm/memblock.c                              |    6 +-
 mm/memcontrol.c                            |   18 ++++---
 mm/memory-failure.c                        |   14 +++--
 mm/memory_hotplug.c                        |   74 ++++++-----------------------
 mm/memremap.c                              |   11 ++--
 mm/page_owner.c                            |    5 +
 mm/rmap.c                                  |    1 
 mm/slab_common.c                           |    9 +--
 mm/truncate.c                              |   12 ++++
 mm/vmscan.c                                |   14 ++---
 scripts/gdb/linux/dmesg.py                 |   16 ++++--
 scripts/gdb/linux/symbols.py               |    8 ++-
 scripts/gdb/linux/utils.py                 |   25 +++++----
 tools/testing/selftests/vm/gup_benchmark.c |    2 
 27 files changed, 166 insertions(+), 140 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-10-14 21:11 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-10-14 21:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


The usual shower of hotfixes and some followups to the recently merged
page_owner enhancements.

16 patches, based on 2abd839aa7e615f2bbc50c8ba7deb9e40d186768.

Subsystems affected by this patch series:


    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "followups to debug_pagealloc improvements through page_owner", v3:
      mm, page_owner: fix off-by-one error in __set_page_owner_handle()
      mm, page_owner: decouple freeing stack trace from debug_pagealloc
      mm, page_owner: rename flag indicating that page is allocated

    Qian Cai <cai@lca.pw>:
      mm/slub: fix a deadlock in show_slab_objects()

    Eric Biggers <ebiggers@google.com>:
      lib/generic-radix-tree.c: add kmemleak annotations

    Alexander Potapenko <glider@google.com>:
      mm/slub.c: init_on_free=1 should wipe freelist ptr for bulk allocations
      lib/test_meminit: add a kmem_cache_alloc_bulk() test

    David Rientjes <rientjes@google.com>:
      mm, hugetlb: allow hugepage allocations to reclaim as needed

    Vlastimil Babka <vbabka@suse.cz>:
      mm, compaction: fix wrong pfn handling in __reset_isolation_pfn()

    Randy Dunlap <rdunlap@infradead.org>:
      fs/direct-io.c: fix kernel-doc warning
      fs/libfs.c: fix kernel-doc warning
      fs/fs-writeback.c: fix kernel-doc warning
      bitmap.h: fix kernel-doc warning and typo
      xarray.h: fix kernel-doc warning
      mm/slab.c: fix kernel-doc warning for __ksize()

    Jane Chu <jane.chu@oracle.com>:
      mm/memory-failure: poison read receives SIGKILL instead of SIGBUS if mmaped more than once

 Documentation/dev-tools/kasan.rst |    3 ++
 fs/direct-io.c                    |    3 --
 fs/fs-writeback.c                 |    2 -
 fs/libfs.c                        |    3 --
 include/linux/bitmap.h            |    3 +-
 include/linux/page_ext.h          |   10 ++++++
 include/linux/xarray.h            |    4 +-
 lib/generic-radix-tree.c          |   32 +++++++++++++++++-----
 lib/test_meminit.c                |   27 ++++++++++++++++++
 mm/compaction.c                   |    7 ++--
 mm/memory-failure.c               |   22 ++++++++-------
 mm/page_alloc.c                   |    6 ++--
 mm/page_ext.c                     |   23 ++++++---------
 mm/page_owner.c                   |   55 +++++++++++++-------------------------
 mm/slab.c                         |    3 ++
 mm/slub.c                         |   35 ++++++++++++++++++------
 16 files changed, 152 insertions(+), 86 deletions(-)







^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-10-07  0:57 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-10-07  0:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


The usual shower of hotfixes.

Chris's memcg patches aren't actually fixes - they're mature but a few
niggling review issues were late to arrive.

The ocfs2 fixes are quite old - those took some time to get
reviewer attention.

18 patches, based on 4ea655343ce4180fe9b2c7ec8cb8ef9884a47901.


Subsystems affected by this patch series:

  ocfs2
  hotfixes
  mm/memcg
  mm/slab-generic

Subsystem: ocfs2

    Jia Guo <guojia12@huawei.com>:
      ocfs2: clear zero in unaligned direct IO

    Jia-Ju Bai <baijiaju1990@gmail.com>:
      fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()
      fs: ocfs2: fix a possible null-pointer dereference in ocfs2_write_end_nolock()
      fs: ocfs2: fix a possible null-pointer dereference in ocfs2_info_scan_inode_alloc()

Subsystem: hotfixes

    Will Deacon <will@kernel.org>:
      panic: ensure preemption is disabled during panic()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/memremap: drop unused SECTION_SIZE and SECTION_MASK

    Tejun Heo <tj@kernel.org>:
      writeback: fix use-after-free in finish_writeback_work()

    Yi Wang <wang.yi59@zte.com.cn>:
      mm: fix -Wmissing-prototypes warnings

    Baoquan He <bhe@redhat.com>:
      memcg: only record foreign writebacks with dirty pages when memcg is not disabled

    Michal Hocko <mhocko@suse.com>:
      kernel/sysctl.c: do not override max_threads provided by userspace

    Vitaly Wool <vitalywool@gmail.com>:
      mm/z3fold.c: claim page in the beginning of free

    Qian Cai <cai@lca.pw>:
      mm/page_alloc.c: fix a crash in free_pages_prepare()

    Dan Carpenter <dan.carpenter@oracle.com>:
      mm/vmpressure.c: fix a signedness bug in vmpressure_register_event()

Subsystem: mm/memcg

    Chris Down <chris@chrisdown.name>:
      mm, memcg: proportional memory.{low,min} reclaim
      mm, memcg: make memory.emin the baseline for utilisation determination
      mm, memcg: make scan aggression always exclude protection

Subsystem: mm/slab-generic

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "guarantee natural alignment for kmalloc()", v2:
      mm, sl[ou]b: improve memory accounting
      mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)

 Documentation/admin-guide/cgroup-v2.rst      |   20 +-
 Documentation/core-api/memory-allocation.rst |    4 
 fs/fs-writeback.c                            |    9 -
 fs/ocfs2/aops.c                              |   25 +++
 fs/ocfs2/ioctl.c                             |    2 
 fs/ocfs2/xattr.c                             |   56 +++----
 include/linux/memcontrol.h                   |   67 ++++++---
 include/linux/slab.h                         |    4 
 kernel/fork.c                                |    4 
 kernel/panic.c                               |    1 
 mm/memcontrol.c                              |    5 
 mm/memremap.c                                |    2 
 mm/page_alloc.c                              |    8 -
 mm/shuffle.c                                 |    2 
 mm/slab_common.c                             |   19 ++
 mm/slob.c                                    |   62 ++++++--
 mm/slub.c                                    |   14 +
 mm/sparse.c                                  |    2 
 mm/vmpressure.c                              |   20 +-
 mm/vmscan.c                                  |  198 +++++++++++++++++----------
 mm/z3fold.c                                  |   10 +
 21 files changed, 363 insertions(+), 171 deletions(-)





^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-09-25 23:45 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-09-25 23:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- almost all of the rest of -mm

- various other subsystems

76 patches, based on 351c8a09b00b5c51c8f58b016fffe51f87e2d820:

Subsystems affected by this patch series:

  memcg
  misc
  core-kernel
  lib
  checkpatch
  reiserfs
  fat
  fork
  cpumask
  kexec
  uaccess
  kconfig
  kgdb
  bug
  ipc
  lzo
  kasan
  madvise
  cleanups
  pagemap

Subsystem: memcg

    Michal Hocko <mhocko@suse.com>:
      memcg, kmem: do not fail __GFP_NOFAIL charges

Subsystem: misc

    Masahiro Yamada <yamada.masahiro@socionext.com>:
      linux/coff.h: add include guard

Subsystem: core-kernel

    Valdis Kletnieks <valdis.kletnieks@vt.edu>:
      kernel/elfcore.c: include proper prototypes

Subsystem: lib

    Michel Lespinasse <walken@google.com>:
      rbtree: avoid generating code twice for the cached versions (tools copy)
    Patch series "make RB_DECLARE_CALLBACKS more generic", v3:
      augmented rbtree: add comments for RB_DECLARE_CALLBACKS macro
      augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro
      augmented rbtree: rework the RB_DECLARE_CALLBACKS macro definition

    Joe Perches <joe@perches.com>:
      kernel-doc: core-api: include string.h into core-api

    Qian Cai <cai@lca.pw>:
      include/trace/events/writeback.h: fix -Wstringop-truncation warnings

    Kees Cook <keescook@chromium.org>:
      strscpy: reject buffer sizes larger than INT_MAX

    Valdis Kletnieks <valdis.kletnieks@vt.edu>:
      lib/generic-radix-tree.c: make 2 functions static inline
      lib/extable.c: add missing prototypes

    Stephen Boyd <swboyd@chromium.org>:
      lib/hexdump: make print_hex_dump_bytes() a nop on !DEBUG builds

Subsystem: checkpatch

    Joe Perches <joe@perches.com>:
      checkpatch: don't interpret stack dumps as commit IDs
      checkpatch: improve SPDX license checking

    Matteo Croce <mcroce@redhat.com>:
      checkpatch.pl: warn on invalid commit id

    Brendan Jackman <brendan.jackman@bluwireless.co.uk>:
      checkpatch: exclude sizeof sub-expressions from MACRO_ARG_REUSE

    Joe Perches <joe@perches.com>:
      checkpatch: prefer __section over __attribute__((section(...)))
      checkpatch: allow consecutive close braces

    Sean Christopherson <sean.j.christopherson@intel.com>:
      checkpatch: remove obsolete period from "ambiguous SHA1" query

    Joe Perches <joe@perches.com>:
      checkpatch: make git output use LANGUAGE=en_US.utf8

Subsystem: reiserfs

    Jia-Ju Bai <baijiaju1990@gmail.com>:
      fs: reiserfs: remove unnecessary check of bh in remove_from_transaction()

    zhengbin <zhengbin13@huawei.com>:
      fs/reiserfs/journal.c: remove set but not used variables
      fs/reiserfs/stree.c: remove set but not used variables
      fs/reiserfs/lbalance.c: remove set but not used variables
      fs/reiserfs/objectid.c: remove set but not used variables
      fs/reiserfs/prints.c: remove set but not used variables
      fs/reiserfs/fix_node.c: remove set but not used variables
      fs/reiserfs/do_balan.c: remove set but not used variables

    Jason Yan <yanaijie@huawei.com>:
      fs/reiserfs/journal.c: remove set but not used variable
      fs/reiserfs/do_balan.c: remove set but not used variable

Subsystem: fat

    Markus Elfring <elfring@users.sourceforge.net>:
      fat: delete an unnecessary check before brelse()

Subsystem: fork

    Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>:
      fork: improve error message for corrupted page tables

Subsystem: cpumask

    Alexey Dobriyan <adobriyan@gmail.com>:
      cpumask: nicer for_each_cpumask_and() signature

Subsystem: kexec

    Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>:
      kexec: bail out upon SIGKILL when allocating memory.

    Vasily Gorbik <gor@linux.ibm.com>:
      kexec: restore arch_kexec_kernel_image_probe declaration

Subsystem: uaccess

    Kees Cook <keescook@chromium.org>:
      uaccess: add missing __must_check attributes

Subsystem: kconfig

    Masahiro Yamada <yamada.masahiro@socionext.com>:
      compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

Subsystem: kgdb

    Douglas Anderson <dianders@chromium.org>:
      kgdb: don't use a notifier to enter kgdb at panic; call directly
      scripts/gdb: handle split debug

Subsystem: bug

    Kees Cook <keescook@chromium.org>:
    Patch series "Clean up WARN() "cut here" handling", v2:
      bug: refactor away warn_slowpath_fmt_taint()
      bug: rename __WARN_printf_taint() to __WARN_printf()
      bug: consolidate warn_slowpath_fmt() usage
      bug: lift "cut here" out of __warn()
      bug: clean up helper macros to remove __WARN_TAINT()
      bug: consolidate __WARN_FLAGS usage
      bug: move WARN_ON() "cut here" into exception handler

Subsystem: ipc

    Markus Elfring <elfring@users.sourceforge.net>:
      ipc/mqueue.c: delete an unnecessary check before the macro call dev_kfree_skb()
      ipc/mqueue: improve exception handling in do_mq_notify()

    "Joel Fernandes (Google)" <joel@joelfernandes.org>:
      ipc/sem.c: convert to use built-in RCU list checking

Subsystem: lzo

    Dave Rodgman <dave.rodgman@arm.com>:
      lib/lzo/lzo1x_compress.c: fix alignment bug in lzo-rle

Subsystem: kasan

    Andrey Konovalov <andreyknvl@google.com>:
    Patch series "arm64: untag user pointers passed to the kernel", v19:
      lib: untag user pointers in strn*_user
      mm: untag user pointers passed to memory syscalls
      mm: untag user pointers in mm/gup.c
      mm: untag user pointers in get_vaddr_frames
      fs/namespace: untag user pointers in copy_mount_options
      userfaultfd: untag user pointers
      drm/amdgpu: untag user pointers
      drm/radeon: untag user pointers in radeon_gem_userptr_ioctl
      media/v4l2-core: untag user pointers in videobuf_dma_contig_user_get
      tee/shm: untag user pointers in tee_shm_register
      vfio/type1: untag user pointers in vaddr_get_pfn

    Catalin Marinas <catalin.marinas@arm.com>:
      mm: untag user pointers in mmap/munmap/mremap/brk

Subsystem: madvise

    Minchan Kim <minchan@kernel.org>:
    Patch series "Introduce MADV_COLD and MADV_PAGEOUT", v7:
      mm: introduce MADV_COLD
      mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM
      mm: introduce MADV_PAGEOUT
      mm: factor out common parts between MADV_COLD and MADV_PAGEOUT

Subsystem: cleanups

    Mike Rapoport <rppt@linux.ibm.com>:
      hexagon: drop empty and unused free_initrd_mem

    Denis Efremov <efremov@linux.com>:
      checkpatch: check for nested (un)?likely() calls
      xen/events: remove unlikely() from WARN() condition
      fs: remove unlikely() from WARN_ON() condition
      wimax/i2400m: remove unlikely() from WARN*() condition
      xfs: remove unlikely() from WARN_ON() condition
      IB/hfi1: remove unlikely() from IS_ERR*() condition
      ntfs: remove (un)?likely() from IS_ERR() conditions

Subsystem: pagemap

    Mark Rutland <mark.rutland@arm.com>:
      mm: treewide: clarify pgtable_page_{ctor,dtor}() naming

 Documentation/core-api/kernel-api.rst            |    3 
 Documentation/vm/split_page_table_lock.rst       |   10 
 arch/alpha/include/uapi/asm/mman.h               |    3 
 arch/arc/include/asm/pgalloc.h                   |    4 
 arch/arm/include/asm/tlb.h                       |    2 
 arch/arm/mm/mmu.c                                |    2 
 arch/arm64/include/asm/tlb.h                     |    2 
 arch/arm64/mm/mmu.c                              |    2 
 arch/csky/include/asm/pgalloc.h                  |    2 
 arch/hexagon/include/asm/pgalloc.h               |    2 
 arch/hexagon/mm/init.c                           |   13 
 arch/m68k/include/asm/mcf_pgalloc.h              |    6 
 arch/m68k/include/asm/motorola_pgalloc.h         |    6 
 arch/m68k/include/asm/sun3_pgalloc.h             |    2 
 arch/mips/include/asm/pgalloc.h                  |    2 
 arch/mips/include/uapi/asm/mman.h                |    3 
 arch/nios2/include/asm/pgalloc.h                 |    2 
 arch/openrisc/include/asm/pgalloc.h              |    6 
 arch/parisc/include/uapi/asm/mman.h              |    3 
 arch/powerpc/mm/pgtable-frag.c                   |    6 
 arch/riscv/include/asm/pgalloc.h                 |    2 
 arch/s390/mm/pgalloc.c                           |    6 
 arch/sh/include/asm/pgalloc.h                    |    2 
 arch/sparc/include/asm/pgtable_64.h              |    5 
 arch/sparc/mm/init_64.c                          |    4 
 arch/sparc/mm/srmmu.c                            |    4 
 arch/um/include/asm/pgalloc.h                    |    2 
 arch/unicore32/include/asm/tlb.h                 |    2 
 arch/x86/mm/pat_rbtree.c                         |   19 
 arch/x86/mm/pgtable.c                            |    2 
 arch/xtensa/include/asm/pgalloc.h                |    4 
 arch/xtensa/include/uapi/asm/mman.h              |    3 
 drivers/block/drbd/drbd_interval.c               |   29 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c |    2 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c          |    2 
 drivers/gpu/drm/radeon/radeon_gem.c              |    2 
 drivers/infiniband/hw/hfi1/verbs.c               |    2 
 drivers/media/v4l2-core/videobuf-dma-contig.c    |    9 
 drivers/net/wimax/i2400m/tx.c                    |    3 
 drivers/tee/tee_shm.c                            |    1 
 drivers/vfio/vfio_iommu_type1.c                  |    2 
 drivers/xen/events/events_base.c                 |    2 
 fs/fat/dir.c                                     |    4 
 fs/namespace.c                                   |    2 
 fs/ntfs/mft.c                                    |   12 
 fs/ntfs/namei.c                                  |    2 
 fs/ntfs/runlist.c                                |    2 
 fs/ntfs/super.c                                  |    2 
 fs/open.c                                        |    2 
 fs/reiserfs/do_balan.c                           |   15 
 fs/reiserfs/fix_node.c                           |    6 
 fs/reiserfs/journal.c                            |   22 
 fs/reiserfs/lbalance.c                           |    3 
 fs/reiserfs/objectid.c                           |    3 
 fs/reiserfs/prints.c                             |    3 
 fs/reiserfs/stree.c                              |    4 
 fs/userfaultfd.c                                 |   22 
 fs/xfs/xfs_buf.c                                 |    4 
 include/asm-generic/bug.h                        |   71 +-
 include/asm-generic/pgalloc.h                    |    8 
 include/linux/cpumask.h                          |   14 
 include/linux/interval_tree_generic.h            |   22 
 include/linux/kexec.h                            |    2 
 include/linux/kgdb.h                             |    2 
 include/linux/mm.h                               |    4 
 include/linux/mm_types_task.h                    |    4 
 include/linux/printk.h                           |   22 
 include/linux/rbtree_augmented.h                 |  114 +++-
 include/linux/string.h                           |    5 
 include/linux/swap.h                             |    2 
 include/linux/thread_info.h                      |    2 
 include/linux/uaccess.h                          |   21 
 include/trace/events/writeback.h                 |   38 -
 include/uapi/asm-generic/mman-common.h           |    3 
 include/uapi/linux/coff.h                        |    5 
 ipc/mqueue.c                                     |   22 
 ipc/sem.c                                        |    3 
 kernel/debug/debug_core.c                        |   31 -
 kernel/elfcore.c                                 |    1 
 kernel/fork.c                                    |   16 
 kernel/kexec_core.c                              |    2 
 kernel/panic.c                                   |   48 -
 lib/Kconfig.debug                                |    4 
 lib/bug.c                                        |   11 
 lib/extable.c                                    |    1 
 lib/generic-radix-tree.c                         |    4 
 lib/hexdump.c                                    |   21 
 lib/lzo/lzo1x_compress.c                         |   14 
 lib/rbtree_test.c                                |   37 -
 lib/string.c                                     |   12 
 lib/strncpy_from_user.c                          |    3 
 lib/strnlen_user.c                               |    3 
 mm/frame_vector.c                                |    2 
 mm/gup.c                                         |    4 
 mm/internal.h                                    |    2 
 mm/madvise.c                                     |  562 ++++++++++++++++-------
 mm/memcontrol.c                                  |   10 
 mm/mempolicy.c                                   |    3 
 mm/migrate.c                                     |    2 
 mm/mincore.c                                     |    2 
 mm/mlock.c                                       |    4 
 mm/mmap.c                                        |   34 -
 mm/mprotect.c                                    |    2 
 mm/mremap.c                                      |   13 
 mm/msync.c                                       |    2 
 mm/oom_kill.c                                    |    2 
 mm/swap.c                                        |   42 +
 mm/vmalloc.c                                     |    5 
 mm/vmscan.c                                      |   62 ++
 scripts/checkpatch.pl                            |   69 ++
 scripts/gdb/linux/symbols.py                     |    4 
 tools/include/linux/rbtree.h                     |   71 +-
 tools/include/linux/rbtree_augmented.h           |  145 +++--
 tools/lib/rbtree.c                               |   37 -
 114 files changed, 1195 insertions(+), 754 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-24 15:34       ` incoming Linus Torvalds
@ 2019-09-25  6:36         ` Michal Hocko
  0 siblings, 0 replies; 485+ messages in thread
From: Michal Hocko @ 2019-09-25  6:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, David Rientjes, Vlastimil Babka, Andrea Arcangeli,
	mm-commits, Linux-MM

On Tue 24-09-19 08:34:20, Linus Torvalds wrote:
> On Tue, Sep 24, 2019 at 12:48 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > The patch proposed by David is really non trivial wrt. potential side
> > effects.
> 
> The thing is, that's not an argument when we know that the current
> state is garbage and has a lot of these non-trivial side effects that
> are bad.
> 
> So the patch by David _fixes_ a non-trivial bad side effect.
> 
> You can't then say "there may be other non-trivial side effects that I
> don't even know about" as an argument for saying it's bad. David at
> least has numbers and an argument for his patch.

All I am saying is that I am not able to wrap my head around this patch
to provide a competent Ack. I also believe that the fix is targetting a
wrong layer of the problem as explained in my review feedback. Appart
from reclaim/compaction interaction mentioned by Vlastimil, it seems
that it is an overly eager fallback to a remote node in the fast path
that is causing a large part of the problem as well. Kcompactd is not
eager enough to keep high order allocations ready for the fast path.
This is not specific to THP we have many other high order allocations
which are going to follow the same pattern, likely not visible in any
counters but still having performance implications.

Let's discuss technical details in the respective email thread

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-24  7:48     ` incoming Michal Hocko
  2019-09-24 15:34       ` incoming Linus Torvalds
@ 2019-09-24 19:55       ` Vlastimil Babka
  1 sibling, 0 replies; 485+ messages in thread
From: Vlastimil Babka @ 2019-09-24 19:55 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: Linus Torvalds, David Rientjes, Andrea Arcangeli, mm-commits, Linux-MM

On 9/24/19 9:48 AM, Michal Hocko wrote:
> On Mon 23-09-19 21:31:53, Andrew Morton wrote:
>> On Mon, 23 Sep 2019 17:55:24 -0700 Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> 
>>> On Mon, Sep 23, 2019 at 3:31 PM Andrew Morton
>>> <akpm@linux-foundation.org> wrote:
>>>> 
>>>> - almost all of -mm, as below.
>>> 
>>> I was hoping that we could at least test the THP locality thing?
>>> Is it in your queue at all, or am I supposed to just do it
>>> myself?
>>> 
>> 
>> Confused.  I saw a privately emailed patch from David which nobody 
>> seems to have tested yet.  I parked that for consideration after
>> -rc1. Or are you referring to something else?
>> 
>> This thing keeps stalling.  It would be nice to push this along and
>> get something nailed down which we can at least get into 5.4-rc,
>> perhaps with a backport-this tag?
> 
> The patch proposed by David is really non trivial wrt. potential
> side effects. I have provided my review feedback [1] and it didn't
> get any reaction. I really believe that we need to debug this
> properly. A reproducer would be useful for others to work on that.
> 
> There is a more fundamental problem here and we need to address it 
> rather than to duck tape it and whack a mole afterwards.

I believe we found a problem when investigating over-reclaim in this
thread [1] where it seems madvised THP allocation attempt can result in
4MB reclaimed, if there is a small zone such as ZONE_DMA on the node. As
it happens, the patch "[patch 090/134] mm, reclaim: make
should_continue_reclaim perform dryrun detection" in Andrew's pile
should change this 4MB to 32 pages reclaimed (as a side-effect), but
that has to be tested. I'm also working on a patch to not reclaim even
those few pages. Of course there might be more fundamental issues with
reclaim/compaction interaction, but this one seems to become hopefully
clear now.

[1]
https://lore.kernel.org/linux-mm/4b4ba042-3741-7b16-2292-198c569da2aa@profihost.ag/

> [1] http://lkml.kernel.org/r/20190909193020.GD2063@dhcp22.suse.cz
> 



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-24  7:48     ` incoming Michal Hocko
@ 2019-09-24 15:34       ` Linus Torvalds
  2019-09-25  6:36         ` incoming Michal Hocko
  2019-09-24 19:55       ` incoming Vlastimil Babka
  1 sibling, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2019-09-24 15:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, David Rientjes, Vlastimil Babka, Andrea Arcangeli,
	mm-commits, Linux-MM

On Tue, Sep 24, 2019 at 12:48 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> The patch proposed by David is really non trivial wrt. potential side
> effects.

The thing is, that's not an argument when we know that the current
state is garbage and has a lot of these non-trivial side effects that
are bad.

So the patch by David _fixes_ a non-trivial bad side effect.

You can't then say "there may be other non-trivial side effects that I
don't even know about" as an argument for saying it's bad. David at
least has numbers and an argument for his patch.

              Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-24  4:31   ` incoming Andrew Morton
@ 2019-09-24  7:48     ` Michal Hocko
  2019-09-24 15:34       ` incoming Linus Torvalds
  2019-09-24 19:55       ` incoming Vlastimil Babka
  0 siblings, 2 replies; 485+ messages in thread
From: Michal Hocko @ 2019-09-24  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, David Rientjes, Vlastimil Babka,
	Andrea Arcangeli, mm-commits, Linux-MM

On Mon 23-09-19 21:31:53, Andrew Morton wrote:
> On Mon, 23 Sep 2019 17:55:24 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > On Mon, Sep 23, 2019 at 3:31 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> > >
> > > - almost all of -mm, as below.
> > 
> > I was hoping that we could at least test the THP locality thing? Is it
> > in your queue at all, or am I supposed to just do it myself?
> > 
> 
> Confused.  I saw a privately emailed patch from David which nobody
> seems to have tested yet.  I parked that for consideration after -rc1. 
> Or are you referring to something else?
> 
> This thing keeps stalling.  It would be nice to push this along and get
> something nailed down which we can at least get into 5.4-rc, perhaps
> with a backport-this tag?

The patch proposed by David is really non trivial wrt. potential side
effects. I have provided my review feedback [1] and it didn't get
any reaction. I really believe that we need to debug this properly. A
reproducer would be useful for others to work on that.

There is a more fundamental problem here and we need to address it
rather than to duck tape it and whack a mole afterwards.

[1] http://lkml.kernel.org/r/20190909193020.GD2063@dhcp22.suse.cz
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-24  0:55 ` incoming Linus Torvalds
@ 2019-09-24  4:31   ` Andrew Morton
  2019-09-24  7:48     ` incoming Michal Hocko
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2019-09-24  4:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Rientjes, Vlastimil Babka, Michal Hocko, Andrea Arcangeli,
	mm-commits, Linux-MM

On Mon, 23 Sep 2019 17:55:24 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Sep 23, 2019 at 3:31 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > - almost all of -mm, as below.
> 
> I was hoping that we could at least test the THP locality thing? Is it
> in your queue at all, or am I supposed to just do it myself?
> 

Confused.  I saw a privately emailed patch from David which nobody
seems to have tested yet.  I parked that for consideration after -rc1. 
Or are you referring to something else?

This thing keeps stalling.  It would be nice to push this along and get
something nailed down which we can at least get into 5.4-rc, perhaps
with a backport-this tag?


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-09-23 22:31 incoming Andrew Morton
@ 2019-09-24  0:55 ` Linus Torvalds
  2019-09-24  4:31   ` incoming Andrew Morton
  0 siblings, 1 reply; 485+ messages in thread
From: Linus Torvalds @ 2019-09-24  0:55 UTC (permalink / raw)
  To: Andrew Morton, David Rientjes, Vlastimil Babka, Michal Hocko,
	Andrea Arcangeli
  Cc: mm-commits, Linux-MM

On Mon, Sep 23, 2019 at 3:31 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> - almost all of -mm, as below.

I was hoping that we could at least test the THP locality thing? Is it
in your queue at all, or am I supposed to just do it myself?

                    Linus


^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-09-23 22:31 Andrew Morton
  2019-09-24  0:55 ` incoming Linus Torvalds
  0 siblings, 1 reply; 485+ messages in thread
From: Andrew Morton @ 2019-09-23 22:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


- a few hot fixes

- ocfs2 updates

- almost all of -mm, as below.


134 patches, based on 619e17cf75dd58905aa67ccd494a6ba5f19d6cc6:


Subsystems affected by this patch series:

  hotfixes
  ocfs2
  slab-generic
  slab
  slub
  kmemleak
  kasan
  cleanups
  debug
  pagecache
  memcg
  gup
  pagemap
  memory-hotplug
  sparsemem
  vmalloc
  initialization
  z3fold
  compaction
  mempolicy
  oom-kill
  hugetlb
  migration
  thp
  mmap
  madvise
  shmem
  zswap
  zsmalloc

Subsystem: hotfixes

    OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>:
      fat: work around race with userspace's read via blockdev while mounting

    Vitaly Wool <vitalywool@gmail.com>:
      Revert "mm/z3fold.c: fix race between migration and destruction"

    Arnd Bergmann <arnd@arndb.de>:
      mm: add dummy can_do_mlock() helper

    Vitaly Wool <vitalywool@gmail.com>:
      z3fold: fix retry mechanism in page reclaim

    Greg Thelen <gthelen@google.com>:
      kbuild: clean compressed initramfs image

Subsystem: ocfs2

    Joseph Qi <joseph.qi@linux.alibaba.com>:
      ocfs2: use jbd2_inode dirty range scoping
      jbd2: remove jbd2_journal_inode_add_[write|wait]

    Greg Kroah-Hartman <gregkh@linuxfoundation.org>:
      ocfs2: further debugfs cleanups

    Guozhonghua <guozhonghua@h3c.com>:
      ocfs2: remove unused ocfs2_calc_tree_trunc_credits()
      ocfs2: remove unused ocfs2_orphan_scan_exit() declaration

    zhengbin <zhengbin13@huawei.com>:
      fs/ocfs2/namei.c: remove set but not used variables
      fs/ocfs2/file.c: remove set but not used variables
      fs/ocfs2/dir.c: remove set but not used variables

    Markus Elfring <elfring@users.sourceforge.net>:
      ocfs2: delete unnecessary checks before brelse()

    Changwei Ge <gechangwei@live.cn>:
      ocfs2: wait for recovering done after direct unlock request
      ocfs2: checkpoint appending truncate log transaction before flushing

    Colin Ian King <colin.king@canonical.com>:
      ocfs2: fix spelling mistake "ambigous" -> "ambiguous"

Subsystem: slab-generic

    Waiman Long <longman@redhat.com>:
      mm, slab: extend slab/shrink to shrink all memcg caches

Subsystem: slab

    Waiman Long <longman@redhat.com>:
      mm, slab: move memcg_cache_params structure to mm/slab.h

Subsystem: slub

    Qian Cai <cai@lca.pw>:
      mm/slub.c: fix -Wunused-function compiler warnings

Subsystem: kmemleak

    Nicolas Boichat <drinkcat@chromium.org>:
      kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K

    Catalin Marinas <catalin.marinas@arm.com>:
    Patch series "mm: kmemleak: Use a memory pool for kmemleak object:
      mm: kmemleak: make the tool tolerant to struct scan_area allocation failures
      mm: kmemleak: simple memory allocation pool for kmemleak objects
      mm: kmemleak: use the memory pool for early allocations

    Qian Cai <cai@lca.pw>:
      mm/kmemleak.c: record the current memory pool size
      mm/kmemleak: increase the max mem pool to 1M

Subsystem: kasan

    Walter Wu <walter-zh.wu@mediatek.com>:
      kasan: add memory corruption identification for software tag-based mode

    Mark Rutland <mark.rutland@arm.com>:
      lib/test_kasan.c: add roundtrip tests

Subsystem: cleanups

    Christophe JAILLET <christophe.jaillet@wanadoo.fr>:
      mm/page_poison.c: fix a typo in a comment

    YueHaibing <yuehaibing@huawei.com>:
      mm/rmap.c: remove set but not used variable 'cstart'

    Matthew Wilcox (Oracle) <willy@infradead.org>:
    Patch series "Make working with compound pages easier", v2:
      mm: introduce page_size()

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: introduce page_shift()

    Matthew Wilcox (Oracle) <willy@infradead.org>:
      mm: introduce compound_nr()

    Yu Zhao <yuzhao@google.com>:
      mm: replace list_move_tail() with add_page_to_lru_list_tail()

Subsystem: debug

    Vlastimil Babka <vbabka@suse.cz>:
    Patch series "debug_pagealloc improvements through page_owner", v2:
      mm, page_owner: record page owner for each subpage
      mm, page_owner: keep owner info when freeing the page
      mm, page_owner, debug_pagealloc: save and dump freeing stack trace

Subsystem: pagecache

    Konstantin Khlebnikov <khlebnikov@yandex-team.ru>:
      mm/filemap.c: don't initiate writeback if mapping has no dirty pages
      mm/filemap.c: rewrite mapping_needs_writeback in less fancy manner

    "Matthew Wilcox (Oracle)" <willy@infradead.org>:
      mm: page cache: store only head pages in i_pages

Subsystem: memcg

    Chris Down <chris@chrisdown.name>:
      mm, memcg: throttle allocators when failing reclaim over memory.high

    Roman Gushchin <guro@fb.com>:
      mm: memcontrol: switch to rcu protection in drain_all_stock()

    Johannes Weiner <hannes@cmpxchg.org>:
      mm: vmscan: do not share cgroup iteration between reclaimers

Subsystem: gup

    [11~From: John Hubbard <jhubbard@nvidia.com>:
    Patch series "mm/gup: add make_dirty arg to put_user_pages_dirty_lock()",:
      mm/gup: add make_dirty arg to put_user_pages_dirty_lock()

    John Hubbard <jhubbard@nvidia.com>:
      drivers/gpu/drm/via: convert put_page() to put_user_page*()
      net/xdp: convert put_page() to put_user_page*()

Subsystem: pagemap

    Wei Yang <richardw.yang@linux.intel.com>:
      mm: remove redundant assignment of entry

    Minchan Kim <minchan@kernel.org>:
      mm: release the spinlock on zap_pte_range

    Nicholas Piggin <npiggin@gmail.com>:
    Patch series "mm: remove quicklist page table caches":
      mm: remove quicklist page table caches

    Mike Rapoport <rppt@linux.ibm.com>:
      ia64: switch to generic version of pte allocation
      sh: switch to generic version of pte allocation
      microblaze: switch to generic version of pte allocation
      mm: consolidate pgtable_cache_init() and pgd_cache_init()

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm: do not hash address in print_bad_pte()

Subsystem: memory-hotplug

    David Hildenbrand <david@redhat.com>:
      mm/memory_hotplug: remove move_pfn_range()
      drivers/base/node.c: simplify unregister_memory_block_under_nodes()
      drivers/base/memory.c: fixup documentation of removable/phys_index/block_size_bytes
      driver/base/memory.c: validate memory block size early
      drivers/base/memory.c: don't store end_section_nr in memory blocks

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/memory_hotplug.c: prevent memory leak when reusing pgdat

    David Hildenbrand <david@redhat.com>:
    Patch series "mm/memory_hotplug: online_pages() cleanups", v2:
      mm/memory_hotplug.c: use PFN_UP / PFN_DOWN in walk_system_ram_range()
      mm/memory_hotplug: drop PageReserved() check in online_pages_range()
      mm/memory_hotplug: simplify online_pages_range()
      mm/memory_hotplug: make sure the pfn is aligned to the order when onlining
      mm/memory_hotplug: online_pages cannot be 0 in online_pages()

    Alastair D'Silva <alastair@d-silva.org>:
    Patch series "Add bounds check for Hotplugged memory", v3:
      mm/memory_hotplug.c: add a bounds check to check_hotplug_memory_range()
      mm/memremap.c: add a bounds check in devm_memremap_pages()

    Souptick Joarder <jrdr.linux@gmail.com>:
      mm/memory_hotplug.c: s/is/if

Subsystem: sparsemem

    Lecopzer Chen <lecopzer.chen@mediatek.com>:
      mm/sparse.c: fix memory leak of sparsemap_buf in aligned memory
      mm/sparse.c: fix ALIGN() without power of 2 in sparse_buffer_alloc()

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/sparse.c: use __nr_to_section(section_nr) to get mem_section

    Alastair D'Silva <alastair@d-silva.org>:
      mm/sparse.c: don't manually decrement num_poisoned_pages

    "Alastair D'Silva" <alastair@d-silva.org>:
      mm/sparse.c: remove NULL check in clear_hwpoisoned_pages()

Subsystem: vmalloc

    "Uladzislau Rezki (Sony)" <urezki@gmail.com>:
      mm/vmalloc: do not keep unpurged areas in the busy tree

    Pengfei Li <lpf.vector@gmail.com>:
      mm/vmalloc: modify struct vmap_area to reduce its size

    Austin Kim <austindh.kim@gmail.com>:
      mm/vmalloc.c: move 'area->pages' after if statement

Subsystem: initialization

    Mike Rapoport <rppt@linux.ibm.com>:
      mm: use CPU_BITS_NONE to initialize init_mm.cpu_bitmask

    Qian Cai <cai@lca.pw>:
      mm: silence -Woverride-init/initializer-overrides

Subsystem: z3fold

    Vitaly Wool <vitalywool@gmail.com>:
      z3fold: fix memory leak in kmem cache

Subsystem: compaction

    Yafang Shao <laoar.shao@gmail.com>:
      mm/compaction.c: clear total_{migrate,free}_scanned before scanning a new zone

    Pengfei Li <lpf.vector@gmail.com>:
      mm/compaction.c: remove unnecessary zone parameter in isolate_migratepages()

Subsystem: mempolicy

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      mm/mempolicy.c: remove unnecessary nodemask check in kernel_migrate_pages()

Subsystem: oom-kill

    Joel Savitz <jsavitz@redhat.com>:
      mm/oom_kill.c: add task UID to info message on an oom kill

    Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
      memcg, oom: don't require __GFP_FS when invoking memcg OOM killer

    Edward Chron <echron@arista.com>:
      mm/oom: add oom_score_adj and pgtables to Killed process message

    Yi Wang <wang.yi59@zte.com.cn>:
      mm/oom_kill.c: fix oom_cpuset_eligible() comment

    Michal Hocko <mhocko@suse.com>:
      mm, oom: consider present pages for the node size

    Qian Cai <cai@lca.pw>:
      mm/memcontrol.c: fix a -Wunused-function warning

    Michal Hocko <mhocko@suse.com>:
      memcg, kmem: deprecate kmem.limit_in_bytes

Subsystem: hugetlb

    Hillf Danton <hdanton@sina.com>:
    Patch series "address hugetlb page allocation stalls", v2:
      mm, reclaim: make should_continue_reclaim perform dryrun detection

    Vlastimil Babka <vbabka@suse.cz>:
      mm, reclaim: cleanup should_continue_reclaim()
      mm, compaction: raise compaction priority after it withdrawns

    Mike Kravetz <mike.kravetz@oracle.com>:
      hugetlbfs: don't retry when pool page allocations start to fail

Subsystem: migration

    Pingfan Liu <kernelfans@gmail.com>:
      mm/migrate.c: clean up useless code in migrate_vma_collect_pmd()

Subsystem: thp

    Kefeng Wang <wangkefeng.wang@huawei.com>:
      thp: update split_huge_page_pmd() comment

    Song Liu <songliubraving@fb.com>:
    Patch series "Enable THP for text section of non-shmem files", v10;:
      filemap: check compound_head(page)->mapping in filemap_fault()
      filemap: check compound_head(page)->mapping in pagecache_get_page()
      filemap: update offset check in filemap_fault()
      mm,thp: stats for file backed THP
      khugepaged: rename collapse_shmem() and khugepaged_scan_shmem()
      mm,thp: add read-only THP support for (non-shmem) FS
      mm,thp: avoid writes to file with THP in pagecache

    Yang Shi <yang.shi@linux.alibaba.com>:
    Patch series "Make deferred split shrinker memcg aware", v6:
      mm: thp: extract split_queue_* into a struct
      mm: move mem_cgroup_uncharge out of __page_cache_release()
      mm: shrinker: make shrinker not depend on memcg kmem
      mm: thp: make deferred split shrinker memcg aware

    Song Liu <songliubraving@fb.com>:
    Patch series "THP aware uprobe", v13:
      mm: move memcmp_pages() and pages_identical()
      uprobe: use original page when all uprobes are removed
      mm, thp: introduce FOLL_SPLIT_PMD
      uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT
      khugepaged: enable collapse pmd for pte-mapped THP
      uprobe: collapse THP pmd after removing all uprobes

Subsystem: mmap

    Alexandre Ghiti <alex@ghiti.fr>:
    Patch series "Provide generic top-down mmap layout functions", v6:
      mm, fs: move randomize_stack_top from fs to mm
      arm64: make use of is_compat_task instead of hardcoding this test
      arm64: consider stack randomization for mmap base only when necessary
      arm64, mm: move generic mmap layout functions to mm
      arm64, mm: make randomization selected by generic topdown mmap layout
      arm: properly account for stack randomization and stack guard gap
      arm: use STACK_TOP when computing mmap base address
      arm: use generic mmap top-down layout and brk randomization
      mips: properly account for stack randomization and stack guard gap
      mips: use STACK_TOP when computing mmap base address
      mips: adjust brk randomization offset to fit generic version
      mips: replace arch specific way to determine 32bit task with generic version
      mips: use generic mmap top-down layout and brk randomization
      riscv: make mmap allocation top-down by default

    Wei Yang <richardw.yang@linux.intel.com>:
      mm/mmap.c: refine find_vma_prev() with rb_last()

    Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>:
      mm: mmap: increase sockets maximum memory size pgoff for 32bits

Subsystem: madvise

    Mike Rapoport <rppt@linux.ibm.com>:
      mm/madvise: reduce code duplication in error handling paths

Subsystem: shmem

    Miles Chen <miles.chen@mediatek.com>:
      shmem: fix obsolete comment in shmem_getpage_gfp()

Subsystem: zswap

    Hui Zhu <teawaterz@linux.alibaba.com>:
      zpool: add malloc_support_movable to zpool_driver
      zswap: use movable memory if zpool support allocate movable memory

    Vitaly Wool <vitalywool@gmail.com>:
      zswap: do not map same object twice

Subsystem: zsmalloc

    Qian Cai <cai@lca.pw>:
      mm/zsmalloc.c: fix a -Wunused-function warning

 Documentation/ABI/testing/sysfs-kernel-slab     |   13 
 Documentation/admin-guide/cgroup-v1/memory.rst  |    4 
 Documentation/admin-guide/kernel-parameters.txt |    2 
 arch/Kconfig                                    |   11 
 arch/alpha/include/asm/pgalloc.h                |    2 
 arch/alpha/include/asm/pgtable.h                |    5 
 arch/arc/include/asm/pgalloc.h                  |    1 
 arch/arc/include/asm/pgtable.h                  |    5 
 arch/arm/Kconfig                                |    1 
 arch/arm/include/asm/pgalloc.h                  |    2 
 arch/arm/include/asm/pgtable-nommu.h            |    5 
 arch/arm/include/asm/pgtable.h                  |    2 
 arch/arm/include/asm/processor.h                |    2 
 arch/arm/kernel/process.c                       |    5 
 arch/arm/mm/flush.c                             |    7 
 arch/arm/mm/mmap.c                              |   80 -----
 arch/arm64/Kconfig                              |    2 
 arch/arm64/include/asm/pgalloc.h                |    2 
 arch/arm64/include/asm/pgtable.h                |    2 
 arch/arm64/include/asm/processor.h              |    2 
 arch/arm64/kernel/process.c                     |    8 
 arch/arm64/mm/flush.c                           |    3 
 arch/arm64/mm/mmap.c                            |   84 -----
 arch/arm64/mm/pgd.c                             |    2 
 arch/c6x/include/asm/pgtable.h                  |    5 
 arch/csky/include/asm/pgalloc.h                 |    2 
 arch/csky/include/asm/pgtable.h                 |    5 
 arch/h8300/include/asm/pgtable.h                |    6 
 arch/hexagon/include/asm/pgalloc.h              |    2 
 arch/hexagon/include/asm/pgtable.h              |    3 
 arch/hexagon/mm/Makefile                        |    2 
 arch/hexagon/mm/pgalloc.c                       |   10 
 arch/ia64/Kconfig                               |    4 
 arch/ia64/include/asm/pgalloc.h                 |   64 ----
 arch/ia64/include/asm/pgtable.h                 |    5 
 arch/ia64/mm/init.c                             |    2 
 arch/m68k/include/asm/pgtable_mm.h              |    7 
 arch/m68k/include/asm/pgtable_no.h              |    7 
 arch/microblaze/include/asm/pgalloc.h           |  128 --------
 arch/microblaze/include/asm/pgtable.h           |    7 
 arch/microblaze/mm/pgtable.c                    |    4 
 arch/mips/Kconfig                               |    2 
 arch/mips/include/asm/pgalloc.h                 |    2 
 arch/mips/include/asm/pgtable.h                 |    5 
 arch/mips/include/asm/processor.h               |    5 
 arch/mips/mm/mmap.c                             |  124 +-------
 arch/nds32/include/asm/pgalloc.h                |    2 
 arch/nds32/include/asm/pgtable.h                |    2 
 arch/nios2/include/asm/pgalloc.h                |    2 
 arch/nios2/include/asm/pgtable.h                |    2 
 arch/openrisc/include/asm/pgalloc.h             |    2 
 arch/openrisc/include/asm/pgtable.h             |    5 
 arch/parisc/include/asm/pgalloc.h               |    2 
 arch/parisc/include/asm/pgtable.h               |    2 
 arch/powerpc/include/asm/pgalloc.h              |    2 
 arch/powerpc/include/asm/pgtable.h              |    1 
 arch/powerpc/mm/book3s64/hash_utils.c           |    2 
 arch/powerpc/mm/book3s64/iommu_api.c            |    7 
 arch/powerpc/mm/hugetlbpage.c                   |    2 
 arch/riscv/Kconfig                              |   12 
 arch/riscv/include/asm/pgalloc.h                |    4 
 arch/riscv/include/asm/pgtable.h                |    5 
 arch/s390/include/asm/pgtable.h                 |    6 
 arch/sh/include/asm/pgalloc.h                   |   56 ---
 arch/sh/include/asm/pgtable.h                   |    5 
 arch/sh/mm/Kconfig                              |    3 
 arch/sh/mm/nommu.c                              |    4 
 arch/sparc/include/asm/pgalloc_32.h             |    2 
 arch/sparc/include/asm/pgalloc_64.h             |    2 
 arch/sparc/include/asm/pgtable_32.h             |    5 
 arch/sparc/include/asm/pgtable_64.h             |    1 
 arch/sparc/mm/init_32.c                         |    1 
 arch/um/include/asm/pgalloc.h                   |    2 
 arch/um/include/asm/pgtable.h                   |    2 
 arch/unicore32/include/asm/pgalloc.h            |    2 
 arch/unicore32/include/asm/pgtable.h            |    2 
 arch/x86/include/asm/pgtable_32.h               |    2 
 arch/x86/include/asm/pgtable_64.h               |    3 
 arch/x86/mm/pgtable.c                           |    6 
 arch/xtensa/include/asm/pgtable.h               |    1 
 arch/xtensa/include/asm/tlbflush.h              |    3 
 drivers/base/memory.c                           |   44 +-
 drivers/base/node.c                             |   55 +--
 drivers/crypto/chelsio/chtls/chtls_io.c         |    5 
 drivers/gpu/drm/via/via_dmablit.c               |   10 
 drivers/infiniband/core/umem.c                  |    5 
 drivers/infiniband/hw/hfi1/user_pages.c         |    5 
 drivers/infiniband/hw/qib/qib_user_pages.c      |    5 
 drivers/infiniband/hw/usnic/usnic_uiom.c        |    5 
 drivers/infiniband/sw/siw/siw_mem.c             |   10 
 drivers/staging/android/ion/ion_system_heap.c   |    4 
 drivers/target/tcm_fc/tfc_io.c                  |    3 
 drivers/vfio/vfio_iommu_spapr_tce.c             |    8 
 fs/binfmt_elf.c                                 |   20 -
 fs/fat/dir.c                                    |   13 
 fs/fat/fatent.c                                 |    3 
 fs/inode.c                                      |    3 
 fs/io_uring.c                                   |    2 
 fs/jbd2/journal.c                               |    2 
 fs/jbd2/transaction.c                           |   12 
 fs/ocfs2/alloc.c                                |   20 +
 fs/ocfs2/aops.c                                 |   13 
 fs/ocfs2/blockcheck.c                           |   26 -
 fs/ocfs2/cluster/heartbeat.c                    |  109 +------
 fs/ocfs2/dir.c                                  |    3 
 fs/ocfs2/dlm/dlmcommon.h                        |    1 
 fs/ocfs2/dlm/dlmdebug.c                         |   55 ---
 fs/ocfs2/dlm/dlmdebug.h                         |   16 -
 fs/ocfs2/dlm/dlmdomain.c                        |    7 
 fs/ocfs2/dlm/dlmunlock.c                        |   23 +
 fs/ocfs2/dlmglue.c                              |   29 -
 fs/ocfs2/extent_map.c                           |    3 
 fs/ocfs2/file.c                                 |   13 
 fs/ocfs2/inode.c                                |    2 
 fs/ocfs2/journal.h                              |   42 --
 fs/ocfs2/namei.c                                |    2 
 fs/ocfs2/ocfs2.h                                |    3 
 fs/ocfs2/super.c                                |   10 
 fs/open.c                                       |    8 
 fs/proc/meminfo.c                               |    8 
 fs/proc/task_mmu.c                              |    6 
 include/asm-generic/pgalloc.h                   |    5 
 include/asm-generic/pgtable.h                   |    7 
 include/linux/compaction.h                      |   22 +
 include/linux/fs.h                              |   32 ++
 include/linux/huge_mm.h                         |    9 
 include/linux/hugetlb.h                         |    2 
 include/linux/jbd2.h                            |    2 
 include/linux/khugepaged.h                      |   12 
 include/linux/memcontrol.h                      |   23 -
 include/linux/memory.h                          |    7 
 include/linux/memory_hotplug.h                  |    1 
 include/linux/mm.h                              |   37 ++
 include/linux/mm_types.h                        |    1 
 include/linux/mmzone.h                          |   14 
 include/linux/page_ext.h                        |    1 
 include/linux/pagemap.h                         |   10 
 include/linux/quicklist.h                       |   94 ------
 include/linux/shrinker.h                        |    7 
 include/linux/slab.h                            |   62 ----
 include/linux/vmalloc.h                         |   20 -
 include/linux/zpool.h                           |    3 
 init/main.c                                     |    6 
 kernel/events/uprobes.c                         |   81 ++++-
 kernel/resource.c                               |    4 
 kernel/sched/idle.c                             |    1 
 kernel/sysctl.c                                 |    6 
 lib/Kconfig.debug                               |   15 
 lib/Kconfig.kasan                               |    8 
 lib/iov_iter.c                                  |    2 
 lib/show_mem.c                                  |    5 
 lib/test_kasan.c                                |   41 ++
 mm/Kconfig                                      |   16 -
 mm/Kconfig.debug                                |    4 
 mm/Makefile                                     |    4 
 mm/compaction.c                                 |   50 +--
 mm/filemap.c                                    |  168 ++++------
 mm/gup.c                                        |  125 +++-----
 mm/huge_memory.c                                |  129 ++++++--
 mm/hugetlb.c                                    |   89 +++++
 mm/hugetlb_cgroup.c                             |    2 
 mm/init-mm.c                                    |    2 
 mm/kasan/common.c                               |   32 +-
 mm/kasan/kasan.h                                |   14 
 mm/kasan/report.c                               |   44 ++
 mm/kasan/tags_report.c                          |   24 +
 mm/khugepaged.c                                 |  372 ++++++++++++++++++++----
 mm/kmemleak.c                                   |  338 +++++----------------
 mm/ksm.c                                        |   18 -
 mm/madvise.c                                    |   52 +--
 mm/memcontrol.c                                 |  188 ++++++++++--
 mm/memfd.c                                      |    2 
 mm/memory.c                                     |   21 +
 mm/memory_hotplug.c                             |  120 ++++---
 mm/mempolicy.c                                  |    4 
 mm/memremap.c                                   |    5 
 mm/migrate.c                                    |   13 
 mm/mmap.c                                       |   12 
 mm/mmu_gather.c                                 |    2 
 mm/nommu.c                                      |    2 
 mm/oom_kill.c                                   |   30 +
 mm/page_alloc.c                                 |   27 +
 mm/page_owner.c                                 |  127 +++++---
 mm/page_poison.c                                |    2 
 mm/page_vma_mapped.c                            |    3 
 mm/quicklist.c                                  |  103 ------
 mm/rmap.c                                       |   25 -
 mm/shmem.c                                      |   12 
 mm/slab.h                                       |   64 ++++
 mm/slab_common.c                                |   37 ++
 mm/slob.c                                       |    2 
 mm/slub.c                                       |   22 -
 mm/sparse.c                                     |   25 +
 mm/swap.c                                       |   16 -
 mm/swap_state.c                                 |    6 
 mm/util.c                                       |  126 +++++++-
 mm/vmalloc.c                                    |   84 +++--
 mm/vmscan.c                                     |  163 ++++------
 mm/vmstat.c                                     |    2 
 mm/z3fold.c                                     |  154 ++-------
 mm/zpool.c                                      |   16 +
 mm/zsmalloc.c                                   |   23 -
 mm/zswap.c                                      |   15 
 net/xdp/xdp_umem.c                              |    9 
 net/xdp/xsk.c                                   |    2 
 usr/Makefile                                    |    3 
 206 files changed, 2385 insertions(+), 2533 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-08-30 23:04 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-08-30 23:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

7 fixes, based on 846d2db3e00048da3f650e0cfb0b8d67669cec3e:


    Roman Gushchin <guro@fb.com>:
      mm: memcontrol: flush percpu slab vmstats on kmem offlining

    Andrew Morton <akpm@linux-foundation.org>:
      mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n

    Roman Gushchin <guro@fb.com>:
      mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones"

    "Gustavo A. R. Silva" <gustavo@embeddedor.com>:
      mm/z3fold.c: fix lock/unlock imbalance in z3fold_page_isolate

    Dmitry Safonov <dima@arista.com>:
      mailmap: add aliases for Dmitry Safonov

    Michal Hocko <mhocko@suse.com>:
      mm, memcg: do not set reclaim_state on soft limit reclaim

    Shakeel Butt <shakeelb@google.com>:
      mm: memcontrol: fix percpu vmstats and vmevents flush

 .mailmap               |    3 ++
 include/linux/mmzone.h |    5 ++--
 mm/memcontrol.c        |   53 ++++++++++++++++++++++++++++++++-----------------
 mm/vmscan.c            |    5 ++--
 mm/z3fold.c            |    1 
 mm/zsmalloc.c          |    2 +
 6 files changed, 47 insertions(+), 22 deletions(-)



^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2019-08-25  0:54 Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2019-08-25  0:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm

11 fixes, based on 361469211f876e67d7ca3d3d29e6d1c3e313d0f1:

    Henry Burns <henryburns@google.com>:
      mm/z3fold.c: fix race between migration and destruction

    David Rientjes <rientjes@google.com>:
      mm, page_alloc: move_freepages should not examine struct page of reserved memory

    Qian Cai <cai@lca.pw>:
      parisc: fix compilation errrors

    Roman Gushchin <guro@fb.com>:
      mm: memcontrol: flush percpu vmstats before releasing memcg
      mm: memcontrol: flush percpu vmevents before releasing memcg

    Jason Xing <kerneljasonxing@linux.alibaba.com>:
      psi: get poll_work to run when calling poll syscall next time

    Oleg Nesterov <oleg@redhat.com>:
      userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx

    Vlastimil Babka <vbabka@suse.cz>:
      mm, page_owner: handle THP splits correctly

    Henry Burns <henryburns@google.com>:
      mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
      mm/zsmalloc.c: fix race condition in zs_destroy_pool

    Andrey Ryabinin <aryabinin@virtuozzo.com>:
      mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-07-17 16:13   ` incoming Linus Torvalds
  2019-07-17 17:09     ` incoming Christian Brauner
@ 2019-07-17 18:13     ` Vlastimil Babka
  1 sibling, 0 replies; 485+ messages in thread
From: Vlastimil Babka @ 2019-07-17 18:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux List Kernel Mailing, linux-mm, Jonathan Corbet, Thorsten Leemhuis

On 7/17/19 6:13 PM, Linus Torvalds wrote:
> On Wed, Jul 17, 2019 at 1:47 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> So I've tried now to provide an example what I had in mind, below.
> 
> I'll take it as a trial. I added one-line notes about coda and the
> PTRACE_GET_SYSCALL_INFO interface too.

Thanks.

> I do hope that eventually I'll just get pull requests,

Very much agree, that was also discussed at length in the LSF/MM mm
process session I've linked.

> and they'll
> have more of a "theme" than this all (*)

I'll check if the first patch bomb would be more amenable to that, as I
plan to fill in the mm part for 5.3 on LinuxChanges wiki, but for a
merge commit it's too late.

>            Linus
> 
> (*) Although in many ways, the theme for Andrew is "falls through the
> cracks otherwise" so I'm not really complaining. This has been working
> for years and years.

Nevermind the misc stuff that much, but I think mm itself is more
important and deserves what other subsystems have.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-07-17 16:13   ` incoming Linus Torvalds
@ 2019-07-17 17:09     ` Christian Brauner
  2019-07-17 18:13     ` incoming Vlastimil Babka
  1 sibling, 0 replies; 485+ messages in thread
From: Christian Brauner @ 2019-07-17 17:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vlastimil Babka, Linux List Kernel Mailing, linux-mm,
	Jonathan Corbet, Thorsten Leemhuis

On Wed, Jul 17, 2019 at 09:13:26AM -0700, Linus Torvalds wrote:
> On Wed, Jul 17, 2019 at 1:47 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> >
> > So I've tried now to provide an example what I had in mind, below.
> 
> I'll take it as a trial. I added one-line notes about coda and the
> PTRACE_GET_SYSCALL_INFO interface too.
> 
> I do hope that eventually I'll just get pull requests, and they'll
> have more of a "theme" than this all (*)
> 
>            Linus
> 
> (*) Although in many ways, the theme for Andrew is "falls through the
> cracks otherwise" so I'm not really complaining. This has been working

I put all pid{fd}/clone{3} which is mostly related to pid.c, exit.c,
fork.c into my tree and try to give it a consistent theme for the prs I
sent. And that at least from my perspective that worked and was pretty
easy to coordinate with Andrew. That should hopefully make it a little
easier to theme the -mm tree overall going forward.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-07-17  8:47 ` incoming Vlastimil Babka
  2019-07-17  8:57   ` incoming Bhaskar Chowdhury
@ 2019-07-17 16:13   ` Linus Torvalds
  2019-07-17 17:09     ` incoming Christian Brauner
  2019-07-17 18:13     ` incoming Vlastimil Babka
  1 sibling, 2 replies; 485+ messages in thread
From: Linus Torvalds @ 2019-07-17 16:13 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Linux List Kernel Mailing, linux-mm, Jonathan Corbet, Thorsten Leemhuis

On Wed, Jul 17, 2019 at 1:47 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> So I've tried now to provide an example what I had in mind, below.

I'll take it as a trial. I added one-line notes about coda and the
PTRACE_GET_SYSCALL_INFO interface too.

I do hope that eventually I'll just get pull requests, and they'll
have more of a "theme" than this all (*)

           Linus

(*) Although in many ways, the theme for Andrew is "falls through the
cracks otherwise" so I'm not really complaining. This has been working
for years and years.


^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2019-07-17  8:47 ` incoming Vlastimil Babka
@ 2019-07-17  8:57   ` Bhaskar Chowdhury
  2019-07-17 16:13   ` incoming Linus Torvalds
  1 sibling, 0 replies; 485+ messages in thread
From: Bhaskar Chowdhury @ 2019-07-17  8:57 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-kernel, Linus Torvalds, linux-mm, Jonathan Corbet,
	Thorsten Leemhuis

[-- Attachment #1: Type: text/plain, Size: 2496 bytes --]



Cool !! 

On 10:47 Wed 17 Jul , Vlastimil Babka wrote:
>On 7/17/19 1:25 AM, Andrew Morton wrote:
>>
>> Most of the rest of MM and just about all of the rest of everything
>> else.
>
>Hi,
>
>as I've mentioned at LSF/MM [1], I think it would be nice if mm pull
>requests had summaries similar to other subsystems. I see they are now
>more structured (thanks!), but they are now probably hitting the limit
>of what scripting can do to produce a high-level summary for human
>readers (unless patch authors themselves provide a blurb that can be
>extracted later?).
>
>So I've tried now to provide an example what I had in mind, below. Maybe
>it's too concise - if there were "larger" features in this pull request,
>they would probably benefit from more details. I'm CCing the known (to
>me) consumers of these mails to judge :) Note I've only covered mm, and
>core stuff that I think will be interesting to wide audience (change in
>LIST_POISON2 value? I'm sure as hell glad to know about that one :)
>
>Feel free to include this in the merge commit, if you find it useful.
>
>Thanks,
>Vlastimil
>
>[1] https://lwn.net/Articles/787705/
>
>-----
>
>- z3fold fixes and enhancements by Henry Burns and Vitaly Wool
>- more accurate reclaimed slab caches calculations by Yafang Shao
>- fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
>Christoph Hellwig
>- !CONFIG_MMU fixes by Christoph Hellwig
>- new novmcoredd parameter to omit device dumps from vmcore, by Kairui Song
>- new test_meminit module for testing heap and pagealloc initialization,
>by Alexander Potapenko
>- ioremap improvements for huge mappings, by Anshuman Khandual
>- generalize kprobe page fault handling, by Anshuman Khandual
>- device-dax hotplug fixes and improvements, by Pavel Tatashin
>- enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V
>- add pte_devmap() support for arm64, by Robin Murphy
>- unify locked_vm accounting with a helper, by Daniel Jordan
>- several misc fixes
>
>core/lib
>- new typeof_member() macro including some users, by Alexey Dobriyan
>- make BIT() and GENMASK() available in asm, by Masahiro Yamada
>- changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better code
>generation, by Alexey Dobriyan
>- rbtree code size optimizations, by Michel Lespinasse
>- convert struct pid count to refcount_t, by Joel Fernandes
>
>get_maintainer.pl
>- add --no-moderated switch to skip moderated ML's, by Joe Perches
>
>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
       [not found] <20190716162536.bb52b8f34a8ecf5331a86a42@linux-foundation.org>
@ 2019-07-17  8:47 ` Vlastimil Babka
  2019-07-17  8:57   ` incoming Bhaskar Chowdhury
  2019-07-17 16:13   ` incoming Linus Torvalds
  0 siblings, 2 replies; 485+ messages in thread
From: Vlastimil Babka @ 2019-07-17  8:47 UTC (permalink / raw)
  To: linux-kernel, Linus Torvalds
  Cc: linux-mm, Jonathan Corbet, Thorsten Leemhuis, LKML

On 7/17/19 1:25 AM, Andrew Morton wrote:
> 
> Most of the rest of MM and just about all of the rest of everything
> else.

Hi,

as I've mentioned at LSF/MM [1], I think it would be nice if mm pull
requests had summaries similar to other subsystems. I see they are now
more structured (thanks!), but they are now probably hitting the limit
of what scripting can do to produce a high-level summary for human
readers (unless patch authors themselves provide a blurb that can be
extracted later?).

So I've tried now to provide an example what I had in mind, below. Maybe
it's too concise - if there were "larger" features in this pull request,
they would probably benefit from more details. I'm CCing the known (to
me) consumers of these mails to judge :) Note I've only covered mm, and
core stuff that I think will be interesting to wide audience (change in
LIST_POISON2 value? I'm sure as hell glad to know about that one :)

Feel free to include this in the merge commit, if you find it useful.

Thanks,
Vlastimil

[1] https://lwn.net/Articles/787705/

-----

- z3fold fixes and enhancements by Henry Burns and Vitaly Wool
- more accurate reclaimed slab caches calculations by Yafang Shao
- fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
Christoph Hellwig
- !CONFIG_MMU fixes by Christoph Hellwig
- new novmcoredd parameter to omit device dumps from vmcore, by Kairui Song
- new test_meminit module for testing heap and pagealloc initialization,
by Alexander Potapenko
- ioremap improvements for huge mappings, by Anshuman Khandual
- generalize kprobe page fault handling, by Anshuman Khandual
- device-dax hotplug fixes and improvements, by Pavel Tatashin
- enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V
- add pte_devmap() support for arm64, by Robin Murphy
- unify locked_vm accounting with a helper, by Daniel Jordan
- several misc fixes

core/lib
- new typeof_member() macro including some users, by Alexey Dobriyan
- make BIT() and GENMASK() available in asm, by Masahiro Yamada
- changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better code
generation, by Alexey Dobriyan
- rbtree code size optimizations, by Michel Lespinasse
- convert struct pid count to refcount_t, by Joel Fernandes

get_maintainer.pl
- add --no-moderated switch to skip moderated ML's, by Joe Perches



^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-04 19:24       ` incoming Greg KH
@ 2007-05-04 19:29         ` Roland McGrath
  0 siblings, 0 replies; 485+ messages in thread
From: Roland McGrath @ 2007-05-04 19:29 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Linus Torvalds, Hugh Dickins, Christoph Lameter,
	David S. Miller, Andi Kleen, Luck, Tony, Rik van Riel,
	Benjamin Herrenschmidt, linux-kernel, linux-mm, Stephen Smalley

> ABI changes are not a problem for -stable, so don't let that stop anyone
> :)

In fact this is the harmless sort (changes only the error code of a
failure case) that might actually go in if there were any important
reason.  But the smiley stands.


Thanks,
Roland

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-04 18:57     ` incoming Roland McGrath
@ 2007-05-04 19:24       ` Greg KH
  2007-05-04 19:29         ` incoming Roland McGrath
  0 siblings, 1 reply; 485+ messages in thread
From: Greg KH @ 2007-05-04 19:24 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Andrew Morton, Linus Torvalds, Hugh Dickins, Christoph Lameter,
	David S. Miller, Andi Kleen, Luck, Tony, Rik van Riel,
	Benjamin Herrenschmidt, linux-kernel, linux-mm, Stephen Smalley

On Fri, May 04, 2007 at 11:57:21AM -0700, Roland McGrath wrote:
> > Ah.  The patch affects security code, but it doesn't actually address any
> > insecurity.  I didn't think it was needed for -stable?
> 
> I would not recommend it for -stable.  
> It is an ABI change for the case of a security refusal.

ABI changes are not a problem for -stable, so don't let that stop anyone
:)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-04 16:14   ` incoming Andrew Morton
  2007-05-04 17:02     ` incoming Greg KH
@ 2007-05-04 18:57     ` Roland McGrath
  2007-05-04 19:24       ` incoming Greg KH
  1 sibling, 1 reply; 485+ messages in thread
From: Roland McGrath @ 2007-05-04 18:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Greg KH, Linus Torvalds, Hugh Dickins, Christoph Lameter,
	David S. Miller, Andi Kleen, Luck, Tony, Rik van Riel,
	Benjamin Herrenschmidt, linux-kernel, linux-mm, Stephen Smalley

> Ah.  The patch affects security code, but it doesn't actually address any
> insecurity.  I didn't think it was needed for -stable?

I would not recommend it for -stable.  
It is an ABI change for the case of a security refusal.


Thanks,
Roland

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-04 16:14   ` incoming Andrew Morton
@ 2007-05-04 17:02     ` Greg KH
  2007-05-04 18:57     ` incoming Roland McGrath
  1 sibling, 0 replies; 485+ messages in thread
From: Greg KH @ 2007-05-04 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, Benjamin Herrenschmidt,
	linux-kernel, linux-mm, Roland McGrath, Stephen Smalley

On Fri, May 04, 2007 at 09:14:34AM -0700, Andrew Morton wrote:
> On Fri, 4 May 2007 06:37:28 -0700 Greg KH <greg@kroah.com> wrote:
> 
> > On Wed, May 02, 2007 at 03:02:52PM -0700, Andrew Morton wrote:
> > > - One little security patch
> > 
> > Care to cc: linux-stable with it so we can do a new 2.6.21 release with
> > it if needed?
> > 
> 
> Ah.  The patch affects security code, but it doesn't actually address any
> insecurity.  I didn't think it was needed for -stable?

Ah, ok, I read "security" as fixing a insecure problem, my mistake :)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-04 13:37 ` incoming Greg KH
@ 2007-05-04 16:14   ` Andrew Morton
  2007-05-04 17:02     ` incoming Greg KH
  2007-05-04 18:57     ` incoming Roland McGrath
  0 siblings, 2 replies; 485+ messages in thread
From: Andrew Morton @ 2007-05-04 16:14 UTC (permalink / raw)
  To: Greg KH
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, Benjamin Herrenschmidt,
	linux-kernel, linux-mm, Roland McGrath, Stephen Smalley

On Fri, 4 May 2007 06:37:28 -0700 Greg KH <greg@kroah.com> wrote:

> On Wed, May 02, 2007 at 03:02:52PM -0700, Andrew Morton wrote:
> > - One little security patch
> 
> Care to cc: linux-stable with it so we can do a new 2.6.21 release with
> it if needed?
> 

Ah.  The patch affects security code, but it doesn't actually address any
insecurity.  I didn't think it was needed for -stable?



From: Roland McGrath <roland@redhat.com>

wait* syscalls return -ECHILD even when an individual PID of a live child
was requested explicitly, when security_task_wait denies the operation. 
This means that something like a broken SELinux policy can produce an
unexpected failure that looks just like a bug with wait or ptrace or
something.

This patch makes do_wait return -EACCES (or other appropriate error returned
from security_task_wait() instead of -ECHILD if some children were ruled out
solely because security_task_wait failed.

[jmorris@namei.org: switch error code to EACCES]
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/exit.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff -puN kernel/exit.c~return-eperm-not-echild-on-security_task_wait-failure kernel/exit.c
--- a/kernel/exit.c~return-eperm-not-echild-on-security_task_wait-failure
+++ a/kernel/exit.c
@@ -1033,6 +1033,8 @@ asmlinkage void sys_exit_group(int error
 
 static int eligible_child(pid_t pid, int options, struct task_struct *p)
 {
+	int err;
+
 	if (pid > 0) {
 		if (p->pid != pid)
 			return 0;
@@ -1066,8 +1068,9 @@ static int eligible_child(pid_t pid, int
 	if (delay_group_leader(p))
 		return 2;
 
-	if (security_task_wait(p))
-		return 0;
+	err = security_task_wait(p);
+	if (err)
+		return err;
 
 	return 1;
 }
@@ -1449,6 +1452,7 @@ static long do_wait(pid_t pid, int optio
 	DECLARE_WAITQUEUE(wait, current);
 	struct task_struct *tsk;
 	int flag, retval;
+	int allowed, denied;
 
 	add_wait_queue(&current->signal->wait_chldexit,&wait);
 repeat:
@@ -1457,6 +1461,7 @@ repeat:
 	 * match our criteria, even if we are not able to reap it yet.
 	 */
 	flag = 0;
+	allowed = denied = 0;
 	current->state = TASK_INTERRUPTIBLE;
 	read_lock(&tasklist_lock);
 	tsk = current;
@@ -1472,6 +1477,12 @@ repeat:
 			if (!ret)
 				continue;
 
+			if (unlikely(ret < 0)) {
+				denied = ret;
+				continue;
+			}
+			allowed = 1;
+
 			switch (p->state) {
 			case TASK_TRACED:
 				/*
@@ -1570,6 +1581,8 @@ check_continued:
 		goto repeat;
 	}
 	retval = -ECHILD;
+	if (unlikely(denied) && !allowed)
+		retval = denied;
 end:
 	current->state = TASK_RUNNING;
 	remove_wait_queue(&current->signal->wait_chldexit,&wait);
_


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-02 22:02 incoming Andrew Morton
  2007-05-02 22:31 ` incoming Benjamin Herrenschmidt
  2007-05-03  7:55 ` incoming Russell King
@ 2007-05-04 13:37 ` Greg KH
  2007-05-04 16:14   ` incoming Andrew Morton
  2 siblings, 1 reply; 485+ messages in thread
From: Greg KH @ 2007-05-04 13:37 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, Benjamin Herrenschmidt,
	linux-kernel, linux-mm

On Wed, May 02, 2007 at 03:02:52PM -0700, Andrew Morton wrote:
> - One little security patch

Care to cc: linux-stable with it so we can do a new 2.6.21 release with
it if needed?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-03  7:55 ` incoming Russell King
@ 2007-05-03  8:05   ` Andrew Morton
  0 siblings, 0 replies; 485+ messages in thread
From: Andrew Morton @ 2007-05-03  8:05 UTC (permalink / raw)
  To: Russell King
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, Benjamin Herrenschmidt,
	linux-kernel, linux-mm

On Thu, 3 May 2007 08:55:43 +0100 Russell King <rmk+lkml@arm.linux.org.uk> wrote:

> On Wed, May 02, 2007 at 03:02:52PM -0700, Andrew Morton wrote:
> > So this is what I have lined up for the first mm->2.6.22 batch.  I won't be
> > sending it off for another 12-24 hours yet.  To give people time for final
> > comment and to give me time to see if it actually works.
> 
> I assume you're going to update this list with my comments I sent
> yesterday?
> 

Serial drivers?  Well you saw me drop a bunch of them.  I now have:

serial-driver-pmc-msp71xx.patch
rm9000-serial-driver.patch
serial-define-fixed_port-flag-for-serial_core.patch
mpsc-serial-driver-tx-locking.patch
serial-serial_core-use-pr_debug.patch

I'll also be holding off on MADV_FREE - Nick has some performance things to
share and I'm assuming they're not as good as he'd like.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-02 22:02 incoming Andrew Morton
  2007-05-02 22:31 ` incoming Benjamin Herrenschmidt
@ 2007-05-03  7:55 ` Russell King
  2007-05-03  8:05   ` incoming Andrew Morton
  2007-05-04 13:37 ` incoming Greg KH
  2 siblings, 1 reply; 485+ messages in thread
From: Russell King @ 2007-05-03  7:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, Benjamin Herrenschmidt,
	linux-kernel, linux-mm

On Wed, May 02, 2007 at 03:02:52PM -0700, Andrew Morton wrote:
> So this is what I have lined up for the first mm->2.6.22 batch.  I won't be
> sending it off for another 12-24 hours yet.  To give people time for final
> comment and to give me time to see if it actually works.

I assume you're going to update this list with my comments I sent
yesterday?

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* Re: incoming
  2007-05-02 22:02 incoming Andrew Morton
@ 2007-05-02 22:31 ` Benjamin Herrenschmidt
  2007-05-03  7:55 ` incoming Russell King
  2007-05-04 13:37 ` incoming Greg KH
  2 siblings, 0 replies; 485+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-02 22:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Hugh Dickins, Christoph Lameter, David S. Miller,
	Andi Kleen, Luck, Tony, Rik van Riel, linux-kernel, linux-mm

On Wed, 2007-05-02 at 15:02 -0700, Andrew Morton wrote:
> So this is what I have lined up for the first mm->2.6.22 batch.  I won't be
> sending it off for another 12-24 hours yet.  To give people time for final
> comment and to give me time to see if it actually works.

Thanks.

I have some powerpc bits that depend on that stuff that will go through
Paulus after these show up in git and I've rebased.

Cheers,
Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

* incoming
@ 2007-05-02 22:02 Andrew Morton
  2007-05-02 22:31 ` incoming Benjamin Herrenschmidt
                   ` (2 more replies)
  0 siblings, 3 replies; 485+ messages in thread
From: Andrew Morton @ 2007-05-02 22:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Christoph Lameter, David S. Miller, Andi Kleen,
	Luck, Tony, Rik van Riel, Benjamin Herrenschmidt, linux-kernel,
	linux-mm

So this is what I have lined up for the first mm->2.6.22 batch.  I won't be
sending it off for another 12-24 hours yet.  To give people time for final
comment and to give me time to see if it actually works.



- A few serial bits.

- A few pcmcia bits.

- Some of the MM queue.  Includes:

  - An enhancement to /proc/pid/smaps to permit monitoring of a running
    program's working set.

    There's another patchset which builds on this quite a lot from Matt
    Mackall, but it's not quite ready yet.

  - The SLUB allocator.  It's pretty green but I do want to push ahead
    with this pretty aggressively with a view to replacing slab altogether.

    If it ends up not working out then we should remove slub altogether
    again, but I doubt if that will occur.

    If SLUB isn't in good shape by 2.6.22 we should hide it in Kconfig
    to prevent people from hitting known problems.  It'll remain
    EXPERIMENTAL.

  - generic pagetable quicklist management.  We have x86_64 and ia64
    and sparc64 implementations, but I'll only include David's sparc64
    implementation here.  I'll send the x86_64 and ia64 implementations
    through maintainers.

  - Various random MM bits

  - Benh's teach-get_unmapped_area-about-MAP_FIXED changes

  - madvise(MADV_FREE)



  This means I'm holding back Mel's page allocator work, and Andy's
  lumpy-reclaim.

  A shame in a way - I have high hopes for lumpy reclaim against the
  moveable zone, but these things are not to be done lightly.

  A few MM things have been held back awaiting subsystem tree merges
  (probably x86 - I didn't check).


- One little security patch

- the blackfin architecture

- small h8300 update

- small alpha update

- swsusp updates

- m68k bits

- cris udpates

- Lots of UML updates

- v850, xtensa



slab-introduce-krealloc.patch
at91_cf-minor-fix.patch
add-new_id-to-pcmcia-drivers.patch
ide-cs-recognize-2gb-compactflash-from-transcend.patch
serial-driver-pmc-msp71xx.patch
rm9000-serial-driver.patch
serial-define-fixed_port-flag-for-serial_core.patch
serial-use-resource_size_t-for-serial-port-io-addresses.patch
mpsc-serial-driver-tx-locking.patch
8250_pci-fix-pci-must_checks.patch
serial-serial_core-use-pr_debug.patch
add-apply_to_page_range-which-applies-a-function-to-a-pte-range.patch
safer-nr_node_ids-and-nr_node_ids-determination-and-initial.patch
use-zvc-counters-to-establish-exact-size-of-dirtyable-pages.patch
proper-prototype-for-hugetlb_get_unmapped_area.patch
mm-remove-gcc-workaround.patch
slab-ensure-cache_alloc_refill-terminates.patch
mm-make-read_cache_page-synchronous.patch
fs-buffer-dont-pageuptodate-without-page-locked.patch
allow-oom_adj-of-saintly-processes.patch
introduce-config_has_dma.patch
mm-slabc-proper-prototypes.patch
add-pfn_valid_within-helper-for-sub-max_order-hole-detection.patch
mm-simplify-filemap_nopage.patch
add-unitialized_var-macro-for-suppressing-gcc-warnings.patch
i386-add-ptep_test_and_clear_dirtyyoung.patch
i386-use-pte_update_defer-in-ptep_test_and_clear_dirtyyoung.patch
smaps-extract-pmd-walker-from-smaps-code.patch
smaps-add-pages-referenced-count-to-smaps.patch
smaps-add-clear_refs-file-to-clear-reference.patch
readahead-improve-heuristic-detecting-sequential-reads.patch
readahead-code-cleanup.patch
slab-use-num_possible_cpus-in-enable_cpucache.patch
slab-dont-allocate-empty-shared-caches.patch
slab-numa-kmem_cache-diet.patch
do-not-disable-interrupts-when-reading-min_free_kbytes.patch
slab-mark-set_up_list3s-__init.patch
cpusets-allow-tif_memdie-threads-to-allocate-anywhere.patch
i386-use-page-allocator-to-allocate-thread_info-structure.patch
slub-core.patch
make-page-private-usable-in-compound-pages-v1.patch
optimize-compound_head-by-avoiding-a-shared-page.patch
add-virt_to_head_page-and-consolidate-code-in-slab-and-slub.patch
slub-fix-object-tracking.patch
slub-enable-tracking-of-full-slabs.patch
slub-validation-of-slabs-metadata-and-guard-zones.patch
slub-add-min_partial.patch
slub-add-ability-to-list-alloc--free-callers-per-slab.patch
slub-free-slabs-and-sort-partial-slab-lists-in-kmem_cache_shrink.patch
slub-remove-object-activities-out-of-checking-functions.patch
slub-user-documentation.patch
slub-add-slabinfo-tool.patch
quicklists-for-page-table-pages.patch
quicklist-support-for-sparc64.patch
slob-handle-slab_panic-flag.patch
include-kern_-constant-in-printk-calls-in-mm-slabc.patch
mm-madvise-avoid-exclusive-mmap_sem.patch
mm-remove-destroy_dirty_buffers-from-invalidate_bdev.patch
mm-optimize-kill_bdev.patch
mm-optimize-acorn-partition-truncate.patch
slab-allocators-remove-obsolete-slab_must_hwcache_align.patch
kmem_cache-simplify-slab-cache-creation.patch
slab-allocators-remove-multiple-alignment-specifications.patch
fault-injection-fix-failslab-with-config_numa.patch
mm-fix-handling-of-panic_on_oom-when-cpusets-are-in-use.patch
oom-fix-constraint-deadlock.patch
get_unmapped_area-handles-map_fixed-on-powerpc.patch
get_unmapped_area-handles-map_fixed-on-alpha.patch
get_unmapped_area-handles-map_fixed-on-arm.patch
get_unmapped_area-handles-map_fixed-on-frv.patch
get_unmapped_area-handles-map_fixed-on-i386.patch
get_unmapped_area-handles-map_fixed-on-ia64.patch
get_unmapped_area-handles-map_fixed-on-parisc.patch
get_unmapped_area-handles-map_fixed-on-sparc64.patch
get_unmapped_area-handles-map_fixed-on-x86_64.patch
get_unmapped_area-handles-map_fixed-in-hugetlbfs.patch
get_unmapped_area-handles-map_fixed-in-generic-code.patch
get_unmapped_area-doesnt-need-hugetlbfs-hacks-anymore.patch
slab-allocators-remove-slab_debug_initial-flag.patch
slab-allocators-remove-slab_ctor_atomic.patch
slab-allocators-remove-useless-__gfp_no_grow-flag.patch
lazy-freeing-of-memory-through-madv_free.patch
restore-madv_dontneed-to-its-original-linux-behaviour.patch
hugetlbfs-add-null-check-in-hugetlb_zero_setup.patch
slob-fix-page-order-calculation-on-not-4kb-page.patch
page-migration-only-migrate-pages-if-allocation-in-the-highest-zone-is-possible.patch
return-eperm-not-echild-on-security_task_wait-failure.patch
blackfin-arch.patch
driver_bfin_serial_core.patch
blackfin-on-chip-ethernet-mac-controller-driver.patch
blackfin-patch-add-blackfin-support-in-smc91x.patch
blackfin-on-chip-rtc-controller-driver.patch
blackfin-blackfin-on-chip-spi-controller-driver.patch
convert-h8-300-to-generic-timekeeping.patch
h8300-generic-irq.patch
h8300-add-zimage-support.patch
round_up-macro-cleanup-in-arch-alpha-kernel-osf_sysc.patch
alpha-fix-bootp-image-creation.patch
alpha-prctl-macros.patch
srmcons-fix-kmallocgfp_kernel-inside-spinlock.patch
arm26-remove-useless-config-option-generic_bust_spinlock.patch
fix-refrigerator-vs-thaw_process-race.patch
swsusp-use-inline-functions-for-changing-page-flags.patch
swsusp-do-not-use-page-flags.patch
mm-remove-unused-page-flags.patch
swsusp-fix-error-paths-in-snapshot_open.patch
swsusp-use-gfp_kernel-for-creating-basic-data-structures.patch
freezer-remove-pf_nofreeze-from-handle_initrd.patch
swsusp-use-rbtree-for-tracking-allocated-swap.patch
freezer-fix-racy-usage-of-try_to_freeze-in-kswapd.patch
remove-software_suspend.patch
power-management-change-sys-power-disk-display.patch
kconfig-mentioneds-hibernation-not-just-swsusp.patch
swsusp-fix-snapshot_release.patch
swsusp-free-more-memory.patch
remove-unused-header-file-arch-m68k-atari-atasoundh.patch
spin_lock_unlocked-cleanup-in-arch-m68k.patch
remove-unused-header-file-drivers-serial-crisv10h.patch
cris-check-for-memory-allocation.patch
cris-remove-code-related-to-pre-22-kernel.patch
uml-delete-unused-code.patch
uml-formatting-fixes.patch
uml-host_info-tidying.patch
uml-mark-tt-mode-code-for-future-removal.patch
uml-print-coredump-limits.patch
uml-handle-block-device-hotplug-errors.patch
uml-driver-formatting-fixes.patch
uml-driver-formatting-fixes-fix.patch
uml-network-interface-hotplug-error-handling.patch
array_size-check-for-type.patch
uml-move-sigio-testing-to-sigioc.patch
uml-create-archh.patch
uml-create-as-layouth.patch
uml-move-remaining-useful-contents-of-user_utilh.patch
uml-remove-user_utilh.patch
uml-add-missing-__init-declarations.patch
remove-unused-header-file-arch-um-kernel-tt-include-mode_kern-tth.patch
uml-improve-checking-and-diagnostics-of-ethernet-macs.patch
uml-eliminate-temporary-buffer-in-eth_configure.patch
uml-replace-one-element-array-with-zero-element-array.patch
uml-fix-umid-in-xterm-titles.patch
uml-speed-up-exec.patch
uml-no-locking-needed-in-tlsc.patch
uml-tidy-processc.patch
uml-remove-page_size.patch
uml-kernel_thread-shouldnt-panic.patch
uml-tidy-fault-code.patch
uml-kernel-segfaults-should-dump-proper-registers.patch
uml-comment-early-boot-locking.patch
uml-irq-locking-commentary.patch
uml-delete-host_frame_size.patch
uml-drivers-get-release-methods.patch
uml-dump-registers-on-ptrace-or-wait-failure.patch
uml-speed-up-page-table-walking.patch
uml-remove-unused-x86_64-code.patch
uml-start-fixing-os_read_file-and-os_write_file.patch
uml-tidy-libc-code.patch
uml-convert-libc-layer-to-call-read-and-write.patch
uml-batch-i-o-requests.patch
uml-send-pointers-instead-of-structures-to-i-o-thread.patch
uml-send-pointers-instead-of-structures-to-i-o-thread-fix.patch
uml-dump-core-on-panic.patch
uml-dont-try-to-handle-signals-on-initial-process-stack.patch
uml-change-remaining-callers-of-os_read_write_file.patch
uml-formatting-fixes-around-os_read_write_file-callers.patch
uml-remove-debugging-remnants.patch
uml-rename-os_read_write_file_k-back-to-os_read_write_file.patch
uml-aio-deadlock-avoidance.patch
uml-speed-page-fault-path.patch
uml-eliminate-a-piece-of-debugging-code.patch
uml-more-page-fault-path-trimming.patch
uml-only-flush-areas-covered-by-vma.patch
uml-out-of-tmpfs-space-error-clarification.patch
uml-virtualized-time-fix.patch
uml-fix-prototypes.patch
v850-generic-timekeeping-conversion.patch
xtensa-strlcpy-is-smart-enough.patch

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 485+ messages in thread

end of thread, other threads:[~2022-04-27 19:41 UTC | newest]

Thread overview: 485+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05 20:34 incoming Andrew Morton
2021-11-05 20:34 ` [patch 001/262] scripts/spelling.txt: add more spellings to spelling.txt Andrew Morton
2021-11-05 20:34 ` [patch 002/262] scripts/spelling.txt: fix "mistake" version of "synchronization" Andrew Morton
2021-11-05 20:34 ` [patch 003/262] scripts/decodecode: fix faulting instruction no print when opps.file is DOS format Andrew Morton
2021-11-05 20:34 ` [patch 004/262] ocfs2: fix handle refcount leak in two exception handling paths Andrew Morton
2021-11-05 20:34 ` [patch 005/262] ocfs2: cleanup journal init and shutdown Andrew Morton
2021-11-05 20:34 ` [patch 006/262] ocfs2/dlm: remove redundant assignment of variable ret Andrew Morton
2021-11-05 20:34 ` [patch 007/262] ocfs2: fix data corruption on truncate Andrew Morton
2021-11-05 20:34 ` [patch 008/262] ocfs2: do not zero pages beyond i_size Andrew Morton
2021-11-05 20:35 ` [patch 009/262] fs/posix_acl.c: avoid -Wempty-body warning Andrew Morton
2021-11-05 20:35 ` [patch 010/262] d_path: fix Kernel doc validator complaining Andrew Morton
2021-11-05 20:35 ` [patch 011/262] mm: move kvmalloc-related functions to slab.h Andrew Morton
2021-11-05 20:35 ` [patch 012/262] mm/slab.c: remove useless lines in enable_cpucache() Andrew Morton
2021-11-05 20:35 ` [patch 013/262] slub: add back check for free nonslab objects Andrew Morton
2021-11-05 20:35 ` [patch 014/262] mm, slub: change percpu partial accounting from objects to pages Andrew Morton
2021-11-05 20:35 ` [patch 015/262] mm/slub: increase default cpu partial list sizes Andrew Morton
2021-11-05 20:35 ` [patch 016/262] mm, slub: use prefetchw instead of prefetch Andrew Morton
2021-11-05 20:35 ` [patch 017/262] mm: disable NUMA_BALANCING_DEFAULT_ENABLED and TRANSPARENT_HUGEPAGE on PREEMPT_RT Andrew Morton
2021-11-05 20:35 ` [patch 018/262] mm: don't include <linux/dax.h> in <linux/mempolicy.h> Andrew Morton
2021-11-05 20:35 ` [patch 019/262] lib/stackdepot: include gfp.h Andrew Morton
2021-11-05 20:35 ` [patch 020/262] lib/stackdepot: remove unused function argument Andrew Morton
2021-11-05 20:35 ` [patch 021/262] lib/stackdepot: introduce __stack_depot_save() Andrew Morton
2021-11-05 20:35 ` [patch 022/262] kasan: common: provide can_alloc in kasan_save_stack() Andrew Morton
2021-11-05 20:35 ` [patch 023/262] kasan: generic: introduce kasan_record_aux_stack_noalloc() Andrew Morton
2021-11-05 20:35 ` [patch 024/262] workqueue, kasan: avoid alloc_pages() when recording stack Andrew Morton
2021-11-05 20:35 ` [patch 025/262] kasan: fix tag for large allocations when using CONFIG_SLAB Andrew Morton
2021-11-05 20:35 ` [patch 026/262] kasan: test: add memcpy test that avoids out-of-bounds write Andrew Morton
2021-11-05 20:35 ` [patch 027/262] mm/smaps: fix shmem pte hole swap calculation Andrew Morton
2021-11-05 20:36 ` [patch 028/262] mm/smaps: use vma->vm_pgoff directly when counting partial swap Andrew Morton
2021-11-05 20:36 ` [patch 029/262] mm/smaps: simplify shmem handling of pte holes Andrew Morton
2021-11-05 20:36 ` [patch 030/262] mm: debug_vm_pgtable: don't use __P000 directly Andrew Morton
2021-11-05 20:36 ` [patch 031/262] kasan: test: bypass __alloc_size checks Andrew Morton
2021-11-05 20:36 ` [patch 032/262] rapidio: avoid bogus __alloc_size warning Andrew Morton
2021-11-05 20:36 ` [patch 033/262] Compiler Attributes: add __alloc_size() for better bounds checking Andrew Morton
2021-11-05 20:36 ` [patch 034/262] slab: clean up function prototypes Andrew Morton
2021-11-05 20:36 ` [patch 035/262] slab: add __alloc_size attributes for better bounds checking Andrew Morton
2021-11-05 20:36 ` [patch 036/262] mm/kvmalloc: " Andrew Morton
2021-11-05 20:36 ` [patch 037/262] mm/vmalloc: " Andrew Morton
2021-11-05 20:36 ` [patch 038/262] mm/page_alloc: " Andrew Morton
2021-11-05 20:36 ` [patch 039/262] percpu: " Andrew Morton
2021-11-05 20:36 ` [patch 040/262] mm/page_ext.c: fix a comment Andrew Morton
2021-11-05 20:36 ` [patch 041/262] mm: stop filemap_read() from grabbing a superfluous page Andrew Morton
2021-11-05 20:36 ` [patch 042/262] mm: export bdi_unregister Andrew Morton
2021-11-05 20:36 ` [patch 043/262] mtd: call bdi_unregister explicitly Andrew Morton
2021-11-05 20:36 ` [patch 044/262] fs: explicitly unregister per-superblock BDIs Andrew Morton
2021-11-05 20:37 ` [patch 045/262] mm: don't automatically unregister bdis Andrew Morton
2021-11-05 20:37 ` [patch 046/262] mm: simplify bdi refcounting Andrew Morton
2021-11-05 20:37 ` [patch 047/262] mm: don't read i_size of inode unless we need it Andrew Morton
2021-11-05 20:37 ` [patch 048/262] mm/filemap.c: remove bogus VM_BUG_ON Andrew Morton
2021-11-05 20:37 ` [patch 049/262] mm: move more expensive part of XA setup out of mapping check Andrew Morton
2021-11-05 20:37 ` [patch 050/262] mm/gup: further simplify __gup_device_huge() Andrew Morton
2021-11-05 20:37 ` [patch 051/262] mm/swapfile: remove needless request_queue NULL pointer check Andrew Morton
2021-11-05 20:37 ` [patch 052/262] mm/swapfile: fix an integer overflow in swap_show() Andrew Morton
2021-11-05 20:37 ` [patch 053/262] mm: optimise put_pages_list() Andrew Morton
2021-11-05 20:37 ` [patch 054/262] mm/memcg: drop swp_entry_t* in mc_handle_file_pte() Andrew Morton
2021-11-05 20:37 ` [patch 055/262] memcg: flush stats only if updated Andrew Morton
2021-11-05 20:37 ` [patch 056/262] memcg: unify memcg stat flushing Andrew Morton
2021-11-05 20:37 ` [patch 057/262] mm/memcg: remove obsolete memcg_free_kmem() Andrew Morton
2021-11-05 20:37 ` [patch 058/262] mm/list_lru.c: prefer struct_size over open coded arithmetic Andrew Morton
2021-11-05 20:37 ` [patch 059/262] memcg, kmem: further deprecate kmem.limit_in_bytes Andrew Morton
2021-11-05 20:37 ` [patch 060/262] mm: list_lru: remove holding lru lock Andrew Morton
2021-11-05 20:37 ` [patch 061/262] mm: list_lru: fix the return value of list_lru_count_one() Andrew Morton
2021-11-05 20:37 ` [patch 062/262] mm: memcontrol: remove kmemcg_id reparenting Andrew Morton
2021-11-05 20:37 ` [patch 063/262] mm: memcontrol: remove the kmem states Andrew Morton
2021-11-05 20:37 ` [patch 064/262] mm: list_lru: only add memcg-aware lrus to the global lru list Andrew Morton
2021-11-05 20:38 ` [patch 065/262] mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks Andrew Morton
2021-11-05 20:38 ` [patch 066/262] mm, oom: do not trigger out_of_memory from the #PF Andrew Morton
2021-11-05 20:38 ` [patch 067/262] memcg: prohibit unconditional exceeding the limit of dying tasks Andrew Morton
2021-11-05 20:38 ` [patch 068/262] mm/mmap.c: fix a data race of mm->total_vm Andrew Morton
2021-11-05 20:38 ` [patch 069/262] mm: use __pfn_to_section() instead of open coding it Andrew Morton
2021-11-05 20:38 ` [patch 070/262] mm/memory.c: avoid unnecessary kernel/user pointer conversion Andrew Morton
2021-11-05 20:38 ` [patch 071/262] mm/memory.c: use correct VMA flags when freeing page-tables Andrew Morton
2021-11-05 20:57   ` Nadav Amit
2021-11-06 18:54     ` Linus Torvalds
2021-11-05 20:38 ` [patch 072/262] mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte Andrew Morton
2021-11-05 20:38 ` [patch 073/262] mm: clear vmf->pte after pte_unmap_same() returns Andrew Morton
2021-11-05 20:38 ` [patch 074/262] mm: drop first_index/last_index in zap_details Andrew Morton
2021-11-05 20:38 ` [patch 075/262] mm: add zap_skip_check_mapping() helper Andrew Morton
2021-11-05 20:38 ` [patch 076/262] mm: introduce pmd_install() helper Andrew Morton
2021-11-05 20:38 ` [patch 077/262] mm: remove redundant smp_wmb() Andrew Morton
2021-11-05 20:38 ` [patch 078/262] Documentation: update pagemap with shmem exceptions Andrew Morton
2021-11-05 20:38 ` [patch 079/262] lazy tlb: introduce lazy mm refcount helper functions Andrew Morton
2021-11-05 20:38 ` [patch 080/262] lazy tlb: allow lazy tlb mm refcounting to be configurable Andrew Morton
2021-11-06  4:29   ` Andy Lutomirski
2021-11-06 19:10     ` Linus Torvalds
2021-11-05 20:38 ` [patch 081/262] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Andrew Morton
2021-11-05 20:38 ` [patch 082/262] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Andrew Morton
2021-11-05 20:39 ` [patch 083/262] memory: remove unused CONFIG_MEM_BLOCK_SIZE Andrew Morton
2021-11-05 20:39 ` [patch 084/262] mm/mprotect.c: avoid repeated assignment in do_mprotect_pkey() Andrew Morton
2021-11-05 20:39 ` [patch 085/262] mm/mremap: don't account pages in vma_to_resize() Andrew Morton
2021-11-05 20:39 ` [patch 086/262] include/linux/io-mapping.h: remove fallback for writecombine Andrew Morton
2021-11-05 20:39 ` [patch 087/262] mm: mmap_lock: remove redundant newline in TP_printk Andrew Morton
2021-11-05 20:39 ` [patch 088/262] mm: mmap_lock: use DECLARE_EVENT_CLASS and DEFINE_EVENT_FN Andrew Morton
2021-11-05 20:39 ` [patch 089/262] mm/vmalloc: repair warn_alloc()s in __vmalloc_area_node() Andrew Morton
2021-11-05 20:39 ` [patch 090/262] mm/vmalloc: don't allow VM_NO_GUARD on vmap() Andrew Morton
2021-11-05 20:39 ` [patch 091/262] mm/vmalloc: make show_numa_info() aware of hugepage mappings Andrew Morton
2021-11-05 20:39 ` [patch 092/262] mm/vmalloc: make sure to dump unpurged areas in /proc/vmallocinfo Andrew Morton
2021-11-05 20:39 ` [patch 093/262] mm/vmalloc: do not adjust the search size for alignment overhead Andrew Morton
2021-11-05 20:39 ` [patch 094/262] mm/vmalloc: check various alignments when debugging Andrew Morton
2021-11-05 20:39 ` [patch 095/262] vmalloc: back off when the current task is OOM-killed Andrew Morton
2021-11-05 20:39 ` [patch 096/262] vmalloc: choose a better start address in vm_area_register_early() Andrew Morton
2021-11-05 20:39 ` [patch 097/262] arm64: support page mapping percpu first chunk allocator Andrew Morton
2021-11-05 20:39 ` [patch 098/262] kasan: arm64: fix pcpu_page_first_chunk crash with KASAN_VMALLOC Andrew Morton
2021-11-05 20:39 ` [patch 099/262] mm/vmalloc: be more explicit about supported gfp flags Andrew Morton
2021-11-08  9:25   ` Michal Hocko
2021-11-08 17:15     ` Linus Torvalds
2021-11-08 17:30       ` Michal Hocko
2021-11-05 20:39 ` [patch 100/262] mm/vmalloc: introduce alloc_pages_bulk_array_mempolicy to accelerate memory allocation Andrew Morton
2021-11-05 20:39 ` [patch 101/262] lib/test_vmalloc.c: use swap() to make code cleaner Andrew Morton
2021-11-05 20:39 ` [patch 102/262] mm/large system hash: avoid possible NULL deref in alloc_large_system_hash Andrew Morton
2021-11-05 20:40 ` [patch 103/262] mm/page_alloc.c: remove meaningless VM_BUG_ON() in pindex_to_order() Andrew Morton
2021-11-05 20:40 ` [patch 104/262] mm/page_alloc.c: simplify the code by using macro K() Andrew Morton
2021-11-05 20:40 ` [patch 105/262] mm/page_alloc.c: fix obsolete comment in free_pcppages_bulk() Andrew Morton
2021-11-05 20:40 ` [patch 106/262] mm/page_alloc.c: use helper function zone_spans_pfn() Andrew Morton
2021-11-05 20:40 ` [patch 107/262] mm/page_alloc.c: avoid allocating highmem pages via alloc_pages_exact[_nid] Andrew Morton
2021-11-05 20:40 ` [patch 108/262] mm/page_alloc: print node fallback order Andrew Morton
2021-11-05 20:40 ` [patch 109/262] mm/page_alloc: use accumulated load when building node fallback list Andrew Morton
2021-11-05 20:40 ` [patch 110/262] mm: move node_reclaim_distance to fix NUMA without SMP Andrew Morton
2021-11-05 20:40 ` [patch 111/262] mm: move fold_vm_numa_events() " Andrew Morton
2021-11-05 20:40 ` [patch 112/262] mm/page_alloc.c: do not acquire zone lock in is_free_buddy_page() Andrew Morton
2021-11-05 20:40 ` [patch 113/262] mm/page_alloc: detect allocation forbidden by cpuset and bail out early Andrew Morton
2021-11-05 20:40 ` [patch 114/262] mm/page_alloc.c: show watermark_boost of zone in zoneinfo Andrew Morton
2021-11-05 20:40 ` [patch 115/262] mm: create a new system state and fix core_kernel_text() Andrew Morton
2021-11-05 20:40 ` [patch 116/262] mm: make generic arch_is_kernel_initmem_freed() do what it says Andrew Morton
2021-11-05 20:40 ` [patch 117/262] powerpc: use generic version of arch_is_kernel_initmem_freed() Andrew Morton
2021-11-05 20:40 ` [patch 118/262] s390: " Andrew Morton
2021-11-05 20:40 ` [patch 119/262] mm: page_alloc: use migrate_disable() in drain_local_pages_wq() Andrew Morton
2021-11-05 20:40 ` [patch 120/262] mm/page_alloc: use clamp() to simplify code Andrew Morton
2021-11-05 20:40 ` [patch 121/262] mm: fix data race in PagePoisoned() Andrew Morton
2021-11-05 20:41 ` [patch 122/262] mm/memory_failure: constify static mm_walk_ops Andrew Morton
2021-11-05 20:41 ` [patch 123/262] mm: filemap: coding style cleanup for filemap_map_pmd() Andrew Morton
2021-11-05 20:41 ` [patch 124/262] mm: hwpoison: refactor refcount check handling Andrew Morton
2021-11-05 20:41 ` [patch 125/262] mm: shmem: don't truncate page if memory failure happens Andrew Morton
2021-11-05 20:41 ` [patch 126/262] mm: hwpoison: handle non-anonymous THP correctly Andrew Morton
2021-11-05 20:41 ` [patch 127/262] mm/hugetlb: drop __unmap_hugepage_range definition from hugetlb.h Andrew Morton
2021-11-05 20:41 ` [patch 128/262] hugetlb: add demote hugetlb page sysfs interfaces Andrew Morton
2021-11-05 20:41 ` [patch 129/262] mm/cma: add cma_pages_valid to determine if pages are in CMA Andrew Morton
2021-11-05 20:41 ` [patch 130/262] hugetlb: be sure to free demoted CMA pages to CMA Andrew Morton
2021-11-05 20:41 ` [patch 131/262] hugetlb: add demote bool to gigantic page routines Andrew Morton
2021-11-05 20:41 ` [patch 132/262] hugetlb: add hugetlb demote page support Andrew Morton
2021-11-05 20:41 ` [patch 133/262] mm: khugepaged: recalculate min_free_kbytes after stopping khugepaged Andrew Morton
2021-11-05 20:41 ` [patch 134/262] mm, hugepages: add mremap() support for hugepage backed vma Andrew Morton
2021-11-05 20:41 ` [patch 135/262] mm, hugepages: add hugetlb vma mremap() test Andrew Morton
2021-11-05 20:41 ` [patch 136/262] hugetlb: support node specified when using cma for gigantic hugepages Andrew Morton
2021-11-05 20:41 ` [patch 137/262] mm: remove duplicate include in hugepage-mremap.c Andrew Morton
2021-11-05 20:41 ` [patch 138/262] hugetlb_cgroup: remove unused hugetlb_cgroup_from_counter macro Andrew Morton
2021-11-05 20:41 ` [patch 139/262] hugetlb: replace the obsolete hugetlb_instantiation_mutex in the comments Andrew Morton
2021-11-05 20:41 ` [patch 140/262] hugetlb: remove redundant validation in has_same_uncharge_info() Andrew Morton
2021-11-05 20:42 ` [patch 141/262] hugetlb: remove redundant VM_BUG_ON() in add_reservation_in_range() Andrew Morton
2021-11-05 20:42 ` [patch 142/262] hugetlb: remove unnecessary set_page_count in prep_compound_gigantic_page Andrew Morton
2021-11-05 20:42 ` [patch 143/262] userfaultfd/selftests: don't rely on GNU extensions for random numbers Andrew Morton
2021-11-05 20:42 ` [patch 144/262] userfaultfd/selftests: fix feature support detection Andrew Morton
2021-11-05 20:42 ` [patch 145/262] userfaultfd/selftests: fix calculation of expected ioctls Andrew Morton
2021-11-05 20:42 ` [patch 146/262] mm/page_isolation: fix potential missing call to unset_migratetype_isolate() Andrew Morton
2021-11-05 20:42 ` [patch 147/262] mm/page_isolation: guard against possible putback unisolated page Andrew Morton
2021-11-05 20:42 ` [patch 148/262] mm/vmscan.c: fix -Wunused-but-set-variable warning Andrew Morton
2021-11-05 20:42 ` [patch 149/262] mm/vmscan: throttle reclaim until some writeback completes if congested Andrew Morton
2021-11-05 21:02   ` Matthew Wilcox
2021-11-06 20:49     ` Linus Torvalds
2021-11-06 21:12       ` Linus Torvalds
2021-11-06 21:13         ` Vlastimil Babka
2021-11-06 21:20           ` Andrew Morton
2021-11-06 21:20           ` Linus Torvalds
2021-11-06 22:45         ` Matthew Wilcox
2021-11-06 23:26           ` Linus Torvalds
2021-11-05 20:42 ` [patch 150/262] mm/vmscan: throttle reclaim and compaction when too may pages are isolated Andrew Morton
2021-11-05 20:42 ` [patch 151/262] mm/vmscan: throttle reclaim when no progress is being made Andrew Morton
2021-11-05 20:42 ` [patch 152/262] mm/writeback: throttle based on page writeback instead of congestion Andrew Morton
2021-11-05 20:42 ` [patch 153/262] mm/page_alloc: remove the throttling logic from the page allocator Andrew Morton
2021-11-05 20:42 ` [patch 154/262] mm/vmscan: centralise timeout values for reclaim_throttle Andrew Morton
2021-11-05 20:42 ` [patch 155/262] mm/vmscan: increase the timeout if page reclaim is not making progress Andrew Morton
2021-11-05 20:42 ` [patch 156/262] mm/vmscan: delay waking of tasks throttled on NOPROGRESS Andrew Morton
2021-11-05 20:42 ` [patch 157/262] mm/vmpressure: fix data-race with memcg->socket_pressure Andrew Morton
2021-11-05 20:42 ` [patch 158/262] tools/vm/page_owner_sort.c: count and sort by mem Andrew Morton
2021-11-05 20:42 ` [patch 159/262] tools/vm/page-types.c: make walk_file() aware of address range option Andrew Morton
2021-11-05 20:43 ` [patch 160/262] tools/vm/page-types.c: move show_file() to summary output Andrew Morton
2021-11-05 20:43 ` [patch 161/262] tools/vm/page-types.c: print file offset in hexadecimal Andrew Morton
2021-11-05 20:43 ` [patch 162/262] arch_numa: simplify numa_distance allocation Andrew Morton
2021-11-05 20:43 ` [patch 163/262] xen/x86: free_p2m_page: use memblock_free_ptr() to free a virtual pointer Andrew Morton
2021-11-05 20:43 ` [patch 164/262] memblock: drop memblock_free_early_nid() and memblock_free_early() Andrew Morton
2021-11-05 20:43 ` [patch 165/262] memblock: stop aliasing __memblock_free_late with memblock_free_late Andrew Morton
2021-11-05 20:43 ` [patch 166/262] memblock: rename memblock_free to memblock_phys_free Andrew Morton
2021-11-05 20:43 ` [patch 167/262] memblock: use memblock_free for freeing virtual pointers Andrew Morton
2021-11-05 20:43 ` [patch 168/262] mm: mark the OOM reaper thread as freezable Andrew Morton
2021-11-05 20:43 ` [patch 169/262] hugetlbfs: extend the definition of hugepages parameter to support node allocation Andrew Morton
2021-11-05 20:43 ` [patch 170/262] mm/migrate: de-duplicate migrate_reason strings Andrew Morton
2021-11-05 20:43 ` [patch 171/262] mm: migrate: make demotion knob depend on migration Andrew Morton
2021-11-05 20:43 ` [patch 172/262] selftests/vm/transhuge-stress: fix ram size thinko Andrew Morton
2021-11-05 20:43 ` [patch 173/262] mm, thp: lock filemap when truncating page cache Andrew Morton
2021-11-05 20:43 ` [patch 174/262] mm, thp: fix incorrect unmap behavior for private pages Andrew Morton
2021-11-05 20:43 ` [patch 175/262] mm/readahead.c: fix incorrect comments for get_init_ra_size Andrew Morton
2021-11-05 20:43 ` [patch 176/262] mm: nommu: kill arch_get_unmapped_area() Andrew Morton
2021-11-05 20:43 ` [patch 177/262] selftest/vm: fix ksm selftest to run with different NUMA topologies Andrew Morton
2021-11-05 20:43 ` [patch 178/262] selftests: vm: add KSM huge pages merging time test Andrew Morton
2021-11-05 20:43 ` [patch 179/262] mm/vmstat: annotate data race for zone->free_area[order].nr_free Andrew Morton
2021-11-05 20:44 ` [patch 180/262] mm: vmstat.c: make extfrag_index show more pretty Andrew Morton
2021-11-05 20:44 ` [patch 181/262] selftests/vm: make MADV_POPULATE_(READ|WRITE) use in-tree headers Andrew Morton
2021-11-05 20:44 ` [patch 182/262] mm/memory_hotplug: add static qualifier for online_policy_to_str() Andrew Morton
2021-11-05 20:44 ` [patch 183/262] memory-hotplug.rst: fix two instances of "movablecore" that should be "movable_node" Andrew Morton
2021-11-05 20:44 ` [patch 184/262] memory-hotplug.rst: fix wrong /sys/module/memory_hotplug/parameters/ path Andrew Morton
2021-11-05 20:44 ` [patch 185/262] memory-hotplug.rst: document the "auto-movable" online policy Andrew Morton
2021-11-05 20:44 ` [patch 186/262] mm/memory_hotplug: remove CONFIG_X86_64_ACPI_NUMA dependency from CONFIG_MEMORY_HOTPLUG Andrew Morton
2021-11-05 20:44 ` [patch 187/262] mm/memory_hotplug: remove CONFIG_MEMORY_HOTPLUG_SPARSE Andrew Morton
2021-11-05 20:44 ` [patch 188/262] mm/memory_hotplug: restrict CONFIG_MEMORY_HOTPLUG to 64 bit Andrew Morton
2021-11-05 20:44 ` [patch 189/262] mm/memory_hotplug: remove HIGHMEM leftovers Andrew Morton
2021-11-05 20:44 ` [patch 190/262] mm/memory_hotplug: remove stale function declarations Andrew Morton
2021-11-05 20:44 ` [patch 191/262] x86: remove memory hotplug support on X86_32 Andrew Morton
2021-11-05 20:44 ` [patch 192/262] mm/memory_hotplug: handle memblock_add_node() failures in add_memory_resource() Andrew Morton
2021-11-05 20:44 ` [patch 193/262] memblock: improve MEMBLOCK_HOTPLUG documentation Andrew Morton
2021-11-05 20:44 ` [patch 194/262] memblock: allow to specify flags with memblock_add_node() Andrew Morton
2021-11-05 20:44 ` [patch 195/262] memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
2021-11-05 20:44 ` [patch 196/262] mm/memory_hotplug: indicate MEMBLOCK_DRIVER_MANAGED with IORESOURCE_SYSRAM_DRIVER_MANAGED Andrew Morton
2021-11-05 20:45 ` [patch 197/262] mm/rmap.c: avoid double faults migrating device private pages Andrew Morton
2021-11-05 20:45 ` [patch 198/262] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration() Andrew Morton
2021-11-05 20:45 ` [patch 199/262] mm/highmem: remove deprecated kmap_atomic Andrew Morton
2021-11-05 20:45 ` [patch 200/262] zram_drv: allow reclaim on bio_alloc Andrew Morton
2021-11-05 20:45 ` [patch 201/262] zram: off by one in read_block_state() Andrew Morton
2021-11-05 20:45 ` [patch 202/262] zram: introduce an aged idle interface Andrew Morton
2021-11-05 20:45 ` [patch 203/262] mm: remove HARDENED_USERCOPY_FALLBACK Andrew Morton
2021-11-05 20:45 ` [patch 204/262] include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h Andrew Morton
2021-11-05 20:45 ` [patch 205/262] stacktrace: move filter_irq_stacks() to kernel/stacktrace.c Andrew Morton
2021-11-05 20:45 ` [patch 206/262] kfence: count unexpectedly skipped allocations Andrew Morton
2021-11-05 20:45 ` [patch 207/262] kfence: move saving stack trace of allocations into __kfence_alloc() Andrew Morton
2021-11-05 20:45 ` [patch 208/262] kfence: limit currently covered allocations when pool nearly full Andrew Morton
2021-11-05 20:45 ` [patch 209/262] kfence: add note to documentation about skipping covered allocations Andrew Morton
2021-11-05 20:45 ` [patch 210/262] kfence: test: use kunit_skip() to skip tests Andrew Morton
2021-11-05 20:45 ` [patch 211/262] kfence: shorten critical sections of alloc/free Andrew Morton
2021-11-05 20:45 ` [patch 212/262] kfence: always use static branches to guard kfence_alloc() Andrew Morton
2021-11-05 20:45 ` [patch 213/262] kfence: default to dynamic branch instead of static keys mode Andrew Morton
2021-11-05 20:45 ` [patch 214/262] mm/damon: grammar s/works/work/ Andrew Morton
2021-11-05 20:45 ` [patch 215/262] Documentation/vm: move user guides to admin-guide/mm/ Andrew Morton
2021-11-05 20:45 ` [patch 216/262] MAINTAINERS: update SeongJae's email address Andrew Morton
2021-11-05 20:46 ` [patch 217/262] docs/vm/damon: remove broken reference Andrew Morton
2021-11-05 20:46 ` [patch 218/262] include/linux/damon.h: fix kernel-doc comments for 'damon_callback' Andrew Morton
2021-11-05 20:46 ` [patch 219/262] mm/damon/core: print kdamond start log in debug mode only Andrew Morton
2021-11-05 20:46 ` [patch 220/262] mm/damon: remove unnecessary do_exit() from kdamond Andrew Morton
2021-11-05 20:46 ` [patch 221/262] mm/damon: needn't hold kdamond_lock to print pid of kdamond Andrew Morton
2021-11-05 20:46 ` [patch 222/262] mm/damon/core: nullify pointer ctx->kdamond with a NULL Andrew Morton
2021-11-05 20:46 ` [patch 223/262] mm/damon/core: account age of target regions Andrew Morton
2021-11-05 20:46 ` [patch 224/262] mm/damon/core: implement DAMON-based Operation Schemes (DAMOS) Andrew Morton
2021-11-05 20:46 ` [patch 225/262] mm/damon/vaddr: support DAMON-based Operation Schemes Andrew Morton
2021-11-05 20:46 ` [patch 226/262] mm/damon/dbgfs: " Andrew Morton
2021-11-05 20:46 ` [patch 227/262] mm/damon/schemes: implement statistics feature Andrew Morton
2021-11-05 20:46 ` [patch 228/262] selftests/damon: add 'schemes' debugfs tests Andrew Morton
2021-11-05 20:46 ` [patch 229/262] Docs/admin-guide/mm/damon: document DAMON-based Operation Schemes Andrew Morton
2021-11-05 20:46 ` [patch 230/262] mm/damon/dbgfs: allow users to set initial monitoring target regions Andrew Morton
2021-11-05 20:46 ` [patch 231/262] mm/damon/dbgfs-test: add a unit test case for 'init_regions' Andrew Morton
2021-11-05 20:46 ` [patch 232/262] Docs/admin-guide/mm/damon: document 'init_regions' feature Andrew Morton
2021-11-05 20:46 ` [patch 233/262] mm/damon/vaddr: separate commonly usable functions Andrew Morton
2021-11-05 20:46 ` [patch 234/262] mm/damon: implement primitives for physical address space monitoring Andrew Morton
2021-11-05 20:47 ` [patch 235/262] mm/damon/dbgfs: support physical memory monitoring Andrew Morton
2021-11-05 20:47 ` [patch 236/262] Docs/DAMON: document physical memory monitoring support Andrew Morton
2021-11-05 20:47 ` [patch 237/262] mm/damon/vaddr: constify static mm_walk_ops Andrew Morton
2021-11-05 20:47 ` [patch 238/262] mm/damon/dbgfs: remove unnecessary variables Andrew Morton
2021-11-05 20:47 ` [patch 239/262] mm/damon/paddr: support the pageout scheme Andrew Morton
2021-11-05 20:47 ` [patch 240/262] mm/damon/schemes: implement size quota for schemes application speed control Andrew Morton
2021-11-05 20:47 ` [patch 241/262] mm/damon/schemes: skip already charged targets and regions Andrew Morton
2021-11-05 20:47 ` [patch 242/262] mm/damon/schemes: implement time quota Andrew Morton
2021-11-05 20:47 ` [patch 243/262] mm/damon/dbgfs: support quotas of schemes Andrew Morton
2021-11-05 20:47 ` [patch 244/262] mm/damon/selftests: support schemes quotas Andrew Morton
2021-11-05 20:47 ` [patch 245/262] mm/damon/schemes: prioritize regions within the quotas Andrew Morton
2021-11-05 20:47 ` [patch 246/262] mm/damon/vaddr,paddr: support pageout prioritization Andrew Morton
2021-11-05 20:47 ` [patch 247/262] mm/damon/dbgfs: support prioritization weights Andrew Morton
2021-11-05 20:47 ` [patch 248/262] tools/selftests/damon: update for regions prioritization of schemes Andrew Morton
2021-11-05 20:47 ` [patch 249/262] mm/damon/schemes: activate schemes based on a watermarks mechanism Andrew Morton
2021-11-05 20:47 ` [patch 250/262] mm/damon/dbgfs: support watermarks Andrew Morton
2021-11-05 20:47 ` [patch 251/262] selftests/damon: " Andrew Morton
2021-11-05 20:47 ` [patch 252/262] mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) Andrew Morton
2021-11-05 20:48 ` [patch 253/262] Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM Andrew Morton
2021-11-05 20:48 ` [patch 254/262] mm/damon: remove unnecessary variable initialization Andrew Morton
2021-11-05 20:48 ` [patch 255/262] mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on Andrew Morton
2021-11-05 20:48 ` [patch 256/262] Docs/admin-guide/mm/damon/start: fix wrong example commands Andrew Morton
2021-11-05 20:48 ` [patch 257/262] Docs/admin-guide/mm/damon/start: fix a wrong link Andrew Morton
2021-11-05 20:48 ` [patch 258/262] Docs/admin-guide/mm/damon/start: simplify the content Andrew Morton
2021-11-05 20:48 ` [patch 259/262] Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Andrew Morton
2021-11-05 20:48 ` [patch 260/262] mm/damon: simplify stop mechanism Andrew Morton
2021-11-05 20:48 ` [patch 261/262] mm/damon: fix a few spelling mistakes in comments and a pr_debug message Andrew Morton
2021-11-05 20:48 ` [patch 262/262] mm/damon: remove return value from before_terminate callback Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2022-04-27 19:41 incoming Andrew Morton
2022-04-21 23:35 incoming Andrew Morton
2022-04-15  2:12 incoming Andrew Morton
2022-04-08 20:08 incoming Andrew Morton
2022-04-01 18:27 incoming Andrew Morton
2022-04-01 18:20 incoming Andrew Morton
2022-04-01 18:27 ` incoming Andrew Morton
2022-03-25  1:07 incoming Andrew Morton
2022-03-23 23:04 incoming Andrew Morton
2022-03-22 21:38 incoming Andrew Morton
2022-03-16 23:14 incoming Andrew Morton
2022-03-05  4:28 incoming Andrew Morton
2022-02-26  3:10 incoming Andrew Morton
2022-02-12  0:27 incoming Andrew Morton
2022-02-12  2:02 ` incoming Linus Torvalds
2022-02-12  5:24   ` incoming Andrew Morton
2022-02-04  4:48 incoming Andrew Morton
2022-01-29 21:40 incoming Andrew Morton
2022-01-29  2:13 incoming Andrew Morton
2022-01-29  4:25 ` incoming Matthew Wilcox
2022-01-29  6:23   ` incoming Andrew Morton
2022-01-22  6:10 incoming Andrew Morton
2022-01-20  2:07 incoming Andrew Morton
2022-01-14 22:02 incoming Andrew Morton
2021-12-31  4:12 incoming Andrew Morton
2021-12-25  5:11 incoming Andrew Morton
2021-12-10 22:45 incoming Andrew Morton
2021-11-20  0:42 incoming Andrew Morton
2021-11-11  4:32 incoming Andrew Morton
2021-11-09  2:30 incoming Andrew Morton
2021-10-28 21:35 incoming Andrew Morton
2021-10-18 22:14 incoming Andrew Morton
2021-09-24 22:42 incoming Andrew Morton
2021-09-10  3:09 incoming Andrew Morton
2021-09-10 17:11 ` incoming Kees Cook
2021-09-10 20:13   ` incoming Kees Cook
2021-09-09  1:08 incoming Andrew Morton
2021-09-08 22:17 incoming Andrew Morton
2021-09-08  2:52 incoming Andrew Morton
2021-09-08  8:57 ` incoming Vlastimil Babka
2021-09-02 21:48 incoming Andrew Morton
2021-09-02 21:49 ` incoming Andrew Morton
2021-08-25 19:17 incoming Andrew Morton
2021-08-20  2:03 incoming Andrew Morton
2021-08-13 23:53 incoming Andrew Morton
2021-07-29 21:52 incoming Andrew Morton
2021-07-23 22:49 incoming Andrew Morton
2021-07-15  4:26 incoming Andrew Morton
2021-07-08  0:59 incoming Andrew Morton
2021-07-01  1:46 incoming Andrew Morton
2021-07-03  0:28 ` incoming Linus Torvalds
2021-07-03  1:06   ` incoming Linus Torvalds
2021-06-29  2:32 incoming Andrew Morton
2021-06-25  1:38 incoming Andrew Morton
2021-06-16  1:22 incoming Andrew Morton
2021-06-05  3:00 incoming Andrew Morton
2021-05-23  0:41 incoming Andrew Morton
2021-05-15  0:26 incoming Andrew Morton
2021-05-07  1:01 incoming Andrew Morton
2021-05-07  7:12 ` incoming Linus Torvalds
2021-05-05  1:32 incoming Andrew Morton
2021-05-05  1:47 ` incoming Linus Torvalds
2021-05-05  3:16   ` incoming Andrew Morton
2021-05-05 17:10     ` incoming Linus Torvalds
2021-05-05 17:44       ` incoming Andrew Morton
2021-05-06  3:19         ` incoming Anshuman Khandual
2021-04-30  5:52 incoming Andrew Morton
2021-04-23 21:28 incoming Andrew Morton
2021-04-16 22:45 incoming Andrew Morton
2021-04-09 20:26 incoming Andrew Morton
2021-03-25  4:36 incoming Andrew Morton
2021-03-13  5:06 incoming Andrew Morton
2021-02-26  1:14 incoming Andrew Morton
2021-02-26 17:55 ` incoming Linus Torvalds
2021-02-26 19:16   ` incoming Andrew Morton
2021-02-24 19:58 incoming Andrew Morton
2021-02-24 21:30 ` incoming Linus Torvalds
2021-02-24 21:37   ` incoming Linus Torvalds
2021-02-25  8:53     ` incoming Arnd Bergmann
2021-02-25  9:12       ` incoming Andrey Ryabinin
2021-02-25 11:07         ` incoming Walter Wu
2021-02-13  4:52 incoming Andrew Morton
2021-02-09 21:41 incoming Andrew Morton
2021-02-10 19:30 ` incoming Linus Torvalds
2021-02-05  2:31 incoming Andrew Morton
2021-01-24  5:00 incoming Andrew Morton
2021-01-12 23:48 incoming Andrew Morton
2021-01-15 23:32 ` incoming Linus Torvalds
2020-12-29 23:13 incoming Andrew Morton
2020-12-22 19:58 incoming Andrew Morton
2020-12-22 21:43 ` incoming Linus Torvalds
2020-12-18 22:00 incoming Andrew Morton
2020-12-16  4:41 incoming Andrew Morton
2020-12-15 20:32 incoming Andrew Morton
2020-12-15 21:00 ` incoming Linus Torvalds
2020-12-15 22:48 ` incoming Linus Torvalds
2020-12-15 22:49   ` incoming Linus Torvalds
2020-12-15 22:55     ` incoming Andrew Morton
2020-12-15  3:02 incoming Andrew Morton
2020-12-15  3:25 ` incoming Linus Torvalds
2020-12-15  3:30   ` incoming Linus Torvalds
2020-12-15 14:04     ` incoming Konstantin Ryabitsev
2020-12-11 21:35 incoming Andrew Morton
2020-12-06  6:14 incoming Andrew Morton
2020-11-22  6:16 incoming Andrew Morton
2020-11-14  6:51 incoming Andrew Morton
2020-11-02  1:06 incoming Andrew Morton
2020-10-17 23:13 incoming Andrew Morton
2020-10-16  2:40 incoming Andrew Morton
2020-10-16  3:03 ` incoming Andrew Morton
2020-10-13 23:46 incoming Andrew Morton
2020-10-11  6:15 incoming Andrew Morton
2020-10-03  5:20 incoming Andrew Morton
2020-09-26  4:17 incoming Andrew Morton
2020-09-19  4:19 incoming Andrew Morton
2020-09-04 23:34 incoming Andrew Morton
2020-08-21  0:41 incoming Andrew Morton
2020-08-15  0:29 incoming Andrew Morton
2020-08-12  1:29 incoming Andrew Morton
2020-08-07  6:16 incoming Andrew Morton
2020-07-24  4:14 incoming Andrew Morton
2020-07-03 22:14 incoming Andrew Morton
2020-06-26  3:28 incoming Andrew Morton
2020-06-26  6:51 ` incoming Linus Torvalds
2020-06-26  7:31   ` incoming Linus Torvalds
2020-06-26 17:39   ` incoming Konstantin Ryabitsev
2020-06-26 17:40     ` incoming Konstantin Ryabitsev
2020-06-12  0:30 incoming Andrew Morton
2020-06-11  1:40 incoming Andrew Morton
2020-06-09  4:29 incoming Andrew Morton
2020-06-09 16:58 ` incoming Linus Torvalds
2020-06-08  4:35 incoming Andrew Morton
2020-06-04 23:45 incoming Andrew Morton
2020-06-03 22:55 incoming Andrew Morton
2020-06-02 20:09 incoming Andrew Morton
2020-06-02  4:44 incoming Andrew Morton
2020-06-02 20:08 ` incoming Andrew Morton
2020-06-02 20:45   ` incoming Linus Torvalds
2020-06-02 21:38     ` incoming Andrew Morton
2020-06-02 22:18       ` incoming Linus Torvalds
2020-05-28  5:20 incoming Andrew Morton
2020-05-28 20:10 ` incoming Linus Torvalds
2020-05-29 20:31   ` incoming Andrew Morton
2020-05-29 20:38     ` incoming Linus Torvalds
2020-05-29 21:12       ` incoming Andrew Morton
2020-05-29 21:20         ` incoming Linus Torvalds
2020-05-23  5:22 incoming Andrew Morton
2020-05-14  0:50 incoming Andrew Morton
2020-05-08  1:35 incoming Andrew Morton
2020-04-21  1:13 incoming Andrew Morton
2020-04-12  7:41 incoming Andrew Morton
2020-04-10 21:30 incoming Andrew Morton
2020-04-07  3:02 incoming Andrew Morton
2020-04-02  4:01 incoming Andrew Morton
2020-03-29  2:14 incoming Andrew Morton
2020-03-22  1:19 incoming Andrew Morton
2020-03-06  6:27 incoming Andrew Morton
2020-02-21  4:00 incoming Andrew Morton
2020-02-21  4:03 ` incoming Andrew Morton
2020-02-21 18:21 ` incoming Linus Torvalds
2020-02-21 18:32   ` incoming Konstantin Ryabitsev
2020-02-27  9:59     ` incoming Vlastimil Babka
2020-02-21 19:33   ` incoming Linus Torvalds
2020-02-04  1:33 incoming Andrew Morton
2020-02-04  2:27 ` incoming Linus Torvalds
2020-02-04  2:46   ` incoming Andrew Morton
2020-02-04  3:11     ` incoming Linus Torvalds
2020-01-31  6:10 incoming Andrew Morton
2020-01-14  0:28 incoming Andrew Morton
2020-01-04 20:55 incoming Andrew Morton
2019-12-18  4:50 incoming Andrew Morton
2019-12-05  0:48 incoming Andrew Morton
2019-12-01  1:47 incoming Andrew Morton
2019-12-01  5:17 ` incoming James Bottomley
2019-12-01 21:07 ` incoming Linus Torvalds
2019-12-02  8:21   ` incoming Steven Price
2019-11-22  1:53 incoming Andrew Morton
2019-11-16  1:34 incoming Andrew Morton
2019-11-06  5:16 incoming Andrew Morton
2019-10-19  3:19 incoming Andrew Morton
2019-10-14 21:11 incoming Andrew Morton
2019-10-07  0:57 incoming Andrew Morton
2019-09-25 23:45 incoming Andrew Morton
2019-09-23 22:31 incoming Andrew Morton
2019-09-24  0:55 ` incoming Linus Torvalds
2019-09-24  4:31   ` incoming Andrew Morton
2019-09-24  7:48     ` incoming Michal Hocko
2019-09-24 15:34       ` incoming Linus Torvalds
2019-09-25  6:36         ` incoming Michal Hocko
2019-09-24 19:55       ` incoming Vlastimil Babka
2019-08-30 23:04 incoming Andrew Morton
2019-08-25  0:54 incoming Andrew Morton
     [not found] <20190716162536.bb52b8f34a8ecf5331a86a42@linux-foundation.org>
2019-07-17  8:47 ` incoming Vlastimil Babka
2019-07-17  8:57   ` incoming Bhaskar Chowdhury
2019-07-17 16:13   ` incoming Linus Torvalds
2019-07-17 17:09     ` incoming Christian Brauner
2019-07-17 18:13     ` incoming Vlastimil Babka
2007-05-02 22:02 incoming Andrew Morton
2007-05-02 22:31 ` incoming Benjamin Herrenschmidt
2007-05-03  7:55 ` incoming Russell King
2007-05-03  8:05   ` incoming Andrew Morton
2007-05-04 13:37 ` incoming Greg KH
2007-05-04 16:14   ` incoming Andrew Morton
2007-05-04 17:02     ` incoming Greg KH
2007-05-04 18:57     ` incoming Roland McGrath
2007-05-04 19:24       ` incoming Greg KH
2007-05-04 19:29         ` incoming Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).