mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* incoming
@ 2020-04-10 21:30 Andrew Morton
  2020-04-10 21:32 ` [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files Andrew Morton
                   ` (34 more replies)
  0 siblings, 35 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: mm-commits, linux-mm


Almost all of the rest of MM.  Various other things.

35 patches, based on c0cc271173b2e1c2d8d0ceaef14e4dfa79eefc0d.

Subsystems affected by this patch series:

  hfs
  mm/memcg
  mm/slab-generic
  mm/slab
  mm/pagealloc
  mm/gup
  ocfs2
  mm/hugetlb
  mm/pagemap
  mm/memremap
  kmod
  misc
  seqfile

Subsystem: hfs

    Simon Gander <simon@tuxera.com>:
      hfsplus: fix crash and filesystem corruption when deleting files

Subsystem: mm/memcg

    Jakub Kicinski <kuba@kernel.org>:
      mm, memcg: do not high throttle allocators based on wraparound

Subsystem: mm/slab-generic

    Qiujun Huang <hqjagain@gmail.com>:
      mm, slab_common: fix a typo in comment "eariler"->"earlier"

Subsystem: mm/slab

    Mauro Carvalho Chehab <mchehab+huawei@kernel.org>:
      docs: mm: slab.h: fix a broken cross-reference

Subsystem: mm/pagealloc

    Randy Dunlap <rdunlap@infradead.org>:
      mm/page_alloc.c: fix kernel-doc warning

    Jason Yan <yanaijie@huawei.com>:
      mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static

Subsystem: mm/gup

    Miles Chen <miles.chen@mediatek.com>:
      mm/gup: fix null pointer dereference detected by coverity

Subsystem: ocfs2

    Changwei Ge <chge@linux.alibaba.com>:
      ocfs2: no need try to truncate file beyond i_size

Subsystem: mm/hugetlb

    Aslan Bakirov <aslan@fb.com>:
      mm: cma: NUMA node interface

    Roman Gushchin <guro@fb.com>:
      mm: hugetlb: optionally allocate gigantic hugepages using cma

Subsystem: mm/pagemap

    Jaewon Kim <jaewon31.kim@samsung.com>:
      mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area

    Arjun Roy <arjunroy@google.com>:
      mm/memory.c: refactor insert_page to prepare for batched-lock insert
      mm: bring sparc pte_index() semantics inline with other platforms
      mm: define pte_index as macro for x86
      mm/memory.c: add vm_insert_pages()

    Anshuman Khandual <anshuman.khandual@arm.com>:
      mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
      mm/vma: introduce VM_ACCESS_FLAGS
      mm/special: create generic fallbacks for pte_special() and pte_mkspecial()

Subsystem: mm/memremap

    Logan Gunthorpe <logang@deltatee.com>:
    Patch series "Allow setting caching mode in arch_add_memory() for P2PDMA", v4:
      mm/memory_hotplug: drop the flags field from struct mhp_restrictions
      mm/memory_hotplug: rename mhp_restrictions to mhp_params
      x86/mm: thread pgprot_t through init_memory_mapping()
      x86/mm: introduce __set_memory_prot()
      powerpc/mm: thread pgprot_t through create_section_mapping()
      mm/memory_hotplug: add pgprot_t to mhp_params
      mm/memremap: set caching mode for PCI P2PDMA memory to WC

Subsystem: kmod

    Eric Biggers <ebiggers@google.com>:
    Patch series "module autoloading fixes and cleanups", v5:
      kmod: make request_module() return an error when autoloading is disabled
      fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
      docs: admin-guide: document the kernel.modprobe sysctl
      selftests: kmod: fix handling test numbers above 9
      selftests: kmod: test disabling module autoloading

Subsystem: misc

    Pali Rohár <pali@kernel.org>:
      change email address for Pali Rohár

    kbuild test robot <lkp@intel.com>:
      drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings

Subsystem: seqfile

    Vasily Averin <vvs@virtuozzo.com>:
    Patch series "seq_file .next functions should increase position index":
      fs/seq_file.c: seq_read(): add info message about buggy .next functions
      kernel/gcov/fs.c: gcov_seq_next() should increase position index
      ipc/util.c: sysvipc_find_ipc() should increase position index

 Documentation/ABI/testing/sysfs-platform-dell-laptop |    8 
 Documentation/admin-guide/kernel-parameters.txt      |    8 
 Documentation/admin-guide/sysctl/kernel.rst          |   21 ++
 MAINTAINERS                                          |   16 -
 arch/alpha/include/asm/page.h                        |    3 
 arch/alpha/include/asm/pgtable.h                     |    2 
 arch/arc/include/asm/page.h                          |    2 
 arch/arm/include/asm/page.h                          |    4 
 arch/arm/include/asm/pgtable-2level.h                |    2 
 arch/arm/include/asm/pgtable.h                       |   15 -
 arch/arm/mach-omap2/omap-secure.c                    |    2 
 arch/arm/mach-omap2/omap-secure.h                    |    2 
 arch/arm/mach-omap2/omap-smc.S                       |    2 
 arch/arm/mm/fault.c                                  |    2 
 arch/arm/mm/mmu.c                                    |   14 +
 arch/arm64/include/asm/page.h                        |    4 
 arch/arm64/mm/fault.c                                |    2 
 arch/arm64/mm/init.c                                 |    6 
 arch/arm64/mm/mmu.c                                  |    7 
 arch/c6x/include/asm/page.h                          |    5 
 arch/csky/include/asm/page.h                         |    3 
 arch/csky/include/asm/pgtable.h                      |    3 
 arch/h8300/include/asm/page.h                        |    2 
 arch/hexagon/include/asm/page.h                      |    3 
 arch/hexagon/include/asm/pgtable.h                   |    2 
 arch/ia64/include/asm/page.h                         |    5 
 arch/ia64/include/asm/pgtable.h                      |    2 
 arch/ia64/mm/init.c                                  |    7 
 arch/m68k/include/asm/mcf_pgtable.h                  |   10 -
 arch/m68k/include/asm/motorola_pgtable.h             |    2 
 arch/m68k/include/asm/page.h                         |    3 
 arch/m68k/include/asm/sun3_pgtable.h                 |    2 
 arch/microblaze/include/asm/page.h                   |    2 
 arch/microblaze/include/asm/pgtable.h                |    4 
 arch/mips/include/asm/page.h                         |    5 
 arch/mips/include/asm/pgtable.h                      |   44 +++-
 arch/nds32/include/asm/page.h                        |    3 
 arch/nds32/include/asm/pgtable.h                     |    9 -
 arch/nds32/mm/fault.c                                |    2 
 arch/nios2/include/asm/page.h                        |    3 
 arch/nios2/include/asm/pgtable.h                     |    3 
 arch/openrisc/include/asm/page.h                     |    5 
 arch/openrisc/include/asm/pgtable.h                  |    2 
 arch/parisc/include/asm/page.h                       |    3 
 arch/parisc/include/asm/pgtable.h                    |    2 
 arch/powerpc/include/asm/book3s/64/hash.h            |    3 
 arch/powerpc/include/asm/book3s/64/radix.h           |    3 
 arch/powerpc/include/asm/page.h                      |    9 -
 arch/powerpc/include/asm/page_64.h                   |    7 
 arch/powerpc/include/asm/sparsemem.h                 |    3 
 arch/powerpc/mm/book3s64/hash_utils.c                |    5 
 arch/powerpc/mm/book3s64/pgtable.c                   |    7 
 arch/powerpc/mm/book3s64/pkeys.c                     |    2 
 arch/powerpc/mm/book3s64/radix_pgtable.c             |   18 +-
 arch/powerpc/mm/mem.c                                |   12 -
 arch/riscv/include/asm/page.h                        |    3 
 arch/s390/include/asm/page.h                         |    3 
 arch/s390/mm/fault.c                                 |    2 
 arch/s390/mm/init.c                                  |    9 -
 arch/sh/include/asm/page.h                           |    3 
 arch/sh/mm/init.c                                    |    7 
 arch/sparc/include/asm/page_32.h                     |    3 
 arch/sparc/include/asm/page_64.h                     |    3 
 arch/sparc/include/asm/pgtable_32.h                  |    7 
 arch/sparc/include/asm/pgtable_64.h                  |   10 -
 arch/um/include/asm/pgtable.h                        |   10 -
 arch/unicore32/include/asm/page.h                    |    3 
 arch/unicore32/include/asm/pgtable.h                 |    3 
 arch/unicore32/mm/fault.c                            |    2 
 arch/x86/include/asm/page_types.h                    |    7 
 arch/x86/include/asm/pgtable.h                       |    6 
 arch/x86/include/asm/set_memory.h                    |    1 
 arch/x86/kernel/amd_gart_64.c                        |    3 
 arch/x86/kernel/setup.c                              |    4 
 arch/x86/mm/init.c                                   |    9 -
 arch/x86/mm/init_32.c                                |   19 +-
 arch/x86/mm/init_64.c                                |   42 ++--
 arch/x86/mm/mm_internal.h                            |    3 
 arch/x86/mm/pat/set_memory.c                         |   13 +
 arch/x86/mm/pkeys.c                                  |    2 
 arch/x86/platform/uv/bios_uv.c                       |    3 
 arch/x86/um/asm/vm-flags.h                           |   10 -
 arch/xtensa/include/asm/page.h                       |    3 
 arch/xtensa/include/asm/pgtable.h                    |    3 
 drivers/char/hw_random/omap3-rom-rng.c               |    4 
 drivers/dma/tegra20-apb-dma.c                        |    1 
 drivers/hwmon/dell-smm-hwmon.c                       |    4 
 drivers/platform/x86/dell-laptop.c                   |    4 
 drivers/platform/x86/dell-rbtn.c                     |    4 
 drivers/platform/x86/dell-rbtn.h                     |    2 
 drivers/platform/x86/dell-smbios-base.c              |    4 
 drivers/platform/x86/dell-smbios-smm.c               |    2 
 drivers/platform/x86/dell-smbios.h                   |    2 
 drivers/platform/x86/dell-smo8800.c                  |    2 
 drivers/platform/x86/dell-wmi.c                      |    4 
 drivers/power/supply/bq2415x_charger.c               |    4 
 drivers/power/supply/bq27xxx_battery.c               |    2 
 drivers/power/supply/isp1704_charger.c               |    2 
 drivers/power/supply/rx51_battery.c                  |    4 
 drivers/staging/gasket/gasket_core.c                 |    2 
 fs/filesystems.c                                     |    4 
 fs/hfsplus/attributes.c                              |    4 
 fs/ocfs2/alloc.c                                     |    4 
 fs/seq_file.c                                        |    7 
 fs/udf/ecma_167.h                                    |    2 
 fs/udf/osta_udf.h                                    |    2 
 include/linux/cma.h                                  |   14 +
 include/linux/hugetlb.h                              |   12 +
 include/linux/memblock.h                             |    3 
 include/linux/memory_hotplug.h                       |   21 +-
 include/linux/mm.h                                   |   34 +++
 include/linux/power/bq2415x_charger.h                |    2 
 include/linux/slab.h                                 |    2 
 ipc/util.c                                           |    2 
 kernel/gcov/fs.c                                     |    2 
 kernel/kmod.c                                        |    4 
 mm/cma.c                                             |   16 +
 mm/gup.c                                             |    3 
 mm/hugetlb.c                                         |  109 ++++++++++++
 mm/memblock.c                                        |    2 
 mm/memcontrol.c                                      |    3 
 mm/memory.c                                          |  168 +++++++++++++++++--
 mm/memory_hotplug.c                                  |   13 -
 mm/memremap.c                                        |   17 +
 mm/mmap.c                                            |    4 
 mm/mprotect.c                                        |    4 
 mm/page_alloc.c                                      |    5 
 mm/slab_common.c                                     |    2 
 tools/laptop/freefall/freefall.c                     |    2 
 tools/testing/selftests/kmod/kmod.sh                 |   43 ++++
 130 files changed, 710 insertions(+), 370 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files
  2020-04-10 21:30 incoming Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 02/35] mm, memcg: do not high throttle allocators based on wraparound Andrew Morton
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, anton, linux-mm, mm-commits, simon, stable, torvalds

From: Simon Gander <simon@tuxera.com>
Subject: hfsplus: fix crash and filesystem corruption when deleting files

When removing files containing extended attributes, the hfsplus driver may
remove the wrong entries from the attributes b-tree, causing major
filesystem damage and in some cases even kernel crashes.

To remove a file, all its extended attributes have to be removed as well. 
The driver does this by looking up all keys in the attributes b-tree with
the cnid of the file.  Each of these entries then gets deleted using the
key used for searching, which doesn't contain the attribute's name when it
should.  Since the key doesn't contain the name, the deletion routine will
not find the correct entry and instead remove the one in front of it.  If
parent nodes have to be modified, these become corrupt as well.  This
causes invalid links and unsorted entries that not even macOS's fsck_hfs
is able to fix.

To fix this, modify the search key before an entry is deleted from the
attributes b-tree by copying the found entry's key into the search key,
therefore ensuring that the correct entry gets removed from the tree.

Link: http://lkml.kernel.org/r/20200327155541.1521-1-simon@tuxera.com
Signed-off-by: Simon Gander <simon@tuxera.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/hfsplus/attributes.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/fs/hfsplus/attributes.c~hfsplus-fix-crash-and-filesystem-corruption-when-deleting-files
+++ a/fs/hfsplus/attributes.c
@@ -292,6 +292,10 @@ static int __hfsplus_delete_attr(struct
 		return -ENOENT;
 	}
 
+	/* Avoid btree corruption */
+	hfs_bnode_read(fd->bnode, fd->search_key,
+			fd->keyoffset, fd->keylength);
+
 	err = hfs_brec_remove(fd);
 	if (err)
 		return err;
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 02/35] mm, memcg: do not high throttle allocators based on wraparound
  2020-04-10 21:30 incoming Andrew Morton
  2020-04-10 21:32 ` [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 03/35] mm, slab_common: fix a typo in comment "eariler"->"earlier" Andrew Morton
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, chris, hannes, kuba, linux-mm, mhocko, mm-commits, stable,
	torvalds

From: Jakub Kicinski <kuba@kernel.org>
Subject: mm, memcg: do not high throttle allocators based on wraparound

If a cgroup violates its memory.high constraints, we may end up unduly
penalising it.  For example, for the following hierarchy:

A:   max high, 20 usage
A/B: 9 high, 10 usage
A/C: max high, 10 usage

We would end up doing the following calculation below when calculating
high delay for A/B:

A/B: 10 - 9 = 1...
A:   20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.

This gets worse with higher disparities in usage in the parent.

I have no idea how this disappeared from the final version of the patch,
but it is certainly Not Good(tm).  This wasn't obvious in testing because,
for a simple cgroup hierarchy with only one child, the result is usually
roughly the same.  It's only in more complex hierarchies that things go
really awry (although still, the effects are limited to a maximum of 2
seconds in schedule_timeout_killable at a maximum).

[chris@chrisdown.name: changelog]
Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
Fixes: e26733e0d0ec ("mm, memcg: throttle allocators based on ancestral memory.high")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>	[5.4.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memcontrol.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/memcontrol.c~mm-memcg-do-not-high-throttle-allocators-based-on-wraparound
+++ a/mm/memcontrol.c
@@ -2336,6 +2336,9 @@ static unsigned long calculate_high_dela
 		usage = page_counter_read(&memcg->memory);
 		high = READ_ONCE(memcg->high);
 
+		if (usage <= high)
+			continue;
+
 		/*
 		 * Prevent division by 0 in overage calculation by acting as if
 		 * it was a threshold of 1 page
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 03/35] mm, slab_common: fix a typo in comment "eariler"->"earlier"
  2020-04-10 21:30 incoming Andrew Morton
  2020-04-10 21:32 ` [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files Andrew Morton
  2020-04-10 21:32 ` [patch 02/35] mm, memcg: do not high throttle allocators based on wraparound Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 04/35] docs: mm: slab.h: fix a broken cross-reference Andrew Morton
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, cl, hqjagain, linux-mm, mm-commits, torvalds

From: Qiujun Huang <hqjagain@gmail.com>
Subject: mm, slab_common: fix a typo in comment "eariler"->"earlier"

There is a typo in comment, fix it.
s/eariler/earlier/

Link: http://lkml.kernel.org/r/20200405160544.1246-1-hqjagain@gmail.com
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/slab_common.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/slab_common.c~mm-slab_common-fix-a-typo-in-comment-eariler-earlier
+++ a/mm/slab_common.c
@@ -731,7 +731,7 @@ static void kmemcg_rcufn(struct rcu_head
 	/*
 	 * We need to grab blocking locks.  Bounce to ->work.  The
 	 * work item shares the space with the RCU head and can't be
-	 * initialized eariler.
+	 * initialized earlier.
 	 */
 	INIT_WORK(&s->memcg_params.work, kmemcg_workfn);
 	queue_work(memcg_kmem_cache_wq, &s->memcg_params.work);
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 04/35] docs: mm: slab.h: fix a broken cross-reference
  2020-04-10 21:30 incoming Andrew Morton
                   ` (2 preceding siblings ...)
  2020-04-10 21:32 ` [patch 03/35] mm, slab_common: fix a typo in comment "eariler"->"earlier" Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 05/35] mm/page_alloc.c: fix kernel-doc warning Andrew Morton
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, cl, corbet, iamjoonsoo.kim, linux-mm, mchehab+huawei,
	mm-commits, penberg, rientjes, torvalds

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Subject: docs: mm: slab.h: fix a broken cross-reference

There is a typo at the cross-reference link, causing this warning:

	./include/linux/slab.h:11: WARNING: undefined label: memory-allocation (if the link has no caption the label must precede a section header)

Link: http://lkml.kernel.org/r/0aeac24235d356ebd935d11e147dcc6edbb6465c.1586359676.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/slab.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/slab.h~docs-mm-slabh-fix-a-broken-cross-reference
+++ a/include/linux/slab.h
@@ -501,7 +501,7 @@ static __always_inline void *kmalloc_lar
  * :ref:`Documentation/core-api/mm-api.rst <mm-api-gfp-flags>`
  *
  * The recommended usage of the @flags is described at
- * :ref:`Documentation/core-api/memory-allocation.rst <memory-allocation>`
+ * :ref:`Documentation/core-api/memory-allocation.rst <memory_allocation>`
  *
  * Below is a brief outline of the most useful GFP flags
  *
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 05/35] mm/page_alloc.c: fix kernel-doc warning
  2020-04-10 21:30 incoming Andrew Morton
                   ` (3 preceding siblings ...)
  2020-04-10 21:32 ` [patch 04/35] docs: mm: slab.h: fix a broken cross-reference Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 06/35] mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static Andrew Morton
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, linux-mm, mm-commits, pankaj.gupta.linux, rdunlap, torvalds

From: Randy Dunlap <rdunlap@infradead.org>
Subject: mm/page_alloc.c: fix kernel-doc warning

Add description of function parameter 'mt' to fix kernel-doc warning:

../mm/page_alloc.c:3246: warning: Function parameter or member 'mt' not described in '__putback_isolated_page'

Link: http://lkml.kernel.org/r/02998bd4-0b82-2f15-2570-f86130304d1e@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    1 +
 1 file changed, 1 insertion(+)

--- a/mm/page_alloc.c~mm-page_alloc-fix-kernel-doc-warning
+++ a/mm/page_alloc.c
@@ -3224,6 +3224,7 @@ int __isolate_free_page(struct page *pag
  * __putback_isolated_page - Return a now-isolated page back where we got it
  * @page: Page that was isolated
  * @order: Order of the isolated page
+ * @mt: The page's pageblock's migratetype
  *
  * This function is meant to return a page pulled from the free lists via
  * __isolate_free_page back to the free lists they were pulled from.
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 06/35] mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static
  2020-04-10 21:30 incoming Andrew Morton
                   ` (4 preceding siblings ...)
  2020-04-10 21:32 ` [patch 05/35] mm/page_alloc.c: fix kernel-doc warning Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 07/35] mm/gup: fix null pointer dereference detected by coverity Andrew Morton
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, hulkci, linux-mm, mm-commits, torvalds, yanaijie

From: Jason Yan <yanaijie@huawei.com>
Subject: mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static

Fix the following sparse warning:

mm/page_alloc.c:106:1: warning: symbol 'pcpu_drain_mutex' was not
declared. Should it be static?
mm/page_alloc.c:107:1: warning: symbol '__pcpu_scope_pcpu_drain' was not
declared. Should it be static?

Link: http://lkml.kernel.org/r/20200407023925.46438-1-yanaijie@huawei.com
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-make-pcpu_drain_mutex-and-pcpu_drain-static
+++ a/mm/page_alloc.c
@@ -103,8 +103,8 @@ struct pcpu_drain {
 	struct zone *zone;
 	struct work_struct work;
 };
-DEFINE_MUTEX(pcpu_drain_mutex);
-DEFINE_PER_CPU(struct pcpu_drain, pcpu_drain);
+static DEFINE_MUTEX(pcpu_drain_mutex);
+static DEFINE_PER_CPU(struct pcpu_drain, pcpu_drain);
 
 #ifdef CONFIG_GCC_PLUGIN_LATENT_ENTROPY
 volatile unsigned long latent_entropy __latent_entropy;
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 07/35] mm/gup: fix null pointer dereference detected by coverity
  2020-04-10 21:30 incoming Andrew Morton
                   ` (5 preceding siblings ...)
  2020-04-10 21:32 ` [patch 06/35] mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 08/35] ocfs2: no need try to truncate file beyond i_size Andrew Morton
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, ira.weiny, linux-mm, miles.chen, mm-commits, torvalds

From: Miles Chen <miles.chen@mediatek.com>
Subject: mm/gup: fix null pointer dereference detected by coverity

In fixup_user_fault(), it is possible that unlocked is NULL,
so we should test unlocked before using it.

For example, in arch/arc/kernel/process.c, NULL is passed
to fixup_user_fault().

SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new)
{
...
	ret = fixup_user_fault(current, current->mm, (unsigned long) uaddr,
			       FAULT_FLAG_WRITE, NULL);
...
}

Link: http://lkml.kernel.org/r/20200407095107.1988-1-miles.chen@mediatek.com
Fixes: 4a9e1cda2748 ("mm: bring in additional flag for fixup_user_fault to signal unlock")
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/gup.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/gup.c~mm-gup-fix-null-pointer-dereference-detected-by-coverity
+++ a/mm/gup.c
@@ -1231,7 +1231,8 @@ retry:
 	if (ret & VM_FAULT_RETRY) {
 		down_read(&mm->mmap_sem);
 		if (!(fault_flags & FAULT_FLAG_TRIED)) {
-			*unlocked = true;
+			if (unlocked)
+				*unlocked = true;
 			fault_flags |= FAULT_FLAG_TRIED;
 			goto retry;
 		}
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 08/35] ocfs2: no need try to truncate file beyond i_size
  2020-04-10 21:30 incoming Andrew Morton
                   ` (6 preceding siblings ...)
  2020-04-10 21:32 ` [patch 07/35] mm/gup: fix null pointer dereference detected by coverity Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 09/35] mm: cma: NUMA node interface Andrew Morton
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, chge, gechangwei, ghe, jlbec, joseph.qi, junxiao.bi,
	linux-mm, mark, mm-commits, piaojun, stable, torvalds

From: Changwei Ge <chge@linux.alibaba.com>
Subject: ocfs2: no need try to truncate file beyond i_size

Linux fallocate(2) with FALLOC_FL_PUNCH_HOLE mode set, its offset can
exceed inode size.  Ocfs2 now does't allow that offset beyond inode size. 
This restriction is not necessary and voilates fallocate(2) semantics.

If fallocate(2) offset is beyond inode size, just return success and do
nothing further.

Otherwise, ocfs2 will crash the kernel.

kernel BUG at fs/ocfs2//alloc.c:7264!
 ocfs2_truncate_inline+0x20f/0x360 [ocfs2]
 ? ocfs2_read_blocks+0x2f3/0x5f0 [ocfs2]
 ocfs2_remove_inode_range+0x23c/0xcb0 [ocfs2]
 ? ocfs2_read_inode_block+0x10/0x20 [ocfs2]
 ? ocfs2_allocate_extend_trans+0x1a0/0x1a0 [ocfs2]
 __ocfs2_change_file_space+0x4a5/0x650 [ocfs2]
 ocfs2_fallocate+0x83/0xa0 [ocfs2]
 ? __audit_syscall_entry+0xb8/0x100
 ? __sb_start_write+0x3b/0x70
 vfs_fallocate+0x148/0x230
 SyS_fallocate+0x48/0x80
 do_syscall_64+0x79/0x170

Link: http://lkml.kernel.org/r/20200407082754.17565-1-chge@linux.alibaba.com
Signed-off-by: Changwei Ge <chge@linux.alibaba.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/alloc.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/fs/ocfs2/alloc.c~ocfs2-no-need-try-to-truncate-file-beyond-i_size
+++ a/fs/ocfs2/alloc.c
@@ -7402,6 +7402,10 @@ int ocfs2_truncate_inline(struct inode *
 	struct ocfs2_dinode *di = (struct ocfs2_dinode *)di_bh->b_data;
 	struct ocfs2_inline_data *idata = &di->id2.i_data;
 
+	/* No need to punch hole beyond i_size. */
+	if (start >= i_size_read(inode))
+		return 0;
+
 	if (end > i_size_read(inode))
 		end = i_size_read(inode);
 
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 09/35] mm: cma: NUMA node interface
  2020-04-10 21:30 incoming Andrew Morton
                   ` (7 preceding siblings ...)
  2020-04-10 21:32 ` [patch 08/35] ocfs2: no need try to truncate file beyond i_size Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 10/35] mm: hugetlb: optionally allocate gigantic hugepages using cma Andrew Morton
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, andreas.schaufler, aslan, guro, js1304, linux-mm, mhocko,
	mike.kravetz, mm-commits, riel, torvalds

From: Aslan Bakirov <aslan@fb.com>
Subject: mm: cma: NUMA node interface

I've noticed that there is no interfaces exposed by CMA which would let me
to declare contigous memory on particular NUMA node.

This patchset adds the ability to try to allocate contiguous memory on
specific node.  It will fallback to other nodes if the specified one
doesn't work.

Implement a new method for declaring contigous memory on particular node
and keep cma_declare_contiguous() as a wrapper.

[akpm@linux-foundation.org: build fix]
Link: http://lkml.kernel.org/r/20200407163840.92263-2-guro@fb.com
Signed-off-by: Aslan Bakirov <aslan@fb.com>
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/cma.h      |   14 ++++++++++++--
 include/linux/memblock.h |    3 +++
 mm/cma.c                 |   16 +++++++++-------
 mm/memblock.c            |    2 +-
 4 files changed, 25 insertions(+), 10 deletions(-)

--- a/include/linux/cma.h~mm-cma-numa-node-interface
+++ a/include/linux/cma.h
@@ -4,6 +4,7 @@
 
 #include <linux/init.h>
 #include <linux/types.h>
+#include <linux/numa.h>
 
 /*
  * There is always at least global CMA area and a few optional
@@ -24,10 +25,19 @@ extern phys_addr_t cma_get_base(const st
 extern unsigned long cma_get_size(const struct cma *cma);
 extern const char *cma_get_name(const struct cma *cma);
 
-extern int __init cma_declare_contiguous(phys_addr_t base,
+extern int __init cma_declare_contiguous_nid(phys_addr_t base,
 			phys_addr_t size, phys_addr_t limit,
 			phys_addr_t alignment, unsigned int order_per_bit,
-			bool fixed, const char *name, struct cma **res_cma);
+			bool fixed, const char *name, struct cma **res_cma,
+			int nid);
+static inline int __init cma_declare_contiguous(phys_addr_t base,
+			phys_addr_t size, phys_addr_t limit,
+			phys_addr_t alignment, unsigned int order_per_bit,
+			bool fixed, const char *name, struct cma **res_cma)
+{
+	return cma_declare_contiguous_nid(base, size, limit, alignment,
+			order_per_bit, fixed, name, res_cma, NUMA_NO_NODE);
+}
 extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					unsigned int order_per_bit,
 					const char *name,
--- a/include/linux/memblock.h~mm-cma-numa-node-interface
+++ a/include/linux/memblock.h
@@ -348,6 +348,9 @@ static inline int memblock_get_region_no
 
 phys_addr_t memblock_phys_alloc_range(phys_addr_t size, phys_addr_t align,
 				      phys_addr_t start, phys_addr_t end);
+phys_addr_t memblock_alloc_range_nid(phys_addr_t size,
+				      phys_addr_t align, phys_addr_t start,
+				      phys_addr_t end, int nid, bool exact_nid);
 phys_addr_t memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid);
 
 static inline phys_addr_t memblock_phys_alloc(phys_addr_t size,
--- a/mm/cma.c~mm-cma-numa-node-interface
+++ a/mm/cma.c
@@ -220,7 +220,7 @@ int __init cma_init_reserved_mem(phys_ad
 }
 
 /**
- * cma_declare_contiguous() - reserve custom contiguous area
+ * cma_declare_contiguous_nid() - reserve custom contiguous area
  * @base: Base address of the reserved area optional, use 0 for any
  * @size: Size of the reserved area (in bytes),
  * @limit: End address of the reserved memory (optional, 0 for any).
@@ -229,6 +229,7 @@ int __init cma_init_reserved_mem(phys_ad
  * @fixed: hint about where to place the reserved area
  * @name: The name of the area. See function cma_init_reserved_mem()
  * @res_cma: Pointer to store the created cma region.
+ * @nid: nid of the free area to find, %NUMA_NO_NODE for any node
  *
  * This function reserves memory from early allocator. It should be
  * called by arch specific code once the early allocator (memblock or bootmem)
@@ -238,10 +239,11 @@ int __init cma_init_reserved_mem(phys_ad
  * If @fixed is true, reserve contiguous area at exactly @base.  If false,
  * reserve in range from @base to @limit.
  */
-int __init cma_declare_contiguous(phys_addr_t base,
+int __init cma_declare_contiguous_nid(phys_addr_t base,
 			phys_addr_t size, phys_addr_t limit,
 			phys_addr_t alignment, unsigned int order_per_bit,
-			bool fixed, const char *name, struct cma **res_cma)
+			bool fixed, const char *name, struct cma **res_cma,
+			int nid)
 {
 	phys_addr_t memblock_end = memblock_end_of_DRAM();
 	phys_addr_t highmem_start;
@@ -336,14 +338,14 @@ int __init cma_declare_contiguous(phys_a
 		 * memory in case of failure.
 		 */
 		if (base < highmem_start && limit > highmem_start) {
-			addr = memblock_phys_alloc_range(size, alignment,
-							 highmem_start, limit);
+			addr = memblock_alloc_range_nid(size, alignment,
+					highmem_start, limit, nid, false);
 			limit = highmem_start;
 		}
 
 		if (!addr) {
-			addr = memblock_phys_alloc_range(size, alignment, base,
-							 limit);
+			addr = memblock_alloc_range_nid(size, alignment, base,
+					limit, nid, false);
 			if (!addr) {
 				ret = -ENOMEM;
 				goto err;
--- a/mm/memblock.c~mm-cma-numa-node-interface
+++ a/mm/memblock.c
@@ -1349,7 +1349,7 @@ __next_mem_pfn_range_in_zone(u64 *idx, s
  * Return:
  * Physical address of allocated memory block on success, %0 on failure.
  */
-static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
+phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
 					phys_addr_t align, phys_addr_t start,
 					phys_addr_t end, int nid,
 					bool exact_nid)
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 10/35] mm: hugetlb: optionally allocate gigantic hugepages using cma
  2020-04-10 21:30 incoming Andrew Morton
                   ` (8 preceding siblings ...)
  2020-04-10 21:32 ` [patch 09/35] mm: cma: NUMA node interface Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 11/35] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Andrew Morton
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, andreas.schaufler, aslan, guro, js1304, linux-mm, mhocko,
	mike.kravetz, mm-commits, rdunlap, riel, torvalds

From: Roman Gushchin <guro@fb.com>
Subject: mm: hugetlb: optionally allocate gigantic hugepages using cma

Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at
runtime") has added the run-time allocation of gigantic pages.  However it
actually works only at early stages of the system loading, when the
majority of memory is free.  After some time the memory gets fragmented by
non-movable pages, so the chances to find a contiguous 1 GB block are
getting close to zero.  Even dropping caches manually doesn't help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a high
probability, however the memory isn't completely wasted if nobody is using
1GB hugepages: it can be used for pagecache, anon memory, THPs, etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions proposed
by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!

Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/kernel-parameters.txt |    8 +
 arch/arm64/mm/init.c                            |    6 
 arch/x86/kernel/setup.c                         |    4 
 include/linux/hugetlb.h                         |   12 +
 mm/hugetlb.c                                    |  109 ++++++++++++++
 5 files changed, 139 insertions(+)

--- a/arch/arm64/mm/init.c~mm-hugetlb-optionally-allocate-gigantic-hugepages-using-cma
+++ a/arch/arm64/mm/init.c
@@ -29,6 +29,7 @@
 #include <linux/mm.h>
 #include <linux/kexec.h>
 #include <linux/crash_dump.h>
+#include <linux/hugetlb.h>
 
 #include <asm/boot.h>
 #include <asm/fixmap.h>
@@ -457,6 +458,11 @@ void __init arm64_memblock_init(void)
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
 
 	dma_contiguous_reserve(arm64_dma32_phys_limit);
+
+#ifdef CONFIG_ARM64_4K_PAGES
+	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+#endif
+
 }
 
 void __init bootmem_init(void)
--- a/arch/x86/kernel/setup.c~mm-hugetlb-optionally-allocate-gigantic-hugepages-using-cma
+++ a/arch/x86/kernel/setup.c
@@ -16,6 +16,7 @@
 #include <linux/pci.h>
 #include <linux/root_dev.h>
 #include <linux/sfi.h>
+#include <linux/hugetlb.h>
 #include <linux/tboot.h>
 #include <linux/usb/xhci-dbgp.h>
 
@@ -1157,6 +1158,9 @@ void __init setup_arch(char **cmdline_p)
 	initmem_init();
 	dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT);
 
+	if (boot_cpu_has(X86_FEATURE_GBPAGES))
+		hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+
 	/*
 	 * Reserve memory for crash kernel after SRAT is parsed so that it
 	 * won't consume hotpluggable memory.
--- a/Documentation/admin-guide/kernel-parameters.txt~mm-hugetlb-optionally-allocate-gigantic-hugepages-using-cma
+++ a/Documentation/admin-guide/kernel-parameters.txt
@@ -1475,6 +1475,14 @@
 	hpet_mmap=	[X86, HPET_MMAP] Allow userspace to mmap HPET
 			registers.  Default set by CONFIG_HPET_MMAP_DEFAULT.
 
+	hugetlb_cma=	[HW] The size of a cma area used for allocation
+			of gigantic hugepages.
+			Format: nn[KMGTPE]
+
+			Reserve a cma area of given size and allocate gigantic
+			hugepages using the cma allocator. If enabled, the
+			boot-time allocation of gigantic hugepages is skipped.
+
 	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
 	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
 			On x86-64 and powerpc, this option can be specified
--- a/include/linux/hugetlb.h~mm-hugetlb-optionally-allocate-gigantic-hugepages-using-cma
+++ a/include/linux/hugetlb.h
@@ -895,4 +895,16 @@ static inline spinlock_t *huge_pte_lock(
 	return ptl;
 }
 
+#if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA)
+extern void __init hugetlb_cma_reserve(int order);
+extern void __init hugetlb_cma_check(void);
+#else
+static inline __init void hugetlb_cma_reserve(int order)
+{
+}
+static inline __init void hugetlb_cma_check(void)
+{
+}
+#endif
+
 #endif /* _LINUX_HUGETLB_H */
--- a/mm/hugetlb.c~mm-hugetlb-optionally-allocate-gigantic-hugepages-using-cma
+++ a/mm/hugetlb.c
@@ -28,6 +28,7 @@
 #include <linux/jhash.h>
 #include <linux/numa.h>
 #include <linux/llist.h>
+#include <linux/cma.h>
 
 #include <asm/page.h>
 #include <asm/pgtable.h>
@@ -44,6 +45,9 @@
 int hugetlb_max_hstate __read_mostly;
 unsigned int default_hstate_idx;
 struct hstate hstates[HUGE_MAX_HSTATE];
+
+static struct cma *hugetlb_cma[MAX_NUMNODES];
+
 /*
  * Minimum page order among possible hugepage sizes, set to a proper value
  * at boot time.
@@ -1228,6 +1232,14 @@ static void destroy_compound_gigantic_pa
 
 static void free_gigantic_page(struct page *page, unsigned int order)
 {
+	/*
+	 * If the page isn't allocated using the cma allocator,
+	 * cma_release() returns false.
+	 */
+	if (IS_ENABLED(CONFIG_CMA) &&
+	    cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
+		return;
+
 	free_contig_range(page_to_pfn(page), 1 << order);
 }
 
@@ -1237,6 +1249,21 @@ static struct page *alloc_gigantic_page(
 {
 	unsigned long nr_pages = 1UL << huge_page_order(h);
 
+	if (IS_ENABLED(CONFIG_CMA)) {
+		struct page *page;
+		int node;
+
+		for_each_node_mask(node, *nodemask) {
+			if (!hugetlb_cma[node])
+				continue;
+
+			page = cma_alloc(hugetlb_cma[node], nr_pages,
+					 huge_page_order(h), true);
+			if (page)
+				return page;
+		}
+	}
+
 	return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
 }
 
@@ -1281,8 +1308,14 @@ static void update_and_free_page(struct
 	set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
 	set_page_refcounted(page);
 	if (hstate_is_gigantic(h)) {
+		/*
+		 * Temporarily drop the hugetlb_lock, because
+		 * we might block in free_gigantic_page().
+		 */
+		spin_unlock(&hugetlb_lock);
 		destroy_compound_gigantic_page(page, huge_page_order(h));
 		free_gigantic_page(page, huge_page_order(h));
+		spin_lock(&hugetlb_lock);
 	} else {
 		__free_pages(page, huge_page_order(h));
 	}
@@ -2539,6 +2572,10 @@ static void __init hugetlb_hstate_alloc_
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
+			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
+				break;
+			}
 			if (!alloc_bootmem_huge_page(h))
 				break;
 		} else if (!alloc_pool_huge_page(h,
@@ -3194,6 +3231,7 @@ static int __init hugetlb_init(void)
 			default_hstate.max_huge_pages = default_hstate_max_huge_pages;
 	}
 
+	hugetlb_cma_check();
 	hugetlb_init_hstates();
 	gather_bootmem_prealloc();
 	report_hugepages();
@@ -5506,3 +5544,74 @@ void move_hugetlb_state(struct page *old
 		spin_unlock(&hugetlb_lock);
 	}
 }
+
+#ifdef CONFIG_CMA
+static unsigned long hugetlb_cma_size __initdata;
+static bool cma_reserve_called __initdata;
+
+static int __init cmdline_parse_hugetlb_cma(char *p)
+{
+	hugetlb_cma_size = memparse(p, &p);
+	return 0;
+}
+
+early_param("hugetlb_cma", cmdline_parse_hugetlb_cma);
+
+void __init hugetlb_cma_reserve(int order)
+{
+	unsigned long size, reserved, per_node;
+	int nid;
+
+	cma_reserve_called = true;
+
+	if (!hugetlb_cma_size)
+		return;
+
+	if (hugetlb_cma_size < (PAGE_SIZE << order)) {
+		pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n",
+			(PAGE_SIZE << order) / SZ_1M);
+		return;
+	}
+
+	/*
+	 * If 3 GB area is requested on a machine with 4 numa nodes,
+	 * let's allocate 1 GB on first three nodes and ignore the last one.
+	 */
+	per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes);
+	pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n",
+		hugetlb_cma_size / SZ_1M, per_node / SZ_1M);
+
+	reserved = 0;
+	for_each_node_state(nid, N_ONLINE) {
+		int res;
+
+		size = min(per_node, hugetlb_cma_size - reserved);
+		size = round_up(size, PAGE_SIZE << order);
+
+		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
+						 0, false, "hugetlb",
+						 &hugetlb_cma[nid], nid);
+		if (res) {
+			pr_warn("hugetlb_cma: reservation failed: err %d, node %d",
+				res, nid);
+			continue;
+		}
+
+		reserved += size;
+		pr_info("hugetlb_cma: reserved %lu MiB on node %d\n",
+			size / SZ_1M, nid);
+
+		if (reserved >= hugetlb_cma_size)
+			break;
+	}
+}
+
+void __init hugetlb_cma_check(void)
+{
+	if (!hugetlb_cma_size || cma_reserve_called)
+		return;
+
+	pr_warn("hugetlb_cma: the option isn't supported by current arch\n");
+}
+
+#endif /* CONFIG_CMA */
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 11/35] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area
  2020-04-10 21:30 incoming Andrew Morton
                   ` (9 preceding siblings ...)
  2020-04-10 21:32 ` [patch 10/35] mm: hugetlb: optionally allocate gigantic hugepages using cma Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 12/35] mm/memory.c: refactor insert_page to prepare for batched-lock insert Andrew Morton
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, bp, jaewon31.kim, linux-mm, mm-commits, torvalds, walken, willy

From: Jaewon Kim <jaewon31.kim@samsung.com>
Subject: mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area

On passing requirement to vm_unmapped_area, arch_get_unmapped_area and
arch_get_unmapped_area_topdown did not set align_offset.  Internally on
both unmapped_area and unmapped_area_topdown, if info->align_mask is 0,
then info->align_offset was meaningless.

But commit df529cabb7a2 ("mm: mmap: add trace point of vm_unmapped_area")
always prints info->align_offset even though it is uninitialized.  Fix
this uninitialized value issue by setting it to 0 explicitly.

Before
92.291104: vm_unmapped_area: addr=0x755b155000 err=0 total_vm=0x15aaf0 flags=0x1 len=0x109000 lo=0x8000 hi=0x75eed48000 mask=0x0 ofs=0x4022

After
68.584210: vm_unmapped_area: addr=0x74a4ca1000 err=0 total_vm=0x168ab1 flags=0x1 len=0x9000 lo=0x8000 hi=0x753d94b000 mask=0x0 ofs=0x0

Link: http://lkml.kernel.org/r/20200409094035.19457-1-jaewon31.kim@samsung.com
Signed-off-by: Jaewon Kim <jaewon31.kim@samsung.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/mmap.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/mm/mmap.c~mm-mmap-initialize-align_offset-explicitly-for-vm_unmapped_area
+++ a/mm/mmap.c
@@ -2123,6 +2123,7 @@ arch_get_unmapped_area(struct file *filp
 	info.low_limit = mm->mmap_base;
 	info.high_limit = mmap_end;
 	info.align_mask = 0;
+	info.align_offset = 0;
 	return vm_unmapped_area(&info);
 }
 #endif
@@ -2164,6 +2165,7 @@ arch_get_unmapped_area_topdown(struct fi
 	info.low_limit = max(PAGE_SIZE, mmap_min_addr);
 	info.high_limit = arch_get_mmap_base(addr, mm->mmap_base);
 	info.align_mask = 0;
+	info.align_offset = 0;
 	addr = vm_unmapped_area(&info);
 
 	/*
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 12/35] mm/memory.c: refactor insert_page to prepare for batched-lock insert
  2020-04-10 21:30 incoming Andrew Morton
                   ` (10 preceding siblings ...)
  2020-04-10 21:32 ` [patch 11/35] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 13/35] mm: bring sparc pte_index() semantics inline with other platforms Andrew Morton
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, arjunroy, davem, edumazet, jgg, linux-mm, mm-commits, sfr,
	soheil, torvalds, willy

From: Arjun Roy <arjunroy@google.com>
Subject: mm/memory.c: refactor insert_page to prepare for batched-lock insert

Add helper methods for vm_insert_page()/insert_page() to prepare for
vm_insert_pages(), which batch-inserts pages to reduce spinlock operations
when inserting multiple consecutive pages into the user page table.

The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.

Link: http://lkml.kernel.org/r/20200128025958.43490-1-arjunroy.kdev@gmail.com
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |   39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- a/mm/memory.c~mm-refactor-insert_page-to-prepare-for-batched-lock-insert
+++ a/mm/memory.c
@@ -1442,6 +1442,27 @@ pte_t *__get_locked_pte(struct mm_struct
 	return pte_alloc_map_lock(mm, pmd, addr, ptl);
 }
 
+static int validate_page_before_insert(struct page *page)
+{
+	if (PageAnon(page) || PageSlab(page) || page_has_type(page))
+		return -EINVAL;
+	flush_dcache_page(page);
+	return 0;
+}
+
+static int insert_page_into_pte_locked(struct mm_struct *mm, pte_t *pte,
+			unsigned long addr, struct page *page, pgprot_t prot)
+{
+	if (!pte_none(*pte))
+		return -EBUSY;
+	/* Ok, finally just insert the thing.. */
+	get_page(page);
+	inc_mm_counter_fast(mm, mm_counter_file(page));
+	page_add_file_rmap(page, false);
+	set_pte_at(mm, addr, pte, mk_pte(page, prot));
+	return 0;
+}
+
 /*
  * This is the old fallback for page remapping.
  *
@@ -1457,26 +1478,14 @@ static int insert_page(struct vm_area_st
 	pte_t *pte;
 	spinlock_t *ptl;
 
-	retval = -EINVAL;
-	if (PageAnon(page) || PageSlab(page) || page_has_type(page))
+	retval = validate_page_before_insert(page);
+	if (retval)
 		goto out;
 	retval = -ENOMEM;
-	flush_dcache_page(page);
 	pte = get_locked_pte(mm, addr, &ptl);
 	if (!pte)
 		goto out;
-	retval = -EBUSY;
-	if (!pte_none(*pte))
-		goto out_unlock;
-
-	/* Ok, finally just insert the thing.. */
-	get_page(page);
-	inc_mm_counter_fast(mm, mm_counter_file(page));
-	page_add_file_rmap(page, false);
-	set_pte_at(mm, addr, pte, mk_pte(page, prot));

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 13/35] mm: bring sparc pte_index() semantics inline with other platforms
  2020-04-10 21:30 incoming Andrew Morton
                   ` (11 preceding siblings ...)
  2020-04-10 21:32 ` [patch 12/35] mm/memory.c: refactor insert_page to prepare for batched-lock insert Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:32 ` [patch 14/35] mm: define pte_index as macro for x86 Andrew Morton
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, arjunroy.kdev, arjunroy, davem, edumazet, jgg, linux-mm,
	mm-commits, rppt, sfr, soheil, torvalds, willy

From: Arjun Roy <arjunroy@google.com>
Subject: mm: bring sparc pte_index() semantics inline with other platforms

pte_index() on platforms other than sparc return a numerical index.  On
sparc, it returns a pte_t*.  This presents an issue for vm_insert_pages(),
which relies on pte_index() to find the offset for a pte within a pmd, for
batched inserts.

This patch:
1. Modifies pte_index() for sparc to return a numerical index, like
   other platforms,
2. Defines pte_entry() for sparc which returns a pte_t*
   (as pte_index() used to),
3. Converts existing sparc callers for pte_index() to use pte_entry().

[sfr@canb.auug.org.au: remove pte_entry and just directly modified pte_offset_kernel instead]
Link: http://lkml.kernel.org/r/20200227105045.6b421d9f@canb.auug.org.au
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Arjun Roy <arjunroy.kdev@gmail.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/sparc/include/asm/pgtable_64.h |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- a/arch/sparc/include/asm/pgtable_64.h~mm-bring-sparc-pte_index-semantics-inline-with-other-platforms
+++ a/arch/sparc/include/asm/pgtable_64.h
@@ -907,11 +907,11 @@ static inline unsigned long pud_pfn(pud_
 	 (((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1)))
 
 /* Find an entry in the third-level page table.. */
-#define pte_index(dir, address)	\
-	((pte_t *) __pmd_page(*(dir)) + \
-	 ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)))
-#define pte_offset_kernel		pte_index
-#define pte_offset_map			pte_index
+#define pte_index(address)			\
+	 ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+#define pte_offset_kernel(dir, address)	\
+	((pte_t *) __pmd_page(*(dir)) + pte_index(address))
+#define pte_offset_map(dir, address)	pte_offset_kernel((dir), (address))
 #define pte_unmap(pte)			do { } while (0)
 
 /* We cannot include <linux/mm_types.h> at this point yet: */
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 14/35] mm: define pte_index as macro for x86
  2020-04-10 21:30 incoming Andrew Morton
                   ` (12 preceding siblings ...)
  2020-04-10 21:32 ` [patch 13/35] mm: bring sparc pte_index() semantics inline with other platforms Andrew Morton
@ 2020-04-10 21:32 ` Andrew Morton
  2020-04-10 21:33 ` [patch 15/35] mm/memory.c: add vm_insert_pages() Andrew Morton
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:32 UTC (permalink / raw)
  To: akpm, arjunroy, davem, edumazet, jgg, linux-mm, mm-commits, sfr,
	soheil, torvalds, willy

From: Arjun Roy <arjunroy@google.com>
Subject: mm: define pte_index as macro for x86

pte_index() is either defined as a macro (e.g.  sparc64) or as an inlined
function (e.g.  x86).  vm_insert_pages() depends on pte_index but it is
not defined on all platforms (e.g.  m68k).

To fix compilation of vm_insert_pages() on architectures not providing
pte_index(), we perform the following fix:

0. For platforms where it is meaningful, and defined as a macro, no
    change is needed.
1. For platforms where it is meaningful and defined as an inlined
    function, and we want to use it with vm_insert_pages(), we define
    a degenerate macro of the form:  #define pte_index pte_index
2. vm_insert_pages() checks for the existence of a pte_index macro
   definition. If found, it implements a batched insert. If not found,
   it devolves to calling vm_insert_page() in a loop.

This patch implements step 1 for x86.

v3 of this patch fixes a compilation warning for an unused method.
v2 of this patch moved a macro definition to a more readable location.

Link: http://lkml.kernel.org/r/20200228054714.204424-1-arjunroy.kdev@gmail.com
Signed-off-by: Arjun Roy <arjunroy@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/include/asm/pgtable.h |    3 +++
 1 file changed, 3 insertions(+)

--- a/arch/x86/include/asm/pgtable.h~mm-define-pte_index-as-macro-for-x86
+++ a/arch/x86/include/asm/pgtable.h
@@ -860,7 +860,10 @@ static inline unsigned long pmd_index(un
  *
  * this function returns the index of the entry in the pte page which would
  * control the given virtual address
+ *
+ * Also define macro so we can test if pte_index is defined for arch.
  */
+#define pte_index pte_index
 static inline unsigned long pte_index(unsigned long address)
 {
 	return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 15/35] mm/memory.c: add vm_insert_pages()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (13 preceding siblings ...)
  2020-04-10 21:32 ` [patch 14/35] mm: define pte_index as macro for x86 Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 16/35] mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS Andrew Morton
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, arjunroy, davem, edumazet, jgg, linux-mm, mm-commits, sfr,
	soheil, torvalds, willy

From: Arjun Roy <arjunroy@google.com>
Subject: mm/memory.c: add vm_insert_pages()

Add the ability to insert multiple pages at once to a user VM with lower
PTE spinlock operations.

The intention of this patch-set is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.

[akpm@linux-foundation.org: pte_alloc() no longer takes the `addr' argument]
[arjunroy@google.com: add missing page_count() check to vm_insert_pages()]
  Link: http://lkml.kernel.org/r/20200214005929.104481-1-arjunroy.kdev@gmail.com
[arjunroy@google.com: vm_insert_pages() checks if pte_index defined]
  Link: http://lkml.kernel.org/r/20200228054714.204424-2-arjunroy.kdev@gmail.com
Link: http://lkml.kernel.org/r/20200128025958.43490-2-arjunroy.kdev@gmail.com
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mm.h |    2 
 mm/memory.c        |  129 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 129 insertions(+), 2 deletions(-)

--- a/include/linux/mm.h~mm-add-vm_insert_pages
+++ a/include/linux/mm.h
@@ -2689,6 +2689,8 @@ struct vm_area_struct *find_extend_vma(s
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
 			unsigned long pfn, unsigned long size, pgprot_t);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
+int vm_insert_pages(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long *num);
 int vm_map_pages(struct vm_area_struct *vma, struct page **pages,
 				unsigned long num);
 int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages,
--- a/mm/memory.c~mm-add-vm_insert_pages
+++ a/mm/memory.c
@@ -1419,8 +1419,7 @@ void zap_vma_ptes(struct vm_area_struct
 }
 EXPORT_SYMBOL_GPL(zap_vma_ptes);
 
-pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr,
-			spinlock_t **ptl)
+static pmd_t *walk_to_pmd(struct mm_struct *mm, unsigned long addr)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -1439,6 +1438,16 @@ pte_t *__get_locked_pte(struct mm_struct
 		return NULL;
 
 	VM_BUG_ON(pmd_trans_huge(*pmd));
+	return pmd;
+}
+
+pte_t *__get_locked_pte(struct mm_struct *mm, unsigned long addr,
+			spinlock_t **ptl)
+{
+	pmd_t *pmd = walk_to_pmd(mm, addr);
+
+	if (!pmd)
+		return NULL;
 	return pte_alloc_map_lock(mm, pmd, addr, ptl);
 }
 
@@ -1491,6 +1500,122 @@ out:
 	return retval;
 }
 
+#ifdef pte_index
+static int insert_page_in_batch_locked(struct mm_struct *mm, pmd_t *pmd,
+			unsigned long addr, struct page *page, pgprot_t prot)
+{
+	int err;
+
+	if (!page_count(page))
+		return -EINVAL;
+	err = validate_page_before_insert(page);
+	return err ? err : insert_page_into_pte_locked(
+		mm, pte_offset_map(pmd, addr), addr, page, prot);
+}
+
+/* insert_pages() amortizes the cost of spinlock operations
+ * when inserting pages in a loop. Arch *must* define pte_index.
+ */
+static int insert_pages(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long *num, pgprot_t prot)
+{
+	pmd_t *pmd = NULL;
+	spinlock_t *pte_lock = NULL;
+	struct mm_struct *const mm = vma->vm_mm;
+	unsigned long curr_page_idx = 0;
+	unsigned long remaining_pages_total = *num;
+	unsigned long pages_to_write_in_pmd;
+	int ret;
+more:
+	ret = -EFAULT;
+	pmd = walk_to_pmd(mm, addr);
+	if (!pmd)
+		goto out;
+
+	pages_to_write_in_pmd = min_t(unsigned long,
+		remaining_pages_total, PTRS_PER_PTE - pte_index(addr));
+
+	/* Allocate the PTE if necessary; takes PMD lock once only. */
+	ret = -ENOMEM;
+	if (pte_alloc(mm, pmd))
+		goto out;
+	pte_lock = pte_lockptr(mm, pmd);
+
+	while (pages_to_write_in_pmd) {
+		int pte_idx = 0;
+		const int batch_size = min_t(int, pages_to_write_in_pmd, 8);
+
+		spin_lock(pte_lock);
+		for (; pte_idx < batch_size; ++pte_idx) {
+			int err = insert_page_in_batch_locked(mm, pmd,
+				addr, pages[curr_page_idx], prot);
+			if (unlikely(err)) {
+				spin_unlock(pte_lock);
+				ret = err;
+				remaining_pages_total -= pte_idx;
+				goto out;
+			}
+			addr += PAGE_SIZE;
+			++curr_page_idx;
+		}
+		spin_unlock(pte_lock);
+		pages_to_write_in_pmd -= batch_size;
+		remaining_pages_total -= batch_size;
+	}
+	if (remaining_pages_total)
+		goto more;
+	ret = 0;
+out:
+	*num = remaining_pages_total;
+	return ret;
+}
+#endif  /* ifdef pte_index */
+
+/**
+ * vm_insert_pages - insert multiple pages into user vma, batching the pmd lock.
+ * @vma: user vma to map to
+ * @addr: target start user address of these pages
+ * @pages: source kernel pages
+ * @num: in: number of pages to map. out: number of pages that were *not*
+ * mapped. (0 means all pages were successfully mapped).
+ *
+ * Preferred over vm_insert_page() when inserting multiple pages.
+ *
+ * In case of error, we may have mapped a subset of the provided
+ * pages. It is the caller's responsibility to account for this case.
+ *
+ * The same restrictions apply as in vm_insert_page().
+ */
+int vm_insert_pages(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long *num)
+{
+#ifdef pte_index
+	const unsigned long end_addr = addr + (*num * PAGE_SIZE) - 1;
+
+	if (addr < vma->vm_start || end_addr >= vma->vm_end)
+		return -EFAULT;
+	if (!(vma->vm_flags & VM_MIXEDMAP)) {
+		BUG_ON(down_read_trylock(&vma->vm_mm->mmap_sem));
+		BUG_ON(vma->vm_flags & VM_PFNMAP);
+		vma->vm_flags |= VM_MIXEDMAP;
+	}
+	/* Defer page refcount checking till we're about to map that page. */
+	return insert_pages(vma, addr, pages, num, vma->vm_page_prot);
+#else
+	unsigned long idx = 0, pgcount = *num;
+	int err;
+
+	for (; idx < pgcount; ++idx) {
+		err = vm_insert_page(vma, addr + (PAGE_SIZE * idx), pages[idx]);
+		if (err)
+			break;
+	}
+	*num = pgcount - idx;
+	return err;
+#endif  /* ifdef pte_index */
+}
+EXPORT_SYMBOL(vm_insert_pages);
+
 /**
  * vm_insert_page - insert single page into user vma
  * @vma: user vma to map to
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 16/35] mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
  2020-04-10 21:30 incoming Andrew Morton
                   ` (14 preceding siblings ...)
  2020-04-10 21:33 ` [patch 15/35] mm/memory.c: add vm_insert_pages() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 17/35] mm/vma: introduce VM_ACCESS_FLAGS Andrew Morton
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, anshuman.khandual, bcain, catalin.marinas, chris, dalias,
	davem, geert, guoren, gxt, heiko.carstens, James.Bottomley,
	jdike, jonas, ley.foon.tan, linux-mm, linux, mm-commits, monstr,
	mpe, msalter, nickhu, paul.walmsley, paulburton, ralf, rth, tglx,
	tony.luck, torvalds, vbabka, vgupta, ysato

From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS

There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS.  While here, also define some more macros
with standard VMA access flag combinations that are used frequently across
many platforms.  Apart from simplification, this reduces code duplication
as well.

Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/include/asm/page.h      |    3 ---
 arch/arc/include/asm/page.h        |    2 +-
 arch/arm/include/asm/page.h        |    4 +---
 arch/arm64/include/asm/page.h      |    4 +---
 arch/c6x/include/asm/page.h        |    5 +----
 arch/csky/include/asm/page.h       |    3 ---
 arch/h8300/include/asm/page.h      |    2 --
 arch/hexagon/include/asm/page.h    |    3 +--
 arch/ia64/include/asm/page.h       |    5 +----
 arch/m68k/include/asm/page.h       |    3 ---
 arch/microblaze/include/asm/page.h |    2 --
 arch/mips/include/asm/page.h       |    5 +----
 arch/nds32/include/asm/page.h      |    3 ---
 arch/nios2/include/asm/page.h      |    3 +--
 arch/openrisc/include/asm/page.h   |    5 -----
 arch/parisc/include/asm/page.h     |    3 ---
 arch/powerpc/include/asm/page.h    |    9 ++-------
 arch/powerpc/include/asm/page_64.h |    7 ++-----
 arch/riscv/include/asm/page.h      |    3 +--
 arch/s390/include/asm/page.h       |    3 +--
 arch/sh/include/asm/page.h         |    3 ---
 arch/sparc/include/asm/page_32.h   |    3 ---
 arch/sparc/include/asm/page_64.h   |    3 ---
 arch/unicore32/include/asm/page.h  |    3 ---
 arch/x86/include/asm/page_types.h  |    4 +---
 arch/x86/um/asm/vm-flags.h         |   10 ++--------
 arch/xtensa/include/asm/page.h     |    3 ---
 include/linux/mm.h                 |   14 ++++++++++++++
 28 files changed, 31 insertions(+), 89 deletions(-)

--- a/arch/alpha/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/alpha/include/asm/page.h
@@ -90,9 +90,6 @@ typedef struct page *pgtable_t;
 #define virt_addr_valid(kaddr)	pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
 #endif /* CONFIG_DISCONTIGMEM */
 
-#define VM_DATA_DEFAULT_FLAGS		(VM_READ | VM_WRITE | VM_EXEC | \
-					 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
--- a/arch/arc/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/arc/include/asm/page.h
@@ -102,7 +102,7 @@ typedef pte_t * pgtable_t;
 #define virt_addr_valid(kaddr)  pfn_valid(virt_to_pfn(kaddr))
 
 /* Default Permissions for stack/heaps pages (Non Executable) */
-#define VM_DATA_DEFAULT_FLAGS   (VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_NON_EXEC
 
 #define WANT_PAGE_VIRTUAL   1
 
--- a/arch/arm64/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/arm64/include/asm/page.h
@@ -36,9 +36,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #include <asm-generic/getorder.h>
 
--- a/arch/arm/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/arm/include/asm/page.h
@@ -161,9 +161,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #include <asm-generic/getorder.h>
 
--- a/arch/c6x/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/c6x/include/asm/page.h
@@ -2,10 +2,7 @@
 #ifndef _ASM_C6X_PAGE_H
 #define _ASM_C6X_PAGE_H
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(VM_READ | VM_WRITE | \
-	((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-		 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #include <asm-generic/page.h>
 
--- a/arch/csky/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/csky/include/asm/page.h
@@ -85,9 +85,6 @@ extern unsigned long va_pa_offset;
 				 PHYS_OFFSET_OFFSET)
 #define virt_to_page(x)	(mem_map + MAP_NR(x))
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #define pfn_to_kaddr(x)	__va(PFN_PHYS(x))
 
 #include <asm-generic/memory_model.h>
--- a/arch/h8300/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/h8300/include/asm/page.h
@@ -6,8 +6,6 @@
 #include <linux/types.h>
 
 #define MAP_NR(addr) (((uintptr_t)(addr)-PAGE_OFFSET) >> PAGE_SHIFT)
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 
 #ifndef __ASSEMBLY__
 extern unsigned long rom_length;
--- a/arch/hexagon/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/hexagon/include/asm/page.h
@@ -93,8 +93,7 @@ struct page;
 #define virt_to_page(kaddr) pfn_to_page(PFN_DOWN(__pa(kaddr)))
 
 /* Default vm area behavior is non-executable.  */
-#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | \
-				VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_NON_EXEC
 
 #define pfn_valid(pfn) ((pfn) < max_mapnr)
 #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
--- a/arch/ia64/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/ia64/include/asm/page.h
@@ -218,10 +218,7 @@ get_order (unsigned long size)
 
 #define PAGE_OFFSET			RGN_BASE(RGN_KERNEL)
 
-#define VM_DATA_DEFAULT_FLAGS		(VM_READ | VM_WRITE |					\
-					 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC |		\
-					 (((current->personality & READ_IMPLIES_EXEC) != 0)	\
-					  ? VM_EXEC : 0))
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #define GATE_ADDR		RGN_BASE(RGN_GATE)
 
--- a/arch/m68k/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/m68k/include/asm/page.h
@@ -65,9 +65,6 @@ extern unsigned long _ramend;
 #define __phys_to_pfn(paddr)	((unsigned long)((paddr) >> PAGE_SHIFT))
 #define __pfn_to_phys(pfn)	PFN_PHYS(pfn)
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/getorder.h>
 
 #endif /* _M68K_PAGE_H */
--- a/arch/microblaze/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/microblaze/include/asm/page.h
@@ -194,8 +194,6 @@ extern int page_is_ram(unsigned long pfn
 
 #ifdef CONFIG_MMU
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
 #endif /* CONFIG_MMU */
 
 #endif /* __KERNEL__ */
--- a/arch/mips/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/mips/include/asm/page.h
@@ -253,10 +253,7 @@ extern bool __virt_addr_valid(const vola
 #define virt_addr_valid(kaddr)						\
 	__virt_addr_valid((const volatile void *) (kaddr))
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(VM_READ | VM_WRITE | \
-	 ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-	 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
--- a/arch/nds32/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/nds32/include/asm/page.h
@@ -59,9 +59,6 @@ typedef struct page *pgtable_t;
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #endif /* __KERNEL__ */
 
 #endif
--- a/arch/nios2/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/nios2/include/asm/page.h
@@ -98,8 +98,7 @@ static inline bool pfn_valid(unsigned lo
 # define virt_to_page(vaddr)	pfn_to_page(PFN_DOWN(virt_to_phys(vaddr)))
 # define virt_addr_valid(vaddr)	pfn_valid(PFN_DOWN(virt_to_phys(vaddr)))
 
-# define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+# define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_NON_EXEC
 
 #include <asm-generic/memory_model.h>
 
--- a/arch/openrisc/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/openrisc/include/asm/page.h
@@ -86,11 +86,6 @@ typedef struct page *pgtable_t;
 
 #endif /* __ASSEMBLY__ */
 
-
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
--- a/arch/parisc/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/parisc/include/asm/page.h
@@ -180,9 +180,6 @@ extern int npmem_ranges;
 #define page_to_phys(page)	(page_to_pfn(page) << PAGE_SHIFT)
 #define virt_to_page(kaddr)     pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 #include <asm/pdc.h>
--- a/arch/powerpc/include/asm/page_64.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/powerpc/include/asm/page_64.h
@@ -94,11 +94,8 @@ extern u64 ppc64_pft_size;
  * stack by default, so in the absence of a PT_GNU_STACK program header
  * we turn execute permission off.
  */
-#define VM_STACK_DEFAULT_FLAGS32	(VM_READ | VM_WRITE | VM_EXEC | \
-					 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
-#define VM_STACK_DEFAULT_FLAGS64	(VM_READ | VM_WRITE | \
-					 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_STACK_DEFAULT_FLAGS32	VM_DATA_FLAGS_EXEC
+#define VM_STACK_DEFAULT_FLAGS64	VM_DATA_FLAGS_NON_EXEC
 
 #define VM_STACK_DEFAULT_FLAGS \
 	(is_32bit_task() ? \
--- a/arch/powerpc/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/powerpc/include/asm/page.h
@@ -240,13 +240,8 @@ static inline bool pfn_valid(unsigned lo
  * and needs to be executable.  This means the whole heap ends
  * up being executable.
  */
-#define VM_DATA_DEFAULT_FLAGS32 \
-	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-				 VM_READ | VM_WRITE | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
-#define VM_DATA_DEFAULT_FLAGS64	(VM_READ | VM_WRITE | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS32	VM_DATA_FLAGS_TSK_EXEC
+#define VM_DATA_DEFAULT_FLAGS64	VM_DATA_FLAGS_NON_EXEC
 
 #ifdef __powerpc64__
 #include <asm/page_64.h>
--- a/arch/riscv/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/riscv/include/asm/page.h
@@ -137,8 +137,7 @@ extern phys_addr_t __phys_addr_symbol(un
 
 #define virt_addr_valid(vaddr)	(pfn_valid(virt_to_pfn(vaddr)))
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_NON_EXEC
 
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
--- a/arch/s390/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/s390/include/asm/page.h
@@ -181,8 +181,7 @@ int arch_make_page_accessible(struct pag
 
 #define virt_addr_valid(kaddr)	pfn_valid(virt_to_pfn(kaddr))
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_NON_EXEC
 
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
--- a/arch/sh/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/sh/include/asm/page.h
@@ -182,9 +182,6 @@ typedef struct page *pgtable_t;
 #endif
 #define virt_addr_valid(kaddr)	pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
--- a/arch/sparc/include/asm/page_32.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/sparc/include/asm/page_32.h
@@ -133,9 +133,6 @@ extern unsigned long pfn_base;
 #define pfn_valid(pfn)		(((pfn) >= (pfn_base)) && (((pfn)-(pfn_base)) < max_mapnr))
 #define virt_addr_valid(kaddr)	((((unsigned long)(kaddr)-PAGE_OFFSET)>>PAGE_SHIFT) < max_mapnr)
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/memory_model.h>
 #include <asm-generic/getorder.h>
 
--- a/arch/sparc/include/asm/page_64.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/sparc/include/asm/page_64.h
@@ -158,9 +158,6 @@ extern unsigned long PAGE_OFFSET;
 
 #endif /* !(__ASSEMBLY__) */
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/getorder.h>
 
 #endif /* _SPARC64_PAGE_H */
--- a/arch/unicore32/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/unicore32/include/asm/page.h
@@ -69,9 +69,6 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(VM_READ | VM_WRITE | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include <asm-generic/getorder.h>
 
 #endif
--- a/arch/x86/include/asm/page_types.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/x86/include/asm/page_types.h
@@ -35,9 +35,7 @@
 
 #define PAGE_OFFSET		((unsigned long)__PAGE_OFFSET)
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
-	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #define __PHYSICAL_START	ALIGN(CONFIG_PHYSICAL_START, \
 				      CONFIG_PHYSICAL_ALIGN)
--- a/arch/x86/um/asm/vm-flags.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/x86/um/asm/vm-flags.h
@@ -9,17 +9,11 @@
 
 #ifdef CONFIG_X86_32
 
-#define VM_DATA_DEFAULT_FLAGS \
-	(VM_READ | VM_WRITE | \
-	((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
-		 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
 
 #else
 
-#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_EXEC | \
-	VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-#define VM_STACK_DEFAULT_FLAGS (VM_GROWSDOWN | VM_READ | VM_WRITE | \
-	VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_STACK_DEFAULT_FLAGS (VM_GROWSDOWN | VM_DATA_FLAGS_EXEC)
 
 #endif
 #endif
--- a/arch/xtensa/include/asm/page.h~mm-vma-define-a-default-value-for-vm_data_default_flags
+++ a/arch/xtensa/include/asm/page.h
@@ -203,8 +203,5 @@ static inline unsigned long ___pa(unsign
 
 #endif /* __ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS	(VM_READ | VM_WRITE | VM_EXEC | \
-				 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 17/35] mm/vma: introduce VM_ACCESS_FLAGS
  2020-04-10 21:30 incoming Andrew Morton
                   ` (15 preceding siblings ...)
  2020-04-10 21:33 ` [patch 16/35] mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 18/35] mm/special: create generic fallbacks for pte_special() and pte_mkspecial() Andrew Morton
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, anshuman.khandual, catalin.marinas, dave.hansen, geert,
	gregkh, gxt, heiko.carstens, ley.foon.tan, linux-mm, linux,
	mm-commits, mpe, msalter, nickhu, rspringer, tglx, torvalds,
	vbabka, ysato

From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/vma: introduce VM_ACCESS_FLAGS

There are many places where all basic VMA access flags (read, write, exec)
are initialized or checked against as a group.  One such example is during
page fault.  Existing vma_is_accessible() wrapper already creates the
notion of VMA accessibility as a group access permissions.  Hence lets
just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which will not only
reduce code duplication but also extend the VMA accessibility concept in
general.

Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm/mm/fault.c                  |    2 +-
 arch/arm64/mm/fault.c                |    2 +-
 arch/nds32/mm/fault.c                |    2 +-
 arch/powerpc/mm/book3s64/pkeys.c     |    2 +-
 arch/s390/mm/fault.c                 |    2 +-
 arch/unicore32/mm/fault.c            |    2 +-
 arch/x86/mm/pkeys.c                  |    2 +-
 drivers/staging/gasket/gasket_core.c |    2 +-
 include/linux/mm.h                   |    6 +++++-
 mm/mmap.c                            |    2 +-
 mm/mprotect.c                        |    4 ++--
 11 files changed, 16 insertions(+), 12 deletions(-)

--- a/arch/arm64/mm/fault.c~mm-vma-introduce-vm_access_flags
+++ a/arch/arm64/mm/fault.c
@@ -445,7 +445,7 @@ static int __kprobes do_page_fault(unsig
 	const struct fault_info *inf;
 	struct mm_struct *mm = current->mm;
 	vm_fault_t fault, major = 0;
-	unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned long vm_flags = VM_ACCESS_FLAGS;
 	unsigned int mm_flags = FAULT_FLAG_DEFAULT;
 
 	if (kprobe_page_fault(regs, esr))
--- a/arch/arm/mm/fault.c~mm-vma-introduce-vm_access_flags
+++ a/arch/arm/mm/fault.c
@@ -189,7 +189,7 @@ void do_bad_area(unsigned long addr, uns
  */
 static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 {
-	unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned int mask = VM_ACCESS_FLAGS;
 
 	if ((fsr & FSR_WRITE) && !(fsr & FSR_CM))
 		mask = VM_WRITE;
--- a/arch/nds32/mm/fault.c~mm-vma-introduce-vm_access_flags
+++ a/arch/nds32/mm/fault.c
@@ -79,7 +79,7 @@ void do_page_fault(unsigned long entry,
 	struct vm_area_struct *vma;
 	int si_code;
 	vm_fault_t fault;
-	unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned int mask = VM_ACCESS_FLAGS;
 	unsigned int flags = FAULT_FLAG_DEFAULT;
 
 	error_code = error_code & (ITYPE_mskINST | ITYPE_mskETYPE);
--- a/arch/powerpc/mm/book3s64/pkeys.c~mm-vma-introduce-vm_access_flags
+++ a/arch/powerpc/mm/book3s64/pkeys.c
@@ -315,7 +315,7 @@ int __execute_only_pkey(struct mm_struct
 static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
 {
 	/* Do this check first since the vm_flags should be hot */
-	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+	if ((vma->vm_flags & VM_ACCESS_FLAGS) != VM_EXEC)
 		return false;
 
 	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
--- a/arch/s390/mm/fault.c~mm-vma-introduce-vm_access_flags
+++ a/arch/s390/mm/fault.c
@@ -580,7 +580,7 @@ void do_dat_exception(struct pt_regs *re
 	int access;
 	vm_fault_t fault;
 
-	access = VM_READ | VM_EXEC | VM_WRITE;
+	access = VM_ACCESS_FLAGS;
 	fault = do_exception(regs, access);
 	if (unlikely(fault))
 		do_fault_error(regs, access, fault);
--- a/arch/unicore32/mm/fault.c~mm-vma-introduce-vm_access_flags
+++ a/arch/unicore32/mm/fault.c
@@ -149,7 +149,7 @@ void do_bad_area(unsigned long addr, uns
  */
 static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 {
-	unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+	unsigned int mask = VM_ACCESS_FLAGS;
 
 	if (!(fsr ^ 0x12))	/* write? */
 		mask = VM_WRITE;
--- a/arch/x86/mm/pkeys.c~mm-vma-introduce-vm_access_flags
+++ a/arch/x86/mm/pkeys.c
@@ -63,7 +63,7 @@ int __execute_only_pkey(struct mm_struct
 static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
 {
 	/* Do this check first since the vm_flags should be hot */
-	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+	if ((vma->vm_flags & VM_ACCESS_FLAGS) != VM_EXEC)
 		return false;
 	if (vma_pkey(vma) != vma->vm_mm->context.execute_only_pkey)
 		return false;
--- a/drivers/staging/gasket/gasket_core.c~mm-vma-introduce-vm_access_flags
+++ a/drivers/staging/gasket/gasket_core.c
@@ -689,7 +689,7 @@ static bool gasket_mmap_has_permissions(
 
 	/* Make sure that no wrong flags are set. */
 	requested_permissions =
-		(vma->vm_flags & (VM_WRITE | VM_READ | VM_EXEC));
+		(vma->vm_flags & VM_ACCESS_FLAGS);
 	if (requested_permissions & ~(bar_permissions)) {
 		dev_dbg(gasket_dev->dev,
 			"Attempting to map a region with requested permissions 0x%x, but region has permissions 0x%x.\n",
--- a/include/linux/mm.h~mm-vma-introduce-vm_access_flags
+++ a/include/linux/mm.h
@@ -369,6 +369,10 @@ extern unsigned int kobjsize(const void
 
 #define VM_STACK_FLAGS	(VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)
 
+/* VMA basic access permission flags */
+#define VM_ACCESS_FLAGS (VM_READ | VM_WRITE | VM_EXEC)
+
+
 /*
  * Special vmas that are non-mergable, non-mlock()able.
  */
@@ -646,7 +650,7 @@ static inline bool vma_is_foreign(struct
 
 static inline bool vma_is_accessible(struct vm_area_struct *vma)
 {
-	return vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+	return vma->vm_flags & VM_ACCESS_FLAGS;
 }
 
 #ifdef CONFIG_SHMEM
--- a/mm/mmap.c~mm-vma-introduce-vm_access_flags
+++ a/mm/mmap.c
@@ -1224,7 +1224,7 @@ static int anon_vma_compatible(struct vm
 	return a->vm_end == b->vm_start &&
 		mpol_equal(vma_policy(a), vma_policy(b)) &&
 		a->vm_file == b->vm_file &&
-		!((a->vm_flags ^ b->vm_flags) & ~(VM_READ|VM_WRITE|VM_EXEC|VM_SOFTDIRTY)) &&
+		!((a->vm_flags ^ b->vm_flags) & ~(VM_ACCESS_FLAGS | VM_SOFTDIRTY)) &&
 		b->vm_pgoff == a->vm_pgoff + ((b->vm_start - a->vm_start) >> PAGE_SHIFT);
 }
 
--- a/mm/mprotect.c~mm-vma-introduce-vm_access_flags
+++ a/mm/mprotect.c
@@ -419,7 +419,7 @@ mprotect_fixup(struct vm_area_struct *vm
 	 */
 	if (arch_has_pfn_modify_check() &&
 	    (vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) &&
-	    (newflags & (VM_READ|VM_WRITE|VM_EXEC)) == 0) {
+	    (newflags & VM_ACCESS_FLAGS) == 0) {
 		pgprot_t new_pgprot = vm_get_page_prot(newflags);
 
 		error = walk_page_range(current->mm, start, end,
@@ -598,7 +598,7 @@ static int do_mprotect_pkey(unsigned lon
 		newflags |= (vma->vm_flags & ~mask_off_old_flags);
 
 		/* newflags >> 4 shift VM_MAY% in place of VM_% */
-		if ((newflags & ~(newflags >> 4)) & (VM_READ | VM_WRITE | VM_EXEC)) {
+		if ((newflags & ~(newflags >> 4)) & VM_ACCESS_FLAGS) {
 			error = -EACCES;
 			goto out;
 		}
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 18/35] mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (16 preceding siblings ...)
  2020-04-10 21:33 ` [patch 17/35] mm/vma: introduce VM_ACCESS_FLAGS Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 19/35] mm/memory_hotplug: drop the flags field from struct mhp_restrictions Andrew Morton
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, anshuman.khandual, anton.ivanov, bcain, chris, davem,
	deanbo422, deller, fenghua.yu, geert, green.hu, guoren, gxt, ink,
	James.Bottomley, jcmvbkbc, jdike, jonas, ley.foon.tan, linux-mm,
	linux, mattst88, mm-commits, monstr, nickhu, paulburton, ralf,
	richard, rth, sammy, shorne, stefan.kristiansson, tony.luck,
	torvalds, tsbogend

From: Anshuman Khandual <anshuman.khandual@arm.com>
Subject: mm/special: create generic fallbacks for pte_special() and pte_mkspecial()

Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL
but required to define quite similar fallback stubs for special page table
entry helpers such as pte_special() and pte_mkspecial(), as they get build
in generic MM without a config check.  This creates two generic fallback
stub definitions for these helpers, eliminating much code duplication.

mips platform has a special case where pte_special() and pte_mkspecial()
visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires. 
This restricts those symbol visibility in order to avoid redefinitions
which is now exposed through this new generic stubs and subsequent build
failure.  arm platform set_pte_at() definition needs to be moved into a C
file just to prevent a build failure.

[anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas]
  Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Guo Ren <guoren@kernel.org>			[csky]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Stafford Horne <shorne@gmail.com>		[openrisc]
Acked-by: Helge Deller <deller@gmx.de>			[parisc]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/alpha/include/asm/pgtable.h         |    2 
 arch/arm/include/asm/pgtable-2level.h    |    2 
 arch/arm/include/asm/pgtable.h           |   15 -------
 arch/arm/mm/mmu.c                        |   14 ++++++
 arch/csky/include/asm/pgtable.h          |    3 -
 arch/hexagon/include/asm/pgtable.h       |    2 
 arch/ia64/include/asm/pgtable.h          |    2 
 arch/m68k/include/asm/mcf_pgtable.h      |   10 ----
 arch/m68k/include/asm/motorola_pgtable.h |    2 
 arch/m68k/include/asm/sun3_pgtable.h     |    2 
 arch/microblaze/include/asm/pgtable.h    |    4 -
 arch/mips/include/asm/pgtable.h          |   44 ++++++++++++++-------
 arch/nds32/include/asm/pgtable.h         |    9 ----
 arch/nios2/include/asm/pgtable.h         |    3 -
 arch/openrisc/include/asm/pgtable.h      |    2 
 arch/parisc/include/asm/pgtable.h        |    2 
 arch/sparc/include/asm/pgtable_32.h      |    7 ---
 arch/um/include/asm/pgtable.h            |   10 ----
 arch/unicore32/include/asm/pgtable.h     |    3 -
 arch/xtensa/include/asm/pgtable.h        |    3 -
 include/linux/mm.h                       |   12 +++++
 21 files changed, 58 insertions(+), 95 deletions(-)

--- a/arch/alpha/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/alpha/include/asm/pgtable.h
@@ -268,7 +268,6 @@ extern inline void pud_clear(pud_t * pud
 extern inline int pte_write(pte_t pte)		{ return !(pte_val(pte) & _PAGE_FOW); }
 extern inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 extern inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-extern inline int pte_special(pte_t pte)	{ return 0; }
 
 extern inline pte_t pte_wrprotect(pte_t pte)	{ pte_val(pte) |= _PAGE_FOW; return pte; }
 extern inline pte_t pte_mkclean(pte_t pte)	{ pte_val(pte) &= ~(__DIRTY_BITS); return pte; }
@@ -276,7 +275,6 @@ extern inline pte_t pte_mkold(pte_t pte)
 extern inline pte_t pte_mkwrite(pte_t pte)	{ pte_val(pte) &= ~_PAGE_FOW; return pte; }
 extern inline pte_t pte_mkdirty(pte_t pte)	{ pte_val(pte) |= __DIRTY_BITS; return pte; }
 extern inline pte_t pte_mkyoung(pte_t pte)	{ pte_val(pte) |= __ACCESS_BITS; return pte; }
-extern inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
 
 #define PAGE_DIR_OFFSET(tsk,address) pgd_offset((tsk),(address))
 
--- a/arch/arm/include/asm/pgtable-2level.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/arm/include/asm/pgtable-2level.h
@@ -211,8 +211,6 @@ static inline pmd_t *pmd_offset(pud_t *p
 #define pmd_addr_end(addr,end) (end)
 
 #define set_pte_ext(ptep,pte,ext) cpu_set_pte_ext(ptep,pte,ext)
-#define pte_special(pte)	(0)
-static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
 
 /*
  * We don't have huge page support for short descriptors, for the moment
--- a/arch/arm/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/arm/include/asm/pgtable.h
@@ -243,19 +243,8 @@ static inline void __sync_icache_dcache(
 extern void __sync_icache_dcache(pte_t pteval);
 #endif
 
-static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
-			      pte_t *ptep, pte_t pteval)
-{
-	unsigned long ext = 0;
-
-	if (addr < TASK_SIZE && pte_valid_user(pteval)) {
-		if (!pte_special(pteval))
-			__sync_icache_dcache(pteval);
-		ext |= PTE_EXT_NG;
-	}
-
-	set_pte_ext(ptep, pteval, ext);
-}
+void set_pte_at(struct mm_struct *mm, unsigned long addr,
+		      pte_t *ptep, pte_t pteval);
 
 static inline pte_t clear_pte_bit(pte_t pte, pgprot_t prot)
 {
--- a/arch/arm/mm/mmu.c~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/arm/mm/mmu.c
@@ -1646,3 +1646,17 @@ void __init early_mm_init(const struct m
 	build_mem_type_table();
 	early_paging_init(mdesc);
 }
+
+void set_pte_at(struct mm_struct *mm, unsigned long addr,
+			      pte_t *ptep, pte_t pteval)
+{
+	unsigned long ext = 0;
+
+	if (addr < TASK_SIZE && pte_valid_user(pteval)) {
+		if (!pte_special(pteval))
+			__sync_icache_dcache(pteval);
+		ext |= PTE_EXT_NG;
+	}
+
+	set_pte_ext(ptep, pteval, ext);
+}
--- a/arch/csky/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/csky/include/asm/pgtable.h
@@ -110,9 +110,6 @@ extern unsigned long empty_zero_page[PAG
 extern void load_pgd(unsigned long pg_dir);
 extern pte_t invalid_pte_table[PTRS_PER_PTE];
 
-static inline int pte_special(pte_t pte) { return 0; }
-static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
-
 static inline void set_pte(pte_t *p, pte_t pte)
 {
 	*p = pte;
--- a/arch/hexagon/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/hexagon/include/asm/pgtable.h
@@ -158,8 +158,6 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD
 
 /* Seems to be zero even in architectures where the zero page is firewalled? */
 #define FIRST_USER_ADDRESS 0UL
-#define pte_special(pte)	0
-#define pte_mkspecial(pte)	(pte)
 
 /*  HUGETLB not working currently  */
 #ifdef CONFIG_HUGETLB_PAGE
--- a/arch/ia64/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/ia64/include/asm/pgtable.h
@@ -298,7 +298,6 @@ extern unsigned long VMALLOC_END;
 #define pte_exec(pte)		((pte_val(pte) & _PAGE_AR_RX) != 0)
 #define pte_dirty(pte)		((pte_val(pte) & _PAGE_D) != 0)
 #define pte_young(pte)		((pte_val(pte) & _PAGE_A) != 0)
-#define pte_special(pte)	0
 
 /*
  * Note: we convert AR_RWX to AR_RX and AR_RW to AR_R by clearing the 2nd bit in the
@@ -311,7 +310,6 @@ extern unsigned long VMALLOC_END;
 #define pte_mkclean(pte)	(__pte(pte_val(pte) & ~_PAGE_D))
 #define pte_mkdirty(pte)	(__pte(pte_val(pte) | _PAGE_D))
 #define pte_mkhuge(pte)		(__pte(pte_val(pte)))
-#define pte_mkspecial(pte)	(pte)
 
 /*
  * Because ia64's Icache and Dcache is not coherent (on a cpu), we need to
--- a/arch/m68k/include/asm/mcf_pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/m68k/include/asm/mcf_pgtable.h
@@ -235,11 +235,6 @@ static inline int pte_young(pte_t pte)
 	return pte_val(pte) & CF_PAGE_ACCESSED;
 }
 
-static inline int pte_special(pte_t pte)
-{
-	return 0;
-}
-
 static inline pte_t pte_wrprotect(pte_t pte)
 {
 	pte_val(pte) &= ~CF_PAGE_WRITABLE;
@@ -312,11 +307,6 @@ static inline pte_t pte_mkcache(pte_t pt
 	return pte;
 }
 
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	return pte;
-}
-
 #define swapper_pg_dir kernel_pg_dir
 extern pgd_t kernel_pg_dir[PTRS_PER_PGD];
 
--- a/arch/m68k/include/asm/motorola_pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/m68k/include/asm/motorola_pgtable.h
@@ -174,7 +174,6 @@ static inline void pud_set(pud_t *pudp,
 static inline int pte_write(pte_t pte)		{ return !(pte_val(pte) & _PAGE_RONLY); }
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return 0; }
 
 static inline pte_t pte_wrprotect(pte_t pte)	{ pte_val(pte) |= _PAGE_RONLY; return pte; }
 static inline pte_t pte_mkclean(pte_t pte)	{ pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
@@ -192,7 +191,6 @@ static inline pte_t pte_mkcache(pte_t pt
 	pte_val(pte) = (pte_val(pte) & _CACHEMASK040) | m68k_supervisor_cachemode;
 	return pte;
 }
-static inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
 
 #define PAGE_DIR_OFFSET(tsk,address) pgd_offset((tsk),(address))
 
--- a/arch/m68k/include/asm/sun3_pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/m68k/include/asm/sun3_pgtable.h
@@ -155,7 +155,6 @@ static inline void pmd_clear (pmd_t *pmd
 static inline int pte_write(pte_t pte)		{ return pte_val(pte) & SUN3_PAGE_WRITEABLE; }
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & SUN3_PAGE_MODIFIED; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & SUN3_PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return 0; }
 
 static inline pte_t pte_wrprotect(pte_t pte)	{ pte_val(pte) &= ~SUN3_PAGE_WRITEABLE; return pte; }
 static inline pte_t pte_mkclean(pte_t pte)	{ pte_val(pte) &= ~SUN3_PAGE_MODIFIED; return pte; }
@@ -168,7 +167,6 @@ static inline pte_t pte_mknocache(pte_t
 //static inline pte_t pte_mkcache(pte_t pte)	{ pte_val(pte) &= SUN3_PAGE_NOCACHE; return pte; }
 // until then, use:
 static inline pte_t pte_mkcache(pte_t pte)	{ return pte; }
-static inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
 
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 extern pgd_t kernel_pg_dir[PTRS_PER_PGD];
--- a/arch/microblaze/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/microblaze/include/asm/pgtable.h
@@ -77,10 +77,6 @@ extern pte_t *va_to_pte(unsigned long ad
  * Undefined behaviour if not..
  */
 
-static inline int pte_special(pte_t pte)	{ return 0; }
-
-static inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
-
 /* Start and end of the vmalloc area. */
 /* Make sure to map the vmalloc area above the pinned kernel memory area
    of 32Mb.  */
--- a/arch/mips/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/mips/include/asm/pgtable.h
@@ -270,6 +270,36 @@ cache_sync_done:
 extern pgd_t swapper_pg_dir[];
 
 /*
+ * Platform specific pte_special() and pte_mkspecial() definitions
+ * are required only when ARCH_HAS_PTE_SPECIAL is enabled.
+ */
+#if defined(CONFIG_ARCH_HAS_PTE_SPECIAL)
+#if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)
+static inline int pte_special(pte_t pte)
+{
+	return pte.pte_low & _PAGE_SPECIAL;
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	pte.pte_low |= _PAGE_SPECIAL;
+	return pte;
+}
+#else
+static inline int pte_special(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_SPECIAL;
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_SPECIAL;
+	return pte;
+}
+#endif
+#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
+
+/*
  * The following only work if pte_present() is true.
  * Undefined behaviour if not..
  */
@@ -277,7 +307,6 @@ extern pgd_t swapper_pg_dir[];
 static inline int pte_write(pte_t pte)	{ return pte.pte_low & _PAGE_WRITE; }
 static inline int pte_dirty(pte_t pte)	{ return pte.pte_low & _PAGE_MODIFIED; }
 static inline int pte_young(pte_t pte)	{ return pte.pte_low & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte) { return pte.pte_low & _PAGE_SPECIAL; }
 
 static inline pte_t pte_wrprotect(pte_t pte)
 {
@@ -338,17 +367,10 @@ static inline pte_t pte_mkyoung(pte_t pt
 	}
 	return pte;
 }
-
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	pte.pte_low |= _PAGE_SPECIAL;
-	return pte;
-}
 #else
 static inline int pte_write(pte_t pte)	{ return pte_val(pte) & _PAGE_WRITE; }
 static inline int pte_dirty(pte_t pte)	{ return pte_val(pte) & _PAGE_MODIFIED; }
 static inline int pte_young(pte_t pte)	{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte) { return pte_val(pte) & _PAGE_SPECIAL; }
 
 static inline pte_t pte_wrprotect(pte_t pte)
 {
@@ -392,12 +414,6 @@ static inline pte_t pte_mkyoung(pte_t pt
 	return pte;
 }
 
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	pte_val(pte) |= _PAGE_SPECIAL;
-	return pte;
-}
-
 #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
 static inline int pte_huge(pte_t pte)	{ return pte_val(pte) & _PAGE_HUGE; }
 
--- a/arch/nds32/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/nds32/include/asm/pgtable.h
@@ -286,15 +286,6 @@ PTE_BIT_FUNC(mkclean, &=~_PAGE_D);
 PTE_BIT_FUNC(mkdirty, |=_PAGE_D);
 PTE_BIT_FUNC(mkold, &=~_PAGE_YOUNG);
 PTE_BIT_FUNC(mkyoung, |=_PAGE_YOUNG);
-static inline int pte_special(pte_t pte)
-{
-	return 0;
-}
-
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	return pte;
-}
 
 /*
  * Mark the prot value as uncacheable and unbufferable.
--- a/arch/nios2/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/nios2/include/asm/pgtable.h
@@ -113,7 +113,6 @@ static inline int pte_dirty(pte_t pte)
 	{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		\
 	{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return 0; }
 
 #define pgprot_noncached pgprot_noncached
 
@@ -168,8 +167,6 @@ static inline pte_t pte_mkdirty(pte_t pt
 	return pte;
 }
 
-static inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
-
 static inline pte_t pte_mkyoung(pte_t pte)
 {
 	pte_val(pte) |= _PAGE_ACCESSED;
--- a/arch/openrisc/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/openrisc/include/asm/pgtable.h
@@ -236,8 +236,6 @@ static inline int pte_write(pte_t pte) {
 static inline int pte_exec(pte_t pte)  { return pte_val(pte) & _PAGE_EXEC; }
 static inline int pte_dirty(pte_t pte) { return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte) { return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte) { return 0; }
-static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
 
 static inline pte_t pte_wrprotect(pte_t pte)
 {
--- a/arch/parisc/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/parisc/include/asm/pgtable.h
@@ -377,7 +377,6 @@ static inline void pud_clear(pud_t *pud)
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 static inline int pte_write(pte_t pte)		{ return pte_val(pte) & _PAGE_WRITE; }
-static inline int pte_special(pte_t pte)	{ return 0; }
 
 static inline pte_t pte_mkclean(pte_t pte)	{ pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
 static inline pte_t pte_mkold(pte_t pte)	{ pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
@@ -385,7 +384,6 @@ static inline pte_t pte_wrprotect(pte_t
 static inline pte_t pte_mkdirty(pte_t pte)	{ pte_val(pte) |= _PAGE_DIRTY; return pte; }
 static inline pte_t pte_mkyoung(pte_t pte)	{ pte_val(pte) |= _PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkwrite(pte_t pte)	{ pte_val(pte) |= _PAGE_WRITE; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte)	{ return pte; }
 
 /*
  * Huge pte definitions.
--- a/arch/sparc/include/asm/pgtable_32.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/sparc/include/asm/pgtable_32.h
@@ -223,11 +223,6 @@ static inline int pte_young(pte_t pte)
 	return pte_val(pte) & SRMMU_REF;
 }
 
-static inline int pte_special(pte_t pte)
-{
-	return 0;
-}
-
 static inline pte_t pte_wrprotect(pte_t pte)
 {
 	return __pte(pte_val(pte) & ~SRMMU_WRITE);
@@ -258,8 +253,6 @@ static inline pte_t pte_mkyoung(pte_t pt
 	return __pte(pte_val(pte) | SRMMU_REF);
 }
 
-#define pte_mkspecial(pte)    (pte)
-
 #define pfn_pte(pfn, prot)		mk_pte(pfn_to_page(pfn), prot)
 
 static inline unsigned long pte_pfn(pte_t pte)
--- a/arch/um/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/um/include/asm/pgtable.h
@@ -167,11 +167,6 @@ static inline int pte_newprot(pte_t pte)
 	return(pte_present(pte) && (pte_get_bits(pte, _PAGE_NEWPROT)));
 }
 
-static inline int pte_special(pte_t pte)
-{
-	return 0;
-}
-
 /*
  * =================================
  * Flags setting section.
@@ -247,11 +242,6 @@ static inline pte_t pte_mknewpage(pte_t
 	return(pte);
 }
 
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	return(pte);
-}
-
 static inline void set_pte(pte_t *pteptr, pte_t pteval)
 {
 	pte_copy(*pteptr, pteval);
--- a/arch/unicore32/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/unicore32/include/asm/pgtable.h
@@ -177,7 +177,6 @@ extern struct page *empty_zero_page;
 #define pte_dirty(pte)		(pte_val(pte) & PTE_DIRTY)
 #define pte_young(pte)		(pte_val(pte) & PTE_YOUNG)
 #define pte_exec(pte)		(pte_val(pte) & PTE_EXEC)
-#define pte_special(pte)	(0)
 
 #define PTE_BIT_FUNC(fn, op) \
 static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
@@ -189,8 +188,6 @@ PTE_BIT_FUNC(mkdirty,   |= PTE_DIRTY);
 PTE_BIT_FUNC(mkold,     &= ~PTE_YOUNG);
 PTE_BIT_FUNC(mkyoung,   |= PTE_YOUNG);
 
-static inline pte_t pte_mkspecial(pte_t pte) { return pte; }
-
 /*
  * Mark the prot value as uncacheable.
  */
--- a/arch/xtensa/include/asm/pgtable.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/arch/xtensa/include/asm/pgtable.h
@@ -266,7 +266,6 @@ static inline void paging_init(void) { }
 static inline int pte_write(pte_t pte) { return pte_val(pte) & _PAGE_WRITABLE; }
 static inline int pte_dirty(pte_t pte) { return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte) { return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte) { return 0; }
 
 static inline pte_t pte_wrprotect(pte_t pte)	
 	{ pte_val(pte) &= ~(_PAGE_WRITABLE | _PAGE_HW_WRITE); return pte; }
@@ -280,8 +279,6 @@ static inline pte_t pte_mkyoung(pte_t pt
 	{ pte_val(pte) |= _PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkwrite(pte_t pte)
 	{ pte_val(pte) |= _PAGE_WRITABLE; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte)
-	{ return pte; }
 
 #define pgprot_noncached(prot) (__pgprot(pgprot_val(prot) & ~_PAGE_CA_MASK))
 
--- a/include/linux/mm.h~mm-special-create-generic-fallbacks-for-pte_special-and-pte_mkspecial
+++ a/include/linux/mm.h
@@ -1927,6 +1927,18 @@ static inline void sync_mm_rss(struct mm
 }
 #endif
 
+#ifndef CONFIG_ARCH_HAS_PTE_SPECIAL
+static inline int pte_special(pte_t pte)
+{
+	return 0;
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return pte;
+}
+#endif
+
 #ifndef CONFIG_ARCH_HAS_PTE_DEVMAP
 static inline int pte_devmap(pte_t pte)
 {
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 19/35] mm/memory_hotplug: drop the flags field from struct mhp_restrictions
  2020-04-10 21:30 incoming Andrew Morton
                   ` (17 preceding siblings ...)
  2020-04-10 21:33 ` [patch 18/35] mm/special: create generic fallbacks for pte_special() and pte_mkspecial() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 20/35] mm/memory_hotplug: rename mhp_restrictions to mhp_params Andrew Morton
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: mm/memory_hotplug: drop the flags field from struct mhp_restrictions

Patch series "Allow setting caching mode in arch_add_memory() for P2PDMA", v4.

Currently, the page tables created using memremap_pages() are always
created with the PAGE_KERNEL cacheing mode.  However, the P2PDMA code is
creating pages for PCI BAR memory which should never be accessed through
the cache and instead use either WC or UC.  This still works in most
cases, on x86, because the MTRR registers typically override the caching
settings in the page tables for all of the IO memory to be UC-.  However,
this tends not to work so well on other arches or some rare x86 machines
that have firmware which does not setup the MTRR registers in this way.

Instead of this, this series proposes a change to arch_add_memory() to
take the pgprot required by the mapping which allows us to explicitly set
pagetable entries for P2PDMA memory to UC.

This changes is pretty routine for most of the arches: x86_64, arm64 and
powerpc simply need to thread the pgprot through to where the page tables
are setup.  x86_32 unfortunately sets up the page tables at boot so must
use _set_memory_prot() to change their caching mode.  ia64, s390 and sh
don't appear to have an easy way to change the page tables so, for now at
least, we just return -EINVAL on such mappings and thus they will not
support P2PDMA memory until the work for this is done.  This should be
fine as they don't yet support ZONE_DEVICE.


This patch (of 7):

This variable is not used anywhere and should therefore be removed from
the structure.

Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/memory_hotplug.h |    2 --
 1 file changed, 2 deletions(-)

--- a/include/linux/memory_hotplug.h~mm-memory_hotplug-drop-the-flags-field-from-struct-mhp_restrictions
+++ a/include/linux/memory_hotplug.h
@@ -59,11 +59,9 @@ enum {
 
 /*
  * Restrictions for the memory hotplug:
- * flags:  MHP_ flags
  * altmap: alternative allocator for memmap array
  */
 struct mhp_restrictions {
-	unsigned long flags;
 	struct vmem_altmap *altmap;
 };
 
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 20/35] mm/memory_hotplug: rename mhp_restrictions to mhp_params
  2020-04-10 21:30 incoming Andrew Morton
                   ` (18 preceding siblings ...)
  2020-04-10 21:33 ` [patch 19/35] mm/memory_hotplug: drop the flags field from struct mhp_restrictions Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 21/35] x86/mm: thread pgprot_t through init_memory_mapping() Andrew Morton
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: mm/memory_hotplug: rename mhp_restrictions to mhp_params

The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.

Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/mmu.c            |    4 ++--
 arch/ia64/mm/init.c            |    4 ++--
 arch/powerpc/mm/mem.c          |    4 ++--
 arch/s390/mm/init.c            |    6 +++---
 arch/sh/mm/init.c              |    4 ++--
 arch/x86/mm/init_32.c          |    4 ++--
 arch/x86/mm/init_64.c          |    8 ++++----
 include/linux/memory_hotplug.h |   16 ++++++++--------
 mm/memory_hotplug.c            |    8 ++++----
 mm/memremap.c                  |    8 ++++----
 10 files changed, 33 insertions(+), 33 deletions(-)

--- a/arch/arm64/mm/mmu.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/arm64/mm/mmu.c
@@ -1374,7 +1374,7 @@ static void __remove_pgd_mapping(pgd_t *
 }
 
 int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	int ret, flags = 0;
 
@@ -1387,7 +1387,7 @@ int arch_add_memory(int nid, u64 start,
 	memblock_clear_nomap(start, size);
 
 	ret = __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
-			   restrictions);
+			   params);
 	if (ret)
 		__remove_pgd_mapping(swapper_pg_dir,
 				     __phys_to_virt(start), size);
--- a/arch/ia64/mm/init.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/ia64/mm/init.c
@@ -670,13 +670,13 @@ mem_init (void)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
-	ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (ret)
 		printk("%s: Problem encountered in __add_pages() as ret=%d\n",
 		       __func__,  ret);
--- a/arch/powerpc/mm/mem.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/powerpc/mm/mem.c
@@ -122,7 +122,7 @@ static void flush_dcache_range_chunked(u
 }
 
 int __ref arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+			  struct mhp_params *params)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -138,7 +138,7 @@ int __ref arch_add_memory(int nid, u64 s
 		return -EFAULT;
 	}
 
-	return __add_pages(nid, start_pfn, nr_pages, restrictions);
+	return __add_pages(nid, start_pfn, nr_pages, params);
 }
 
 void __ref arch_remove_memory(int nid, u64 start, u64 size,
--- a/arch/s390/mm/init.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/s390/mm/init.c
@@ -268,20 +268,20 @@ device_initcall(s390_cma_mem_init);
 #endif /* CONFIG_CMA */
 
 int arch_add_memory(int nid, u64 start, u64 size,
-		struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	unsigned long start_pfn = PFN_DOWN(start);
 	unsigned long size_pages = PFN_DOWN(size);
 	int rc;
 
-	if (WARN_ON_ONCE(restrictions->altmap))
+	if (WARN_ON_ONCE(params->altmap))
 		return -EINVAL;
 
 	rc = vmem_add_mapping(start, size);
 	if (rc)
 		return rc;
 
-	rc = __add_pages(nid, start_pfn, size_pages, restrictions);
+	rc = __add_pages(nid, start_pfn, size_pages, params);
 	if (rc)
 		vmem_remove_mapping(start, size);
 	return rc;
--- a/arch/sh/mm/init.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/sh/mm/init.c
@@ -406,14 +406,14 @@ void __init mem_init(void)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	unsigned long start_pfn = PFN_DOWN(start);
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
 	/* We only have ZONE_NORMAL, so this is easy.. */
-	ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (unlikely(ret))
 		printk("%s: Failed, __add_pages() == %d\n", __func__, ret);
 
--- a/arch/x86/mm/init_32.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/x86/mm/init_32.c
@@ -819,12 +819,12 @@ void __init mem_init(void)
 
 #ifdef CONFIG_MEMORY_HOTPLUG
 int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 
-	return __add_pages(nid, start_pfn, nr_pages, restrictions);
+	return __add_pages(nid, start_pfn, nr_pages, params);
 }
 
 void arch_remove_memory(int nid, u64 start, u64 size,
--- a/arch/x86/mm/init_64.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/arch/x86/mm/init_64.c
@@ -843,11 +843,11 @@ static void update_end_of_memory_vars(u6
 }
 
 int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
-				struct mhp_restrictions *restrictions)
+	      struct mhp_params *params)
 {
 	int ret;
 
-	ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	WARN_ON_ONCE(ret);
 
 	/* update max_pfn, max_low_pfn and high_memory */
@@ -858,14 +858,14 @@ int add_pages(int nid, unsigned long sta
 }
 
 int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions)
+		    struct mhp_params *params)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 
 	init_memory_mapping(start, start + size);
 
-	return add_pages(nid, start_pfn, nr_pages, restrictions);
+	return add_pages(nid, start_pfn, nr_pages, params);
 }
 
 #define PAGE_INUSE 0xFD
--- a/include/linux/memory_hotplug.h~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/include/linux/memory_hotplug.h
@@ -58,10 +58,10 @@ enum {
 };
 
 /*
- * Restrictions for the memory hotplug:
- * altmap: alternative allocator for memmap array
+ * Extended parameters for memory hotplug:
+ * altmap: alternative allocator for memmap array (optional)
  */
-struct mhp_restrictions {
+struct mhp_params {
 	struct vmem_altmap *altmap;
 };
 
@@ -112,7 +112,7 @@ extern int restore_online_page_callback(
 extern int try_online_node(int nid);
 
 extern int arch_add_memory(int nid, u64 start, u64 size,
-			struct mhp_restrictions *restrictions);
+			   struct mhp_params *params);
 extern u64 max_mem_size;
 
 extern int memhp_online_type_from_str(const char *str);
@@ -133,17 +133,17 @@ extern void __remove_pages(unsigned long
 
 /* reasonably generic interface to expand the physical pages */
 extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
-		       struct mhp_restrictions *restrictions);
+		       struct mhp_params *params);
 
 #ifndef CONFIG_ARCH_HAS_ADD_PAGES
 static inline int add_pages(int nid, unsigned long start_pfn,
-		unsigned long nr_pages, struct mhp_restrictions *restrictions)
+		unsigned long nr_pages, struct mhp_params *params)
 {
-	return __add_pages(nid, start_pfn, nr_pages, restrictions);
+	return __add_pages(nid, start_pfn, nr_pages, params);
 }
 #else /* ARCH_HAS_ADD_PAGES */
 int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
-	      struct mhp_restrictions *restrictions);
+	      struct mhp_params *params);
 #endif /* ARCH_HAS_ADD_PAGES */
 
 #ifdef CONFIG_NUMA
--- a/mm/memory_hotplug.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/mm/memory_hotplug.c
@@ -304,12 +304,12 @@ static int check_hotplug_memory_addressa
  * add the new pages.
  */
 int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages,
-		struct mhp_restrictions *restrictions)
+		struct mhp_params *params)
 {
 	const unsigned long end_pfn = pfn + nr_pages;
 	unsigned long cur_nr_pages;
 	int err;
-	struct vmem_altmap *altmap = restrictions->altmap;
+	struct vmem_altmap *altmap = params->altmap;
 
 	err = check_hotplug_memory_addressable(pfn, nr_pages);
 	if (err)
@@ -1002,7 +1002,7 @@ static int online_memory_block(struct me
  */
 int __ref add_memory_resource(int nid, struct resource *res)
 {
-	struct mhp_restrictions restrictions = {};
+	struct mhp_params params = {};
 	u64 start, size;
 	bool new_node = false;
 	int ret;
@@ -1030,7 +1030,7 @@ int __ref add_memory_resource(int nid, s
 	new_node = ret;
 
 	/* call arch's memory hotadd */
-	ret = arch_add_memory(nid, start, size, &restrictions);
+	ret = arch_add_memory(nid, start, size, &params);
 	if (ret < 0)
 		goto error;
 
--- a/mm/memremap.c~mm-memory_hotplug-rename-mhp_restrictions-to-mhp_params
+++ a/mm/memremap.c
@@ -184,7 +184,7 @@ void *memremap_pages(struct dev_pagemap
 {
 	struct resource *res = &pgmap->res;
 	struct dev_pagemap *conflict_pgmap;
-	struct mhp_restrictions restrictions = {
+	struct mhp_params params = {
 		/*
 		 * We do not want any optional features only our own memmap
 		 */
@@ -302,7 +302,7 @@ void *memremap_pages(struct dev_pagemap
 	 */
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
 		error = add_pages(nid, PHYS_PFN(res->start),
-				PHYS_PFN(resource_size(res)), &restrictions);
+				PHYS_PFN(resource_size(res)), &params);
 	} else {
 		error = kasan_add_zero_shadow(__va(res->start), resource_size(res));
 		if (error) {
@@ -311,7 +311,7 @@ void *memremap_pages(struct dev_pagemap
 		}
 
 		error = arch_add_memory(nid, res->start, resource_size(res),
-					&restrictions);
+					&params);
 	}
 
 	if (!error) {
@@ -319,7 +319,7 @@ void *memremap_pages(struct dev_pagemap
 
 		zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
 		move_pfn_range_to_zone(zone, PHYS_PFN(res->start),
-				PHYS_PFN(resource_size(res)), restrictions.altmap);
+				PHYS_PFN(resource_size(res)), params.altmap);
 	}
 
 	mem_hotplug_done();
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 21/35] x86/mm: thread pgprot_t through init_memory_mapping()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (19 preceding siblings ...)
  2020-04-10 21:33 ` [patch 20/35] mm/memory_hotplug: rename mhp_restrictions to mhp_params Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 22/35] x86/mm: introduce __set_memory_prot() Andrew Morton
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: x86/mm: thread pgprot_t through init_memory_mapping()

In preparation to support a pgprot_t argument for arch_add_memory().

It's required to move the prototype of init_memory_mapping() seeing the
original location came before the definition of pgprot_t.

Link: http://lkml.kernel.org/r/20200306170846.9333-4-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/include/asm/page_types.h |    3 --
 arch/x86/include/asm/pgtable.h    |    3 ++
 arch/x86/kernel/amd_gart_64.c     |    3 +-
 arch/x86/mm/init.c                |    9 ++++---
 arch/x86/mm/init_32.c             |    3 +-
 arch/x86/mm/init_64.c             |   32 +++++++++++++++-------------
 arch/x86/mm/mm_internal.h         |    3 +-
 arch/x86/platform/uv/bios_uv.c    |    3 +-
 8 files changed, 34 insertions(+), 25 deletions(-)

--- a/arch/x86/include/asm/page_types.h~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/include/asm/page_types.h
@@ -71,9 +71,6 @@ static inline phys_addr_t get_max_mapped
 
 bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn);
 
-extern unsigned long init_memory_mapping(unsigned long start,
-					 unsigned long end);
-
 extern void initmem_init(void);
 
 #endif	/* !__ASSEMBLY__ */
--- a/arch/x86/include/asm/pgtable.h~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/include/asm/pgtable.h
@@ -1081,6 +1081,9 @@ static inline void __meminit init_trampo
 
 void __init poking_init(void);
 
+unsigned long init_memory_mapping(unsigned long start,
+				  unsigned long end, pgprot_t prot);
+
 # ifdef CONFIG_RANDOMIZE_MEMORY
 void __meminit init_trampoline(void);
 # else
--- a/arch/x86/kernel/amd_gart_64.c~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/kernel/amd_gart_64.c
@@ -744,7 +744,8 @@ int __init gart_iommu_init(void)
 
 	start_pfn = PFN_DOWN(aper_base);
 	if (!pfn_range_is_mapped(start_pfn, end_pfn))
-		init_memory_mapping(start_pfn<<PAGE_SHIFT, end_pfn<<PAGE_SHIFT);
+		init_memory_mapping(start_pfn<<PAGE_SHIFT, end_pfn<<PAGE_SHIFT,
+				    PAGE_KERNEL);
 
 	pr_info("PCI-DMA: using GART IOMMU.\n");
 	iommu_size = check_iommu_size(info.aper_base, aper_size);
--- a/arch/x86/mm/init_32.c~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/mm/init_32.c
@@ -257,7 +257,8 @@ static inline int __is_kernel_text(unsig
 unsigned long __init
 kernel_physical_mapping_init(unsigned long start,
 			     unsigned long end,
-			     unsigned long page_size_mask)
+			     unsigned long page_size_mask,
+			     pgprot_t prot)
 {
 	int use_pse = page_size_mask == (1<<PG_LEVEL_2M);
 	unsigned long last_map_addr = end;
--- a/arch/x86/mm/init_64.c~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/mm/init_64.c
@@ -585,7 +585,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned
  */
 static unsigned long __meminit
 phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
-	      unsigned long page_size_mask, bool init)
+	      unsigned long page_size_mask, pgprot_t _prot, bool init)
 {
 	unsigned long pages = 0, paddr_next;
 	unsigned long paddr_last = paddr_end;
@@ -595,7 +595,7 @@ phys_pud_init(pud_t *pud_page, unsigned
 	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
 		pud_t *pud;
 		pmd_t *pmd;
-		pgprot_t prot = PAGE_KERNEL;
+		pgprot_t prot = _prot;
 
 		vaddr = (unsigned long)__va(paddr);
 		pud = pud_page + pud_index(vaddr);
@@ -644,9 +644,12 @@ phys_pud_init(pud_t *pud_page, unsigned
 		if (page_size_mask & (1<<PG_LEVEL_1G)) {
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
+
+			prot = __pgprot(pgprot_val(prot) | __PAGE_KERNEL_LARGE);
+
 			set_pte_init((pte_t *)pud,
 				     pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
-					     PAGE_KERNEL_LARGE),
+					     prot),
 				     init);
 			spin_unlock(&init_mm.page_table_lock);
 			paddr_last = paddr_next;
@@ -669,7 +672,7 @@ phys_pud_init(pud_t *pud_page, unsigned
 
 static unsigned long __meminit
 phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
-	      unsigned long page_size_mask, bool init)
+	      unsigned long page_size_mask, pgprot_t prot, bool init)
 {
 	unsigned long vaddr, vaddr_end, vaddr_next, paddr_next, paddr_last;
 
@@ -679,7 +682,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned
 
 	if (!pgtable_l5_enabled())
 		return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end,
-				     page_size_mask, init);
+				     page_size_mask, prot, init);
 
 	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
 		p4d_t *p4d = p4d_page + p4d_index(vaddr);
@@ -702,13 +705,13 @@ phys_p4d_init(p4d_t *p4d_page, unsigned
 		if (!p4d_none(*p4d)) {
 			pud = pud_offset(p4d, 0);
 			paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
-					page_size_mask, init);
+					page_size_mask, prot, init);
 			continue;
 		}
 
 		pud = alloc_low_page();
 		paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
-					   page_size_mask, init);
+					   page_size_mask, prot, init);
 
 		spin_lock(&init_mm.page_table_lock);
 		p4d_populate_init(&init_mm, p4d, pud, init);
@@ -722,7 +725,7 @@ static unsigned long __meminit
 __kernel_physical_mapping_init(unsigned long paddr_start,
 			       unsigned long paddr_end,
 			       unsigned long page_size_mask,
-			       bool init)
+			       pgprot_t prot, bool init)
 {
 	bool pgd_changed = false;
 	unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
@@ -743,13 +746,13 @@ __kernel_physical_mapping_init(unsigned
 			paddr_last = phys_p4d_init(p4d, __pa(vaddr),
 						   __pa(vaddr_end),
 						   page_size_mask,
-						   init);
+						   prot, init);
 			continue;
 		}
 
 		p4d = alloc_low_page();
 		paddr_last = phys_p4d_init(p4d, __pa(vaddr), __pa(vaddr_end),
-					   page_size_mask, init);
+					   page_size_mask, prot, init);
 
 		spin_lock(&init_mm.page_table_lock);
 		if (pgtable_l5_enabled())
@@ -778,10 +781,10 @@ __kernel_physical_mapping_init(unsigned
 unsigned long __meminit
 kernel_physical_mapping_init(unsigned long paddr_start,
 			     unsigned long paddr_end,
-			     unsigned long page_size_mask)
+			     unsigned long page_size_mask, pgprot_t prot)
 {
 	return __kernel_physical_mapping_init(paddr_start, paddr_end,
-					      page_size_mask, true);
+					      page_size_mask, prot, true);
 }
 
 /*
@@ -796,7 +799,8 @@ kernel_physical_mapping_change(unsigned
 			       unsigned long page_size_mask)
 {
 	return __kernel_physical_mapping_init(paddr_start, paddr_end,
-					      page_size_mask, false);
+					      page_size_mask, PAGE_KERNEL,
+					      false);
 }
 
 #ifndef CONFIG_NUMA
@@ -863,7 +867,7 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 
-	init_memory_mapping(start, start + size);
+	init_memory_mapping(start, start + size, PAGE_KERNEL);
 
 	return add_pages(nid, start_pfn, nr_pages, params);
 }
--- a/arch/x86/mm/init.c~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/mm/init.c
@@ -467,7 +467,7 @@ bool pfn_range_is_mapped(unsigned long s
  * the physical memory. To access them they are temporarily mapped.
  */
 unsigned long __ref init_memory_mapping(unsigned long start,
-					       unsigned long end)
+					unsigned long end, pgprot_t prot)
 {
 	struct map_range mr[NR_RANGE_MR];
 	unsigned long ret = 0;
@@ -481,7 +481,8 @@ unsigned long __ref init_memory_mapping(
 
 	for (i = 0; i < nr_range; i++)
 		ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
-						   mr[i].page_size_mask);
+						   mr[i].page_size_mask,
+						   prot);
 
 	add_pfn_range_mapped(start >> PAGE_SHIFT, ret >> PAGE_SHIFT);
 
@@ -521,7 +522,7 @@ static unsigned long __init init_range_m
 		 */
 		can_use_brk_pgt = max(start, (u64)pgt_buf_end<<PAGE_SHIFT) >=
 				    min(end, (u64)pgt_buf_top<<PAGE_SHIFT);
-		init_memory_mapping(start, end);
+		init_memory_mapping(start, end, PAGE_KERNEL);
 		mapped_ram_size += end - start;
 		can_use_brk_pgt = true;
 	}
@@ -661,7 +662,7 @@ void __init init_mem_mapping(void)
 #endif
 
 	/* the ISA range is always mapped regardless of memory holes */
-	init_memory_mapping(0, ISA_END_ADDRESS);
+	init_memory_mapping(0, ISA_END_ADDRESS, PAGE_KERNEL);
 
 	/* Init the trampoline, possibly with KASLR memory offset */
 	init_trampoline();
--- a/arch/x86/mm/mm_internal.h~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/mm/mm_internal.h
@@ -12,7 +12,8 @@ void early_ioremap_page_table_range_init
 
 unsigned long kernel_physical_mapping_init(unsigned long start,
 					     unsigned long end,
-					     unsigned long page_size_mask);
+					     unsigned long page_size_mask,
+					     pgprot_t prot);
 unsigned long kernel_physical_mapping_change(unsigned long start,
 					     unsigned long end,
 					     unsigned long page_size_mask);
--- a/arch/x86/platform/uv/bios_uv.c~x86-mm-thread-pgprot_t-through-init_memory_mapping
+++ a/arch/x86/platform/uv/bios_uv.c
@@ -352,7 +352,8 @@ void __iomem *__init efi_ioremap(unsigne
 	if (type == EFI_MEMORY_MAPPED_IO)
 		return ioremap(phys_addr, size);
 
-	last_map_pfn = init_memory_mapping(phys_addr, phys_addr + size);
+	last_map_pfn = init_memory_mapping(phys_addr, phys_addr + size,
+					   PAGE_KERNEL);
 	if ((last_map_pfn << PAGE_SHIFT) < phys_addr + size) {
 		unsigned long top = last_map_pfn << PAGE_SHIFT;
 		efi_ioremap(top, size - (top - phys_addr), type, attribute);
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 22/35] x86/mm: introduce __set_memory_prot()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (20 preceding siblings ...)
  2020-04-10 21:33 ` [patch 21/35] x86/mm: thread pgprot_t through init_memory_mapping() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 23/35] powerpc/mm: thread pgprot_t through create_section_mapping() Andrew Morton
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: x86/mm: introduce __set_memory_prot()

For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.

Link: http://lkml.kernel.org/r/20200306170846.9333-5-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/x86/include/asm/set_memory.h |    1 +
 arch/x86/mm/pat/set_memory.c      |   13 +++++++++++++
 2 files changed, 14 insertions(+)

--- a/arch/x86/include/asm/set_memory.h~x86-mm-introduce-__set_memory_prot
+++ a/arch/x86/include/asm/set_memory.h
@@ -34,6 +34,7 @@
  * The caller is required to take care of these.
  */
 
+int __set_memory_prot(unsigned long addr, int numpages, pgprot_t prot);
 int _set_memory_uc(unsigned long addr, int numpages);
 int _set_memory_wc(unsigned long addr, int numpages);
 int _set_memory_wt(unsigned long addr, int numpages);
--- a/arch/x86/mm/pat/set_memory.c~x86-mm-introduce-__set_memory_prot
+++ a/arch/x86/mm/pat/set_memory.c
@@ -1795,6 +1795,19 @@ static inline int cpa_clear_pages_array(
 		CPA_PAGES_ARRAY, pages);
 }
 
+/*
+ * _set_memory_prot is an internal helper for callers that have been passed
+ * a pgprot_t value from upper layers and a reservation has already been taken.
+ * If you want to set the pgprot to a specific page protocol, use the
+ * set_memory_xx() functions.
+ */
+int __set_memory_prot(unsigned long addr, int numpages, pgprot_t prot)
+{
+	return change_page_attr_set_clr(&addr, numpages, prot,
+					__pgprot(~pgprot_val(prot)), 0, 0,
+					NULL);
+}
+
 int _set_memory_uc(unsigned long addr, int numpages)
 {
 	/*
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 23/35] powerpc/mm: thread pgprot_t through create_section_mapping()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (21 preceding siblings ...)
  2020-04-10 21:33 ` [patch 22/35] x86/mm: introduce __set_memory_prot() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 24/35] mm/memory_hotplug: add pgprot_t to mhp_params Andrew Morton
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: powerpc/mm: thread pgprot_t through create_section_mapping()

In prepartion to support a pgprot_t argument for arch_add_memory().

Link: http://lkml.kernel.org/r/20200306170846.9333-6-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/powerpc/include/asm/book3s/64/hash.h  |    3 ++-
 arch/powerpc/include/asm/book3s/64/radix.h |    3 ++-
 arch/powerpc/include/asm/sparsemem.h       |    3 ++-
 arch/powerpc/mm/book3s64/hash_utils.c      |    5 +++--
 arch/powerpc/mm/book3s64/pgtable.c         |    7 ++++---
 arch/powerpc/mm/book3s64/radix_pgtable.c   |   18 +++++++++++-------
 arch/powerpc/mm/mem.c                      |    5 +++--
 7 files changed, 27 insertions(+), 17 deletions(-)

--- a/arch/powerpc/include/asm/book3s/64/hash.h~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/include/asm/book3s/64/hash.h
@@ -251,7 +251,8 @@ extern int __meminit hash__vmemmap_creat
 extern void hash__vmemmap_remove_mapping(unsigned long start,
 				     unsigned long page_size);
 
-int hash__create_section_mapping(unsigned long start, unsigned long end, int nid);
+int hash__create_section_mapping(unsigned long start, unsigned long end,
+				 int nid, pgprot_t prot);
 int hash__remove_section_mapping(unsigned long start, unsigned long end);
 
 #endif /* !__ASSEMBLY__ */
--- a/arch/powerpc/include/asm/book3s/64/radix.h~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/include/asm/book3s/64/radix.h
@@ -294,7 +294,8 @@ static inline unsigned long radix__get_t
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-int radix__create_section_mapping(unsigned long start, unsigned long end, int nid);
+int radix__create_section_mapping(unsigned long start, unsigned long end,
+				  int nid, pgprot_t prot);
 int radix__remove_section_mapping(unsigned long start, unsigned long end);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 #endif /* __ASSEMBLY__ */
--- a/arch/powerpc/include/asm/sparsemem.h~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/include/asm/sparsemem.h
@@ -13,7 +13,8 @@
 #endif /* CONFIG_SPARSEMEM */
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-extern int create_section_mapping(unsigned long start, unsigned long end, int nid);
+extern int create_section_mapping(unsigned long start, unsigned long end,
+				  int nid, pgprot_t prot);
 extern int remove_section_mapping(unsigned long start, unsigned long end);
 
 #ifdef CONFIG_PPC_BOOK3S_64
--- a/arch/powerpc/mm/book3s64/hash_utils.c~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/mm/book3s64/hash_utils.c
@@ -809,7 +809,8 @@ int resize_hpt_for_hotplug(unsigned long
 	return 0;
 }
 
-int hash__create_section_mapping(unsigned long start, unsigned long end, int nid)
+int hash__create_section_mapping(unsigned long start, unsigned long end,
+				 int nid, pgprot_t prot)
 {
 	int rc;
 
@@ -819,7 +820,7 @@ int hash__create_section_mapping(unsigne
 	}
 
 	rc = htab_bolt_mapping(start, end, __pa(start),
-			       pgprot_val(PAGE_KERNEL), mmu_linear_psize,
+			       pgprot_val(prot), mmu_linear_psize,
 			       mmu_kernel_ssize);
 
 	if (rc < 0) {
--- a/arch/powerpc/mm/book3s64/pgtable.c~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/mm/book3s64/pgtable.c
@@ -171,12 +171,13 @@ void mmu_cleanup_all(void)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-int __meminit create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __meminit create_section_mapping(unsigned long start, unsigned long end,
+				     int nid, pgprot_t prot)
 {
 	if (radix_enabled())
-		return radix__create_section_mapping(start, end, nid);
+		return radix__create_section_mapping(start, end, nid, prot);
 
-	return hash__create_section_mapping(start, end, nid);
+	return hash__create_section_mapping(start, end, nid, prot);
 }
 
 int __meminit remove_section_mapping(unsigned long start, unsigned long end)
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -254,7 +254,7 @@ static unsigned long next_boundary(unsig
 
 static int __meminit create_physical_mapping(unsigned long start,
 					     unsigned long end,
-					     int nid)
+					     int nid, pgprot_t _prot)
 {
 	unsigned long vaddr, addr, mapping_size = 0;
 	bool prev_exec, exec = false;
@@ -290,7 +290,7 @@ static int __meminit create_physical_map
 			prot = PAGE_KERNEL_X;
 			exec = true;
 		} else {
-			prot = PAGE_KERNEL;
+			prot = _prot;
 			exec = false;
 		}
 
@@ -334,7 +334,7 @@ static void __init radix_init_pgtable(vo
 
 		WARN_ON(create_physical_mapping(reg->base,
 						reg->base + reg->size,
-						-1));
+						-1, PAGE_KERNEL));
 	}
 
 	/* Find out how many PID bits are supported */
@@ -713,8 +713,10 @@ static int __meminit stop_machine_change
 
 	spin_unlock(&init_mm.page_table_lock);
 	pte_clear(&init_mm, params->aligned_start, params->pte);
-	create_physical_mapping(__pa(params->aligned_start), __pa(params->start), -1);
-	create_physical_mapping(__pa(params->end), __pa(params->aligned_end), -1);
+	create_physical_mapping(__pa(params->aligned_start),
+				__pa(params->start), -1, PAGE_KERNEL);
+	create_physical_mapping(__pa(params->end), __pa(params->aligned_end),
+				-1, PAGE_KERNEL);
 	spin_lock(&init_mm.page_table_lock);
 	return 0;
 }
@@ -871,14 +873,16 @@ static void __meminit remove_pagetable(u
 	radix__flush_tlb_kernel_range(start, end);
 }
 
-int __meminit radix__create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __meminit radix__create_section_mapping(unsigned long start,
+					    unsigned long end, int nid,
+					    pgprot_t prot)
 {
 	if (end >= RADIX_VMALLOC_START) {
 		pr_warn("Outside the supported range\n");
 		return -1;
 	}
 
-	return create_physical_mapping(__pa(start), __pa(end), nid);
+	return create_physical_mapping(__pa(start), __pa(end), nid, prot);
 }
 
 int __meminit radix__remove_section_mapping(unsigned long start, unsigned long end)
--- a/arch/powerpc/mm/mem.c~powerpc-mm-thread-pgprot_t-through-create_section_mapping
+++ a/arch/powerpc/mm/mem.c
@@ -90,7 +90,8 @@ int memory_add_physaddr_to_nid(u64 start
 }
 #endif
 
-int __weak create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __weak create_section_mapping(unsigned long start, unsigned long end,
+				  int nid, pgprot_t prot)
 {
 	return -ENODEV;
 }
@@ -131,7 +132,7 @@ int __ref arch_add_memory(int nid, u64 s
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
 
 	start = (unsigned long)__va(start);
-	rc = create_section_mapping(start, start + size, nid);
+	rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
 	if (rc) {
 		pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
 			start, start + size, rc);
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 24/35] mm/memory_hotplug: add pgprot_t to mhp_params
  2020-04-10 21:30 incoming Andrew Morton
                   ` (22 preceding siblings ...)
  2020-04-10 21:33 ` [patch 23/35] powerpc/mm: thread pgprot_t through create_section_mapping() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 25/35] mm/memremap: set caching mode for PCI P2PDMA memory to WC Andrew Morton
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: mm/memory_hotplug: add pgprot_t to mhp_params

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB. 
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this register
it is effectively WB and will typically result in a machine check
exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple
change to pass the pgprot_t down to their respective functions which set
up the page tables.  For x86_32, set the page tables explicitly using
_set_memory_prot() (seeing they are already mapped).  For ia64, s390 and
sh, reject anything but PAGE_KERNEL settings -- this should be fine, for
now, seeing these architectures don't support ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter was
set for all arches.

Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/mmu.c            |    3 ++-
 arch/ia64/mm/init.c            |    3 +++
 arch/powerpc/mm/mem.c          |    3 ++-
 arch/s390/mm/init.c            |    3 +++
 arch/sh/mm/init.c              |    3 +++
 arch/x86/mm/init_32.c          |   12 ++++++++++++
 arch/x86/mm/init_64.c          |    2 +-
 include/linux/memory_hotplug.h |    3 +++
 mm/memory_hotplug.c            |    5 ++++-
 mm/memremap.c                  |    6 +++---
 10 files changed, 36 insertions(+), 7 deletions(-)

--- a/arch/arm64/mm/mmu.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/arm64/mm/mmu.c
@@ -1382,7 +1382,8 @@ int arch_add_memory(int nid, u64 start,
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
-			     size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
+			     size, params->pgprot, __pgd_pgtable_alloc,
+			     flags);
 
 	memblock_clear_nomap(start, size);
 
--- a/arch/ia64/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/ia64/mm/init.c
@@ -676,6 +676,9 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
+		return -EINVAL;
+
 	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (ret)
 		printk("%s: Problem encountered in __add_pages() as ret=%d\n",
--- a/arch/powerpc/mm/mem.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/powerpc/mm/mem.c
@@ -132,7 +132,8 @@ int __ref arch_add_memory(int nid, u64 s
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
 
 	start = (unsigned long)__va(start);
-	rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
+	rc = create_section_mapping(start, start + size, nid,
+				    params->pgprot);
 	if (rc) {
 		pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
 			start, start + size, rc);
--- a/arch/s390/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/s390/mm/init.c
@@ -277,6 +277,9 @@ int arch_add_memory(int nid, u64 start,
 	if (WARN_ON_ONCE(params->altmap))
 		return -EINVAL;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
+		return -EINVAL;
+
 	rc = vmem_add_mapping(start, size);
 	if (rc)
 		return rc;
--- a/arch/sh/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/sh/mm/init.c
@@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot)
+		return -EINVAL;
+
 	/* We only have ZONE_NORMAL, so this is easy.. */
 	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (unlikely(ret))
--- a/arch/x86/mm/init_32.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/x86/mm/init_32.c
@@ -824,6 +824,18 @@ int arch_add_memory(int nid, u64 start,
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	int ret;
+
+	/*
+	 * The page tables were already mapped at boot so if the caller
+	 * requests a different mapping type then we must change all the
+	 * pages with __set_memory_prot().
+	 */
+	if (params->pgprot.pgprot != PAGE_KERNEL.pgprot) {
+		ret = __set_memory_prot(start, nr_pages, params->pgprot);
+		if (ret)
+			return ret;
+	}
 
 	return __add_pages(nid, start_pfn, nr_pages, params);
 }
--- a/arch/x86/mm/init_64.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/x86/mm/init_64.c
@@ -867,7 +867,7 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 
-	init_memory_mapping(start, start + size, PAGE_KERNEL);
+	init_memory_mapping(start, start + size, params->pgprot);
 
 	return add_pages(nid, start_pfn, nr_pages, params);
 }
--- a/include/linux/memory_hotplug.h~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/include/linux/memory_hotplug.h
@@ -60,9 +60,12 @@ enum {
 /*
  * Extended parameters for memory hotplug:
  * altmap: alternative allocator for memmap array (optional)
+ * pgprot: page protection flags to apply to newly created page tables
+ *	(required)
  */
 struct mhp_params {
 	struct vmem_altmap *altmap;
+	pgprot_t pgprot;
 };
 
 /*
--- a/mm/memory_hotplug.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/mm/memory_hotplug.c
@@ -311,6 +311,9 @@ int __ref __add_pages(int nid, unsigned
 	int err;
 	struct vmem_altmap *altmap = params->altmap;
 
+	if (WARN_ON_ONCE(!params->pgprot.pgprot))
+		return -EINVAL;
+
 	err = check_hotplug_memory_addressable(pfn, nr_pages);
 	if (err)
 		return err;
@@ -1002,7 +1005,7 @@ static int online_memory_block(struct me
  */
 int __ref add_memory_resource(int nid, struct resource *res)
 {
-	struct mhp_params params = {};
+	struct mhp_params params = { .pgprot = PAGE_KERNEL };
 	u64 start, size;
 	bool new_node = false;
 	int ret;
--- a/mm/memremap.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/mm/memremap.c
@@ -189,8 +189,8 @@ void *memremap_pages(struct dev_pagemap
 		 * We do not want any optional features only our own memmap
 		 */
 		.altmap = pgmap_altmap(pgmap),
+		.pgprot = PAGE_KERNEL,
 	};
-	pgprot_t pgprot = PAGE_KERNEL;
 	int error, is_ram;
 	bool need_devmap_managed = true;
 
@@ -282,8 +282,8 @@ void *memremap_pages(struct dev_pagemap
 	if (nid < 0)
 		nid = numa_mem_id();
 
-	error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
-			resource_size(res));
+	error = track_pfn_remap(NULL, &params.pgprot, PHYS_PFN(res->start),
+				0, resource_size(res));
 	if (error)
 		goto err_pfn_remap;
 
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 25/35] mm/memremap: set caching mode for PCI P2PDMA memory to WC
  2020-04-10 21:30 incoming Andrew Morton
                   ` (23 preceding siblings ...)
  2020-04-10 21:33 ` [patch 24/35] mm/memory_hotplug: add pgprot_t to mhp_params Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 26/35] kmod: make request_module() return an error when autoloading is disabled Andrew Morton
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, benh, bp, catalin.marinas, dan.j.williams, dave.hansen,
	david, ebadger, hch, hpa, jgg, linux-mm, logang, luto, mhocko,
	mingo, mm-commits, mpe, paulus, peterz, tglx, torvalds, will

From: Logan Gunthorpe <logang@deltatee.com>
Subject: mm/memremap: set caching mode for PCI P2PDMA memory to WC

PCI BAR IO memory should never be mapped as WB, however prior to this the
PAT bits were set WB and it was typically overridden by MTRR registers set
by the firmware.

Set PCI P2PDMA memory to be UC as this is what it currently, typically,
ends up being mapped as on x86 after the MTRR registers override the cache
setting.

Future use-cases may need to generalize this by adding flags to select the
caching type, as some P2PDMA cases may not want UC.  However, those
use-cases are not upstream yet and this can be changed when they arrive.

Link: http://lkml.kernel.org/r/20200306170846.9333-8-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memremap.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/mm/memremap.c~mm-memremap-set-caching-mode-for-pci-p2pdma-memory-to-wc
+++ a/mm/memremap.c
@@ -217,7 +217,10 @@ void *memremap_pages(struct dev_pagemap
 		}
 		break;
 	case MEMORY_DEVICE_DEVDAX:
+		need_devmap_managed = false;
+		break;
 	case MEMORY_DEVICE_PCI_P2PDMA:
+		params.pgprot = pgprot_noncached(params.pgprot);
 		need_devmap_managed = false;
 		break;
 	default:
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 26/35] kmod: make request_module() return an error when autoloading is disabled
  2020-04-10 21:30 incoming Andrew Morton
                   ` (24 preceding siblings ...)
  2020-04-10 21:33 ` [patch 25/35] mm/memremap: set caching mode for PCI P2PDMA memory to WC Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 27/35] fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() Andrew Morton
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, ast, benh, ebiggers, gregkh, jeffv, jeyu, josh, keescook,
	linux-mm, mcgrof, mm-commits, stable, torvalds

From: Eric Biggers <ebiggers@google.com>
Subject: kmod: make request_module() return an error when autoloading is disabled

Patch series "module autoloading fixes and cleanups", v5.

This series fixes a bug where request_module() was reporting success to
kernel code when module autoloading had been completely disabled via 'echo
> /proc/sys/kernel/modprobe'.

It also addresses the issues raised on the original thread
(https://lkml.kernel.org/lkml/20200310223731.126894-1-ebiggers@kernel.org/T/#u)
by documenting the modprobe sysctl, adding a self-test for the empty path
case, and downgrading a user-reachable WARN_ONCE().


This patch (of 4):

It's long been possible to disable kernel module autoloading completely
(while still allowing manual module insertion) by setting
/proc/sys/kernel/modprobe to the empty string.  This can be preferable to
setting it to a nonexistent file since it avoids the overhead of an
attempted execve(), avoids potential deadlocks, and avoids the call to
security_kernel_module_request() and thus on SELinux-based systems
eliminates the need to write SELinux rules to dontaudit module_request.

However, when module autoloading is disabled in this way, request_module()
returns 0.  This is broken because callers expect 0 to mean that the
module was successfully loaded.

Apparently this was never noticed because this method of disabling module
autoloading isn't used much, and also most callers don't use the return
value of request_module() since it's always necessary to check whether the
module registered its functionality or not anyway.  But improperly
returning 0 can indeed confuse a few callers, for example get_fs_type() in
fs/filesystems.c where it causes a WARNING to be hit:

	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
		fs = __get_fs_type(name, len);
		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
	}

This is easily reproduced with:

	echo > /proc/sys/kernel/modprobe
	mount -t NONEXISTENT none /

It causes:

	request_module fs-NONEXISTENT succeeded, but still no fs?
	WARNING: CPU: 1 PID: 1106 at fs/filesystems.c:275 get_fs_type+0xd6/0xf0
	[...]

This should actually use pr_warn_once() rather than WARN_ONCE(), since
it's also user-reachable if userspace immediately unloads the module. 
Regardless, request_module() should correctly return an error when it
fails.  So let's make it return -ENOENT, which matches the error when the
modprobe binary doesn't exist.

I've also sent patches to document and test this case.

Link: http://lkml.kernel.org/r/20200310223731.126894-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Ben Hutchings <benh@debian.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/kmod.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/kmod.c~kmod-make-request_module-return-an-error-when-autoloading-is-disabled
+++ a/kernel/kmod.c
@@ -120,7 +120,7 @@ out:
  * invoke it.
  *
  * If module auto-loading support is disabled then this function
- * becomes a no-operation.
+ * simply returns -ENOENT.
  */
 int __request_module(bool wait, const char *fmt, ...)
 {
@@ -137,7 +137,7 @@ int __request_module(bool wait, const ch
 	WARN_ON_ONCE(wait && current_is_async());
 
 	if (!modprobe_path[0])
-		return 0;
+		return -ENOENT;
 
 	va_start(args, fmt);
 	ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 27/35] fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
  2020-04-10 21:30 incoming Andrew Morton
                   ` (25 preceding siblings ...)
  2020-04-10 21:33 ` [patch 26/35] kmod: make request_module() return an error when autoloading is disabled Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 28/35] docs: admin-guide: document the kernel.modprobe sysctl Andrew Morton
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, ast, ebiggers, gregkh, jeffv, jeyu, keescook, linux-mm,
	mcgrof, mm-commits, neilb, stable, torvalds

From: Eric Biggers <ebiggers@google.com>
Subject: fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()

After request_module(), nothing is stopping the module from being unloaded
until someone takes a reference to it via try_get_module().

The WARN_ONCE() in get_fs_type() is thus user-reachable, via userspace
running 'rmmod' concurrently.

Since WARN_ONCE() is for kernel bugs only, not for user-reachable
situations, downgrade this warning to pr_warn_once().

Keep it printed once only, since the intent of this warning is to detect a
bug in modprobe at boot time.  Printing the warning more than once
wouldn't really provide any useful extra information.

Link: http://lkml.kernel.org/r/20200312202552.241885-3-ebiggers@kernel.org
Fixes: 41124db869b7 ("fs: warn in case userspace lied about modprobe return")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Jessica Yu <jeyu@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Cc: <stable@vger.kernel.org>		[4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/filesystems.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/filesystems.c~fs-filesystemsc-downgrade-user-reachable-warn_once-to-pr_warn_once
+++ a/fs/filesystems.c
@@ -272,7 +272,9 @@ struct file_system_type *get_fs_type(con
 	fs = __get_fs_type(name, len);
 	if (!fs && (request_module("fs-%.*s", len, name) == 0)) {
 		fs = __get_fs_type(name, len);
-		WARN_ONCE(!fs, "request_module fs-%.*s succeeded, but still no fs?\n", len, name);
+		if (!fs)
+			pr_warn_once("request_module fs-%.*s succeeded, but still no fs?\n",
+				     len, name);
 	}
 
 	if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 28/35] docs: admin-guide: document the kernel.modprobe sysctl
  2020-04-10 21:30 incoming Andrew Morton
                   ` (26 preceding siblings ...)
  2020-04-10 21:33 ` [patch 27/35] fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 29/35] selftests: kmod: fix handling test numbers above 9 Andrew Morton
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, ast, ebiggers, gregkh, jeffv, jeyu, keescook, linux-mm,
	mcgrof, mm-commits, neilb, torvalds

From: Eric Biggers <ebiggers@google.com>
Subject: docs: admin-guide: document the kernel.modprobe sysctl

Document the kernel.modprobe sysctl in the same place that all the other
kernel.* sysctls are documented.  Make sure to mention how to use this
sysctl to completely disable module autoloading, and how this sysctl
relates to CONFIG_STATIC_USERMODEHELPER.

[ebiggers@google.com: v5]
  Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/admin-guide/sysctl/kernel.rst |   21 ++++++++++++++++++
 1 file changed, 21 insertions(+)

--- a/Documentation/admin-guide/sysctl/kernel.rst~docs-admin-guide-document-the-kernelmodprobe-sysctl
+++ a/Documentation/admin-guide/sysctl/kernel.rst
@@ -446,6 +446,27 @@ Notes:
      successful IPC object allocation. If an IPC object allocation syscall
      fails, it is undefined if the value remains unmodified or is reset to -1.
 
+modprobe:
+=========
+
+The path to the usermode helper for autoloading kernel modules, by
+default "/sbin/modprobe".  This binary is executed when the kernel
+requests a module.  For example, if userspace passes an unknown
+filesystem type to mount(), then the kernel will automatically request
+the corresponding filesystem module by executing this usermode helper.
+This usermode helper should insert the needed module into the kernel.
+
+This sysctl only affects module autoloading.  It has no effect on the
+ability to explicitly insert modules.
+
+If this sysctl is set to the empty string, then module autoloading is
+completely disabled.  The kernel will not try to execute a usermode
+helper at all, nor will it call the kernel_module_request LSM hook.
+
+If CONFIG_STATIC_USERMODEHELPER=y is set in the kernel configuration,
+then the configured static usermode helper overrides this sysctl,
+except that the empty string is still accepted to completely disable
+module autoloading as described above.
 
 nmi_watchdog
 ============
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 29/35] selftests: kmod: fix handling test numbers above 9
  2020-04-10 21:30 incoming Andrew Morton
                   ` (27 preceding siblings ...)
  2020-04-10 21:33 ` [patch 28/35] docs: admin-guide: document the kernel.modprobe sysctl Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:33 ` [patch 30/35] selftests: kmod: test disabling module autoloading Andrew Morton
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, ast, ebiggers, gregkh, jeffv, jeyu, keescook, linux-mm,
	mcgrof, mm-commits, neilb, torvalds

From: Eric Biggers <ebiggers@google.com>
Subject: selftests: kmod: fix handling test numbers above 9

get_test_count() and get_test_enabled() were broken for test numbers above
9 due to awk interpreting a field specification like '$0010' as octal
rather than decimal.  Fix it by stripping the leading zeroes.

Link: http://lkml.kernel.org/r/20200318230515.171692-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/kmod/kmod.sh |   13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

--- a/tools/testing/selftests/kmod/kmod.sh~selftests-kmod-fix-handling-test-numbers-above-9
+++ a/tools/testing/selftests/kmod/kmod.sh
@@ -505,18 +505,23 @@ function test_num()
 	fi
 }
 
-function get_test_count()
+function get_test_data()
 {
 	test_num $1
-	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	local field_num=$(echo $1 | sed 's/^0*//')
+	echo $ALL_TESTS | awk '{print $'$field_num'}'
+}
+
+function get_test_count()
+{
+	TEST_DATA=$(get_test_data $1)
 	LAST_TWO=${TEST_DATA#*:*}
 	echo ${LAST_TWO%:*}
 }
 
 function get_test_enabled()
 {
-	test_num $1
-	TEST_DATA=$(echo $ALL_TESTS | awk '{print $'$1'}')
+	TEST_DATA=$(get_test_data $1)
 	echo ${TEST_DATA#*:*:}
 }
 
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 30/35] selftests: kmod: test disabling module autoloading
  2020-04-10 21:30 incoming Andrew Morton
                   ` (28 preceding siblings ...)
  2020-04-10 21:33 ` [patch 29/35] selftests: kmod: fix handling test numbers above 9 Andrew Morton
@ 2020-04-10 21:33 ` Andrew Morton
  2020-04-10 21:34 ` [patch 31/35] change email address for Pali Rohár Andrew Morton
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:33 UTC (permalink / raw)
  To: akpm, ast, ebiggers, gregkh, jeffv, jeyu, keescook, linux-mm,
	mcgrof, mm-commits, neilb, torvalds

From: Eric Biggers <ebiggers@google.com>
Subject: selftests: kmod: test disabling module autoloading

Test that request_module() fails with -ENOENT when
/proc/sys/kernel/modprobe contains (a) a nonexistent path, and (b) an
empty path.

Case (b) is a regression test for the patch "kmod: make request_module()
return an error when autoloading is disabled".

Tested with 'kmod.sh -t 0010 && kmod.sh -t 0011', and also simply with
'kmod.sh' to run all kmod tests.

Link: http://lkml.kernel.org/r/20200312202552.241885-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 tools/testing/selftests/kmod/kmod.sh |   30 +++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

--- a/tools/testing/selftests/kmod/kmod.sh~selftests-kmod-test-disabling-module-autoloading
+++ a/tools/testing/selftests/kmod/kmod.sh
@@ -61,6 +61,8 @@ ALL_TESTS="$ALL_TESTS 0006:10:1"
 ALL_TESTS="$ALL_TESTS 0007:5:1"
 ALL_TESTS="$ALL_TESTS 0008:150:1"
 ALL_TESTS="$ALL_TESTS 0009:150:1"
+ALL_TESTS="$ALL_TESTS 0010:1:1"
+ALL_TESTS="$ALL_TESTS 0011:1:1"
 
 # Kselftest framework requirement - SKIP code is 4.
 ksft_skip=4
@@ -149,6 +151,7 @@ function load_req_mod()
 
 test_finish()
 {
+	echo "$MODPROBE" > /proc/sys/kernel/modprobe
 	echo "Test completed"
 }
 
@@ -443,6 +446,30 @@ kmod_test_0009()
 	config_expect_result ${FUNCNAME[0]} SUCCESS
 }
 
+kmod_test_0010()
+{
+	kmod_defaults_driver
+	config_num_threads 1
+	echo "/KMOD_TEST_NONEXISTENT" > /proc/sys/kernel/modprobe
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -ENOENT
+	echo "$MODPROBE" > /proc/sys/kernel/modprobe
+}
+
+kmod_test_0011()
+{
+	kmod_defaults_driver
+	config_num_threads 1
+	# This causes the kernel to not even try executing modprobe.  The error
+	# code is still -ENOENT like when modprobe doesn't exist, so we can't
+	# easily test for the exact difference.  But this still is a useful test
+	# since there was a bug where request_module() returned 0 in this case.
+	echo > /proc/sys/kernel/modprobe
+	config_trigger ${FUNCNAME[0]}
+	config_expect_result ${FUNCNAME[0]} -ENOENT
+	echo "$MODPROBE" > /proc/sys/kernel/modprobe
+}
+
 list_tests()
 {
 	echo "Test ID list:"
@@ -460,6 +487,8 @@ list_tests()
 	echo "0007 x $(get_test_count 0007) - multithreaded tests with default setup test request_module() and get_fs_type()"
 	echo "0008 x $(get_test_count 0008) - multithreaded - push kmod_concurrent over max_modprobes for request_module()"
 	echo "0009 x $(get_test_count 0009) - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()"
+	echo "0010 x $(get_test_count 0010) - test nonexistent modprobe path"
+	echo "0011 x $(get_test_count 0011) - test completely disabling module autoloading"
 }
 
 usage()
@@ -616,6 +645,7 @@ test_reqs
 allow_user_defaults
 load_req_mod
 
+MODPROBE=$(</proc/sys/kernel/modprobe)
 trap "test_finish" EXIT
 
 parse_args $@
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 31/35] change email address for Pali Rohár
  2020-04-10 21:30 incoming Andrew Morton
                   ` (29 preceding siblings ...)
  2020-04-10 21:33 ` [patch 30/35] selftests: kmod: test disabling module autoloading Andrew Morton
@ 2020-04-10 21:34 ` Andrew Morton
  2020-04-10 21:34 ` [patch 32/35] drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings Andrew Morton
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:34 UTC (permalink / raw)
  To: akpm, gregkh, joe, linux-mm, mm-commits, pali, torvalds

From: Pali Rohár <pali@kernel.org>
Subject: change email address for Pali Rohár

For security reasons I stopped using gmail account and kernel address is
now up-to-date alias to my personal address.

People periodically send me emails to address which they found in source
code of drivers, so this change reflects state where people can contact me.

Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 Documentation/ABI/testing/sysfs-platform-dell-laptop |    8 ++---
 MAINTAINERS                                          |   16 +++++-----
 arch/arm/mach-omap2/omap-secure.c                    |    2 -
 arch/arm/mach-omap2/omap-secure.h                    |    2 -
 arch/arm/mach-omap2/omap-smc.S                       |    2 -
 drivers/char/hw_random/omap3-rom-rng.c               |    4 +-
 drivers/hwmon/dell-smm-hwmon.c                       |    4 +-
 drivers/platform/x86/dell-laptop.c                   |    4 +-
 drivers/platform/x86/dell-rbtn.c                     |    4 +-
 drivers/platform/x86/dell-rbtn.h                     |    2 -
 drivers/platform/x86/dell-smbios-base.c              |    4 +-
 drivers/platform/x86/dell-smbios-smm.c               |    2 -
 drivers/platform/x86/dell-smbios.h                   |    2 -
 drivers/platform/x86/dell-smo8800.c                  |    2 -
 drivers/platform/x86/dell-wmi.c                      |    4 +-
 drivers/power/supply/bq2415x_charger.c               |    4 +-
 drivers/power/supply/bq27xxx_battery.c               |    2 -
 drivers/power/supply/isp1704_charger.c               |    2 -
 drivers/power/supply/rx51_battery.c                  |    4 +-
 fs/udf/ecma_167.h                                    |    2 -
 fs/udf/osta_udf.h                                    |    2 -
 include/linux/power/bq2415x_charger.h                |    2 -
 tools/laptop/freefall/freefall.c                     |    2 -
 23 files changed, 41 insertions(+), 41 deletions(-)

--- a/arch/arm/mach-omap2/omap-secure.c~change-email-address-for-pali-rohar
+++ a/arch/arm/mach-omap2/omap-secure.c
@@ -5,7 +5,7 @@
  * Copyright (C) 2011 Texas Instruments, Inc.
  *	Santosh Shilimkar <santosh.shilimkar@ti.com>
  * Copyright (C) 2012 Ivaylo Dimitrov <freemangordon@abv.bg>
- * Copyright (C) 2013 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2013 Pali Rohár <pali@kernel.org>
  */
 
 #include <linux/arm-smccc.h>
--- a/arch/arm/mach-omap2/omap-secure.h~change-email-address-for-pali-rohar
+++ a/arch/arm/mach-omap2/omap-secure.h
@@ -5,7 +5,7 @@
  * Copyright (C) 2011 Texas Instruments, Inc.
  *	Santosh Shilimkar <santosh.shilimkar@ti.com>
  * Copyright (C) 2012 Ivaylo Dimitrov <freemangordon@abv.bg>
- * Copyright (C) 2013 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2013 Pali Rohár <pali@kernel.org>
  */
 #ifndef OMAP_ARCH_OMAP_SECURE_H
 #define OMAP_ARCH_OMAP_SECURE_H
--- a/arch/arm/mach-omap2/omap-smc.S~change-email-address-for-pali-rohar
+++ a/arch/arm/mach-omap2/omap-smc.S
@@ -6,7 +6,7 @@
  * Written by Santosh Shilimkar <santosh.shilimkar@ti.com>
  *
  * Copyright (C) 2012 Ivaylo Dimitrov <freemangordon@abv.bg>
- * Copyright (C) 2013 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2013 Pali Rohár <pali@kernel.org>
  */
 
 #include <linux/linkage.h>
--- a/Documentation/ABI/testing/sysfs-platform-dell-laptop~change-email-address-for-pali-rohar
+++ a/Documentation/ABI/testing/sysfs-platform-dell-laptop
@@ -2,7 +2,7 @@ What:		/sys/class/leds/dell::kbd_backlig
 Date:		December 2014
 KernelVersion:	3.19
 Contact:	Gabriele Mazzotta <gabriele.mzt@gmail.com>,
-		Pali Rohár <pali.rohar@gmail.com>
+		Pali Rohár <pali@kernel.org>
 Description:
 		This file allows to control the automatic keyboard
 		illumination mode on some systems that have an ambient
@@ -13,7 +13,7 @@ What:		/sys/class/leds/dell::kbd_backlig
 Date:		December 2014
 KernelVersion:	3.19
 Contact:	Gabriele Mazzotta <gabriele.mzt@gmail.com>,
-		Pali Rohár <pali.rohar@gmail.com>
+		Pali Rohár <pali@kernel.org>
 Description:
 		This file allows to specifiy the on/off threshold value,
 		as reported by the ambient light sensor.
@@ -22,7 +22,7 @@ What:		/sys/class/leds/dell::kbd_backlig
 Date:		December 2014
 KernelVersion:	3.19
 Contact:	Gabriele Mazzotta <gabriele.mzt@gmail.com>,
-		Pali Rohár <pali.rohar@gmail.com>
+		Pali Rohár <pali@kernel.org>
 Description:
 		This file allows to control the input triggers that
 		turn on the keyboard backlight illumination that is
@@ -45,7 +45,7 @@ What:		/sys/class/leds/dell::kbd_backlig
 Date:		December 2014
 KernelVersion:	3.19
 Contact:	Gabriele Mazzotta <gabriele.mzt@gmail.com>,
-		Pali Rohár <pali.rohar@gmail.com>
+		Pali Rohár <pali@kernel.org>
 Description:
 		This file allows to specify the interval after which the
 		keyboard illumination is disabled because of inactivity.
--- a/drivers/char/hw_random/omap3-rom-rng.c~change-email-address-for-pali-rohar
+++ a/drivers/char/hw_random/omap3-rom-rng.c
@@ -4,7 +4,7 @@
  * Copyright (C) 2009 Nokia Corporation
  * Author: Juha Yrjola <juha.yrjola@solidboot.com>
  *
- * Copyright (C) 2013 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2013 Pali Rohár <pali@kernel.org>
  *
  * This file is licensed under  the terms of the GNU General Public
  * License version 2. This program is licensed "as is" without any
@@ -178,5 +178,5 @@ module_platform_driver(omap3_rom_rng_dri
 
 MODULE_ALIAS("platform:omap3-rom-rng");
 MODULE_AUTHOR("Juha Yrjola");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_LICENSE("GPL");
--- a/drivers/hwmon/dell-smm-hwmon.c~change-email-address-for-pali-rohar
+++ a/drivers/hwmon/dell-smm-hwmon.c
@@ -7,7 +7,7 @@
  * Hwmon integration:
  * Copyright (C) 2011  Jean Delvare <jdelvare@suse.de>
  * Copyright (C) 2013, 2014  Guenter Roeck <linux@roeck-us.net>
- * Copyright (C) 2014, 2015  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2014, 2015  Pali Rohár <pali@kernel.org>
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -86,7 +86,7 @@ static unsigned int auto_fan;
 #define I8K_HWMON_HAVE_FAN3	(1 << 12)
 
 MODULE_AUTHOR("Massimo Dal Zotto (dz@debian.org)");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_DESCRIPTION("Dell laptop SMM BIOS hwmon driver");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("i8k");
--- a/drivers/platform/x86/dell-laptop.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-laptop.c
@@ -4,7 +4,7 @@
  *
  *  Copyright (c) Red Hat <mjg@redhat.com>
  *  Copyright (c) 2014 Gabriele Mazzotta <gabriele.mzt@gmail.com>
- *  Copyright (c) 2014 Pali Rohár <pali.rohar@gmail.com>
+ *  Copyright (c) 2014 Pali Rohár <pali@kernel.org>
  *
  *  Based on documentation in the libsmbios package:
  *  Copyright (C) 2005-2014 Dell Inc.
@@ -2295,6 +2295,6 @@ module_exit(dell_exit);
 
 MODULE_AUTHOR("Matthew Garrett <mjg@redhat.com>");
 MODULE_AUTHOR("Gabriele Mazzotta <gabriele.mzt@gmail.com>");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_DESCRIPTION("Dell laptop driver");
 MODULE_LICENSE("GPL");
--- a/drivers/platform/x86/dell-rbtn.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-rbtn.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /*
     Dell Airplane Mode Switch driver
-    Copyright (C) 2014-2015  Pali Rohár <pali.rohar@gmail.com>
+    Copyright (C) 2014-2015  Pali Rohár <pali@kernel.org>
 
 */
 
@@ -495,5 +495,5 @@ MODULE_PARM_DESC(auto_remove_rfkill, "Au
 				     "(default true)");
 MODULE_DEVICE_TABLE(acpi, rbtn_ids);
 MODULE_DESCRIPTION("Dell Airplane Mode Switch driver");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_LICENSE("GPL");
--- a/drivers/platform/x86/dell-rbtn.h~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-rbtn.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
 /*
     Dell Airplane Mode Switch driver
-    Copyright (C) 2014-2015  Pali Rohár <pali.rohar@gmail.com>
+    Copyright (C) 2014-2015  Pali Rohár <pali@kernel.org>
 
 */
 
--- a/drivers/platform/x86/dell-smbios-base.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-smbios-base.c
@@ -4,7 +4,7 @@
  *
  *  Copyright (c) Red Hat <mjg@redhat.com>
  *  Copyright (c) 2014 Gabriele Mazzotta <gabriele.mzt@gmail.com>
- *  Copyright (c) 2014 Pali Rohár <pali.rohar@gmail.com>
+ *  Copyright (c) 2014 Pali Rohár <pali@kernel.org>
  *
  *  Based on documentation in the libsmbios package:
  *  Copyright (C) 2005-2014 Dell Inc.
@@ -645,7 +645,7 @@ module_exit(dell_smbios_exit);
 
 MODULE_AUTHOR("Matthew Garrett <mjg@redhat.com>");
 MODULE_AUTHOR("Gabriele Mazzotta <gabriele.mzt@gmail.com>");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_AUTHOR("Mario Limonciello <mario.limonciello@dell.com>");
 MODULE_DESCRIPTION("Common functions for kernel modules using Dell SMBIOS");
 MODULE_LICENSE("GPL");
--- a/drivers/platform/x86/dell-smbios.h~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-smbios.h
@@ -4,7 +4,7 @@
  *
  *  Copyright (c) Red Hat <mjg@redhat.com>
  *  Copyright (c) 2014 Gabriele Mazzotta <gabriele.mzt@gmail.com>
- *  Copyright (c) 2014 Pali Rohár <pali.rohar@gmail.com>
+ *  Copyright (c) 2014 Pali Rohár <pali@kernel.org>
  *
  *  Based on documentation in the libsmbios package:
  *  Copyright (C) 2005-2014 Dell Inc.
--- a/drivers/platform/x86/dell-smbios-smm.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-smbios-smm.c
@@ -4,7 +4,7 @@
  *
  *  Copyright (c) Red Hat <mjg@redhat.com>
  *  Copyright (c) 2014 Gabriele Mazzotta <gabriele.mzt@gmail.com>
- *  Copyright (c) 2014 Pali Rohár <pali.rohar@gmail.com>
+ *  Copyright (c) 2014 Pali Rohár <pali@kernel.org>
  *  Copyright (c) 2017 Dell Inc.
  */
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
--- a/drivers/platform/x86/dell-smo8800.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-smo8800.c
@@ -3,7 +3,7 @@
  *  dell-smo8800.c - Dell Latitude ACPI SMO88XX freefall sensor driver
  *
  *  Copyright (C) 2012 Sonal Santan <sonal.santan@gmail.com>
- *  Copyright (C) 2014 Pali Rohár <pali.rohar@gmail.com>
+ *  Copyright (C) 2014 Pali Rohár <pali@kernel.org>
  *
  *  This is loosely based on lis3lv02d driver.
  */
--- a/drivers/platform/x86/dell-wmi.c~change-email-address-for-pali-rohar
+++ a/drivers/platform/x86/dell-wmi.c
@@ -3,7 +3,7 @@
  * Dell WMI hotkeys
  *
  * Copyright (C) 2008 Red Hat <mjg@redhat.com>
- * Copyright (C) 2014-2015 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2014-2015 Pali Rohár <pali@kernel.org>
  *
  * Portions based on wistron_btns.c:
  * Copyright (C) 2005 Miloslav Trmac <mitr@volny.cz>
@@ -29,7 +29,7 @@
 #include "dell-wmi-descriptor.h"
 
 MODULE_AUTHOR("Matthew Garrett <mjg@redhat.com>");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_DESCRIPTION("Dell laptop WMI hotkeys driver");
 MODULE_LICENSE("GPL");
 
--- a/drivers/power/supply/bq2415x_charger.c~change-email-address-for-pali-rohar
+++ a/drivers/power/supply/bq2415x_charger.c
@@ -2,7 +2,7 @@
 /*
  * bq2415x charger driver
  *
- * Copyright (C) 2011-2013  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2011-2013  Pali Rohár <pali@kernel.org>
  *
  * Datasheets:
  * http://www.ti.com/product/bq24150
@@ -1788,6 +1788,6 @@ static struct i2c_driver bq2415x_driver
 };
 module_i2c_driver(bq2415x_driver);
 
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_DESCRIPTION("bq2415x charger driver");
 MODULE_LICENSE("GPL");
--- a/drivers/power/supply/bq27xxx_battery.c~change-email-address-for-pali-rohar
+++ a/drivers/power/supply/bq27xxx_battery.c
@@ -4,7 +4,7 @@
  * Copyright (C) 2008 Rodolfo Giometti <giometti@linux.it>
  * Copyright (C) 2008 Eurotech S.p.A. <info@eurotech.it>
  * Copyright (C) 2010-2011 Lars-Peter Clausen <lars@metafoo.de>
- * Copyright (C) 2011 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2011 Pali Rohár <pali@kernel.org>
  * Copyright (C) 2017 Liam Breck <kernel@networkimprov.net>
  *
  * Based on a previous work by Copyright (C) 2008 Texas Instruments, Inc.
--- a/drivers/power/supply/isp1704_charger.c~change-email-address-for-pali-rohar
+++ a/drivers/power/supply/isp1704_charger.c
@@ -3,7 +3,7 @@
  * ISP1704 USB Charger Detection driver
  *
  * Copyright (C) 2010 Nokia Corporation
- * Copyright (C) 2012 - 2013 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2012 - 2013 Pali Rohár <pali@kernel.org>
  */
 
 #include <linux/kernel.h>
--- a/drivers/power/supply/rx51_battery.c~change-email-address-for-pali-rohar
+++ a/drivers/power/supply/rx51_battery.c
@@ -2,7 +2,7 @@
 /*
  * Nokia RX-51 battery driver
  *
- * Copyright (C) 2012  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2012  Pali Rohár <pali@kernel.org>
  */
 
 #include <linux/module.h>
@@ -278,6 +278,6 @@ static struct platform_driver rx51_batte
 module_platform_driver(rx51_battery_driver);
 
 MODULE_ALIAS("platform:rx51-battery");
-MODULE_AUTHOR("Pali Rohár <pali.rohar@gmail.com>");
+MODULE_AUTHOR("Pali Rohár <pali@kernel.org>");
 MODULE_DESCRIPTION("Nokia RX-51 battery driver");
 MODULE_LICENSE("GPL");
--- a/fs/udf/ecma_167.h~change-email-address-for-pali-rohar
+++ a/fs/udf/ecma_167.h
@@ -5,7 +5,7 @@
  * http://www.ecma.ch
  *
  * Copyright (c) 2001-2002  Ben Fennema
- * Copyright (c) 2017-2019  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (c) 2017-2019  Pali Rohár <pali@kernel.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
--- a/fs/udf/osta_udf.h~change-email-address-for-pali-rohar
+++ a/fs/udf/osta_udf.h
@@ -5,7 +5,7 @@
  * http://www.osta.org
  *
  * Copyright (c) 2001-2004  Ben Fennema
- * Copyright (c) 2017-2019  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (c) 2017-2019  Pali Rohár <pali@kernel.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
--- a/include/linux/power/bq2415x_charger.h~change-email-address-for-pali-rohar
+++ a/include/linux/power/bq2415x_charger.h
@@ -2,7 +2,7 @@
 /*
  * bq2415x charger driver
  *
- * Copyright (C) 2011-2013  Pali Rohár <pali.rohar@gmail.com>
+ * Copyright (C) 2011-2013  Pali Rohár <pali@kernel.org>
  */
 
 #ifndef BQ2415X_CHARGER_H
--- a/MAINTAINERS~change-email-address-for-pali-rohar
+++ a/MAINTAINERS
@@ -727,7 +727,7 @@ L:	linux-alpha@vger.kernel.org
 F:	arch/alpha/
 
 ALPS PS/2 TOUCHPAD DRIVER
-R:	Pali Rohár <pali.rohar@gmail.com>
+R:	Pali Rohár <pali@kernel.org>
 F:	drivers/input/mouse/alps.*
 
 ALTERA I2C CONTROLLER DRIVER
@@ -4774,23 +4774,23 @@ F:	drivers/net/fddi/defza.*
 
 DELL LAPTOP DRIVER
 M:	Matthew Garrett <mjg59@srcf.ucam.org>
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
 F:	drivers/platform/x86/dell-laptop.c
 
 DELL LAPTOP FREEFALL DRIVER
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 S:	Maintained
 F:	drivers/platform/x86/dell-smo8800.c
 
 DELL LAPTOP RBTN DRIVER
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 S:	Maintained
 F:	drivers/platform/x86/dell-rbtn.*
 
 DELL LAPTOP SMM DRIVER
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 S:	Maintained
 F:	drivers/hwmon/dell-smm-hwmon.c
 F:	include/uapi/linux/i8k.h
@@ -4802,7 +4802,7 @@ S:	Maintained
 F:	drivers/platform/x86/dell_rbu.c
 
 DELL SMBIOS DRIVER
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 M:	Mario Limonciello <mario.limonciello@dell.com>
 L:	platform-driver-x86@vger.kernel.org
 S:	Maintained
@@ -4835,7 +4835,7 @@ F:	drivers/platform/x86/dell-wmi-descrip
 
 DELL WMI NOTIFICATIONS DRIVER
 M:	Matthew Garrett <mjg59@srcf.ucam.org>
-M:	Pali Rohár <pali.rohar@gmail.com>
+M:	Pali Rohár <pali@kernel.org>
 S:	Maintained
 F:	drivers/platform/x86/dell-wmi.c
 
@@ -11950,7 +11950,7 @@ F:	drivers/media/i2c/et8ek8
 F:	drivers/media/i2c/ad5820.c
 
 NOKIA N900 POWER SUPPLY DRIVERS
-R:	Pali Rohár <pali.rohar@gmail.com>
+R:	Pali Rohár <pali@kernel.org>
 F:	include/linux/power/bq2415x_charger.h
 F:	include/linux/power/bq27xxx_battery.h
 F:	drivers/power/supply/bq2415x_charger.c
--- a/tools/laptop/freefall/freefall.c~change-email-address-for-pali-rohar
+++ a/tools/laptop/freefall/freefall.c
@@ -4,7 +4,7 @@
  * Copyright 2008 Eric Piel
  * Copyright 2009 Pavel Machek <pavel@ucw.cz>
  * Copyright 2012 Sonal Santan
- * Copyright 2014 Pali Rohár <pali.rohar@gmail.com>
+ * Copyright 2014 Pali Rohár <pali@kernel.org>
  */
 
 #include <stdio.h>
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 32/35] drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
  2020-04-10 21:30 incoming Andrew Morton
                   ` (30 preceding siblings ...)
  2020-04-10 21:34 ` [patch 31/35] change email address for Pali Rohár Andrew Morton
@ 2020-04-10 21:34 ` Andrew Morton
  2020-04-10 21:34 ` [patch 33/35] fs/seq_file.c: seq_read(): add info message about buggy .next functions Andrew Morton
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:34 UTC (permalink / raw)
  To: akpm, digetx, jonathanh, julia.lawall, ldewangan, linux-mm, lkp,
	mm-commits, swarren, torvalds, treding, vinod.koul

From: kbuild test robot <lkp@intel.com>
Subject: drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings

Remove dev_err() messages after platform_get_irq*() failures. 
platform_get_irq() already prints an error.

Generated by: scripts/coccinelle/api/platform_get_irq.cocci

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002271133450.2973@hadrien
Fixes: 6c41ac96ad92 ("dmaengine: tegra-apb: Support COMPILE_TEST")
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: Laxman Dewangan <ldewangan@nvidia.com>
Cc: Vinod Koul <vinod.koul@linux.intel.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/dma/tegra20-apb-dma.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/dma/tegra20-apb-dma.c~dmaengine-tegra-apb-fix-platform_get_irqcocci-warnings
+++ a/drivers/dma/tegra20-apb-dma.c
@@ -1493,7 +1493,6 @@ static int tegra_dma_probe(struct platfo
 		irq = platform_get_irq(pdev, i);
 		if (irq < 0) {
 			ret = irq;
-			dev_err(&pdev->dev, "No irq resource for chan %d\n", i);
 			goto err_pm_disable;
 		}
 
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 33/35] fs/seq_file.c: seq_read(): add info message about buggy .next functions
  2020-04-10 21:30 incoming Andrew Morton
                   ` (31 preceding siblings ...)
  2020-04-10 21:34 ` [patch 32/35] drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings Andrew Morton
@ 2020-04-10 21:34 ` Andrew Morton
  2020-04-10 21:34 ` [patch 34/35] kernel/gcov/fs.c: gcov_seq_next() should increase position index Andrew Morton
  2020-04-10 21:34 ` [patch 35/35] ipc/util.c: sysvipc_find_ipc() " Andrew Morton
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:34 UTC (permalink / raw)
  To: akpm, dave, linux-mm, longman, manfred, mingo, mm-commits, neilb,
	oberpar, rostedt, torvalds, viro, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: fs/seq_file.c: seq_read(): add info message about buggy .next functions

Patch series "seq_file .next functions should increase position index".

In Aug 2018 NeilBrown noticed commit 1f4aace60b0e ("fs/seq_file.c:
simplify seq_file iteration code and interface")

"Some ->next functions do not increment *pos when they return NULL... 
Note that such ->next functions are buggy and should be fixed.  A simple
demonstration is dd if=/proc/swaps bs=1000 skip=1 Choose any block size
larger than the size of /proc/swaps.  This will always show the whole last
line of /proc/swaps"

Described problem is still actual.  If you make lseek into middle of last
output line following read will output end of last line and whole last
line once again.

$ dd if=/proc/swaps bs=1  # usual output
Filename				Type		Size	Used	Priority
/dev/dm-0                               partition	4194812	97536	-2
104+0 records in
104+0 records out
104 bytes copied

$ dd if=/proc/swaps bs=40 skip=1    # last line was generated twice
dd: /proc/swaps: cannot skip to specified offset
v/dm-0                               partition	4194812	97536	-2
/dev/dm-0                               partition	4194812	97536	-2 
3+1 records in
3+1 records out
131 bytes copied

There are lot of other affected files, I've found 30+ including
/proc/net/ip_tables_matches and /proc/sysvipc/*

I've sent patches into maillists of affected subsystems already, this
patch-set fixes the problem in files related to pstore, tracing, gcov,
sysvipc and other subsystems processed via linux-kernel@ mailing list
directly

https://bugzilla.kernel.org/show_bug.cgi?id=206283


This patch (of 4):

Add debug code to seq_read() to detect missed or out-of-tree incorrect
.next seq_file functions.

[akpm@linux-foundation.org: s/pr_info/pr_info_ratelimited/, per Qian Cai]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Link: http://lkml.kernel.org/r/244674e5-760c-86bd-d08a-047042881748@virtuozzo.com
Link: http://lkml.kernel.org/r/7c24087c-e280-e580-5b0c-0cdaeb14cd18@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/seq_file.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/fs/seq_file.c~seq_read-info-message-about-buggy-next-functions
+++ a/fs/seq_file.c
@@ -232,9 +232,12 @@ Fill:
 		loff_t pos = m->index;
 
 		p = m->op->next(m, p, &m->index);
-		if (pos == m->index)
-			/* Buggy ->next function */
+		if (pos == m->index) {
+			pr_info_ratelimited("buggy seq_file .next function %ps "
+				"did not updated position index\n",
+				m->op->next);
 			m->index++;
+		}
 		if (!p || IS_ERR(p)) {
 			err = PTR_ERR(p);
 			break;
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 34/35] kernel/gcov/fs.c: gcov_seq_next() should increase position index
  2020-04-10 21:30 incoming Andrew Morton
                   ` (32 preceding siblings ...)
  2020-04-10 21:34 ` [patch 33/35] fs/seq_file.c: seq_read(): add info message about buggy .next functions Andrew Morton
@ 2020-04-10 21:34 ` Andrew Morton
  2020-04-10 21:34 ` [patch 35/35] ipc/util.c: sysvipc_find_ipc() " Andrew Morton
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:34 UTC (permalink / raw)
  To: akpm, dave, linux-mm, longman, manfred, mingo, mm-commits, neilb,
	oberpar, rostedt, torvalds, viro, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: kernel/gcov/fs.c: gcov_seq_next() should increase position index

If seq_file .next function does not change position index, read after some
lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Link: http://lkml.kernel.org/r/f65c6ee7-bd00-f910-2f8a-37cc67e4ff88@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 kernel/gcov/fs.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/gcov/fs.c~gcov_seq_next-should-increase-position-index
+++ a/kernel/gcov/fs.c
@@ -108,9 +108,9 @@ static void *gcov_seq_next(struct seq_fi
 {
 	struct gcov_iterator *iter = data;
 
+	(*pos)++;
 	if (gcov_iter_next(iter))
 		return NULL;
-	(*pos)++;
 
 	return iter;
 }
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [patch 35/35] ipc/util.c: sysvipc_find_ipc() should increase position index
  2020-04-10 21:30 incoming Andrew Morton
                   ` (33 preceding siblings ...)
  2020-04-10 21:34 ` [patch 34/35] kernel/gcov/fs.c: gcov_seq_next() should increase position index Andrew Morton
@ 2020-04-10 21:34 ` Andrew Morton
  34 siblings, 0 replies; 36+ messages in thread
From: Andrew Morton @ 2020-04-10 21:34 UTC (permalink / raw)
  To: akpm, dave, linux-mm, longman, manfred, mingo, mm-commits, neilb,
	oberpar, rostedt, torvalds, viro, vvs

From: Vasily Averin <vvs@virtuozzo.com>
Subject: ipc/util.c: sysvipc_find_ipc() should increase position index

If seq_file .next function does not change position index, read after some
lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Link: http://lkml.kernel.org/r/b7a20945-e315-8bb0-21e6-3875c14a8494@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Waiman Long <longman@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: NeilBrown <neilb@suse.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/util.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/ipc/util.c~sysvipc_find_ipc-should-increase-position-index
+++ a/ipc/util.c
@@ -764,13 +764,13 @@ static struct kern_ipc_perm *sysvipc_fin
 			total++;
 	}
 
+	*new_pos = pos + 1;
 	if (total >= ids->in_use)
 		return NULL;
 
 	for (; pos < ipc_mni; pos++) {
 		ipc = idr_find(&ids->ipcs_idr, pos);
 		if (ipc != NULL) {
-			*new_pos = pos + 1;
 			rcu_read_lock();
 			ipc_lock_object(ipc);
 			return ipc;
_

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2020-04-10 21:34 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-10 21:30 incoming Andrew Morton
2020-04-10 21:32 ` [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files Andrew Morton
2020-04-10 21:32 ` [patch 02/35] mm, memcg: do not high throttle allocators based on wraparound Andrew Morton
2020-04-10 21:32 ` [patch 03/35] mm, slab_common: fix a typo in comment "eariler"->"earlier" Andrew Morton
2020-04-10 21:32 ` [patch 04/35] docs: mm: slab.h: fix a broken cross-reference Andrew Morton
2020-04-10 21:32 ` [patch 05/35] mm/page_alloc.c: fix kernel-doc warning Andrew Morton
2020-04-10 21:32 ` [patch 06/35] mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static Andrew Morton
2020-04-10 21:32 ` [patch 07/35] mm/gup: fix null pointer dereference detected by coverity Andrew Morton
2020-04-10 21:32 ` [patch 08/35] ocfs2: no need try to truncate file beyond i_size Andrew Morton
2020-04-10 21:32 ` [patch 09/35] mm: cma: NUMA node interface Andrew Morton
2020-04-10 21:32 ` [patch 10/35] mm: hugetlb: optionally allocate gigantic hugepages using cma Andrew Morton
2020-04-10 21:32 ` [patch 11/35] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Andrew Morton
2020-04-10 21:32 ` [patch 12/35] mm/memory.c: refactor insert_page to prepare for batched-lock insert Andrew Morton
2020-04-10 21:32 ` [patch 13/35] mm: bring sparc pte_index() semantics inline with other platforms Andrew Morton
2020-04-10 21:32 ` [patch 14/35] mm: define pte_index as macro for x86 Andrew Morton
2020-04-10 21:33 ` [patch 15/35] mm/memory.c: add vm_insert_pages() Andrew Morton
2020-04-10 21:33 ` [patch 16/35] mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS Andrew Morton
2020-04-10 21:33 ` [patch 17/35] mm/vma: introduce VM_ACCESS_FLAGS Andrew Morton
2020-04-10 21:33 ` [patch 18/35] mm/special: create generic fallbacks for pte_special() and pte_mkspecial() Andrew Morton
2020-04-10 21:33 ` [patch 19/35] mm/memory_hotplug: drop the flags field from struct mhp_restrictions Andrew Morton
2020-04-10 21:33 ` [patch 20/35] mm/memory_hotplug: rename mhp_restrictions to mhp_params Andrew Morton
2020-04-10 21:33 ` [patch 21/35] x86/mm: thread pgprot_t through init_memory_mapping() Andrew Morton
2020-04-10 21:33 ` [patch 22/35] x86/mm: introduce __set_memory_prot() Andrew Morton
2020-04-10 21:33 ` [patch 23/35] powerpc/mm: thread pgprot_t through create_section_mapping() Andrew Morton
2020-04-10 21:33 ` [patch 24/35] mm/memory_hotplug: add pgprot_t to mhp_params Andrew Morton
2020-04-10 21:33 ` [patch 25/35] mm/memremap: set caching mode for PCI P2PDMA memory to WC Andrew Morton
2020-04-10 21:33 ` [patch 26/35] kmod: make request_module() return an error when autoloading is disabled Andrew Morton
2020-04-10 21:33 ` [patch 27/35] fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() Andrew Morton
2020-04-10 21:33 ` [patch 28/35] docs: admin-guide: document the kernel.modprobe sysctl Andrew Morton
2020-04-10 21:33 ` [patch 29/35] selftests: kmod: fix handling test numbers above 9 Andrew Morton
2020-04-10 21:33 ` [patch 30/35] selftests: kmod: test disabling module autoloading Andrew Morton
2020-04-10 21:34 ` [patch 31/35] change email address for Pali Rohár Andrew Morton
2020-04-10 21:34 ` [patch 32/35] drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings Andrew Morton
2020-04-10 21:34 ` [patch 33/35] fs/seq_file.c: seq_read(): add info message about buggy .next functions Andrew Morton
2020-04-10 21:34 ` [patch 34/35] kernel/gcov/fs.c: gcov_seq_next() should increase position index Andrew Morton
2020-04-10 21:34 ` [patch 35/35] ipc/util.c: sysvipc_find_ipc() " Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).