linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/11] Remove 'order' argument from many mm functions
@ 2019-05-07  4:05 Matthew Wilcox
  2019-05-07  4:05 ` [PATCH 01/11] fix function alignment Matthew Wilcox
                   ` (12 more replies)
  0 siblings, 13 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:05 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

It's possible to save a few hundred bytes from the kernel text by moving
the 'order' argument into the GFP flags.  I had the idea while I was
playing with THP pagecache (notably, I didn't want to add an 'order'
parameter to pagecache_get_page())

What I got for a -tiny config for page_alloc.o (with a tinyconfig,
x86-32) after each step:

   text	   data	    bss	    dec	    hex	filename
  21462	    349	     44	  21855	   555f	1.o
  21447	    349	     44	  21840	   5550	2.o
  21415	    349	     44	  21808	   5530	3.o
  21399	    349	     44	  21792	   5520	4.o
  21399	    349	     44	  21792	   5520	5.o
  21367	    349	     44	  21760	   5500	6.o
  21303	    349	     44	  21696	   54c0	7.o
  21303	    349	     44	  21696	   54c0	8.o
  21303	    349	     44	  21696	   54c0	9.o
  21303	    349	     44	  21696	   54c0	A.o
  21303	    349	     44	  21696	   54c0	B.o

I assure you that the callers all shrink as well.  vmscan.o also
shrinks, but I didn't keep detailed records.

Anyway, this is just a quick POC due to me being on an aeroplane for
most of today.  Maybe we don't want to spend five GFP bits on this.
Some bits of this could be pulled out and applied even if we don't want
to go for the main objective.  eg rmqueue_pcplist() doesn't use its
gfp_flags argument.

Matthew Wilcox (Oracle) (11):
  fix function alignment
  mm: Pass order to __alloc_pages_nodemask in GFP flags
  mm: Pass order to __get_free_pages() in GFP flags
  mm: Pass order to prep_new_page in GFP flags
  mm: Remove gfp_flags argument from rmqueue_pcplist
  mm: Pass order to rmqueue in GFP flags
  mm: Pass order to get_page_from_freelist in GFP flags
  mm: Pass order to __alloc_pages_cpuset_fallback in GFP flags
  mm: Pass order to prepare_alloc_pages in GFP flags
  mm: Pass order to try_to_free_pages in GFP flags
  mm: Pass order to node_reclaim() in GFP flags

 arch/x86/Makefile_32.cpu      |  2 +
 arch/x86/events/intel/ds.c    |  4 +-
 arch/x86/kvm/vmx/vmx.c        |  4 +-
 arch/x86/mm/init.c            |  3 +-
 arch/x86/mm/pgtable.c         |  7 +--
 drivers/base/devres.c         |  2 +-
 include/linux/gfp.h           | 57 +++++++++++---------
 include/linux/migrate.h       |  2 +-
 include/linux/swap.h          |  2 +-
 include/trace/events/vmscan.h | 28 +++++-----
 mm/filemap.c                  |  2 +-
 mm/gup.c                      |  4 +-
 mm/hugetlb.c                  |  5 +-
 mm/internal.h                 |  5 +-
 mm/khugepaged.c               |  2 +-
 mm/mempolicy.c                | 30 +++++------
 mm/migrate.c                  |  2 +-
 mm/mmu_gather.c               |  2 +-
 mm/page_alloc.c               | 97 +++++++++++++++++------------------
 mm/shmem.c                    |  5 +-
 mm/slub.c                     |  2 +-
 mm/vmscan.c                   | 26 +++++-----
 22 files changed, 147 insertions(+), 146 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/11] fix function alignment
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
@ 2019-05-07  4:05 ` Matthew Wilcox
  2019-05-09 10:55   ` Kirill A. Shutemov
  2019-05-07  4:06 ` [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags Matthew Wilcox
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:05 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

---
 arch/x86/Makefile_32.cpu | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/Makefile_32.cpu b/arch/x86/Makefile_32.cpu
index 1f5faf8606b4..55d333187d13 100644
--- a/arch/x86/Makefile_32.cpu
+++ b/arch/x86/Makefile_32.cpu
@@ -45,6 +45,8 @@ cflags-$(CONFIG_MGEODE_LX)	+= $(call cc-option,-march=geode,-march=pentium-mmx)
 # cpu entries
 cflags-$(CONFIG_X86_GENERIC) 	+= $(call tune,generic,$(call tune,i686))
 
+cflags-y			+= $(call cc-option,-falign-functions=1)
+
 # Bug fix for binutils: this option is required in order to keep
 # binutils from generating NOPL instructions against our will.
 ifneq ($(CONFIG_X86_P6_NOP),y)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
  2019-05-07  4:05 ` [PATCH 01/11] fix function alignment Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-09  1:50   ` Ira Weiny
  2019-05-09 10:59   ` Kirill A. Shutemov
  2019-05-07  4:06 ` [PATCH 03/11] mm: Pass order to __get_free_pages() " Matthew Wilcox
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Save marshalling an extra argument in all the callers at the expense of
using five bits of the GFP flags.  We still have three GFP bits remaining
after doing this (and we can release one more by reallocating NORETRY,
RETRY_MAYFAIL and NOFAIL).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/x86/events/intel/ds.c |  4 +--
 arch/x86/kvm/vmx/vmx.c     |  4 +--
 include/linux/gfp.h        | 51 ++++++++++++++++++++++----------------
 include/linux/migrate.h    |  2 +-
 mm/filemap.c               |  2 +-
 mm/gup.c                   |  4 +--
 mm/hugetlb.c               |  5 ++--
 mm/khugepaged.c            |  2 +-
 mm/mempolicy.c             | 30 +++++++++++-----------
 mm/migrate.c               |  2 +-
 mm/page_alloc.c            |  4 +--
 mm/shmem.c                 |  5 ++--
 mm/slub.c                  |  2 +-
 13 files changed, 63 insertions(+), 54 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 10c99ce1fead..82fee9845b87 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -315,13 +315,13 @@ static void ds_clear_cea(void *cea, size_t size)
 	preempt_enable();
 }
 
-static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
+static void *dsalloc_pages(size_t size, gfp_t gfp, int cpu)
 {
 	unsigned int order = get_order(size);
 	int node = cpu_to_node(cpu);
 	struct page *page;
 
-	page = __alloc_pages_node(node, flags | __GFP_ZERO, order);
+	page = __alloc_pages_node(node, gfp | __GFP_ZERO | __GFP_ORDER(order));
 	return page ? page_address(page) : NULL;
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae8..323a0f6ffe13 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2380,13 +2380,13 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	return 0;
 }
 
-struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
+struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t gfp)
 {
 	int node = cpu_to_node(cpu);
 	struct page *pages;
 	struct vmcs *vmcs;
 
-	pages = __alloc_pages_node(node, flags, vmcs_config.order);
+	pages = __alloc_pages_node(node, gfp | __GFP_ORDER(vmcs_config.order));
 	if (!pages)
 		return NULL;
 	vmcs = page_address(pages);
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index fb07b503dc45..e7845c2510db 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -219,6 +219,18 @@ struct vm_area_struct;
 /* Room for N __GFP_FOO bits */
 #define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
+#define __GFP_ORDER(order) ((__force gfp_t)(order << __GFP_BITS_SHIFT))
+#define __GFP_ORDER_PMD	__GFP_ORDER(PMD_SHIFT - PAGE_SHIFT)
+#define __GFP_ORDER_PUD	__GFP_ORDER(PUD_SHIFT - PAGE_SHIFT)
+
+/*
+ * Extract the order from a GFP bitmask.
+ * Must be the top bits to avoid an AND operation.  Don't let
+ * __GFP_BITS_SHIFT get over 27, or we won't be able to encode orders
+ * above 15 (some architectures allow configuring MAX_ORDER up to 64,
+ * but I doubt larger than 31 are ever used).
+ */
+#define gfp_order(gfp)	(((__force unsigned int)gfp) >> __GFP_BITS_SHIFT)
 
 /**
  * DOC: Useful GFP flag combinations
@@ -464,26 +476,23 @@ static inline void arch_alloc_page(struct page *page, int order) { }
 #endif
 
 struct page *
-__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
-							nodemask_t *nodemask);
+__alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask);
 
-static inline struct page *
-__alloc_pages(gfp_t gfp_mask, unsigned int order, int preferred_nid)
+static inline struct page *__alloc_pages(gfp_t gfp_mask, int preferred_nid)
 {
-	return __alloc_pages_nodemask(gfp_mask, order, preferred_nid, NULL);
+	return __alloc_pages_nodemask(gfp_mask, preferred_nid, NULL);
 }
 
 /*
  * Allocate pages, preferring the node given as nid. The node must be valid and
  * online. For more general interface, see alloc_pages_node().
  */
-static inline struct page *
-__alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
+static inline struct page *__alloc_pages_node(int nid, gfp_t gfp)
 {
 	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
-	VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
+	VM_WARN_ON((gfp & __GFP_THISNODE) && !node_online(nid));
 
-	return __alloc_pages(gfp_mask, order, nid);
+	return __alloc_pages(gfp, nid);
 }
 
 /*
@@ -497,35 +506,35 @@ static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
 	if (nid == NUMA_NO_NODE)
 		nid = numa_mem_id();
 
-	return __alloc_pages_node(nid, gfp_mask, order);
+	return __alloc_pages_node(nid, gfp_mask | __GFP_ORDER(order));
 }
 
 #ifdef CONFIG_NUMA
-extern struct page *alloc_pages_current(gfp_t gfp_mask, unsigned order);
+extern struct page *alloc_pages_current(gfp_t gfp_mask);
 
 static inline struct page *
 alloc_pages(gfp_t gfp_mask, unsigned int order)
 {
-	return alloc_pages_current(gfp_mask, order);
+	return alloc_pages_current(gfp_mask | __GFP_ORDER(order));
 }
-extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
-			struct vm_area_struct *vma, unsigned long addr,
-			int node, bool hugepage);
+extern struct page *alloc_pages_vma(gfp_t gfp_mask, struct vm_area_struct *vma,
+		unsigned long addr, int node, bool hugepage);
 #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
-	alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true)
+	alloc_pages_vma(gfp_mask | __GFP_ORDER(order), vma, addr, \
+			numa_node_id(), true)
 #else
 #define alloc_pages(gfp_mask, order) \
-		alloc_pages_node(numa_node_id(), gfp_mask, order)
-#define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
-	alloc_pages(gfp_mask, order)
+	alloc_pages_node(numa_node_id(), gfp_mask, order)
+#define alloc_pages_vma(gfp_mask, vma, addr, node, false) \
+	alloc_pages(gfp_mask, 0)
 #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
 	alloc_pages(gfp_mask, order)
 #endif
 #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
 #define alloc_page_vma(gfp_mask, vma, addr)			\
-	alloc_pages_vma(gfp_mask, 0, vma, addr, numa_node_id(), false)
+	alloc_pages_vma(gfp_mask, vma, addr, numa_node_id(), false)
 #define alloc_page_vma_node(gfp_mask, vma, addr, node)		\
-	alloc_pages_vma(gfp_mask, 0, vma, addr, node, false)
+	alloc_pages_vma(gfp_mask, vma, addr, node, false)
 
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index e13d9bf2f9a5..ba4385144cc9 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -50,7 +50,7 @@ static inline struct page *new_page_nodemask(struct page *page,
 	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
 		gfp_mask |= __GFP_HIGHMEM;
 
-	new_page = __alloc_pages_nodemask(gfp_mask, order,
+	new_page = __alloc_pages_nodemask(gfp_mask | __GFP_ORDER(order),
 				preferred_nid, nodemask);
 
 	if (new_page && PageTransHuge(new_page))
diff --git a/mm/filemap.c b/mm/filemap.c
index 3ad18fa56057..b7b0841312c9 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -945,7 +945,7 @@ struct page *__page_cache_alloc(gfp_t gfp)
 		do {
 			cpuset_mems_cookie = read_mems_allowed_begin();
 			n = cpuset_mem_spread_node();
-			page = __alloc_pages_node(n, gfp, 0);
+			page = __alloc_pages_node(n, gfp);
 		} while (!page && read_mems_allowed_retry(cpuset_mems_cookie));
 
 		return page;
diff --git a/mm/gup.c b/mm/gup.c
index 294e87ae5b9a..7b06962a4630 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1306,14 +1306,14 @@ static struct page *new_non_cma_page(struct page *page, unsigned long private)
 		 * CMA area again.
 		 */
 		thp_gfpmask &= ~__GFP_MOVABLE;
-		thp = __alloc_pages_node(nid, thp_gfpmask, HPAGE_PMD_ORDER);
+		thp = __alloc_pages_node(nid, thp_gfpmask | __GFP_PMD_ORDER);
 		if (!thp)
 			return NULL;
 		prep_transhuge_page(thp);
 		return thp;
 	}
 
-	return __alloc_pages_node(nid, gfp_mask, 0);
+	return __alloc_pages_node(nid, gfp_mask);
 }
 
 static long check_and_migrate_cma_pages(struct task_struct *tsk,
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 109f5de82910..f3f0f2902a52 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1401,10 +1401,11 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
 	int order = huge_page_order(h);
 	struct page *page;
 
-	gfp_mask |= __GFP_COMP|__GFP_RETRY_MAYFAIL|__GFP_NOWARN;
+	gfp_mask |= __GFP_COMP | __GFP_RETRY_MAYFAIL | __GFP_NOWARN |
+			__GFP_ORDER(order);
 	if (nid == NUMA_NO_NODE)
 		nid = numa_mem_id();
-	page = __alloc_pages_nodemask(gfp_mask, order, nid, nmask);
+	page = __alloc_pages_nodemask(gfp_mask, nid, nmask);
 	if (page)
 		__count_vm_event(HTLB_BUDDY_PGALLOC);
 	else
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index a335f7c1fac4..3d9267394881 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -770,7 +770,7 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
 {
 	VM_BUG_ON_PAGE(*hpage, *hpage);
 
-	*hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER);
+	*hpage = __alloc_pages_node(node, gfp | __GFP_PMD_ORDER);
 	if (unlikely(!*hpage)) {
 		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
 		*hpage = ERR_PTR(-ENOMEM);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 2219e747df49..bad60476d5ad 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -975,7 +975,7 @@ struct page *alloc_new_node_page(struct page *page, unsigned long node)
 		return thp;
 	} else
 		return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE |
-						    __GFP_THISNODE, 0);
+						    __GFP_THISNODE);
 }
 
 /*
@@ -2006,12 +2006,11 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
 
 /* Allocate a page in interleaved policy.
    Own path because it needs to do special accounting. */
-static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
-					unsigned nid)
+static struct page *alloc_page_interleave(gfp_t gfp, unsigned nid)
 {
 	struct page *page;
 
-	page = __alloc_pages(gfp, order, nid);
+	page = __alloc_pages(gfp, nid);
 	/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
 	if (!static_branch_likely(&vm_numa_stat_key))
 		return page;
@@ -2033,7 +2032,6 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
  *      %GFP_FS      allocation should not call back into a file system.
  *      %GFP_ATOMIC  don't sleep.
  *
- *	@order:Order of the GFP allocation.
  * 	@vma:  Pointer to VMA or NULL if not available.
  *	@addr: Virtual Address of the allocation. Must be inside the VMA.
  *	@node: Which node to prefer for allocation (modulo policy).
@@ -2047,8 +2045,8 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
  *	NULL when no page can be allocated.
  */
 struct page *
-alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
-		unsigned long addr, int node, bool hugepage)
+alloc_pages_vma(gfp_t gfp, struct vm_area_struct *vma, unsigned long addr,
+		int node, bool hugepage)
 {
 	struct mempolicy *pol;
 	struct page *page;
@@ -2060,9 +2058,10 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
 	if (pol->mode == MPOL_INTERLEAVE) {
 		unsigned nid;
 
-		nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
+		nid = interleave_nid(pol, vma, addr,
+				PAGE_SHIFT + gfp_order(gfp));
 		mpol_cond_put(pol);
-		page = alloc_page_interleave(gfp, order, nid);
+		page = alloc_page_interleave(gfp, nid);
 		goto out;
 	}
 
@@ -2086,14 +2085,14 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
 		if (!nmask || node_isset(hpage_node, *nmask)) {
 			mpol_cond_put(pol);
 			page = __alloc_pages_node(hpage_node,
-						gfp | __GFP_THISNODE, order);
+						gfp | __GFP_THISNODE);
 			goto out;
 		}
 	}
 
 	nmask = policy_nodemask(gfp, pol);
 	preferred_nid = policy_node(gfp, pol, node);
-	page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask);
+	page = __alloc_pages_nodemask(gfp, preferred_nid, nmask);
 	mpol_cond_put(pol);
 out:
 	return page;
@@ -2108,13 +2107,12 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
  *      	%GFP_HIGHMEM highmem allocation,
  *      	%GFP_FS     don't call back into a file system.
  *      	%GFP_ATOMIC don't sleep.
- *	@order: Power of two of allocation size in pages. 0 is a single page.
  *
  *	Allocate a page from the kernel page pool.  When not in
- *	interrupt context and apply the current process NUMA policy.
+ *	interrupt context apply the current process NUMA policy.
  *	Returns NULL when no page can be allocated.
  */
-struct page *alloc_pages_current(gfp_t gfp, unsigned order)
+struct page *alloc_pages_current(gfp_t gfp)
 {
 	struct mempolicy *pol = &default_policy;
 	struct page *page;
@@ -2127,9 +2125,9 @@ struct page *alloc_pages_current(gfp_t gfp, unsigned order)
 	 * nor system default_policy
 	 */
 	if (pol->mode == MPOL_INTERLEAVE)
-		page = alloc_page_interleave(gfp, order, interleave_nodes(pol));
+		page = alloc_page_interleave(gfp, interleave_nodes(pol));
 	else
-		page = __alloc_pages_nodemask(gfp, order,
+		page = __alloc_pages_nodemask(gfp,
 				policy_node(gfp, pol, numa_node_id()),
 				policy_nodemask(gfp, pol));
 
diff --git a/mm/migrate.c b/mm/migrate.c
index f2ecc2855a12..acb479132398 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1884,7 +1884,7 @@ static struct page *alloc_misplaced_dst_page(struct page *page,
 					 (GFP_HIGHUSER_MOVABLE |
 					  __GFP_THISNODE | __GFP_NOMEMALLOC |
 					  __GFP_NORETRY | __GFP_NOWARN) &
-					 ~__GFP_RECLAIM, 0);
+					 ~__GFP_RECLAIM);
 
 	return newpage;
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index deea16489e2b..13191fe2f19d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4610,11 +4610,11 @@ static inline void finalise_ac(gfp_t gfp_mask, struct alloc_context *ac)
  * This is the 'heart' of the zoned buddy allocator.
  */
 struct page *
-__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
-							nodemask_t *nodemask)
+__alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask)
 {
 	struct page *page;
 	unsigned int alloc_flags = ALLOC_WMARK_LOW;
+	int order = gfp_order(gfp_mask);
 	gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
 	struct alloc_context ac = { };
 
diff --git a/mm/shmem.c b/mm/shmem.c
index a1e9f6194138..445e76e5c0c2 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1463,8 +1463,9 @@ static struct page *shmem_alloc_hugepage(gfp_t gfp,
 		return NULL;
 
 	shmem_pseudo_vma_init(&pvma, info, hindex);
-	page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY | __GFP_NOWARN,
-			HPAGE_PMD_ORDER, &pvma, 0, numa_node_id(), true);
+	page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY |
+				__GFP_NOWARN | __GFP_PMD_ORDER,
+			&pvma, 0, numa_node_id(), true);
 	shmem_pseudo_vma_destroy(&pvma);
 	if (page)
 		prep_transhuge_page(page);
diff --git a/mm/slub.c b/mm/slub.c
index a34fbe1f6ede..7504fa3f844b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1497,7 +1497,7 @@ static inline struct page *alloc_slab_page(struct kmem_cache *s,
 	if (node == NUMA_NO_NODE)
 		page = alloc_pages(flags, order);
 	else
-		page = __alloc_pages_node(node, flags, order);
+		page = __alloc_pages_node(node, flags | __GFP_ORDER(order));
 
 	if (page && memcg_charge_slab(page, flags, order, s)) {
 		__free_pages(page, order);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/11] mm: Pass order to __get_free_pages() in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
  2019-05-07  4:05 ` [PATCH 01/11] fix function alignment Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 04/11] mm: Pass order to prep_new_page " Matthew Wilcox
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Matches the change to the __alloc_pages_nodemask API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/x86/mm/init.c    | 3 ++-
 arch/x86/mm/pgtable.c | 7 ++++---
 drivers/base/devres.c | 2 +-
 include/linux/gfp.h   | 6 +++---
 mm/mmu_gather.c       | 2 +-
 mm/page_alloc.c       | 8 ++++----
 6 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index f905a2371080..963f30581291 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -94,7 +94,8 @@ __ref void *alloc_low_pages(unsigned int num)
 		unsigned int order;
 
 		order = get_order((unsigned long)num << PAGE_SHIFT);
-		return (void *)__get_free_pages(GFP_ATOMIC | __GFP_ZERO, order);
+		return (void *)__get_free_pages(GFP_ATOMIC | __GFP_ZERO |
+				__GFP_ORDER(order));
 	}
 
 	if ((pgt_buf_end + num) > pgt_buf_top || !can_use_brk_pgt) {
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7bd01709a091..3d3d13f859e5 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -401,8 +401,8 @@ static inline pgd_t *_pgd_alloc(void)
 	 * We allocate one page for pgd.
 	 */
 	if (!SHARED_KERNEL_PMD)
-		return (pgd_t *)__get_free_pages(PGALLOC_GFP,
-						 PGD_ALLOCATION_ORDER);
+		return (pgd_t *)__get_free_pages(PGALLOC_GFP |
+					__GFP_ORDER(PGD_ALLOCATION_ORDER));
 
 	/*
 	 * Now PAE kernel is not running as a Xen domain. We can allocate
@@ -422,7 +422,8 @@ static inline void _pgd_free(pgd_t *pgd)
 
 static inline pgd_t *_pgd_alloc(void)
 {
-	return (pgd_t *)__get_free_pages(PGALLOC_GFP, PGD_ALLOCATION_ORDER);
+	return (pgd_t *)__get_free_pages(PGALLOC_GFP |
+					 __GFP_ORDER(PGD_ALLOCATION_ORDER));
 }
 
 static inline void _pgd_free(pgd_t *pgd)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index e038e2b3b7ea..572e81282285 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -992,7 +992,7 @@ unsigned long devm_get_free_pages(struct device *dev,
 	struct pages_devres *devres;
 	unsigned long addr;
 
-	addr = __get_free_pages(gfp_mask, order);
+	addr = __get_free_pages(gfp_mask | __GFP_ORDER(order));
 
 	if (unlikely(!addr))
 		return 0;
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index e7845c2510db..23fbd6da1fb6 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -536,7 +536,7 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, struct vm_area_struct *vma,
 #define alloc_page_vma_node(gfp_mask, vma, addr, node)		\
 	alloc_pages_vma(gfp_mask, vma, addr, node, false)
 
-extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
+extern unsigned long __get_free_pages(gfp_t gfp_mask);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
 void *alloc_pages_exact(size_t size, gfp_t gfp_mask);
@@ -544,10 +544,10 @@ void free_pages_exact(void *virt, size_t size);
 void * __meminit alloc_pages_exact_nid(int nid, size_t size, gfp_t gfp_mask);
 
 #define __get_free_page(gfp_mask) \
-		__get_free_pages((gfp_mask), 0)
+		__get_free_pages(gfp_mask)
 
 #define __get_dma_pages(gfp_mask, order) \
-		__get_free_pages((gfp_mask) | GFP_DMA, (order))
+		__get_free_pages((gfp_mask) | GFP_DMA | __GFP_ORDER(order))
 
 extern void __free_pages(struct page *page, unsigned int order);
 extern void free_pages(unsigned long addr, unsigned int order);
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index f2f03c655807..d370621c8c5d 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -26,7 +26,7 @@ static bool tlb_next_batch(struct mmu_gather *tlb)
 	if (tlb->batch_count == MAX_GATHER_BATCH_COUNT)
 		return false;
 
-	batch = (void *)__get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
+	batch = (void *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
 	if (!batch)
 		return false;
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 13191fe2f19d..e26536825a0b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4681,11 +4681,11 @@ EXPORT_SYMBOL(__alloc_pages_nodemask);
  * address cannot represent highmem pages. Use alloc_pages and then kmap if
  * you need to access high mem.
  */
-unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order)
+unsigned long __get_free_pages(gfp_t gfp_mask)
 {
 	struct page *page;
 
-	page = alloc_pages(gfp_mask & ~__GFP_HIGHMEM, order);
+	page = __alloc_pages(gfp_mask & ~__GFP_HIGHMEM, numa_mem_id());
 	if (!page)
 		return 0;
 	return (unsigned long) page_address(page);
@@ -4694,7 +4694,7 @@ EXPORT_SYMBOL(__get_free_pages);
 
 unsigned long get_zeroed_page(gfp_t gfp_mask)
 {
-	return __get_free_pages(gfp_mask | __GFP_ZERO, 0);
+	return __get_free_page(gfp_mask | __GFP_ZERO);
 }
 EXPORT_SYMBOL(get_zeroed_page);
 
@@ -4869,7 +4869,7 @@ void *alloc_pages_exact(size_t size, gfp_t gfp_mask)
 	if (WARN_ON_ONCE(gfp_mask & __GFP_COMP))
 		gfp_mask &= ~__GFP_COMP;
 
-	addr = __get_free_pages(gfp_mask, order);
+	addr = __get_free_pages(gfp_mask | __GFP_ORDER(order));
 	return make_alloc_exact(addr, order, size);
 }
 EXPORT_SYMBOL(alloc_pages_exact);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/11] mm: Pass order to prep_new_page in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (2 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 03/11] mm: Pass order to __get_free_pages() " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 05/11] mm: Remove gfp_flags argument from rmqueue_pcplist Matthew Wilcox
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Matches the change to the __alloc_pages_nodemask API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e26536825a0b..cb997c41c384 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2056,10 +2056,11 @@ inline void post_alloc_hook(struct page *page, unsigned int order,
 	set_page_owner(page, order, gfp_flags);
 }
 
-static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp_flags,
-							unsigned int alloc_flags)
+static void prep_new_page(struct page *page, gfp_t gfp_flags,
+						unsigned int alloc_flags)
 {
 	int i;
+	unsigned int order = gfp_order(gfp_flags);
 
 	post_alloc_hook(page, order, gfp_flags);
 
@@ -3598,7 +3599,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 		page = rmqueue(ac->preferred_zoneref->zone, zone, order,
 				gfp_mask, alloc_flags, ac->migratetype);
 		if (page) {
-			prep_new_page(page, order, gfp_mask, alloc_flags);
+			prep_new_page(page, gfp_mask, alloc_flags);
 
 			/*
 			 * If this is a high-order atomic allocation then check
@@ -3828,7 +3829,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 
 	/* Prep a captured page if available */
 	if (page)
-		prep_new_page(page, order, gfp_mask, alloc_flags);
+		prep_new_page(page, gfp_mask, alloc_flags);
 
 	/* Try get a page from the freelist if available */
 	if (!page)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/11] mm: Remove gfp_flags argument from rmqueue_pcplist
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (3 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 04/11] mm: Pass order to prep_new_page " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 06/11] mm: Pass order to rmqueue in GFP flags Matthew Wilcox
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Unused argument.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb997c41c384..987d47c9bb37 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3148,8 +3148,7 @@ static struct page *__rmqueue_pcplist(struct zone *zone, int migratetype,
 /* Lock and remove page from the per-cpu list */
 static struct page *rmqueue_pcplist(struct zone *preferred_zone,
 			struct zone *zone, unsigned int order,
-			gfp_t gfp_flags, int migratetype,
-			unsigned int alloc_flags)
+			int migratetype, unsigned int alloc_flags)
 {
 	struct per_cpu_pages *pcp;
 	struct list_head *list;
@@ -3182,7 +3181,7 @@ struct page *rmqueue(struct zone *preferred_zone,
 
 	if (likely(order == 0)) {
 		page = rmqueue_pcplist(preferred_zone, zone, order,
-				gfp_flags, migratetype, alloc_flags);
+				migratetype, alloc_flags);
 		goto out;
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/11] mm: Pass order to rmqueue in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (4 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 05/11] mm: Remove gfp_flags argument from rmqueue_pcplist Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 07/11] mm: Pass order to get_page_from_freelist " Matthew Wilcox
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Matches the change to the __alloc_pages_nodemask API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 987d47c9bb37..4705d0e7cf6f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3171,11 +3171,10 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone,
  * Allocate a page from the given zone. Use pcplists for order-0 allocations.
  */
 static inline
-struct page *rmqueue(struct zone *preferred_zone,
-			struct zone *zone, unsigned int order,
-			gfp_t gfp_flags, unsigned int alloc_flags,
-			int migratetype)
+struct page *rmqueue(struct zone *preferred_zone, struct zone *zone,
+		gfp_t gfp_flags, unsigned int alloc_flags, int migratetype)
 {
+	unsigned int order = gfp_order(gfp_flags);
 	unsigned long flags;
 	struct page *page;
 
@@ -3595,7 +3594,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 		}
 
 try_this_zone:
-		page = rmqueue(ac->preferred_zoneref->zone, zone, order,
+		page = rmqueue(ac->preferred_zoneref->zone, zone,
 				gfp_mask, alloc_flags, ac->migratetype);
 		if (page) {
 			prep_new_page(page, gfp_mask, alloc_flags);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/11] mm: Pass order to get_page_from_freelist in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (5 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 06/11] mm: Pass order to rmqueue in GFP flags Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 08/11] mm: Pass order to __alloc_pages_cpuset_fallback " Matthew Wilcox
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Matches the change to the __alloc_pages_nodemask API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4705d0e7cf6f..cf71547be903 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3482,13 +3482,14 @@ alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
  * a page.
  */
 static struct page *
-get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
-						const struct alloc_context *ac)
+get_page_from_freelist(gfp_t gfp_mask, int alloc_flags,
+			const struct alloc_context *ac)
 {
 	struct zoneref *z;
 	struct zone *zone;
 	struct pglist_data *last_pgdat_dirty_limit = NULL;
 	bool no_fallback;
+	unsigned int order = gfp_order(gfp_mask);
 
 retry:
 	/*
@@ -3684,15 +3685,13 @@ __alloc_pages_cpuset_fallback(gfp_t gfp_mask, unsigned int order,
 {
 	struct page *page;
 
-	page = get_page_from_freelist(gfp_mask, order,
-			alloc_flags|ALLOC_CPUSET, ac);
+	page = get_page_from_freelist(gfp_mask, alloc_flags|ALLOC_CPUSET, ac);
 	/*
 	 * fallback to ignore cpuset restriction if our nodes
 	 * are depleted
 	 */
 	if (!page)
-		page = get_page_from_freelist(gfp_mask, order,
-				alloc_flags, ac);
+		page = get_page_from_freelist(gfp_mask, alloc_flags, ac);
 
 	return page;
 }
@@ -3730,7 +3729,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	 * allocation which will never fail due to oom_lock already held.
 	 */
 	page = get_page_from_freelist((gfp_mask | __GFP_HARDWALL) &
-				      ~__GFP_DIRECT_RECLAIM, order,
+				      ~__GFP_DIRECT_RECLAIM,
 				      ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
 	if (page)
 		goto out;
@@ -3831,7 +3830,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
 
 	/* Try get a page from the freelist if available */
 	if (!page)
-		page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+		page = get_page_from_freelist(gfp_mask, alloc_flags, ac);
 
 	if (page) {
 		struct zone *zone = page_zone(page);
@@ -4058,7 +4057,7 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 		return NULL;
 
 retry:
-	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+	page = get_page_from_freelist(gfp_mask, alloc_flags, ac);
 
 	/*
 	 * If an allocation failed after direct reclaim, it could be because
@@ -4363,7 +4362,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	 * The adjusted alloc_flags might result in immediate success, so try
 	 * that first
 	 */
-	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+	page = get_page_from_freelist(gfp_mask, alloc_flags, ac);
 	if (page)
 		goto got_pg;
 
@@ -4433,7 +4432,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	}
 
 	/* Attempt with potentially adjusted zonelist and alloc_flags */
-	page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
+	page = get_page_from_freelist(gfp_mask, alloc_flags, ac);
 	if (page)
 		goto got_pg;
 
@@ -4640,7 +4639,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask)
 	alloc_flags |= alloc_flags_nofragment(ac.preferred_zoneref->zone, gfp_mask);
 
 	/* First allocation attempt */
-	page = get_page_from_freelist(alloc_mask, order, alloc_flags, &ac);
+	page = get_page_from_freelist(alloc_mask, alloc_flags, &ac);
 	if (likely(page))
 		goto out;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/11] mm: Pass order to __alloc_pages_cpuset_fallback in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (6 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 07/11] mm: Pass order to get_page_from_freelist " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 09/11] mm: Pass order to prepare_alloc_pages " Matthew Wilcox
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Matches the change to the __alloc_pages_nodemask API.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cf71547be903..f693fec5f555 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3679,8 +3679,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 }
 
 static inline struct page *
-__alloc_pages_cpuset_fallback(gfp_t gfp_mask, unsigned int order,
-			      unsigned int alloc_flags,
+__alloc_pages_cpuset_fallback(gfp_t gfp_mask, unsigned int alloc_flags,
 			      const struct alloc_context *ac)
 {
 	struct page *page;
@@ -3776,7 +3775,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 		 * reserves
 		 */
 		if (gfp_mask & __GFP_NOFAIL)
-			page = __alloc_pages_cpuset_fallback(gfp_mask, order,
+			page = __alloc_pages_cpuset_fallback(gfp_mask,
 					ALLOC_NO_WATERMARKS, ac);
 	}
 out:
@@ -4543,7 +4542,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		 * could deplete whole memory reserves which would just make
 		 * the situation worse
 		 */
-		page = __alloc_pages_cpuset_fallback(gfp_mask, order, ALLOC_HARDER, ac);
+		page = __alloc_pages_cpuset_fallback(gfp_mask, ALLOC_HARDER, ac);
 		if (page)
 			goto got_pg;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/11] mm: Pass order to prepare_alloc_pages in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (7 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 08/11] mm: Pass order to __alloc_pages_cpuset_fallback " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 10/11] mm: Pass order to try_to_free_pages " Matthew Wilcox
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Also pass the order to should_fail_alloc_page() in the GFP flags,
which only used the order when calling prepare_alloc_pages().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/page_alloc.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f693fec5f555..94ad4727206e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3247,8 +3247,9 @@ static int __init setup_fail_page_alloc(char *str)
 }
 __setup("fail_page_alloc=", setup_fail_page_alloc);
 
-static bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
+static bool __should_fail_alloc_page(gfp_t gfp_mask)
 {
+	unsigned int order = gfp_order(gfp_mask);
 	if (order < fail_page_alloc.min_order)
 		return false;
 	if (gfp_mask & __GFP_NOFAIL)
@@ -3287,16 +3288,16 @@ late_initcall(fail_page_alloc_debugfs);
 
 #else /* CONFIG_FAIL_PAGE_ALLOC */
 
-static inline bool __should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
+static inline bool __should_fail_alloc_page(gfp_t gfp_mask)
 {
 	return false;
 }
 
 #endif /* CONFIG_FAIL_PAGE_ALLOC */
 
-static noinline bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
+static noinline bool should_fail_alloc_page(gfp_t gfp_mask)
 {
-	return __should_fail_alloc_page(gfp_mask, order);
+	return __should_fail_alloc_page(gfp_mask);
 }
 ALLOW_ERROR_INJECTION(should_fail_alloc_page, TRUE);
 
@@ -4556,7 +4557,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	return page;
 }
 
-static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
+static inline bool prepare_alloc_pages(gfp_t gfp_mask,
 		int preferred_nid, nodemask_t *nodemask,
 		struct alloc_context *ac, gfp_t *alloc_mask,
 		unsigned int *alloc_flags)
@@ -4579,7 +4580,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
 
 	might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
 
-	if (should_fail_alloc_page(gfp_mask, order))
+	if (should_fail_alloc_page(gfp_mask))
 		return false;
 
 	if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
@@ -4626,7 +4627,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask)
 
 	gfp_mask &= gfp_allowed_mask;
 	alloc_mask = gfp_mask;
-	if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags))
+	if (!prepare_alloc_pages(gfp_mask, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags))
 		return NULL;
 
 	finalise_ac(gfp_mask, &ac);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/11] mm: Pass order to try_to_free_pages in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (8 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 09/11] mm: Pass order to prepare_alloc_pages " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-07  4:06 ` [PATCH 11/11] mm: Pass order to node_reclaim() " Matthew Wilcox
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Also remove the order argument from __perform_reclaim() and
__alloc_pages_direct_reclaim() which only passed the argument down.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/swap.h          |  2 +-
 include/trace/events/vmscan.h | 20 +++++++++-----------
 mm/page_alloc.c               | 15 ++++++---------
 mm/vmscan.c                   | 13 ++++++-------
 4 files changed, 22 insertions(+), 28 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4bfb5c4ac108..029737fec38b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -348,7 +348,7 @@ extern void lru_cache_add_active_or_unevictable(struct page *page,
 
 /* linux/mm/vmscan.c */
 extern unsigned long zone_reclaimable_pages(struct zone *zone);
-extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
+extern unsigned long try_to_free_pages(struct zonelist *zonelist,
 					gfp_t gfp_mask, nodemask_t *mask);
 extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 0aa882a4e870..fd8b468570c8 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -106,45 +106,43 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags),
 
-	TP_ARGS(order, gfp_flags),
+	TP_ARGS(gfp_flags),
 
 	TP_STRUCT__entry(
-		__field(	int,	order		)
 		__field(	gfp_t,	gfp_flags	)
 	),
 
 	TP_fast_assign(
-		__entry->order		= order;
 		__entry->gfp_flags	= gfp_flags;
 	),
 
 	TP_printk("order=%d gfp_flags=%s",
-		__entry->order,
+		gfp_order(__entry->gfp_flags),
 		show_gfp_flags(__entry->gfp_flags))
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_direct_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_softlimit_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags)
 );
 #endif /* CONFIG_MEMCG */
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 94ad4727206e..5ac2cbb105c3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4011,9 +4011,7 @@ EXPORT_SYMBOL_GPL(fs_reclaim_release);
 #endif
 
 /* Perform direct synchronous page reclaim */
-static int
-__perform_reclaim(gfp_t gfp_mask, unsigned int order,
-					const struct alloc_context *ac)
+static int __perform_reclaim(gfp_t gfp_mask, const struct alloc_context *ac)
 {
 	struct reclaim_state reclaim_state;
 	int progress;
@@ -4030,8 +4028,7 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
 	reclaim_state.reclaimed_slab = 0;
 	current->reclaim_state = &reclaim_state;
 
-	progress = try_to_free_pages(ac->zonelist, order, gfp_mask,
-								ac->nodemask);
+	progress = try_to_free_pages(ac->zonelist, gfp_mask, ac->nodemask);
 
 	current->reclaim_state = NULL;
 	memalloc_noreclaim_restore(noreclaim_flag);
@@ -4045,14 +4042,14 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
 
 /* The really slow allocator path where we enter direct reclaim */
 static inline struct page *
-__alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
-		unsigned int alloc_flags, const struct alloc_context *ac,
+__alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int alloc_flags,
+		const struct alloc_context *ac,
 		unsigned long *did_some_progress)
 {
 	struct page *page = NULL;
 	bool drained = false;
 
-	*did_some_progress = __perform_reclaim(gfp_mask, order, ac);
+	*did_some_progress = __perform_reclaim(gfp_mask, ac);
 	if (unlikely(!(*did_some_progress)))
 		return NULL;
 
@@ -4445,7 +4442,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 		goto nopage;
 
 	/* Try direct reclaim and then allocating */
-	page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
+	page = __alloc_pages_direct_reclaim(gfp_mask, alloc_flags, ac,
 							&did_some_progress);
 	if (page)
 		goto got_pg;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 836b28913bd7..5d465bdaf225 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3206,15 +3206,15 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
 	return false;
 }
 
-unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
-				gfp_t gfp_mask, nodemask_t *nodemask)
+unsigned long try_to_free_pages(struct zonelist *zonelist, gfp_t gfp_mask,
+		nodemask_t *nodemask)
 {
 	unsigned long nr_reclaimed;
 	struct scan_control sc = {
 		.nr_to_reclaim = SWAP_CLUSTER_MAX,
 		.gfp_mask = current_gfp_context(gfp_mask),
 		.reclaim_idx = gfp_zone(gfp_mask),
-		.order = order,
+		.order = gfp_order(gfp_mask),
 		.nodemask = nodemask,
 		.priority = DEF_PRIORITY,
 		.may_writepage = !laptop_mode,
@@ -3239,7 +3239,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 	if (throttle_direct_reclaim(sc.gfp_mask, zonelist, nodemask))
 		return 1;
 
-	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
+	trace_mm_vmscan_direct_reclaim_begin(sc.gfp_mask);
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
@@ -3268,8 +3268,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
-						      sc.gfp_mask);
+	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.gfp_mask);
 
 	/*
 	 * NOTE: Although we can get the priority field, using it
@@ -3318,7 +3317,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 
 	zonelist = &NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK];
 
-	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask);
+	trace_mm_vmscan_memcg_reclaim_begin(sc.gfp_mask);
 
 	psi_memstall_enter(&pflags);
 	noreclaim_flag = memalloc_noreclaim_save();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/11] mm: Pass order to node_reclaim() in GFP flags
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (9 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 10/11] mm: Pass order to try_to_free_pages " Matthew Wilcox
@ 2019-05-07  4:06 ` Matthew Wilcox
  2019-05-09  1:58 ` [RFC 00/11] Remove 'order' argument from many mm functions Ira Weiny
  2019-05-09 11:07 ` Kirill A. Shutemov
  12 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-07  4:06 UTC (permalink / raw)
  To: linux-mm; +Cc: Matthew Wilcox (Oracle)

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/trace/events/vmscan.h |  8 +++-----
 mm/internal.h                 |  5 ++---
 mm/page_alloc.c               |  2 +-
 mm/vmscan.c                   | 13 ++++++-------
 4 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index fd8b468570c8..bc5a8a6f6e64 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -464,25 +464,23 @@ TRACE_EVENT(mm_vmscan_inactive_list_is_low,
 
 TRACE_EVENT(mm_vmscan_node_reclaim_begin,
 
-	TP_PROTO(int nid, int order, gfp_t gfp_flags),
+	TP_PROTO(int nid, gfp_t gfp_flags),
 
-	TP_ARGS(nid, order, gfp_flags),
+	TP_ARGS(nid, gfp_flags),
 
 	TP_STRUCT__entry(
 		__field(int, nid)
-		__field(int, order)
 		__field(gfp_t, gfp_flags)
 	),
 
 	TP_fast_assign(
 		__entry->nid = nid;
-		__entry->order = order;
 		__entry->gfp_flags = gfp_flags;
 	),
 
 	TP_printk("nid=%d order=%d gfp_flags=%s",
 		__entry->nid,
-		__entry->order,
+		gfp_order(__entry->gfp_flags),
 		show_gfp_flags(__entry->gfp_flags))
 );
 
diff --git a/mm/internal.h b/mm/internal.h
index 9eeaf2b95166..353cefdc3f34 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -457,10 +457,9 @@ static inline void mminit_validate_memmodel_limits(unsigned long *start_pfn,
 #define NODE_RECLAIM_SUCCESS	1
 
 #ifdef CONFIG_NUMA
-extern int node_reclaim(struct pglist_data *, gfp_t, unsigned int);
+extern int node_reclaim(struct pglist_data *, gfp_t);
 #else
-static inline int node_reclaim(struct pglist_data *pgdat, gfp_t mask,
-				unsigned int order)
+static inline int node_reclaim(struct pglist_data *pgdat, gfp_t mask)
 {
 	return NODE_RECLAIM_NOSCAN;
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5ac2cbb105c3..6ea7bda90100 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3577,7 +3577,7 @@ get_page_from_freelist(gfp_t gfp_mask, int alloc_flags,
 			    !zone_allows_reclaim(ac->preferred_zoneref->zone, zone))
 				continue;
 
-			ret = node_reclaim(zone->zone_pgdat, gfp_mask, order);
+			ret = node_reclaim(zone->zone_pgdat, gfp_mask);
 			switch (ret) {
 			case NODE_RECLAIM_NOSCAN:
 				/* did not scan */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5d465bdaf225..171844a2a8c0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4148,17 +4148,17 @@ static unsigned long node_pagecache_reclaimable(struct pglist_data *pgdat)
 /*
  * Try to free up some pages from this node through reclaim.
  */
-static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
+static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask)
 {
 	/* Minimum pages needed in order to stay on node */
-	const unsigned long nr_pages = 1 << order;
+	const unsigned long nr_pages = 1UL << gfp_order(gfp_mask);
 	struct task_struct *p = current;
 	struct reclaim_state reclaim_state;
 	unsigned int noreclaim_flag;
 	struct scan_control sc = {
 		.nr_to_reclaim = max(nr_pages, SWAP_CLUSTER_MAX),
 		.gfp_mask = current_gfp_context(gfp_mask),
-		.order = order,
+		.order = gfp_order(gfp_mask),
 		.priority = NODE_RECLAIM_PRIORITY,
 		.may_writepage = !!(node_reclaim_mode & RECLAIM_WRITE),
 		.may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
@@ -4166,8 +4166,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 		.reclaim_idx = gfp_zone(gfp_mask),
 	};
 
-	trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
-					   sc.gfp_mask);
+	trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, sc.gfp_mask);
 
 	cond_resched();
 	fs_reclaim_acquire(sc.gfp_mask);
@@ -4201,7 +4200,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	return sc.nr_reclaimed >= nr_pages;
 }
 
-int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
+int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask)
 {
 	int ret;
 
@@ -4237,7 +4236,7 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
 	if (test_and_set_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags))
 		return NODE_RECLAIM_NOSCAN;
 
-	ret = __node_reclaim(pgdat, gfp_mask, order);
+	ret = __node_reclaim(pgdat, gfp_mask);
 	clear_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags);
 
 	if (!ret)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags
  2019-05-07  4:06 ` [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags Matthew Wilcox
@ 2019-05-09  1:50   ` Ira Weiny
  2019-05-09 13:58     ` Matthew Wilcox
  2019-05-09 10:59   ` Kirill A. Shutemov
  1 sibling, 1 reply; 24+ messages in thread
From: Ira Weiny @ 2019-05-09  1:50 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Mon, May 06, 2019 at 09:06:00PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Save marshalling an extra argument in all the callers at the expense of
> using five bits of the GFP flags.  We still have three GFP bits remaining
> after doing this (and we can release one more by reallocating NORETRY,
> RETRY_MAYFAIL and NOFAIL).
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  arch/x86/events/intel/ds.c |  4 +--
>  arch/x86/kvm/vmx/vmx.c     |  4 +--
>  include/linux/gfp.h        | 51 ++++++++++++++++++++++----------------
>  include/linux/migrate.h    |  2 +-
>  mm/filemap.c               |  2 +-
>  mm/gup.c                   |  4 +--
>  mm/hugetlb.c               |  5 ++--
>  mm/khugepaged.c            |  2 +-
>  mm/mempolicy.c             | 30 +++++++++++-----------
>  mm/migrate.c               |  2 +-
>  mm/page_alloc.c            |  4 +--
>  mm/shmem.c                 |  5 ++--
>  mm/slub.c                  |  2 +-
>  13 files changed, 63 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 10c99ce1fead..82fee9845b87 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -315,13 +315,13 @@ static void ds_clear_cea(void *cea, size_t size)
>  	preempt_enable();
>  }
>  
> -static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
> +static void *dsalloc_pages(size_t size, gfp_t gfp, int cpu)
>  {
>  	unsigned int order = get_order(size);
>  	int node = cpu_to_node(cpu);
>  	struct page *page;
>  
> -	page = __alloc_pages_node(node, flags | __GFP_ZERO, order);
> +	page = __alloc_pages_node(node, gfp | __GFP_ZERO | __GFP_ORDER(order));

Order was derived from size in this function.  Is this truely equal to the old
function?

At a minimum if I am wrong the get_order call above should be removed, no?

Ira

>  	return page ? page_address(page) : NULL;
>  }
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index ab432a930ae8..323a0f6ffe13 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2380,13 +2380,13 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	return 0;
>  }
>  
> -struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
> +struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t gfp)
>  {
>  	int node = cpu_to_node(cpu);
>  	struct page *pages;
>  	struct vmcs *vmcs;
>  
> -	pages = __alloc_pages_node(node, flags, vmcs_config.order);
> +	pages = __alloc_pages_node(node, gfp | __GFP_ORDER(vmcs_config.order));
>  	if (!pages)
>  		return NULL;
>  	vmcs = page_address(pages);
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index fb07b503dc45..e7845c2510db 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -219,6 +219,18 @@ struct vm_area_struct;
>  /* Room for N __GFP_FOO bits */
>  #define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
>  #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))
> +#define __GFP_ORDER(order) ((__force gfp_t)(order << __GFP_BITS_SHIFT))
> +#define __GFP_ORDER_PMD	__GFP_ORDER(PMD_SHIFT - PAGE_SHIFT)
> +#define __GFP_ORDER_PUD	__GFP_ORDER(PUD_SHIFT - PAGE_SHIFT)
> +
> +/*
> + * Extract the order from a GFP bitmask.
> + * Must be the top bits to avoid an AND operation.  Don't let
> + * __GFP_BITS_SHIFT get over 27, or we won't be able to encode orders
> + * above 15 (some architectures allow configuring MAX_ORDER up to 64,
> + * but I doubt larger than 31 are ever used).
> + */
> +#define gfp_order(gfp)	(((__force unsigned int)gfp) >> __GFP_BITS_SHIFT)
>  
>  /**
>   * DOC: Useful GFP flag combinations
> @@ -464,26 +476,23 @@ static inline void arch_alloc_page(struct page *page, int order) { }
>  #endif
>  
>  struct page *
> -__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
> -							nodemask_t *nodemask);
> +__alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask);
>  
> -static inline struct page *
> -__alloc_pages(gfp_t gfp_mask, unsigned int order, int preferred_nid)
> +static inline struct page *__alloc_pages(gfp_t gfp_mask, int preferred_nid)
>  {
> -	return __alloc_pages_nodemask(gfp_mask, order, preferred_nid, NULL);
> +	return __alloc_pages_nodemask(gfp_mask, preferred_nid, NULL);
>  }
>  
>  /*
>   * Allocate pages, preferring the node given as nid. The node must be valid and
>   * online. For more general interface, see alloc_pages_node().
>   */
> -static inline struct page *
> -__alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
> +static inline struct page *__alloc_pages_node(int nid, gfp_t gfp)
>  {
>  	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> -	VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
> +	VM_WARN_ON((gfp & __GFP_THISNODE) && !node_online(nid));
>  
> -	return __alloc_pages(gfp_mask, order, nid);
> +	return __alloc_pages(gfp, nid);
>  }
>  
>  /*
> @@ -497,35 +506,35 @@ static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
>  	if (nid == NUMA_NO_NODE)
>  		nid = numa_mem_id();
>  
> -	return __alloc_pages_node(nid, gfp_mask, order);
> +	return __alloc_pages_node(nid, gfp_mask | __GFP_ORDER(order));
>  }
>  
>  #ifdef CONFIG_NUMA
> -extern struct page *alloc_pages_current(gfp_t gfp_mask, unsigned order);
> +extern struct page *alloc_pages_current(gfp_t gfp_mask);
>  
>  static inline struct page *
>  alloc_pages(gfp_t gfp_mask, unsigned int order)
>  {
> -	return alloc_pages_current(gfp_mask, order);
> +	return alloc_pages_current(gfp_mask | __GFP_ORDER(order));
>  }
> -extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
> -			struct vm_area_struct *vma, unsigned long addr,
> -			int node, bool hugepage);
> +extern struct page *alloc_pages_vma(gfp_t gfp_mask, struct vm_area_struct *vma,
> +		unsigned long addr, int node, bool hugepage);
>  #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
> -	alloc_pages_vma(gfp_mask, order, vma, addr, numa_node_id(), true)
> +	alloc_pages_vma(gfp_mask | __GFP_ORDER(order), vma, addr, \
> +			numa_node_id(), true)
>  #else
>  #define alloc_pages(gfp_mask, order) \
> -		alloc_pages_node(numa_node_id(), gfp_mask, order)
> -#define alloc_pages_vma(gfp_mask, order, vma, addr, node, false)\
> -	alloc_pages(gfp_mask, order)
> +	alloc_pages_node(numa_node_id(), gfp_mask, order)
> +#define alloc_pages_vma(gfp_mask, vma, addr, node, false) \
> +	alloc_pages(gfp_mask, 0)
>  #define alloc_hugepage_vma(gfp_mask, vma, addr, order) \
>  	alloc_pages(gfp_mask, order)
>  #endif
>  #define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
>  #define alloc_page_vma(gfp_mask, vma, addr)			\
> -	alloc_pages_vma(gfp_mask, 0, vma, addr, numa_node_id(), false)
> +	alloc_pages_vma(gfp_mask, vma, addr, numa_node_id(), false)
>  #define alloc_page_vma_node(gfp_mask, vma, addr, node)		\
> -	alloc_pages_vma(gfp_mask, 0, vma, addr, node, false)
> +	alloc_pages_vma(gfp_mask, vma, addr, node, false)
>  
>  extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
>  extern unsigned long get_zeroed_page(gfp_t gfp_mask);
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index e13d9bf2f9a5..ba4385144cc9 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -50,7 +50,7 @@ static inline struct page *new_page_nodemask(struct page *page,
>  	if (PageHighMem(page) || (zone_idx(page_zone(page)) == ZONE_MOVABLE))
>  		gfp_mask |= __GFP_HIGHMEM;
>  
> -	new_page = __alloc_pages_nodemask(gfp_mask, order,
> +	new_page = __alloc_pages_nodemask(gfp_mask | __GFP_ORDER(order),
>  				preferred_nid, nodemask);
>  
>  	if (new_page && PageTransHuge(new_page))
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 3ad18fa56057..b7b0841312c9 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -945,7 +945,7 @@ struct page *__page_cache_alloc(gfp_t gfp)
>  		do {
>  			cpuset_mems_cookie = read_mems_allowed_begin();
>  			n = cpuset_mem_spread_node();
> -			page = __alloc_pages_node(n, gfp, 0);
> +			page = __alloc_pages_node(n, gfp);
>  		} while (!page && read_mems_allowed_retry(cpuset_mems_cookie));
>  
>  		return page;
> diff --git a/mm/gup.c b/mm/gup.c
> index 294e87ae5b9a..7b06962a4630 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1306,14 +1306,14 @@ static struct page *new_non_cma_page(struct page *page, unsigned long private)
>  		 * CMA area again.
>  		 */
>  		thp_gfpmask &= ~__GFP_MOVABLE;
> -		thp = __alloc_pages_node(nid, thp_gfpmask, HPAGE_PMD_ORDER);
> +		thp = __alloc_pages_node(nid, thp_gfpmask | __GFP_PMD_ORDER);
>  		if (!thp)
>  			return NULL;
>  		prep_transhuge_page(thp);
>  		return thp;
>  	}
>  
> -	return __alloc_pages_node(nid, gfp_mask, 0);
> +	return __alloc_pages_node(nid, gfp_mask);
>  }
>  
>  static long check_and_migrate_cma_pages(struct task_struct *tsk,
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 109f5de82910..f3f0f2902a52 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1401,10 +1401,11 @@ static struct page *alloc_buddy_huge_page(struct hstate *h,
>  	int order = huge_page_order(h);
>  	struct page *page;
>  
> -	gfp_mask |= __GFP_COMP|__GFP_RETRY_MAYFAIL|__GFP_NOWARN;
> +	gfp_mask |= __GFP_COMP | __GFP_RETRY_MAYFAIL | __GFP_NOWARN |
> +			__GFP_ORDER(order);
>  	if (nid == NUMA_NO_NODE)
>  		nid = numa_mem_id();
> -	page = __alloc_pages_nodemask(gfp_mask, order, nid, nmask);
> +	page = __alloc_pages_nodemask(gfp_mask, nid, nmask);
>  	if (page)
>  		__count_vm_event(HTLB_BUDDY_PGALLOC);
>  	else
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index a335f7c1fac4..3d9267394881 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -770,7 +770,7 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
>  {
>  	VM_BUG_ON_PAGE(*hpage, *hpage);
>  
> -	*hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER);
> +	*hpage = __alloc_pages_node(node, gfp | __GFP_PMD_ORDER);
>  	if (unlikely(!*hpage)) {
>  		count_vm_event(THP_COLLAPSE_ALLOC_FAILED);
>  		*hpage = ERR_PTR(-ENOMEM);
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 2219e747df49..bad60476d5ad 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -975,7 +975,7 @@ struct page *alloc_new_node_page(struct page *page, unsigned long node)
>  		return thp;
>  	} else
>  		return __alloc_pages_node(node, GFP_HIGHUSER_MOVABLE |
> -						    __GFP_THISNODE, 0);
> +						    __GFP_THISNODE);
>  }
>  
>  /*
> @@ -2006,12 +2006,11 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk,
>  
>  /* Allocate a page in interleaved policy.
>     Own path because it needs to do special accounting. */
> -static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
> -					unsigned nid)
> +static struct page *alloc_page_interleave(gfp_t gfp, unsigned nid)
>  {
>  	struct page *page;
>  
> -	page = __alloc_pages(gfp, order, nid);
> +	page = __alloc_pages(gfp, nid);
>  	/* skip NUMA_INTERLEAVE_HIT counter update if numa stats is disabled */
>  	if (!static_branch_likely(&vm_numa_stat_key))
>  		return page;
> @@ -2033,7 +2032,6 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
>   *      %GFP_FS      allocation should not call back into a file system.
>   *      %GFP_ATOMIC  don't sleep.
>   *
> - *	@order:Order of the GFP allocation.
>   * 	@vma:  Pointer to VMA or NULL if not available.
>   *	@addr: Virtual Address of the allocation. Must be inside the VMA.
>   *	@node: Which node to prefer for allocation (modulo policy).
> @@ -2047,8 +2045,8 @@ static struct page *alloc_page_interleave(gfp_t gfp, unsigned order,
>   *	NULL when no page can be allocated.
>   */
>  struct page *
> -alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
> -		unsigned long addr, int node, bool hugepage)
> +alloc_pages_vma(gfp_t gfp, struct vm_area_struct *vma, unsigned long addr,
> +		int node, bool hugepage)
>  {
>  	struct mempolicy *pol;
>  	struct page *page;
> @@ -2060,9 +2058,10 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
>  	if (pol->mode == MPOL_INTERLEAVE) {
>  		unsigned nid;
>  
> -		nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
> +		nid = interleave_nid(pol, vma, addr,
> +				PAGE_SHIFT + gfp_order(gfp));
>  		mpol_cond_put(pol);
> -		page = alloc_page_interleave(gfp, order, nid);
> +		page = alloc_page_interleave(gfp, nid);
>  		goto out;
>  	}
>  
> @@ -2086,14 +2085,14 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
>  		if (!nmask || node_isset(hpage_node, *nmask)) {
>  			mpol_cond_put(pol);
>  			page = __alloc_pages_node(hpage_node,
> -						gfp | __GFP_THISNODE, order);
> +						gfp | __GFP_THISNODE);
>  			goto out;
>  		}
>  	}
>  
>  	nmask = policy_nodemask(gfp, pol);
>  	preferred_nid = policy_node(gfp, pol, node);
> -	page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask);
> +	page = __alloc_pages_nodemask(gfp, preferred_nid, nmask);
>  	mpol_cond_put(pol);
>  out:
>  	return page;
> @@ -2108,13 +2107,12 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
>   *      	%GFP_HIGHMEM highmem allocation,
>   *      	%GFP_FS     don't call back into a file system.
>   *      	%GFP_ATOMIC don't sleep.
> - *	@order: Power of two of allocation size in pages. 0 is a single page.
>   *
>   *	Allocate a page from the kernel page pool.  When not in
> - *	interrupt context and apply the current process NUMA policy.
> + *	interrupt context apply the current process NUMA policy.
>   *	Returns NULL when no page can be allocated.
>   */
> -struct page *alloc_pages_current(gfp_t gfp, unsigned order)
> +struct page *alloc_pages_current(gfp_t gfp)
>  {
>  	struct mempolicy *pol = &default_policy;
>  	struct page *page;
> @@ -2127,9 +2125,9 @@ struct page *alloc_pages_current(gfp_t gfp, unsigned order)
>  	 * nor system default_policy
>  	 */
>  	if (pol->mode == MPOL_INTERLEAVE)
> -		page = alloc_page_interleave(gfp, order, interleave_nodes(pol));
> +		page = alloc_page_interleave(gfp, interleave_nodes(pol));
>  	else
> -		page = __alloc_pages_nodemask(gfp, order,
> +		page = __alloc_pages_nodemask(gfp,
>  				policy_node(gfp, pol, numa_node_id()),
>  				policy_nodemask(gfp, pol));
>  
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f2ecc2855a12..acb479132398 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1884,7 +1884,7 @@ static struct page *alloc_misplaced_dst_page(struct page *page,
>  					 (GFP_HIGHUSER_MOVABLE |
>  					  __GFP_THISNODE | __GFP_NOMEMALLOC |
>  					  __GFP_NORETRY | __GFP_NOWARN) &
> -					 ~__GFP_RECLAIM, 0);
> +					 ~__GFP_RECLAIM);
>  
>  	return newpage;
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index deea16489e2b..13191fe2f19d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4610,11 +4610,11 @@ static inline void finalise_ac(gfp_t gfp_mask, struct alloc_context *ac)
>   * This is the 'heart' of the zoned buddy allocator.
>   */
>  struct page *
> -__alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
> -							nodemask_t *nodemask)
> +__alloc_pages_nodemask(gfp_t gfp_mask, int preferred_nid, nodemask_t *nodemask)
>  {
>  	struct page *page;
>  	unsigned int alloc_flags = ALLOC_WMARK_LOW;
> +	int order = gfp_order(gfp_mask);
>  	gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
>  	struct alloc_context ac = { };
>  
> diff --git a/mm/shmem.c b/mm/shmem.c
> index a1e9f6194138..445e76e5c0c2 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1463,8 +1463,9 @@ static struct page *shmem_alloc_hugepage(gfp_t gfp,
>  		return NULL;
>  
>  	shmem_pseudo_vma_init(&pvma, info, hindex);
> -	page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY | __GFP_NOWARN,
> -			HPAGE_PMD_ORDER, &pvma, 0, numa_node_id(), true);
> +	page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY |
> +				__GFP_NOWARN | __GFP_PMD_ORDER,
> +			&pvma, 0, numa_node_id(), true);
>  	shmem_pseudo_vma_destroy(&pvma);
>  	if (page)
>  		prep_transhuge_page(page);
> diff --git a/mm/slub.c b/mm/slub.c
> index a34fbe1f6ede..7504fa3f844b 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1497,7 +1497,7 @@ static inline struct page *alloc_slab_page(struct kmem_cache *s,
>  	if (node == NUMA_NO_NODE)
>  		page = alloc_pages(flags, order);
>  	else
> -		page = __alloc_pages_node(node, flags, order);
> +		page = __alloc_pages_node(node, flags | __GFP_ORDER(order));
>  
>  	if (page && memcg_charge_slab(page, flags, order, s)) {
>  		__free_pages(page, order);
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (10 preceding siblings ...)
  2019-05-07  4:06 ` [PATCH 11/11] mm: Pass order to node_reclaim() " Matthew Wilcox
@ 2019-05-09  1:58 ` Ira Weiny
  2019-05-09 14:07   ` Matthew Wilcox
  2019-05-09 11:07 ` Kirill A. Shutemov
  12 siblings, 1 reply; 24+ messages in thread
From: Ira Weiny @ 2019-05-09  1:58 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> It's possible to save a few hundred bytes from the kernel text by moving
> the 'order' argument into the GFP flags.  I had the idea while I was
> playing with THP pagecache (notably, I didn't want to add an 'order'
> parameter to pagecache_get_page())
> 
> What I got for a -tiny config for page_alloc.o (with a tinyconfig,
> x86-32) after each step:
> 
>    text	   data	    bss	    dec	    hex	filename
>   21462	    349	     44	  21855	   555f	1.o
>   21447	    349	     44	  21840	   5550	2.o
>   21415	    349	     44	  21808	   5530	3.o
>   21399	    349	     44	  21792	   5520	4.o
>   21399	    349	     44	  21792	   5520	5.o
>   21367	    349	     44	  21760	   5500	6.o
>   21303	    349	     44	  21696	   54c0	7.o
>   21303	    349	     44	  21696	   54c0	8.o
>   21303	    349	     44	  21696	   54c0	9.o
>   21303	    349	     44	  21696	   54c0	A.o
>   21303	    349	     44	  21696	   54c0	B.o
> 
> I assure you that the callers all shrink as well.  vmscan.o also
> shrinks, but I didn't keep detailed records.
> 
> Anyway, this is just a quick POC due to me being on an aeroplane for
> most of today.  Maybe we don't want to spend five GFP bits on this.
> Some bits of this could be pulled out and applied even if we don't want
> to go for the main objective.  eg rmqueue_pcplist() doesn't use its
> gfp_flags argument.

Over all I may just be a simpleton WRT this but I'm not sure that the added
complexity justifies the gain.

But other than the 1 patch I don't see anything technically wrong.  So I
guess...

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> 
> Matthew Wilcox (Oracle) (11):
>   fix function alignment
>   mm: Pass order to __alloc_pages_nodemask in GFP flags
>   mm: Pass order to __get_free_pages() in GFP flags
>   mm: Pass order to prep_new_page in GFP flags
>   mm: Remove gfp_flags argument from rmqueue_pcplist
>   mm: Pass order to rmqueue in GFP flags
>   mm: Pass order to get_page_from_freelist in GFP flags
>   mm: Pass order to __alloc_pages_cpuset_fallback in GFP flags
>   mm: Pass order to prepare_alloc_pages in GFP flags
>   mm: Pass order to try_to_free_pages in GFP flags
>   mm: Pass order to node_reclaim() in GFP flags
> 
>  arch/x86/Makefile_32.cpu      |  2 +
>  arch/x86/events/intel/ds.c    |  4 +-
>  arch/x86/kvm/vmx/vmx.c        |  4 +-
>  arch/x86/mm/init.c            |  3 +-
>  arch/x86/mm/pgtable.c         |  7 +--
>  drivers/base/devres.c         |  2 +-
>  include/linux/gfp.h           | 57 +++++++++++---------
>  include/linux/migrate.h       |  2 +-
>  include/linux/swap.h          |  2 +-
>  include/trace/events/vmscan.h | 28 +++++-----
>  mm/filemap.c                  |  2 +-
>  mm/gup.c                      |  4 +-
>  mm/hugetlb.c                  |  5 +-
>  mm/internal.h                 |  5 +-
>  mm/khugepaged.c               |  2 +-
>  mm/mempolicy.c                | 30 +++++------
>  mm/migrate.c                  |  2 +-
>  mm/mmu_gather.c               |  2 +-
>  mm/page_alloc.c               | 97 +++++++++++++++++------------------
>  mm/shmem.c                    |  5 +-
>  mm/slub.c                     |  2 +-
>  mm/vmscan.c                   | 26 +++++-----
>  22 files changed, 147 insertions(+), 146 deletions(-)
> 
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 01/11] fix function alignment
  2019-05-07  4:05 ` [PATCH 01/11] fix function alignment Matthew Wilcox
@ 2019-05-09 10:55   ` Kirill A. Shutemov
  0 siblings, 0 replies; 24+ messages in thread
From: Kirill A. Shutemov @ 2019-05-09 10:55 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Mon, May 06, 2019 at 09:05:59PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 

Hm?

-ENOENT;


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags
  2019-05-07  4:06 ` [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags Matthew Wilcox
  2019-05-09  1:50   ` Ira Weiny
@ 2019-05-09 10:59   ` Kirill A. Shutemov
  1 sibling, 0 replies; 24+ messages in thread
From: Kirill A. Shutemov @ 2019-05-09 10:59 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Mon, May 06, 2019 at 09:06:00PM -0700, Matthew Wilcox wrote:
> +/*
> + * Extract the order from a GFP bitmask.
> + * Must be the top bits to avoid an AND operation.  Don't let
> + * __GFP_BITS_SHIFT get over 27

Should we have BUILD_BUG_ON() for this?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
                   ` (11 preceding siblings ...)
  2019-05-09  1:58 ` [RFC 00/11] Remove 'order' argument from many mm functions Ira Weiny
@ 2019-05-09 11:07 ` Kirill A. Shutemov
  2019-05-14 14:51   ` Matthew Wilcox
  12 siblings, 1 reply; 24+ messages in thread
From: Kirill A. Shutemov @ 2019-05-09 11:07 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> It's possible to save a few hundred bytes from the kernel text by moving
> the 'order' argument into the GFP flags.  I had the idea while I was
> playing with THP pagecache (notably, I didn't want to add an 'order'
> parameter to pagecache_get_page())
> 
> What I got for a -tiny config for page_alloc.o (with a tinyconfig,
> x86-32) after each step:
> 
>    text	   data	    bss	    dec	    hex	filename
>   21462	    349	     44	  21855	   555f	1.o
>   21447	    349	     44	  21840	   5550	2.o
>   21415	    349	     44	  21808	   5530	3.o
>   21399	    349	     44	  21792	   5520	4.o
>   21399	    349	     44	  21792	   5520	5.o
>   21367	    349	     44	  21760	   5500	6.o
>   21303	    349	     44	  21696	   54c0	7.o
>   21303	    349	     44	  21696	   54c0	8.o
>   21303	    349	     44	  21696	   54c0	9.o
>   21303	    349	     44	  21696	   54c0	A.o
>   21303	    349	     44	  21696	   54c0	B.o
> 
> I assure you that the callers all shrink as well.  vmscan.o also
> shrinks, but I didn't keep detailed records.
> 
> Anyway, this is just a quick POC due to me being on an aeroplane for
> most of today.  Maybe we don't want to spend five GFP bits on this.
> Some bits of this could be pulled out and applied even if we don't want
> to go for the main objective.  eg rmqueue_pcplist() doesn't use its
> gfp_flags argument.

I like the idea. But I'm somewhat worried about running out of bits in
gfp_t. Is there anything preventing us to bump gfp_t to u64 in the future?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags
  2019-05-09  1:50   ` Ira Weiny
@ 2019-05-09 13:58     ` Matthew Wilcox
  2019-05-09 16:22       ` Weiny, Ira
  0 siblings, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-09 13:58 UTC (permalink / raw)
  To: Ira Weiny; +Cc: linux-mm

On Wed, May 08, 2019 at 06:50:16PM -0700, Ira Weiny wrote:
> On Mon, May 06, 2019 at 09:06:00PM -0700, Matthew Wilcox wrote:
> > Save marshalling an extra argument in all the callers at the expense of
> > using five bits of the GFP flags.  We still have three GFP bits remaining
> > after doing this (and we can release one more by reallocating NORETRY,
> > RETRY_MAYFAIL and NOFAIL).

> > -static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
> > +static void *dsalloc_pages(size_t size, gfp_t gfp, int cpu)
> >  {
> >  	unsigned int order = get_order(size);
> >  	int node = cpu_to_node(cpu);
> >  	struct page *page;
> >  
> > -	page = __alloc_pages_node(node, flags | __GFP_ZERO, order);
> > +	page = __alloc_pages_node(node, gfp | __GFP_ZERO | __GFP_ORDER(order));
> 
> Order was derived from size in this function.  Is this truely equal to the old
> function?
> 
> At a minimum if I am wrong the get_order call above should be removed, no?

I think you have a misunderstanding, but I'm not sure what it is.

Before this patch, we pass 'order' (a small integer generally less than 10)
in the bottom few bits of a parameter called 'order'.  After this patch,
we pass the order in some of the high bits of the GFP flags.  So we can't
remove the call to get_order() because that's what calculates 'order' from
'size'.

> > +#define __GFP_ORDER(order) ((__force gfp_t)(order << __GFP_BITS_SHIFT))
> > +#define __GFP_ORDER_PMD	__GFP_ORDER(PMD_SHIFT - PAGE_SHIFT)
> > +#define __GFP_ORDER_PUD	__GFP_ORDER(PUD_SHIFT - PAGE_SHIFT)
> > +
> > +/*
> > + * Extract the order from a GFP bitmask.
> > + * Must be the top bits to avoid an AND operation.  Don't let
> > + * __GFP_BITS_SHIFT get over 27, or we won't be able to encode orders
> > + * above 15 (some architectures allow configuring MAX_ORDER up to 64,
> > + * but I doubt larger than 31 are ever used).
> > + */
> > +#define gfp_order(gfp)	(((__force unsigned int)gfp) >> __GFP_BITS_SHIFT)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-09  1:58 ` [RFC 00/11] Remove 'order' argument from many mm functions Ira Weiny
@ 2019-05-09 14:07   ` Matthew Wilcox
  2019-05-09 16:48     ` Weiny, Ira
  0 siblings, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-09 14:07 UTC (permalink / raw)
  To: Ira Weiny; +Cc: linux-mm

On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote:
> On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> > It's possible to save a few hundred bytes from the kernel text by moving
> > the 'order' argument into the GFP flags.  I had the idea while I was
> > playing with THP pagecache (notably, I didn't want to add an 'order'
> > parameter to pagecache_get_page())
...
> > Anyway, this is just a quick POC due to me being on an aeroplane for
> > most of today.  Maybe we don't want to spend five GFP bits on this.
> > Some bits of this could be pulled out and applied even if we don't want
> > to go for the main objective.  eg rmqueue_pcplist() doesn't use its
> > gfp_flags argument.
> 
> Over all I may just be a simpleton WRT this but I'm not sure that the added
> complexity justifies the gain.

I'm disappointed that you see it as added complexity.  I see it as
reducing complexity.  With this patch, we can simply pass GFP_PMD as
a flag to pagecache_get_page(); without it, we have to add a fifth
parameter to pagecache_get_page() and change all the callers to pass '0'.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags
  2019-05-09 13:58     ` Matthew Wilcox
@ 2019-05-09 16:22       ` Weiny, Ira
  0 siblings, 0 replies; 24+ messages in thread
From: Weiny, Ira @ 2019-05-09 16:22 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

> 
> On Wed, May 08, 2019 at 06:50:16PM -0700, Ira Weiny wrote:
> > On Mon, May 06, 2019 at 09:06:00PM -0700, Matthew Wilcox wrote:
> > > Save marshalling an extra argument in all the callers at the expense
> > > of using five bits of the GFP flags.  We still have three GFP bits
> > > remaining after doing this (and we can release one more by
> > > reallocating NORETRY, RETRY_MAYFAIL and NOFAIL).
> 
> > > -static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
> > > +static void *dsalloc_pages(size_t size, gfp_t gfp, int cpu)
> > >  {
> > >  	unsigned int order = get_order(size);
> > >  	int node = cpu_to_node(cpu);
> > >  	struct page *page;
> > >
> > > -	page = __alloc_pages_node(node, flags | __GFP_ZERO, order);
> > > +	page = __alloc_pages_node(node, gfp | __GFP_ZERO |
> > > +__GFP_ORDER(order));
> >
> > Order was derived from size in this function.  Is this truely equal to
> > the old function?
> >
> > At a minimum if I am wrong the get_order call above should be removed,
> no?
> 
> I think you have a misunderstanding, but I'm not sure what it is.
> 
> Before this patch, we pass 'order' (a small integer generally less than 10) in
> the bottom few bits of a parameter called 'order'.  After this patch, we pass
> the order in some of the high bits of the GFP flags.  So we can't remove the
> call to get_order() because that's what calculates 'order' from 'size'.

Ah I see it now.  Sorry was thinking the wrong thing when I saw that line.

Yep you are correct,
Ira


> 
> > > +#define __GFP_ORDER(order) ((__force gfp_t)(order <<
> __GFP_BITS_SHIFT))
> > > +#define __GFP_ORDER_PMD	__GFP_ORDER(PMD_SHIFT -
> PAGE_SHIFT)
> > > +#define __GFP_ORDER_PUD	__GFP_ORDER(PUD_SHIFT -
> PAGE_SHIFT)
> > > +
> > > +/*
> > > + * Extract the order from a GFP bitmask.
> > > + * Must be the top bits to avoid an AND operation.  Don't let
> > > + * __GFP_BITS_SHIFT get over 27, or we won't be able to encode
> > > +orders
> > > + * above 15 (some architectures allow configuring MAX_ORDER up to
> > > +64,
> > > + * but I doubt larger than 31 are ever used).
> > > + */
> > > +#define gfp_order(gfp)	(((__force unsigned int)gfp) >>
> __GFP_BITS_SHIFT)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-09 14:07   ` Matthew Wilcox
@ 2019-05-09 16:48     ` Weiny, Ira
  2019-05-09 18:29       ` Matthew Wilcox
  0 siblings, 1 reply; 24+ messages in thread
From: Weiny, Ira @ 2019-05-09 16:48 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

> On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote:
> > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> > > It's possible to save a few hundred bytes from the kernel text by
> > > moving the 'order' argument into the GFP flags.  I had the idea
> > > while I was playing with THP pagecache (notably, I didn't want to add an
> 'order'
> > > parameter to pagecache_get_page())
> ...
> > > Anyway, this is just a quick POC due to me being on an aeroplane for
> > > most of today.  Maybe we don't want to spend five GFP bits on this.
> > > Some bits of this could be pulled out and applied even if we don't
> > > want to go for the main objective.  eg rmqueue_pcplist() doesn't use
> > > its gfp_flags argument.
> >
> > Over all I may just be a simpleton WRT this but I'm not sure that the
> > added complexity justifies the gain.
> 
> I'm disappointed that you see it as added complexity.  I see it as reducing
> complexity.  With this patch, we can simply pass GFP_PMD as a flag to
> pagecache_get_page(); without it, we have to add a fifth parameter to
> pagecache_get_page() and change all the callers to pass '0'.

I don't disagree for pagecache_get_page().

I'm not saying we should not do this.  But this seems odd to me.

Again I'm probably just being a simpleton...
Ira
 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-09 16:48     ` Weiny, Ira
@ 2019-05-09 18:29       ` Matthew Wilcox
  2019-05-29 21:44         ` Ira Weiny
  0 siblings, 1 reply; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-09 18:29 UTC (permalink / raw)
  To: Weiny, Ira; +Cc: linux-mm

On Thu, May 09, 2019 at 04:48:39PM +0000, Weiny, Ira wrote:
> > On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote:
> > > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> > > > It's possible to save a few hundred bytes from the kernel text by
> > > > moving the 'order' argument into the GFP flags.  I had the idea
> > > > while I was playing with THP pagecache (notably, I didn't want to add an
> > 'order'
> > > > parameter to pagecache_get_page())
> > ...
> > > > Anyway, this is just a quick POC due to me being on an aeroplane for
> > > > most of today.  Maybe we don't want to spend five GFP bits on this.
> > > > Some bits of this could be pulled out and applied even if we don't
> > > > want to go for the main objective.  eg rmqueue_pcplist() doesn't use
> > > > its gfp_flags argument.
> > >
> > > Over all I may just be a simpleton WRT this but I'm not sure that the
> > > added complexity justifies the gain.
> > 
> > I'm disappointed that you see it as added complexity.  I see it as reducing
> > complexity.  With this patch, we can simply pass GFP_PMD as a flag to
> > pagecache_get_page(); without it, we have to add a fifth parameter to
> > pagecache_get_page() and change all the callers to pass '0'.
> 
> I don't disagree for pagecache_get_page().
> 
> I'm not saying we should not do this.  But this seems odd to me.
> 
> Again I'm probably just being a simpleton...

This concerns me, though.  I see it as being a simplification, but if
other people see it as a complication, then it's not.  Perhaps I didn't
take the patches far enough for you to see benefit?  We have quite the
thicket of .*alloc_page.* functions, and I can't keep them all straight.
Between taking, or not taking, the nodeid, the gfp mask, the order, a VMA
and random other crap; not to mention the NUMA vs !NUMA implementations,
this is crying out for simplification.

It doesn't help that I screwed up the __get_free_pages patch.  I should
have grepped and realised that we had over 200 callers and it's not
worth changing them all as part of this patchset.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-09 11:07 ` Kirill A. Shutemov
@ 2019-05-14 14:51   ` Matthew Wilcox
  0 siblings, 0 replies; 24+ messages in thread
From: Matthew Wilcox @ 2019-05-14 14:51 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: linux-mm

On Thu, May 09, 2019 at 02:07:55PM +0300, Kirill A. Shutemov wrote:
> On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> > Anyway, this is just a quick POC due to me being on an aeroplane for
> > most of today.  Maybe we don't want to spend five GFP bits on this.
> > Some bits of this could be pulled out and applied even if we don't want
> > to go for the main objective.  eg rmqueue_pcplist() doesn't use its
> > gfp_flags argument.
> 
> I like the idea. But I'm somewhat worried about running out of bits in
> gfp_t. Is there anything preventing us to bump gfp_t to u64 in the future?

It's stored in a few structs that might not appreciate it growing,
like struct address_space.  I've been vaguely wondering about how to
combine order, gfp_t and nodeid into one parameter in a way that doesn't
grow those structs, but I don't have a solid idea yet.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 00/11] Remove 'order' argument from many mm functions
  2019-05-09 18:29       ` Matthew Wilcox
@ 2019-05-29 21:44         ` Ira Weiny
  0 siblings, 0 replies; 24+ messages in thread
From: Ira Weiny @ 2019-05-29 21:44 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-mm

On Thu, May 09, 2019 at 11:29:02AM -0700, Matthew Wilcox wrote:
> On Thu, May 09, 2019 at 04:48:39PM +0000, Weiny, Ira wrote:
> > > On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote:
> > > > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote:
> > > > > It's possible to save a few hundred bytes from the kernel text by
> > > > > moving the 'order' argument into the GFP flags.  I had the idea
> > > > > while I was playing with THP pagecache (notably, I didn't want to add an
> > > 'order'
> > > > > parameter to pagecache_get_page())
> > > ...
> > > > > Anyway, this is just a quick POC due to me being on an aeroplane for
> > > > > most of today.  Maybe we don't want to spend five GFP bits on this.
> > > > > Some bits of this could be pulled out and applied even if we don't
> > > > > want to go for the main objective.  eg rmqueue_pcplist() doesn't use
> > > > > its gfp_flags argument.
> > > >
> > > > Over all I may just be a simpleton WRT this but I'm not sure that the
> > > > added complexity justifies the gain.
> > > 
> > > I'm disappointed that you see it as added complexity.  I see it as reducing
> > > complexity.  With this patch, we can simply pass GFP_PMD as a flag to
> > > pagecache_get_page(); without it, we have to add a fifth parameter to
> > > pagecache_get_page() and change all the callers to pass '0'.
> > 
> > I don't disagree for pagecache_get_page().
> > 
> > I'm not saying we should not do this.  But this seems odd to me.
> > 
> > Again I'm probably just being a simpleton...
> 
> This concerns me, though.  I see it as being a simplification, but if
> other people see it as a complication, then it's not.  Perhaps I didn't
> take the patches far enough for you to see benefit?  We have quite the
> thicket of .*alloc_page.* functions, and I can't keep them all straight.
> Between taking, or not taking, the nodeid, the gfp mask, the order, a VMA
> and random other crap; not to mention the NUMA vs !NUMA implementations,
> this is crying out for simplification.

Was there a new version of this coming?

Sorry perhaps I dropped the ball here by not replying?

Ira

> 
> It doesn't help that I screwed up the __get_free_pages patch.  I should
> have grepped and realised that we had over 200 callers and it's not
> worth changing them all as part of this patchset.
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-05-29 21:43 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07  4:05 [RFC 00/11] Remove 'order' argument from many mm functions Matthew Wilcox
2019-05-07  4:05 ` [PATCH 01/11] fix function alignment Matthew Wilcox
2019-05-09 10:55   ` Kirill A. Shutemov
2019-05-07  4:06 ` [PATCH 02/11] mm: Pass order to __alloc_pages_nodemask in GFP flags Matthew Wilcox
2019-05-09  1:50   ` Ira Weiny
2019-05-09 13:58     ` Matthew Wilcox
2019-05-09 16:22       ` Weiny, Ira
2019-05-09 10:59   ` Kirill A. Shutemov
2019-05-07  4:06 ` [PATCH 03/11] mm: Pass order to __get_free_pages() " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 04/11] mm: Pass order to prep_new_page " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 05/11] mm: Remove gfp_flags argument from rmqueue_pcplist Matthew Wilcox
2019-05-07  4:06 ` [PATCH 06/11] mm: Pass order to rmqueue in GFP flags Matthew Wilcox
2019-05-07  4:06 ` [PATCH 07/11] mm: Pass order to get_page_from_freelist " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 08/11] mm: Pass order to __alloc_pages_cpuset_fallback " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 09/11] mm: Pass order to prepare_alloc_pages " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 10/11] mm: Pass order to try_to_free_pages " Matthew Wilcox
2019-05-07  4:06 ` [PATCH 11/11] mm: Pass order to node_reclaim() " Matthew Wilcox
2019-05-09  1:58 ` [RFC 00/11] Remove 'order' argument from many mm functions Ira Weiny
2019-05-09 14:07   ` Matthew Wilcox
2019-05-09 16:48     ` Weiny, Ira
2019-05-09 18:29       ` Matthew Wilcox
2019-05-29 21:44         ` Ira Weiny
2019-05-09 11:07 ` Kirill A. Shutemov
2019-05-14 14:51   ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).