All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-21 15:20 Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
                   ` (19 more replies)
  0 siblings, 20 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.

Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
the bottom three bits of GFP mask is reserved for storing encoded
zone number.

The encoding method is XOR. Get zone number from enum zone_type,
then encode the number with ZONE_NORMAL by XOR operation.
The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
can be used as before.

Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
__GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
__GFP_ZONE_MOVABLE is created to realize it.

With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
enough to get ZONE_MOVABLE from gfp_zone. All callers should use
GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.

Decode zone number directly from bottom three bits of flags in gfp_zone.
The theory of encoding and decoding is,
        A ^ B ^ B = A

Changes since v1,

v2: Add __GFP_ZONE_MOVABLE and modify GFP_HIGHUSER_MOVABLE to help
callers to get ZONE_MOVABLE. Add __GFP_ZONE_MASK to mask lowest 3
bits of GFP bitmasks.
Modify some callers' gfp flag to update usage of address zone
modifiers.
Modify inline function gfp_zone to get better performance according
to Matthew's suggestion.

Link: https://marc.info/?l=linux-mm&m=152596791931266&w=2

Huaisheng Ye (12):
  include/linux/gfp.h: get rid of GFP_ZONE_TABLE/BAD
  arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
  arch/x86/kernel/pci-calgary_64: update usage of address zone modifiers
  drivers/iommu/amd_iommu: update usage of address zone modifiers
  include/linux/dma-mapping: update usage of address zone modifiers
  drivers/xen/swiotlb-xen: update usage of address zone modifiers
  fs/btrfs/extent_io: update usage of address zone modifiers
  drivers/block/zram/zram_drv: update usage of address zone modifiers
  mm/vmpressure: update usage of address zone modifiers
  mm/zsmalloc: update usage of address zone modifiers
  include/linux/highmem: update usage of movableflags
  arch/x86/include/asm/page.h: update usage of movableflags

 arch/x86/include/asm/page.h      |  3 +-
 arch/x86/kernel/amd_gart_64.c    |  2 +-
 arch/x86/kernel/pci-calgary_64.c |  2 +-
 drivers/block/zram/zram_drv.c    |  6 +--
 drivers/iommu/amd_iommu.c        |  2 +-
 drivers/xen/swiotlb-xen.c        |  2 +-
 fs/btrfs/extent_io.c             |  2 +-
 include/linux/dma-mapping.h      |  2 +-
 include/linux/gfp.h              | 98 +++++-----------------------------------
 include/linux/highmem.h          |  4 +-
 mm/vmpressure.c                  |  2 +-
 mm/zsmalloc.c                    |  4 +-
 12 files changed, 26 insertions(+), 103 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 01/12] include/linux/gfp.h: get rid of GFP_ZONE_TABLE/BAD
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye

From: Huaisheng Ye <yehs1@lenovo.com>

Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.

Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
the bottom three bits of GFP mask is reserved for storing encoded
zone number.

The encoding method is XOR. Get zone number from enum zone_type,
then encode the number with ZONE_NORMAL by XOR operation.
The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
can be used as before.

Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
__GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
__GFP_ZONE_MOVABLE is created to realize it.

With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
enough to get ZONE_MOVABLE from gfp_zone. All subsystems should use
GFP_HIGHUSER_MOVABLE directly to achieve that.

Decode zone number directly from bottom three bits of flags in gfp_zone.
The theory of encoding and decoding is,
        A ^ B ^ B = A

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/gfp.h | 98 ++++++-----------------------------------------------
 1 file changed, 11 insertions(+), 87 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 1a4582b..ab0fb7f 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -16,9 +16,7 @@
  */
 
 /* Plain integer GFP bitmasks. Do not use this directly. */
-#define ___GFP_DMA		0x01u
-#define ___GFP_HIGHMEM		0x02u
-#define ___GFP_DMA32		0x04u
+#define ___GFP_ZONE_MASK	0x07u
 #define ___GFP_MOVABLE		0x08u
 #define ___GFP_RECLAIMABLE	0x10u
 #define ___GFP_HIGH		0x20u
@@ -53,11 +51,15 @@
  * without the underscores and use them consistently. The definitions here may
  * be used in bit comparisons.
  */
-#define __GFP_DMA	((__force gfp_t)___GFP_DMA)
-#define __GFP_HIGHMEM	((__force gfp_t)___GFP_HIGHMEM)
-#define __GFP_DMA32	((__force gfp_t)___GFP_DMA32)
+#define __GFP_DMA	((__force gfp_t)OPT_ZONE_DMA ^ ZONE_NORMAL)
+#define __GFP_HIGHMEM	((__force gfp_t)OPT_ZONE_HIGHMEM ^ ZONE_NORMAL)
+#define __GFP_DMA32	((__force gfp_t)OPT_ZONE_DMA32 ^ ZONE_NORMAL)
 #define __GFP_MOVABLE	((__force gfp_t)___GFP_MOVABLE)  /* ZONE_MOVABLE allowed */
-#define GFP_ZONEMASK	(__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE)
+#define GFP_ZONEMASK	((__force gfp_t)___GFP_ZONE_MASK | ___GFP_MOVABLE)
+/* bottom 3 bits of GFP bitmasks are used for zone number encoded*/
+#define __GFP_ZONE_MASK ((__force gfp_t)___GFP_ZONE_MASK)
+#define __GFP_ZONE_MOVABLE	\
+		((__force gfp_t)(ZONE_MOVABLE ^ ZONE_NORMAL) | ___GFP_MOVABLE)
 
 /*
  * Page mobility and placement hints
@@ -279,7 +281,7 @@
 #define GFP_DMA		__GFP_DMA
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
-#define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE)
+#define GFP_HIGHUSER_MOVABLE	(GFP_USER | __GFP_ZONE_MOVABLE)
 #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
 			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
@@ -326,87 +328,9 @@ static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
 #define OPT_ZONE_DMA32 ZONE_NORMAL
 #endif
 
-/*
- * GFP_ZONE_TABLE is a word size bitstring that is used for looking up the
- * zone to use given the lowest 4 bits of gfp_t. Entries are GFP_ZONES_SHIFT
- * bits long and there are 16 of them to cover all possible combinations of
- * __GFP_DMA, __GFP_DMA32, __GFP_MOVABLE and __GFP_HIGHMEM.
- *
- * The zone fallback order is MOVABLE=>HIGHMEM=>NORMAL=>DMA32=>DMA.
- * But GFP_MOVABLE is not only a zone specifier but also an allocation
- * policy. Therefore __GFP_MOVABLE plus another zone selector is valid.
- * Only 1 bit of the lowest 3 bits (DMA,DMA32,HIGHMEM) can be set to "1".
- *
- *       bit       result
- *       =================
- *       0x0    => NORMAL
- *       0x1    => DMA or NORMAL
- *       0x2    => HIGHMEM or NORMAL
- *       0x3    => BAD (DMA+HIGHMEM)
- *       0x4    => DMA32 or DMA or NORMAL
- *       0x5    => BAD (DMA+DMA32)
- *       0x6    => BAD (HIGHMEM+DMA32)
- *       0x7    => BAD (HIGHMEM+DMA32+DMA)
- *       0x8    => NORMAL (MOVABLE+0)
- *       0x9    => DMA or NORMAL (MOVABLE+DMA)
- *       0xa    => MOVABLE (Movable is valid only if HIGHMEM is set too)
- *       0xb    => BAD (MOVABLE+HIGHMEM+DMA)
- *       0xc    => DMA32 (MOVABLE+DMA32)
- *       0xd    => BAD (MOVABLE+DMA32+DMA)
- *       0xe    => BAD (MOVABLE+DMA32+HIGHMEM)
- *       0xf    => BAD (MOVABLE+DMA32+HIGHMEM+DMA)
- *
- * GFP_ZONES_SHIFT must be <= 2 on 32 bit platforms.
- */
-
-#if defined(CONFIG_ZONE_DEVICE) && (MAX_NR_ZONES-1) <= 4
-/* ZONE_DEVICE is not a valid GFP zone specifier */
-#define GFP_ZONES_SHIFT 2
-#else
-#define GFP_ZONES_SHIFT ZONES_SHIFT
-#endif
-
-#if 16 * GFP_ZONES_SHIFT > BITS_PER_LONG
-#error GFP_ZONES_SHIFT too large to create GFP_ZONE_TABLE integer
-#endif
-
-#define GFP_ZONE_TABLE ( \
-	(ZONE_NORMAL << 0 * GFP_ZONES_SHIFT)				       \
-	| (OPT_ZONE_DMA << ___GFP_DMA * GFP_ZONES_SHIFT)		       \
-	| (OPT_ZONE_HIGHMEM << ___GFP_HIGHMEM * GFP_ZONES_SHIFT)	       \
-	| (OPT_ZONE_DMA32 << ___GFP_DMA32 * GFP_ZONES_SHIFT)		       \
-	| (ZONE_NORMAL << ___GFP_MOVABLE * GFP_ZONES_SHIFT)		       \
-	| (OPT_ZONE_DMA << (___GFP_MOVABLE | ___GFP_DMA) * GFP_ZONES_SHIFT)    \
-	| (ZONE_MOVABLE << (___GFP_MOVABLE | ___GFP_HIGHMEM) * GFP_ZONES_SHIFT)\
-	| (OPT_ZONE_DMA32 << (___GFP_MOVABLE | ___GFP_DMA32) * GFP_ZONES_SHIFT)\
-)
-
-/*
- * GFP_ZONE_BAD is a bitmap for all combinations of __GFP_DMA, __GFP_DMA32
- * __GFP_HIGHMEM and __GFP_MOVABLE that are not permitted. One flag per
- * entry starting with bit 0. Bit is set if the combination is not
- * allowed.
- */
-#define GFP_ZONE_BAD ( \
-	1 << (___GFP_DMA | ___GFP_HIGHMEM)				      \
-	| 1 << (___GFP_DMA | ___GFP_DMA32)				      \
-	| 1 << (___GFP_DMA32 | ___GFP_HIGHMEM)				      \
-	| 1 << (___GFP_DMA | ___GFP_DMA32 | ___GFP_HIGHMEM)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_HIGHMEM | ___GFP_DMA)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_HIGHMEM)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA | ___GFP_HIGHMEM)  \
-)
-
 static inline enum zone_type gfp_zone(gfp_t flags)
 {
-	enum zone_type z;
-	int bit = (__force int) (flags & GFP_ZONEMASK);
-
-	z = (GFP_ZONE_TABLE >> (bit * GFP_ZONES_SHIFT)) &
-					 ((1 << GFP_ZONES_SHIFT) - 1);
-	VM_BUG_ON((GFP_ZONE_BAD >> bit) & 1);
-	return z;
+	return ((__force unsigned int)flags & __GFP_ZONE_MASK) ^ ZONE_NORMAL;
 }
 
 /*
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 01/12] include/linux/gfp.h: get rid of GFP_ZONE_TABLE/BAD
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	willy, alexander.levin, iommu, linux-btrfs, chengnt, xen-devel,
	colyli, mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.

Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
the bottom three bits of GFP mask is reserved for storing encoded
zone number.

The encoding method is XOR. Get zone number from enum zone_type,
then encode the number with ZONE_NORMAL by XOR operation.
The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
can be used as before.

Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
__GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
__GFP_ZONE_MOVABLE is created to realize it.

With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
enough to get ZONE_MOVABLE from gfp_zone. All subsystems should use
GFP_HIGHUSER_MOVABLE directly to achieve that.

Decode zone number directly from bottom three bits of flags in gfp_zone.
The theory of encoding and decoding is,
        A ^ B ^ B = A

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/gfp.h | 98 ++++++-----------------------------------------------
 1 file changed, 11 insertions(+), 87 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 1a4582b..ab0fb7f 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -16,9 +16,7 @@
  */
 
 /* Plain integer GFP bitmasks. Do not use this directly. */
-#define ___GFP_DMA		0x01u
-#define ___GFP_HIGHMEM		0x02u
-#define ___GFP_DMA32		0x04u
+#define ___GFP_ZONE_MASK	0x07u
 #define ___GFP_MOVABLE		0x08u
 #define ___GFP_RECLAIMABLE	0x10u
 #define ___GFP_HIGH		0x20u
@@ -53,11 +51,15 @@
  * without the underscores and use them consistently. The definitions here may
  * be used in bit comparisons.
  */
-#define __GFP_DMA	((__force gfp_t)___GFP_DMA)
-#define __GFP_HIGHMEM	((__force gfp_t)___GFP_HIGHMEM)
-#define __GFP_DMA32	((__force gfp_t)___GFP_DMA32)
+#define __GFP_DMA	((__force gfp_t)OPT_ZONE_DMA ^ ZONE_NORMAL)
+#define __GFP_HIGHMEM	((__force gfp_t)OPT_ZONE_HIGHMEM ^ ZONE_NORMAL)
+#define __GFP_DMA32	((__force gfp_t)OPT_ZONE_DMA32 ^ ZONE_NORMAL)
 #define __GFP_MOVABLE	((__force gfp_t)___GFP_MOVABLE)  /* ZONE_MOVABLE allowed */
-#define GFP_ZONEMASK	(__GFP_DMA|__GFP_HIGHMEM|__GFP_DMA32|__GFP_MOVABLE)
+#define GFP_ZONEMASK	((__force gfp_t)___GFP_ZONE_MASK | ___GFP_MOVABLE)
+/* bottom 3 bits of GFP bitmasks are used for zone number encoded*/
+#define __GFP_ZONE_MASK ((__force gfp_t)___GFP_ZONE_MASK)
+#define __GFP_ZONE_MOVABLE	\
+		((__force gfp_t)(ZONE_MOVABLE ^ ZONE_NORMAL) | ___GFP_MOVABLE)
 
 /*
  * Page mobility and placement hints
@@ -279,7 +281,7 @@
 #define GFP_DMA		__GFP_DMA
 #define GFP_DMA32	__GFP_DMA32
 #define GFP_HIGHUSER	(GFP_USER | __GFP_HIGHMEM)
-#define GFP_HIGHUSER_MOVABLE	(GFP_HIGHUSER | __GFP_MOVABLE)
+#define GFP_HIGHUSER_MOVABLE	(GFP_USER | __GFP_ZONE_MOVABLE)
 #define GFP_TRANSHUGE_LIGHT	((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \
 			 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM)
 #define GFP_TRANSHUGE	(GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
@@ -326,87 +328,9 @@ static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
 #define OPT_ZONE_DMA32 ZONE_NORMAL
 #endif
 
-/*
- * GFP_ZONE_TABLE is a word size bitstring that is used for looking up the
- * zone to use given the lowest 4 bits of gfp_t. Entries are GFP_ZONES_SHIFT
- * bits long and there are 16 of them to cover all possible combinations of
- * __GFP_DMA, __GFP_DMA32, __GFP_MOVABLE and __GFP_HIGHMEM.
- *
- * The zone fallback order is MOVABLE=>HIGHMEM=>NORMAL=>DMA32=>DMA.
- * But GFP_MOVABLE is not only a zone specifier but also an allocation
- * policy. Therefore __GFP_MOVABLE plus another zone selector is valid.
- * Only 1 bit of the lowest 3 bits (DMA,DMA32,HIGHMEM) can be set to "1".
- *
- *       bit       result
- *       =================
- *       0x0    => NORMAL
- *       0x1    => DMA or NORMAL
- *       0x2    => HIGHMEM or NORMAL
- *       0x3    => BAD (DMA+HIGHMEM)
- *       0x4    => DMA32 or DMA or NORMAL
- *       0x5    => BAD (DMA+DMA32)
- *       0x6    => BAD (HIGHMEM+DMA32)
- *       0x7    => BAD (HIGHMEM+DMA32+DMA)
- *       0x8    => NORMAL (MOVABLE+0)
- *       0x9    => DMA or NORMAL (MOVABLE+DMA)
- *       0xa    => MOVABLE (Movable is valid only if HIGHMEM is set too)
- *       0xb    => BAD (MOVABLE+HIGHMEM+DMA)
- *       0xc    => DMA32 (MOVABLE+DMA32)
- *       0xd    => BAD (MOVABLE+DMA32+DMA)
- *       0xe    => BAD (MOVABLE+DMA32+HIGHMEM)
- *       0xf    => BAD (MOVABLE+DMA32+HIGHMEM+DMA)
- *
- * GFP_ZONES_SHIFT must be <= 2 on 32 bit platforms.
- */
-
-#if defined(CONFIG_ZONE_DEVICE) && (MAX_NR_ZONES-1) <= 4
-/* ZONE_DEVICE is not a valid GFP zone specifier */
-#define GFP_ZONES_SHIFT 2
-#else
-#define GFP_ZONES_SHIFT ZONES_SHIFT
-#endif
-
-#if 16 * GFP_ZONES_SHIFT > BITS_PER_LONG
-#error GFP_ZONES_SHIFT too large to create GFP_ZONE_TABLE integer
-#endif
-
-#define GFP_ZONE_TABLE ( \
-	(ZONE_NORMAL << 0 * GFP_ZONES_SHIFT)				       \
-	| (OPT_ZONE_DMA << ___GFP_DMA * GFP_ZONES_SHIFT)		       \
-	| (OPT_ZONE_HIGHMEM << ___GFP_HIGHMEM * GFP_ZONES_SHIFT)	       \
-	| (OPT_ZONE_DMA32 << ___GFP_DMA32 * GFP_ZONES_SHIFT)		       \
-	| (ZONE_NORMAL << ___GFP_MOVABLE * GFP_ZONES_SHIFT)		       \
-	| (OPT_ZONE_DMA << (___GFP_MOVABLE | ___GFP_DMA) * GFP_ZONES_SHIFT)    \
-	| (ZONE_MOVABLE << (___GFP_MOVABLE | ___GFP_HIGHMEM) * GFP_ZONES_SHIFT)\
-	| (OPT_ZONE_DMA32 << (___GFP_MOVABLE | ___GFP_DMA32) * GFP_ZONES_SHIFT)\
-)
-
-/*
- * GFP_ZONE_BAD is a bitmap for all combinations of __GFP_DMA, __GFP_DMA32
- * __GFP_HIGHMEM and __GFP_MOVABLE that are not permitted. One flag per
- * entry starting with bit 0. Bit is set if the combination is not
- * allowed.
- */
-#define GFP_ZONE_BAD ( \
-	1 << (___GFP_DMA | ___GFP_HIGHMEM)				      \
-	| 1 << (___GFP_DMA | ___GFP_DMA32)				      \
-	| 1 << (___GFP_DMA32 | ___GFP_HIGHMEM)				      \
-	| 1 << (___GFP_DMA | ___GFP_DMA32 | ___GFP_HIGHMEM)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_HIGHMEM | ___GFP_DMA)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_HIGHMEM)		      \
-	| 1 << (___GFP_MOVABLE | ___GFP_DMA32 | ___GFP_DMA | ___GFP_HIGHMEM)  \
-)
-
 static inline enum zone_type gfp_zone(gfp_t flags)
 {
-	enum zone_type z;
-	int bit = (__force int) (flags & GFP_ZONEMASK);
-
-	z = (GFP_ZONE_TABLE >> (bit * GFP_ZONES_SHIFT)) &
-					 ((1 << GFP_ZONES_SHIFT) - 1);
-	VM_BUG_ON((GFP_ZONE_BAD >> bit) & 1);
-	return z;
+	return ((__force unsigned int)flags & __GFP_ZONE_MASK) ^ ZONE_NORMAL;
 }
 
 /*
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-22  9:38   ` Christoph Hellwig
  2018-05-22  9:38     ` Christoph Hellwig
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (16 subsequent siblings)
  19 siblings, 2 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Robin Murphy

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 arch/x86/kernel/amd_gart_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index ecd486c..1dd6971 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -485,7 +485,7 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 	struct page *page;
 
 	if (force_iommu && !(flag & GFP_DMA)) {
-		flag &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+		flag &= ~__GFP_ZONE_MASK;
 		page = alloc_pages(flag | __GFP_ZERO, get_order(size));
 		if (!page)
 			return NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (2 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: " Huaisheng Ye
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, H. Peter Anvin,
	linux-kernel, willy, alexander.levin, iommu, Ingo Molnar,
	linux-btrfs, chengnt, xen-devel, Thomas Gleixner, colyli,
	mgorman, vbabka, Robin Murphy

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 arch/x86/kernel/amd_gart_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index ecd486c..1dd6971 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -485,7 +485,7 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
 	struct page *page;
 
 	if (force_iommu && !(flag & GFP_DMA)) {
-		flag &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+		flag &= ~__GFP_ZONE_MASK;
 		page = alloc_pages(flag | __GFP_ZERO, get_order(size));
 		if (!page)
 			return NULL;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (4 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: " Huaisheng Ye
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Muli Ben-Yehuda, Jon Mason,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/kernel/pci-calgary_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index 35c461f..c89717d 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -445,7 +445,7 @@ static void* calgary_alloc_coherent(struct device *dev, size_t size,
 	npages = size >> PAGE_SHIFT;
 	order = get_order(size);
 
-	flag &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+	flag &= ~__GFP_ZONE_MASK;
 
 	/* alloc enough pages (and possibly more) */
 	ret = (void *)__get_free_pages(flag, order);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (3 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, Muli Ben-Yehuda, mhocko, H. Peter Anvin, Huaisheng Ye,
	hehy1, gregkh, Jon Mason, linux-kernel, willy, alexander.levin,
	iommu, Ingo Molnar, linux-btrfs, chengnt, xen-devel,
	Thomas Gleixner, colyli, mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Muli Ben-Yehuda <mulix@mulix.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/kernel/pci-calgary_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index 35c461f..c89717d 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -445,7 +445,7 @@ static void* calgary_alloc_coherent(struct device *dev, size_t size,
 	npages = size >> PAGE_SHIFT;
 	order = get_order(size);
 
-	flag &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+	flag &= ~__GFP_ZONE_MASK;
 
 	/* alloc enough pages (and possibly more) */
 	ret = (void *)__get_free_pages(flag, order);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (6 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Joerg Roedel

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 74788fd..3921d53 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2614,7 +2614,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 	dma_dom   = to_dma_ops_domain(domain);
 	size	  = PAGE_ALIGN(size);
 	dma_mask  = dev->coherent_dma_mask;
-	flag     &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+	flag     &= ~__GFP_ZONE_MASK;
 	flag     |= __GFP_ZERO;
 
 	page = alloc_pages(flag | __GFP_NOWARN,  get_order(size));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (5 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, Joerg Roedel,
	linux-kernel, willy, alexander.levin, iommu, linux-btrfs,
	chengnt, xen-devel, colyli, mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Joerg Roedel <joro@8bytes.org>
---
 drivers/iommu/amd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 74788fd..3921d53 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2614,7 +2614,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 	dma_dom   = to_dma_ops_domain(domain);
 	size	  = PAGE_ALIGN(size);
 	dma_mask  = dev->coherent_dma_mask;
-	flag     &= ~(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32);
+	flag     &= ~__GFP_ZONE_MASK;
 	flag     |= __GFP_ZERO;
 
 	page = alloc_pages(flag | __GFP_NOWARN,  get_order(size));
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 05/12] include/linux/dma-mapping: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (7 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:30   ` Christoph Hellwig
  2018-05-21 15:30     ` Christoph Hellwig
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (10 subsequent siblings)
  19 siblings, 2 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Christoph Hellwig, Marek Szyprowski,
	Robin Murphy

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated with
each others by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/dma-mapping.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index eb9eab4..3da0293 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -523,7 +523,7 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size,
 	 * decide on the way of zeroing the memory given that the memory
 	 * returned should always be zeroed.
 	 */
-	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM | __GFP_ZERO);
+	flag &= ~(__GFP_ZONE_MASK | __GFP_ZERO);
 
 	if (!arch_dma_alloc_attrs(&dev, &flag))
 		return NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 05/12] include/linux/dma-mapping: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (8 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, Robin Murphy,
	linux-kernel, willy, alexander.levin, iommu, linux-btrfs,
	chengnt, xen-devel, colyli, mgorman, Christoph Hellwig, vbabka,
	Marek Szyprowski

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.
__GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated with
each others by OR.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
---
 include/linux/dma-mapping.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index eb9eab4..3da0293 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -523,7 +523,7 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size,
 	 * decide on the way of zeroing the memory given that the memory
 	 * returned should always be zeroed.
 	 */
-	flag &= ~(__GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM | __GFP_ZERO);
+	flag &= ~(__GFP_ZONE_MASK | __GFP_ZERO);
 
 	if (!arch_dma_alloc_attrs(&dev, &flag))
 		return NULL;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (9 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-22 11:22     ` Matthew Wilcox
  2018-05-22 11:22   ` Matthew Wilcox
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (8 subsequent siblings)
  19 siblings, 2 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Minchan Kim, Nitin Gupta,
	Sergey Senozhatsky

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.

__GFP_ZONE_MOVABLE contains encoded ZONE_MOVABLE and __GFP_MOVABLE flag.

With GFP_ZONE_TABLE, __GFP_HIGHMEM ORing __GFP_MOVABLE means gfp_zone
should return ZONE_MOVABLE. In order to keep that compatible with
GFP_ZONE_TABLE, replace (__GFP_HIGHMEM | __GFP_MOVABLE) with
__GFP_ZONE_MOVABLE.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
---
 mm/zsmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c301350..06b2902 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
 static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
 {
 	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			gfp & ~__GFP_ZONE_MOVABLE);
 }
 
 static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
@@ -354,7 +354,7 @@ static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
 static struct zspage *cache_alloc_zspage(struct zs_pool *pool, gfp_t flags)
 {
 	return kmem_cache_alloc(pool->zspage_cachep,
-			flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			flags & ~__GFP_ZONE_MOVABLE);
 }
 
 static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (10 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags Huaisheng Ye
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Nitin Gupta, Sergey Senozhatsky, Huaisheng Ye,
	hehy1, gregkh, linux-kernel, willy, alexander.levin, Minchan Kim,
	iommu, linux-btrfs, chengnt, xen-devel, colyli, mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

Use __GFP_ZONE_MOVABLE to replace (__GFP_HIGHMEM | __GFP_MOVABLE).

___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
bitmasks, the bottom three bits of GFP mask is reserved for storing
encoded zone number.

__GFP_ZONE_MOVABLE contains encoded ZONE_MOVABLE and __GFP_MOVABLE flag.

With GFP_ZONE_TABLE, __GFP_HIGHMEM ORing __GFP_MOVABLE means gfp_zone
should return ZONE_MOVABLE. In order to keep that compatible with
GFP_ZONE_TABLE, replace (__GFP_HIGHMEM | __GFP_MOVABLE) with
__GFP_ZONE_MOVABLE.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
---
 mm/zsmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index c301350..06b2902 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
 static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
 {
 	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			gfp & ~__GFP_ZONE_MOVABLE);
 }
 
 static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
@@ -354,7 +354,7 @@ static void cache_free_handle(struct zs_pool *pool, unsigned long handle)
 static struct zspage *cache_alloc_zspage(struct zs_pool *pool, gfp_t flags)
 {
 	return kmem_cache_alloc(pool->zspage_cachep,
-			flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			flags & ~__GFP_ZONE_MOVABLE);
 }
 
 static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (12 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: " Huaisheng Ye
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Thomas Gleixner, Philippe Ombredanne

From: Huaisheng Ye <yehs1@lenovo.com>

GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
---
 include/linux/highmem.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 776f90f..da34260 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -159,8 +159,8 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 			struct vm_area_struct *vma,
 			unsigned long vaddr)
 {
-	struct page *page = alloc_page_vma(GFP_HIGHUSER | movableflags,
-			vma, vaddr);
+	struct page *page = alloc_page_vma(movableflags ?
+		GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER, vma, vaddr);
 
 	if (page)
 		clear_user_highpage(page, vaddr);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (11 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh,
	Philippe Ombredanne, linux-kernel, willy, alexander.levin, iommu,
	linux-btrfs, chengnt, xen-devel, Thomas Gleixner, colyli,
	mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
---
 include/linux/highmem.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 776f90f..da34260 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -159,8 +159,8 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 			struct vm_area_struct *vma,
 			unsigned long vaddr)
 {
-	struct page *page = alloc_page_vma(GFP_HIGHUSER | movableflags,
-			vma, vaddr);
+	struct page *page = alloc_page_vma(movableflags ?
+		GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER, vma, vaddr);
 
 	if (page)
 		clear_user_highpage(page, vaddr);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: update usage of movableflags
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (13 preceding siblings ...)
  2018-05-21 15:20 ` Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-21 15:20 ` Huaisheng Ye
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: mhocko, willy, vbabka, mgorman, kstewart, alexander.levin,
	gregkh, colyli, chengnt, hehy1, linux-kernel, iommu, xen-devel,
	linux-btrfs, Huaisheng Ye, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Philippe Ombredanne

From: Huaisheng Ye <yehs1@lenovo.com>

GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: x86@kernel.org <x86@kernel.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
---
 arch/x86/include/asm/page.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 7555b48..a47f42d 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -35,7 +35,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 }
 
 #define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
-	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
+	alloc_page_vma((movableflags ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER) \
+	| __GFP_ZERO, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 #ifndef __pa
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: update usage of movableflags
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (14 preceding siblings ...)
  2018-05-21 15:20 ` [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: " Huaisheng Ye
@ 2018-05-21 15:20 ` Huaisheng Ye
  2018-05-22  9:40   ` Christoph Hellwig
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 72+ messages in thread
From: Huaisheng Ye @ 2018-05-21 15:20 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: kstewart, x86, mhocko, Huaisheng Ye, hehy1, Philippe Ombredanne,
	gregkh, H. Peter Anvin, linux-kernel, willy, alexander.levin,
	iommu, Ingo Molnar, linux-btrfs, chengnt, xen-devel,
	Thomas Gleixner, colyli, mgorman, vbabka

From: Huaisheng Ye <yehs1@lenovo.com>

GFP_HIGHUSER_MOVABLE doesn't equal to GFP_HIGHUSER | __GFP_MOVABLE,
modify it to adapt patch of getting rid of GFP_ZONE_TABLE/BAD.

Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: x86@kernel.org <x86@kernel.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
---
 arch/x86/include/asm/page.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 7555b48..a47f42d 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -35,7 +35,8 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
 }
 
 #define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
-	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
+	alloc_page_vma((movableflags ? GFP_HIGHUSER_MOVABLE : GFP_HIGHUSER) \
+	| __GFP_ZERO, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 #ifndef __pa
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 05/12] include/linux/dma-mapping: update usage of address zone modifiers
@ 2018-05-21 15:30     ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-21 15:30 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: akpm, linux-mm, mhocko, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye, Christoph Hellwig,
	Marek Szyprowski, Robin Murphy

On Mon, May 21, 2018 at 11:20:26PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated with
>each others by OR.

You have to include me for the whole series, otherwise I have absolutely
no way to properly review your patch.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 05/12] include/linux/dma-mapping: update usage of address zone modifiers
@ 2018-05-21 15:30     ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-21 15:30 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Huaisheng Ye, hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

On Mon, May 21, 2018 at 11:20:26PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1-6jq1YtArVR3QT0dZR+AlfA@public.gmane.org>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated with
>each others by OR.

You have to include me for the whole series, otherwise I have absolutely
no way to properly review your patch.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 05/12] include/linux/dma-mapping: update usage of address zone modifiers
  2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
@ 2018-05-21 15:30   ` Christoph Hellwig
  2018-05-21 15:30     ` Christoph Hellwig
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-21 15:30 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, Robin Murphy,
	linux-kernel, willy, alexander.levin, linux-mm, iommu,
	linux-btrfs, chengnt, xen-devel, akpm, colyli, mgorman,
	Christoph Hellwig, vbabka, Marek Szyprowski

On Mon, May 21, 2018 at 11:20:26PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated with
>each others by OR.

You have to include me for the whole series, otherwise I have absolutely
no way to properly review your patch.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
@ 2018-05-22  9:38     ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:38 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: akpm, linux-mm, mhocko, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Robin Murphy

This code doesn't exist in current mainline.  What kernel version
is your patch against?

On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

If they have already been deleted the identifier should not exist
anymore, so either your patch has issues, or at least the description.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
@ 2018-05-22  9:38     ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:38 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Huaisheng Ye, hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, H. Peter Anvin,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Ingo Molnar,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Thomas Gleixner,
	vbabka-AlSwsSmVLrQ

This code doesn't exist in current mainline.  What kernel version
is your patch against?

On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1-6jq1YtArVR3QT0dZR+AlfA@public.gmane.org>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

If they have already been deleted the identifier should not exist
anymore, so either your patch has issues, or at least the description.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
  2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
@ 2018-05-22  9:38   ` Christoph Hellwig
  2018-05-22  9:38     ` Christoph Hellwig
  1 sibling, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:38 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, H. Peter Anvin,
	linux-kernel, willy, alexander.levin, linux-mm, iommu,
	Ingo Molnar, linux-btrfs, chengnt, xen-devel, akpm, colyli,
	mgorman, Thomas Gleixner, vbabka, Robin Murphy

This code doesn't exist in current mainline.  What kernel version
is your patch against?

On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> 
> ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> bitmasks, the bottom three bits of GFP mask is reserved for storing
> encoded zone number.
> __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.

If they have already been deleted the identifier should not exist
anymore, so either your patch has issues, or at least the description.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-22  9:40   ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:40 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: akpm, linux-mm, mhocko, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

This seems to be missing patch 1 and generally be in somewhat odd format.
Can you try to resend it with git-send-email and against current Linus'
tree?

Also I'd suggest you do cleanups like adding and using __GFP_ZONE_MASK
at the beginning of the series before doing any real changes.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-22  9:40   ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:40 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Huaisheng Ye, hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, vbabka-AlSwsSmVLrQ

This seems to be missing patch 1 and generally be in somewhat odd format.
Can you try to resend it with git-send-email and against current Linus'
tree?

Also I'd suggest you do cleanups like adding and using __GFP_ZONE_MASK
at the beginning of the series before doing any real changes.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (16 preceding siblings ...)
  2018-05-22  9:40   ` Christoph Hellwig
@ 2018-05-22  9:40 ` Christoph Hellwig
  2018-05-22 18:37 ` Michal Hocko
  2018-05-22 18:37 ` Michal Hocko
  19 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-22  9:40 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart, mhocko, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	willy, alexander.levin, linux-mm, iommu, linux-btrfs, chengnt,
	xen-devel, akpm, colyli, mgorman, vbabka

This seems to be missing patch 1 and generally be in somewhat odd format.
Can you try to resend it with git-send-email and against current Linus'
tree?

Also I'd suggest you do cleanups like adding and using __GFP_ZONE_MASK
at the beginning of the series before doing any real changes.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
@ 2018-05-22 10:17       ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 10:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, linux-mm, mhocko, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Robin Murphy, Huaisheng Ye

From: owner-linux-mm@kvack.org On Behalf Of Christoph Hellwig
> 
> This code doesn't exist in current mainline.  What kernel version
> is your patch against?
> 
> On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1@lenovo.com>
> >
> > Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> >
> > ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> > bitmasks, the bottom three bits of GFP mask is reserved for storing
> > encoded zone number.
> > __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.
> 
> If they have already been deleted the identifier should not exist
> anymore, so either your patch has issues, or at least the description.

Dear Christoph,

The kernel version of my patches against is Linux 4.16, the most of
modifications come from include/Linux/gfp.h. I think they should be
pushed to Linux-mm, so I follow the requirement of maintainers to make
patches based on mmotm/master.

I just checked the current mainline, yes,
(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) has been deleted, I can
rebase my patches to mainline, and resend them to mail list.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External] Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
@ 2018-05-22 10:17       ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 10:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Huaisheng Ye, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, H. Peter Anvin,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Ingo Molnar,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt@public.gmane.org

From: owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org On Behalf Of Christoph Hellwig
> 
> This code doesn't exist in current mainline.  What kernel version
> is your patch against?
> 
> On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1-6jq1YtArVR3QT0dZR+AlfA@public.gmane.org>
> >
> > Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> >
> > ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> > bitmasks, the bottom three bits of GFP mask is reserved for storing
> > encoded zone number.
> > __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.
> 
> If they have already been deleted the identifier should not exist
> anymore, so either your patch has issues, or at least the description.

Dear Christoph,

The kernel version of my patches against is Linux 4.16, the most of
modifications come from include/Linux/gfp.h. I think they should be
pushed to Linux-mm, so I follow the requirement of maintainers to make
patches based on mmotm/master.

I just checked the current mainline, yes,
(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) has been deleted, I can
rebase my patches to mainline, and resend them to mail list.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
  2018-05-22  9:38     ` Christoph Hellwig
  (?)
@ 2018-05-22 10:17     ` Huaisheng HS1 Ye
  -1 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 10:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kstewart, mhocko, Huaisheng Ye, Ocean HY1 He, gregkh,
	H. Peter Anvin, linux-kernel, willy, alexander.levin, linux-mm,
	iommu, Ingo Molnar, linux-btrfs, NingTing Cheng, xen-devel, akpm,
	colyli, mgorman@techsingularity.net

From: owner-linux-mm@kvack.org On Behalf Of Christoph Hellwig
> 
> This code doesn't exist in current mainline.  What kernel version
> is your patch against?
> 
> On Mon, May 21, 2018 at 11:20:23PM +0800, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1@lenovo.com>
> >
> > Use __GFP_ZONE_MASK to replace (__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32).
> >
> > ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 have been deleted from GFP
> > bitmasks, the bottom three bits of GFP mask is reserved for storing
> > encoded zone number.
> > __GFP_DMA, __GFP_HIGHMEM and __GFP_DMA32 should not be operated by OR.
> 
> If they have already been deleted the identifier should not exist
> anymore, so either your patch has issues, or at least the description.

Dear Christoph,

The kernel version of my patches against is Linux 4.16, the most of
modifications come from include/Linux/gfp.h. I think they should be
pushed to Linux-mm, so I follow the requirement of maintainers to make
patches based on mmotm/master.

I just checked the current mainline, yes,
(__GFP_DMA | __GFP_HIGHMEM | __GFP_DMA32) has been deleted, I can
rebase my patches to mainline, and resend them to mail list.

Sincerely,
Huaisheng Ye

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
@ 2018-05-22 11:22     ` Matthew Wilcox
  0 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-22 11:22 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: akpm, linux-mm, mhocko, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye, Minchan Kim,
	Nitin Gupta, Sergey Senozhatsky

On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
>  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
>  {
>  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp & ~__GFP_ZONE_MOVABLE);
>  }

This should be & ~GFP_ZONEMASK

Actually, we should probably have a function to clear those bits rather
than have every driver manipulating the gfp mask like this.  Maybe

#define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			gfp_normal(gfp));

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
@ 2018-05-22 11:22     ` Matthew Wilcox
  0 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-22 11:22 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Nitin Gupta, Sergey Senozhatsky, Huaisheng Ye,
	hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Minchan Kim,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, vbabka-AlSwsSmVLrQ

On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
>  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
>  {
>  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp & ~__GFP_ZONE_MOVABLE);
>  }

This should be & ~GFP_ZONEMASK

Actually, we should probably have a function to clear those bits rather
than have every driver manipulating the gfp mask like this.  Maybe

#define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			gfp_normal(gfp));

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
  2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
  2018-05-22 11:22     ` Matthew Wilcox
@ 2018-05-22 11:22   ` Matthew Wilcox
  1 sibling, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-22 11:22 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart, mhocko, Nitin Gupta, Sergey Senozhatsky, Huaisheng Ye,
	hehy1, gregkh, Minchan Kim, linux-kernel, alexander.levin,
	linux-mm, iommu, linux-btrfs, chengnt, xen-devel, akpm, colyli,
	mgorman, vbabka

On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
>  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
>  {
>  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp & ~__GFP_ZONE_MOVABLE);
>  }

This should be & ~GFP_ZONEMASK

Actually, we should probably have a function to clear those bits rather
than have every driver manipulating the gfp mask like this.  Maybe

#define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
-			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
+			gfp_normal(gfp));


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
@ 2018-05-22 11:51       ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 11:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, linux-mm, mhocko, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Minchan Kim,
	Nitin Gupta, Sergey Senozhatsky, Huaisheng Ye, Christoph Hellwig

From: owner-linux-mm@kvack.org On Behalf Of Matthew Wilcox
> 
> On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> > @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
> >  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
> >  {
> >  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> > -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > +			gfp & ~__GFP_ZONE_MOVABLE);
> >  }
> 
> This should be & ~GFP_ZONEMASK
> 
> Actually, we should probably have a function to clear those bits rather
> than have every driver manipulating the gfp mask like this.  Maybe
> 
> #define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

Good idea!

> 
> 	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp_normal(gfp));


Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External] Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
@ 2018-05-22 11:51       ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 11:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, mhocko-IBi9RG/b67k,
	Nitin Gupta, Sergey Senozhatsky, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Minchan Kim,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph

From: owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org On Behalf Of Matthew Wilcox
> 
> On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> > @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
> >  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
> >  {
> >  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> > -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > +			gfp & ~__GFP_ZONE_MOVABLE);
> >  }
> 
> This should be & ~GFP_ZONEMASK
> 
> Actually, we should probably have a function to clear those bits rather
> than have every driver manipulating the gfp mask like this.  Maybe
> 
> #define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

Good idea!

> 
> 	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp_normal(gfp));


Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 10/12] mm/zsmalloc: update usage of address zone modifiers
  2018-05-22 11:22     ` Matthew Wilcox
  (?)
@ 2018-05-22 11:51     ` Huaisheng HS1 Ye
  -1 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-22 11:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart, mhocko, Nitin Gupta, Sergey Senozhatsky, Ocean HY1 He,
	gregkh, Minchan Kim, linux-kernel, alexander.levin, linux-mm,
	iommu, linux-btrfs, NingTing Cheng, xen-devel, akpm, colyli,
	mgorman, Christoph

From: owner-linux-mm@kvack.org On Behalf Of Matthew Wilcox
> 
> On Mon, May 21, 2018 at 11:20:31PM +0800, Huaisheng Ye wrote:
> > @@ -343,7 +343,7 @@ static void destroy_cache(struct zs_pool *pool)
> >  static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp)
> >  {
> >  	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> > -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> > +			gfp & ~__GFP_ZONE_MOVABLE);
> >  }
> 
> This should be & ~GFP_ZONEMASK
> 
> Actually, we should probably have a function to clear those bits rather
> than have every driver manipulating the gfp mask like this.  Maybe
> 
> #define gfp_normal(gfp)		((gfp) & ~GFP_ZONEMASK)

Good idea!

> 
> 	return (unsigned long)kmem_cache_alloc(pool->handle_cachep,
> -			gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE));
> +			gfp_normal(gfp));


Sincerely,
Huaisheng Ye

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (17 preceding siblings ...)
  2018-05-22  9:40 ` Christoph Hellwig
@ 2018-05-22 18:37 ` Michal Hocko
  2018-05-23 16:07     ` Huaisheng HS1 Ye
                     ` (3 more replies)
  2018-05-22 18:37 ` Michal Hocko
  19 siblings, 4 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-22 18:37 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> 
> Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> the bottom three bits of GFP mask is reserved for storing encoded
> zone number.
> 
> The encoding method is XOR. Get zone number from enum zone_type,
> then encode the number with ZONE_NORMAL by XOR operation.
> The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> can be used as before.
> 
> Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> __GFP_ZONE_MOVABLE is created to realize it.
> 
> With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> 
> Decode zone number directly from bottom three bits of flags in gfp_zone.
> The theory of encoding and decoding is,
>         A ^ B ^ B = A

So why is this any better than the current code. Sure I am not a great
fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
doesn't look too much better, yet we are losing a check for incompatible
gfp flags. The diffstat looks really sound but then you just look and
see that the large part is the comment that at least explained the gfp
zone modifiers somehow and the debugging code. So what is the selling
point?

> Changes since v1,
> 
> v2: Add __GFP_ZONE_MOVABLE and modify GFP_HIGHUSER_MOVABLE to help
> callers to get ZONE_MOVABLE. Add __GFP_ZONE_MASK to mask lowest 3
> bits of GFP bitmasks.
> Modify some callers' gfp flag to update usage of address zone
> modifiers.
> Modify inline function gfp_zone to get better performance according
> to Matthew's suggestion.
> 
> Link: https://marc.info/?l=linux-mm&m=152596791931266&w=2
> 
> Huaisheng Ye (12):
>   include/linux/gfp.h: get rid of GFP_ZONE_TABLE/BAD
>   arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
>   arch/x86/kernel/pci-calgary_64: update usage of address zone modifiers
>   drivers/iommu/amd_iommu: update usage of address zone modifiers
>   include/linux/dma-mapping: update usage of address zone modifiers
>   drivers/xen/swiotlb-xen: update usage of address zone modifiers
>   fs/btrfs/extent_io: update usage of address zone modifiers
>   drivers/block/zram/zram_drv: update usage of address zone modifiers
>   mm/vmpressure: update usage of address zone modifiers
>   mm/zsmalloc: update usage of address zone modifiers
>   include/linux/highmem: update usage of movableflags
>   arch/x86/include/asm/page.h: update usage of movableflags
> 
>  arch/x86/include/asm/page.h      |  3 +-
>  arch/x86/kernel/amd_gart_64.c    |  2 +-
>  arch/x86/kernel/pci-calgary_64.c |  2 +-
>  drivers/block/zram/zram_drv.c    |  6 +--
>  drivers/iommu/amd_iommu.c        |  2 +-
>  drivers/xen/swiotlb-xen.c        |  2 +-
>  fs/btrfs/extent_io.c             |  2 +-
>  include/linux/dma-mapping.h      |  2 +-
>  include/linux/gfp.h              | 98 +++++-----------------------------------
>  include/linux/highmem.h          |  4 +-
>  mm/vmpressure.c                  |  2 +-
>  mm/zsmalloc.c                    |  4 +-
>  12 files changed, 26 insertions(+), 103 deletions(-)
> 
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
                   ` (18 preceding siblings ...)
  2018-05-22 18:37 ` Michal Hocko
@ 2018-05-22 18:37 ` Michal Hocko
  19 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-22 18:37 UTC (permalink / raw)
  To: Huaisheng Ye
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, chengnt,
	xen-devel, akpm, colyli, mgorman, vbabka

On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> From: Huaisheng Ye <yehs1@lenovo.com>
> 
> Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> 
> Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> the bottom three bits of GFP mask is reserved for storing encoded
> zone number.
> 
> The encoding method is XOR. Get zone number from enum zone_type,
> then encode the number with ZONE_NORMAL by XOR operation.
> The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> can be used as before.
> 
> Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> __GFP_ZONE_MOVABLE is created to realize it.
> 
> With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> 
> Decode zone number directly from bottom three bits of flags in gfp_zone.
> The theory of encoding and decoding is,
>         A ^ B ^ B = A

So why is this any better than the current code. Sure I am not a great
fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
doesn't look too much better, yet we are losing a check for incompatible
gfp flags. The diffstat looks really sound but then you just look and
see that the large part is the comment that at least explained the gfp
zone modifiers somehow and the debugging code. So what is the selling
point?

> Changes since v1,
> 
> v2: Add __GFP_ZONE_MOVABLE and modify GFP_HIGHUSER_MOVABLE to help
> callers to get ZONE_MOVABLE. Add __GFP_ZONE_MASK to mask lowest 3
> bits of GFP bitmasks.
> Modify some callers' gfp flag to update usage of address zone
> modifiers.
> Modify inline function gfp_zone to get better performance according
> to Matthew's suggestion.
> 
> Link: https://marc.info/?l=linux-mm&m=152596791931266&w=2
> 
> Huaisheng Ye (12):
>   include/linux/gfp.h: get rid of GFP_ZONE_TABLE/BAD
>   arch/x86/kernel/amd_gart_64: update usage of address zone modifiers
>   arch/x86/kernel/pci-calgary_64: update usage of address zone modifiers
>   drivers/iommu/amd_iommu: update usage of address zone modifiers
>   include/linux/dma-mapping: update usage of address zone modifiers
>   drivers/xen/swiotlb-xen: update usage of address zone modifiers
>   fs/btrfs/extent_io: update usage of address zone modifiers
>   drivers/block/zram/zram_drv: update usage of address zone modifiers
>   mm/vmpressure: update usage of address zone modifiers
>   mm/zsmalloc: update usage of address zone modifiers
>   include/linux/highmem: update usage of movableflags
>   arch/x86/include/asm/page.h: update usage of movableflags
> 
>  arch/x86/include/asm/page.h      |  3 +-
>  arch/x86/kernel/amd_gart_64.c    |  2 +-
>  arch/x86/kernel/pci-calgary_64.c |  2 +-
>  drivers/block/zram/zram_drv.c    |  6 +--
>  drivers/iommu/amd_iommu.c        |  2 +-
>  drivers/xen/swiotlb-xen.c        |  2 +-
>  fs/btrfs/extent_io.c             |  2 +-
>  include/linux/dma-mapping.h      |  2 +-
>  include/linux/gfp.h              | 98 +++++-----------------------------------
>  include/linux/highmem.h          |  4 +-
>  mm/vmpressure.c                  |  2 +-
>  mm/zsmalloc.c                    |  4 +-
>  12 files changed, 26 insertions(+), 103 deletions(-)
> 
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-23 16:07     ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-23 16:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

From: Michal Hocko [mailto:mhocko@kernel.org]
Sent: Wednesday, May 23, 2018 2:37 AM
> 
> On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1@lenovo.com>
> >
> > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> >
> > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > the bottom three bits of GFP mask is reserved for storing encoded
> > zone number.
> >
> > The encoding method is XOR. Get zone number from enum zone_type,
> > then encode the number with ZONE_NORMAL by XOR operation.
> > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > can be used as before.
> >
> > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > __GFP_ZONE_MOVABLE is created to realize it.
> >
> > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> >
> > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > The theory of encoding and decoding is,
> >         A ^ B ^ B = A
> 
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

Dear Michal,

Let me try to reply your questions.
Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
from the series of patches.

1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
shift operations, the first is for getting a zone_type and the second is for
checking the to be returned type is a correct or not. But with these patch XOR
operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
been used to represent the encoded zone number, we can say there is no bad zone
number if all callers could use it without buggy way. Of course, the returned
zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
amount of zone types to larger than 4, the zone shift should be 3. That is to say,
a 32 bits zone table is not enough to store all zone types.
And the most painful thing is that, current GFP bitmasks' space is quite
space-constrained it only have four ___GFP_XXX could be used as below,

	#define ___GFP_DMA		0x01u
	#define ___GFP_HIGHMEM	0x02u
	#define ___GFP_DMA32		0x04u
	(___GFP_NORMAL equals to 0x00)

If we use the implementation of these patches, there is a maximum of 8 zone types
could be used. The method of encoding and decoding is quite simple and users could
have an intuitive feeling for this as below, and the most important is that, there
is no BAD zone types eventually.

	A ^ B ^ B = A

And by the way, our v3 patches are ready, but the smtp of Gmail is quite unstable
for some firewall reason in my side, I will try to resend them ASAP.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-23 16:07     ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-23 16:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

From: Michal Hocko [mailto:mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
Sent: Wednesday, May 23, 2018 2:37 AM
> 
> On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1-6jq1YtArVR3QT0dZR+AlfA@public.gmane.org>
> >
> > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> >
> > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > the bottom three bits of GFP mask is reserved for storing encoded
> > zone number.
> >
> > The encoding method is XOR. Get zone number from enum zone_type,
> > then encode the number with ZONE_NORMAL by XOR operation.
> > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > can be used as before.
> >
> > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > __GFP_ZONE_MOVABLE is created to realize it.
> >
> > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> >
> > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > The theory of encoding and decoding is,
> >         A ^ B ^ B = A
> 
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

Dear Michal,

Let me try to reply your questions.
Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
from the series of patches.

1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
shift operations, the first is for getting a zone_type and the second is for
checking the to be returned type is a correct or not. But with these patch XOR
operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
been used to represent the encoded zone number, we can say there is no bad zone
number if all callers could use it without buggy way. Of course, the returned
zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
amount of zone types to larger than 4, the zone shift should be 3. That is to say,
a 32 bits zone table is not enough to store all zone types.
And the most painful thing is that, current GFP bitmasks' space is quite
space-constrained it only have four ___GFP_XXX could be used as below,

	#define ___GFP_DMA		0x01u
	#define ___GFP_HIGHMEM	0x02u
	#define ___GFP_DMA32		0x04u
	(___GFP_NORMAL equals to 0x00)

If we use the implementation of these patches, there is a maximum of 8 zone types
could be used. The method of encoding and decoding is quite simple and users could
have an intuitive feeling for this as below, and the most important is that, there
is no BAD zone types eventually.

	A ^ B ^ B = A

And by the way, our v3 patches are ready, but the smtp of Gmail is quite unstable
for some firewall reason in my side, I will try to resend them ASAP.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-22 18:37 ` Michal Hocko
  2018-05-23 16:07     ` Huaisheng HS1 Ye
@ 2018-05-23 16:07   ` Huaisheng HS1 Ye
  2018-05-24  5:19     ` Matthew Wilcox
  2018-05-24  5:19   ` Matthew Wilcox
  3 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-23 16:07 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

From: Michal Hocko [mailto:mhocko@kernel.org]
Sent: Wednesday, May 23, 2018 2:37 AM
> 
> On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > From: Huaisheng Ye <yehs1@lenovo.com>
> >
> > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> >
> > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > the bottom three bits of GFP mask is reserved for storing encoded
> > zone number.
> >
> > The encoding method is XOR. Get zone number from enum zone_type,
> > then encode the number with ZONE_NORMAL by XOR operation.
> > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > can be used as before.
> >
> > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > __GFP_ZONE_MOVABLE is created to realize it.
> >
> > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> >
> > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > The theory of encoding and decoding is,
> >         A ^ B ^ B = A
> 
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

Dear Michal,

Let me try to reply your questions.
Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
from the series of patches.

1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
shift operations, the first is for getting a zone_type and the second is for
checking the to be returned type is a correct or not. But with these patch XOR
operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
been used to represent the encoded zone number, we can say there is no bad zone
number if all callers could use it without buggy way. Of course, the returned
zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
amount of zone types to larger than 4, the zone shift should be 3. That is to say,
a 32 bits zone table is not enough to store all zone types.
And the most painful thing is that, current GFP bitmasks' space is quite
space-constrained it only have four ___GFP_XXX could be used as below,

	#define ___GFP_DMA		0x01u
	#define ___GFP_HIGHMEM	0x02u
	#define ___GFP_DMA32		0x04u
	(___GFP_NORMAL equals to 0x00)

If we use the implementation of these patches, there is a maximum of 8 zone types
could be used. The method of encoding and decoding is quite simple and users could
have an intuitive feeling for this as below, and the most important is that, there
is no BAD zone types eventually.

	A ^ B ^ B = A

And by the way, our v3 patches are ready, but the smtp of Gmail is quite unstable
for some firewall reason in my side, I will try to resend them ASAP.

Sincerely,
Huaisheng Ye



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-24  5:19     ` Matthew Wilcox
  0 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-24  5:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

I have a plan, but it's not exactly fully-formed yet.

One of the big problems we have today is that we have a lot of users
who have constraints on the physical memory they want to allocate,
but we have very limited abilities to provide them with what they're
asking for.  The various different ZONEs have different meanings on
different architectures and are generally a mess.

If we had eight ZONEs, we could offer:

ZONE_16M	// 24 bit
ZONE_256M	// 28 bit
ZONE_LOWMEM	// CONFIG_32BIT only
ZONE_4G		// 32 bit
ZONE_64G	// 36 bit
ZONE_1T		// 40 bit
ZONE_ALL	// everything larger
ZONE_MOVABLE	// movable allocations; no physical address guarantees

#ifdef CONFIG_64BIT
#define ZONE_NORMAL	ZONE_ALL
#else
#define ZONE_NORMAL	ZONE_LOWMEM
#endif

This would cover most driver DMA mask allocations; we could tweak the
offered zones based on analysis of what people need.

#define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
#define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)

One other thing I want to see is that fallback from zones happens from
highest to lowest normally (ie if you fail to allocate in 1T, then you
try to allocate from 64G), but movable allocations hapen from lowest
to highest.  So ZONE_16M ends up full of page cache pages which are
readily evictable for the rare occasions when we need to allocate memory
below 16MB.

I'm sure there are lots of good reasons why this won't work, which is
why I've been hesitant to propose it before now.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-24  5:19     ` Matthew Wilcox
  0 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-24  5:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Huaisheng Ye,
	hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Huaisheng Ye,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, vbabka-AlSwsSmVLrQ

On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

I have a plan, but it's not exactly fully-formed yet.

One of the big problems we have today is that we have a lot of users
who have constraints on the physical memory they want to allocate,
but we have very limited abilities to provide them with what they're
asking for.  The various different ZONEs have different meanings on
different architectures and are generally a mess.

If we had eight ZONEs, we could offer:

ZONE_16M	// 24 bit
ZONE_256M	// 28 bit
ZONE_LOWMEM	// CONFIG_32BIT only
ZONE_4G		// 32 bit
ZONE_64G	// 36 bit
ZONE_1T		// 40 bit
ZONE_ALL	// everything larger
ZONE_MOVABLE	// movable allocations; no physical address guarantees

#ifdef CONFIG_64BIT
#define ZONE_NORMAL	ZONE_ALL
#else
#define ZONE_NORMAL	ZONE_LOWMEM
#endif

This would cover most driver DMA mask allocations; we could tweak the
offered zones based on analysis of what people need.

#define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
#define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)

One other thing I want to see is that fallback from zones happens from
highest to lowest normally (ie if you fail to allocate in 1T, then you
try to allocate from 64G), but movable allocations hapen from lowest
to highest.  So ZONE_16M ends up full of page cache pages which are
readily evictable for the rare occasions when we need to allocate memory
below 16MB.

I'm sure there are lots of good reasons why this won't work, which is
why I've been hesitant to propose it before now.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-22 18:37 ` Michal Hocko
                     ` (2 preceding siblings ...)
  2018-05-24  5:19     ` Matthew Wilcox
@ 2018-05-24  5:19   ` Matthew Wilcox
  3 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-24  5:19 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> So why is this any better than the current code. Sure I am not a great
> fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> doesn't look too much better, yet we are losing a check for incompatible
> gfp flags. The diffstat looks really sound but then you just look and
> see that the large part is the comment that at least explained the gfp
> zone modifiers somehow and the debugging code. So what is the selling
> point?

I have a plan, but it's not exactly fully-formed yet.

One of the big problems we have today is that we have a lot of users
who have constraints on the physical memory they want to allocate,
but we have very limited abilities to provide them with what they're
asking for.  The various different ZONEs have different meanings on
different architectures and are generally a mess.

If we had eight ZONEs, we could offer:

ZONE_16M	// 24 bit
ZONE_256M	// 28 bit
ZONE_LOWMEM	// CONFIG_32BIT only
ZONE_4G		// 32 bit
ZONE_64G	// 36 bit
ZONE_1T		// 40 bit
ZONE_ALL	// everything larger
ZONE_MOVABLE	// movable allocations; no physical address guarantees

#ifdef CONFIG_64BIT
#define ZONE_NORMAL	ZONE_ALL
#else
#define ZONE_NORMAL	ZONE_LOWMEM
#endif

This would cover most driver DMA mask allocations; we could tweak the
offered zones based on analysis of what people need.

#define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
#define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)

One other thing I want to see is that fallback from zones happens from
highest to lowest normally (ie if you fail to allocate in 1T, then you
try to allocate from 64G), but movable allocations hapen from lowest
to highest.  So ZONE_16M ends up full of page cache pages which are
readily evictable for the rare occasions when we need to allocate memory
below 16MB.

I'm sure there are lots of good reasons why this won't work, which is
why I've been hesitant to propose it before now.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-23 16:07     ` Huaisheng HS1 Ye
@ 2018-05-24 12:18       ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:18 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

On Wed 23-05-18 16:07:16, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko@kernel.org]
> Sent: Wednesday, May 23, 2018 2:37 AM
> > 
> > On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > > From: Huaisheng Ye <yehs1@lenovo.com>
> > >
> > > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> > >
> > > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > > the bottom three bits of GFP mask is reserved for storing encoded
> > > zone number.
> > >
> > > The encoding method is XOR. Get zone number from enum zone_type,
> > > then encode the number with ZONE_NORMAL by XOR operation.
> > > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > > can be used as before.
> > >
> > > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > > __GFP_ZONE_MOVABLE is created to realize it.
> > >
> > > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> > >
> > > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > > The theory of encoding and decoding is,
> > >         A ^ B ^ B = A
> > 
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> Dear Michal,
> 
> Let me try to reply your questions.
> Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> from the series of patches.
> 
> 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> shift operations, the first is for getting a zone_type and the second is for
> checking the to be returned type is a correct or not. But with these patch XOR
> operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> been used to represent the encoded zone number, we can say there is no bad zone
> number if all callers could use it without buggy way. Of course, the returned
> zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

But you are losing the ability to check for wrong usage. And it seems
that the sad reality is that the existing code do screw up.

> 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> amount of zone types to larger than 4, the zone shift should be 3.

But we do not want to expand the number of zones IMHO. The existing zoo
is quite a maint. pain.
 
That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
It always makes my head explode when I look there but it seems to work
with the current code and it is optimized for it. If you want to change
this then you should make sure you describe reasons _why_ this is an
improvement. And I would argue that "we can have more zones" is a
relevant one.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-24 12:18       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:18 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs@vger.kernel.org

On Wed 23-05-18 16:07:16, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko@kernel.org]
> Sent: Wednesday, May 23, 2018 2:37 AM
> > 
> > On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > > From: Huaisheng Ye <yehs1@lenovo.com>
> > >
> > > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> > >
> > > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > > the bottom three bits of GFP mask is reserved for storing encoded
> > > zone number.
> > >
> > > The encoding method is XOR. Get zone number from enum zone_type,
> > > then encode the number with ZONE_NORMAL by XOR operation.
> > > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > > can be used as before.
> > >
> > > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > > __GFP_ZONE_MOVABLE is created to realize it.
> > >
> > > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> > >
> > > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > > The theory of encoding and decoding is,
> > >         A ^ B ^ B = A
> > 
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> Dear Michal,
> 
> Let me try to reply your questions.
> Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> from the series of patches.
> 
> 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> shift operations, the first is for getting a zone_type and the second is for
> checking the to be returned type is a correct or not. But with these patch XOR
> operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> been used to represent the encoded zone number, we can say there is no bad zone
> number if all callers could use it without buggy way. Of course, the returned
> zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

But you are losing the ability to check for wrong usage. And it seems
that the sad reality is that the existing code do screw up.

> 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> amount of zone types to larger than 4, the zone shift should be 3.

But we do not want to expand the number of zones IMHO. The existing zoo
is quite a maint. pain.
 
That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
It always makes my head explode when I look there but it seems to work
with the current code and it is optimized for it. If you want to change
this then you should make sure you describe reasons _why_ this is an
improvement. And I would argue that "we can have more zones" is a
relevant one.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-23 16:07     ` Huaisheng HS1 Ye
  (?)
  (?)
@ 2018-05-24 12:18     ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:18 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

On Wed 23-05-18 16:07:16, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko@kernel.org]
> Sent: Wednesday, May 23, 2018 2:37 AM
> > 
> > On Mon 21-05-18 23:20:21, Huaisheng Ye wrote:
> > > From: Huaisheng Ye <yehs1@lenovo.com>
> > >
> > > Replace GFP_ZONE_TABLE and GFP_ZONE_BAD with encoded zone number.
> > >
> > > Delete ___GFP_DMA, ___GFP_HIGHMEM and ___GFP_DMA32 from GFP bitmasks,
> > > the bottom three bits of GFP mask is reserved for storing encoded
> > > zone number.
> > >
> > > The encoding method is XOR. Get zone number from enum zone_type,
> > > then encode the number with ZONE_NORMAL by XOR operation.
> > > The goal is to make sure ZONE_NORMAL can be encoded to zero. So,
> > > the compatibility can be guaranteed, such as GFP_KERNEL and GFP_ATOMIC
> > > can be used as before.
> > >
> > > Reserve __GFP_MOVABLE in bit 3, so that it can continue to be used as
> > > a flag. Same as before, __GFP_MOVABLE respresents movable migrate type
> > > for ZONE_DMA, ZONE_DMA32, and ZONE_NORMAL. But when it is enabled with
> > > __GFP_HIGHMEM, ZONE_MOVABLE shall be returned instead of ZONE_HIGHMEM.
> > > __GFP_ZONE_MOVABLE is created to realize it.
> > >
> > > With this patch, just enabling __GFP_MOVABLE and __GFP_HIGHMEM is not
> > > enough to get ZONE_MOVABLE from gfp_zone. All callers should use
> > > GFP_HIGHUSER_MOVABLE or __GFP_ZONE_MOVABLE directly to achieve that.
> > >
> > > Decode zone number directly from bottom three bits of flags in gfp_zone.
> > > The theory of encoding and decoding is,
> > >         A ^ B ^ B = A
> > 
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> Dear Michal,
> 
> Let me try to reply your questions.
> Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> from the series of patches.
> 
> 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> shift operations, the first is for getting a zone_type and the second is for
> checking the to be returned type is a correct or not. But with these patch XOR
> operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> been used to represent the encoded zone number, we can say there is no bad zone
> number if all callers could use it without buggy way. Of course, the returned
> zone type in gfp_zone needs to be no more than ZONE_MOVABLE.

But you are losing the ability to check for wrong usage. And it seems
that the sad reality is that the existing code do screw up.

> 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> amount of zone types to larger than 4, the zone shift should be 3.

But we do not want to expand the number of zones IMHO. The existing zoo
is quite a maint. pain.
 
That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
It always makes my head explode when I look there but it seems to work
with the current code and it is optimized for it. If you want to change
this then you should make sure you describe reasons _why_ this is an
improvement. And I would argue that "we can have more zones" is a
relevant one.
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-24 12:23       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Wed 23-05-18 22:19:19, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> I have a plan, but it's not exactly fully-formed yet.
> 
> One of the big problems we have today is that we have a lot of users
> who have constraints on the physical memory they want to allocate,
> but we have very limited abilities to provide them with what they're
> asking for.  The various different ZONEs have different meanings on
> different architectures and are generally a mess.

Agreed.

> If we had eight ZONEs, we could offer:

No, please no more zones. What we have is quite a maint. burden on its
own. Ideally we should only have lowmem, highmem and special/device
zones for directly kernel accessible memory, the one that the kernel
cannot or must not use and completely special memory managed out of
the page allocator. All the remaining constrains should better be
implemented on top.

> ZONE_16M	// 24 bit
> ZONE_256M	// 28 bit
> ZONE_LOWMEM	// CONFIG_32BIT only
> ZONE_4G		// 32 bit
> ZONE_64G	// 36 bit
> ZONE_1T		// 40 bit
> ZONE_ALL	// everything larger
> ZONE_MOVABLE	// movable allocations; no physical address guarantees
> 
> #ifdef CONFIG_64BIT
> #define ZONE_NORMAL	ZONE_ALL
> #else
> #define ZONE_NORMAL	ZONE_LOWMEM
> #endif
> 
> This would cover most driver DMA mask allocations; we could tweak the
> offered zones based on analysis of what people need.

But those already do have aproper API, IIUC. So do we really need to
make our GFP_*/Zone API more complicated than it already is?

> #define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
> #define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)
> 
> One other thing I want to see is that fallback from zones happens from
> highest to lowest normally (ie if you fail to allocate in 1T, then you
> try to allocate from 64G), but movable allocations hapen from lowest
> to highest.  So ZONE_16M ends up full of page cache pages which are
> readily evictable for the rare occasions when we need to allocate memory
> below 16MB.
> 
> I'm sure there are lots of good reasons why this won't work, which is
> why I've been hesitant to propose it before now.

I am worried you are playing with a can of worms...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-24 12:23       ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Huaisheng Ye,
	hehy1-6jq1YtArVR3QT0dZR+AlfA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, Huaisheng Ye,
	chengnt-6jq1YtArVR3QT0dZR+AlfA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, vbabka-AlSwsSmVLrQ

On Wed 23-05-18 22:19:19, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> I have a plan, but it's not exactly fully-formed yet.
> 
> One of the big problems we have today is that we have a lot of users
> who have constraints on the physical memory they want to allocate,
> but we have very limited abilities to provide them with what they're
> asking for.  The various different ZONEs have different meanings on
> different architectures and are generally a mess.

Agreed.

> If we had eight ZONEs, we could offer:

No, please no more zones. What we have is quite a maint. burden on its
own. Ideally we should only have lowmem, highmem and special/device
zones for directly kernel accessible memory, the one that the kernel
cannot or must not use and completely special memory managed out of
the page allocator. All the remaining constrains should better be
implemented on top.

> ZONE_16M	// 24 bit
> ZONE_256M	// 28 bit
> ZONE_LOWMEM	// CONFIG_32BIT only
> ZONE_4G		// 32 bit
> ZONE_64G	// 36 bit
> ZONE_1T		// 40 bit
> ZONE_ALL	// everything larger
> ZONE_MOVABLE	// movable allocations; no physical address guarantees
> 
> #ifdef CONFIG_64BIT
> #define ZONE_NORMAL	ZONE_ALL
> #else
> #define ZONE_NORMAL	ZONE_LOWMEM
> #endif
> 
> This would cover most driver DMA mask allocations; we could tweak the
> offered zones based on analysis of what people need.

But those already do have aproper API, IIUC. So do we really need to
make our GFP_*/Zone API more complicated than it already is?

> #define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
> #define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)
> 
> One other thing I want to see is that fallback from zones happens from
> highest to lowest normally (ie if you fail to allocate in 1T, then you
> try to allocate from 64G), but movable allocations hapen from lowest
> to highest.  So ZONE_16M ends up full of page cache pages which are
> readily evictable for the rare occasions when we need to allocate memory
> below 16MB.
> 
> I'm sure there are lots of good reasons why this won't work, which is
> why I've been hesitant to propose it before now.

I am worried you are playing with a can of worms...
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24  5:19     ` Matthew Wilcox
  (?)
  (?)
@ 2018-05-24 12:23     ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 12:23 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Wed 23-05-18 22:19:19, Matthew Wilcox wrote:
> On Tue, May 22, 2018 at 08:37:28PM +0200, Michal Hocko wrote:
> > So why is this any better than the current code. Sure I am not a great
> > fan of GFP_ZONE_TABLE because of how it is incomprehensible but this
> > doesn't look too much better, yet we are losing a check for incompatible
> > gfp flags. The diffstat looks really sound but then you just look and
> > see that the large part is the comment that at least explained the gfp
> > zone modifiers somehow and the debugging code. So what is the selling
> > point?
> 
> I have a plan, but it's not exactly fully-formed yet.
> 
> One of the big problems we have today is that we have a lot of users
> who have constraints on the physical memory they want to allocate,
> but we have very limited abilities to provide them with what they're
> asking for.  The various different ZONEs have different meanings on
> different architectures and are generally a mess.

Agreed.

> If we had eight ZONEs, we could offer:

No, please no more zones. What we have is quite a maint. burden on its
own. Ideally we should only have lowmem, highmem and special/device
zones for directly kernel accessible memory, the one that the kernel
cannot or must not use and completely special memory managed out of
the page allocator. All the remaining constrains should better be
implemented on top.

> ZONE_16M	// 24 bit
> ZONE_256M	// 28 bit
> ZONE_LOWMEM	// CONFIG_32BIT only
> ZONE_4G		// 32 bit
> ZONE_64G	// 36 bit
> ZONE_1T		// 40 bit
> ZONE_ALL	// everything larger
> ZONE_MOVABLE	// movable allocations; no physical address guarantees
> 
> #ifdef CONFIG_64BIT
> #define ZONE_NORMAL	ZONE_ALL
> #else
> #define ZONE_NORMAL	ZONE_LOWMEM
> #endif
> 
> This would cover most driver DMA mask allocations; we could tweak the
> offered zones based on analysis of what people need.

But those already do have aproper API, IIUC. So do we really need to
make our GFP_*/Zone API more complicated than it already is?

> #define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
> #define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)
> 
> One other thing I want to see is that fallback from zones happens from
> highest to lowest normally (ie if you fail to allocate in 1T, then you
> try to allocate from 64G), but movable allocations hapen from lowest
> to highest.  So ZONE_16M ends up full of page cache pages which are
> readily evictable for the rare occasions when we need to allocate memory
> below 16MB.
> 
> I'm sure there are lots of good reasons why this won't work, which is
> why I've been hesitant to propose it before now.

I am worried you are playing with a can of worms...
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 12:23       ` Michal Hocko
  (?)
  (?)
@ 2018-05-24 15:18       ` Matthew Wilcox
  2018-05-24 15:29         ` Michal Hocko
  2018-05-24 15:29         ` Michal Hocko
  -1 siblings, 2 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-24 15:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Thu, May 24, 2018 at 02:23:23PM +0200, Michal Hocko wrote:
> > If we had eight ZONEs, we could offer:
> 
> No, please no more zones. What we have is quite a maint. burden on its
> own. Ideally we should only have lowmem, highmem and special/device
> zones for directly kernel accessible memory, the one that the kernel
> cannot or must not use and completely special memory managed out of
> the page allocator. All the remaining constrains should better be
> implemented on top.

I believe you when you say that they're a maintenance pain.  Is that
maintenance pain because they're so specialised?  ie if we had more,
could we solve our pain by making them more generic?

> > ZONE_16M	// 24 bit
> > ZONE_256M	// 28 bit
> > ZONE_LOWMEM	// CONFIG_32BIT only
> > ZONE_4G		// 32 bit
> > ZONE_64G	// 36 bit
> > ZONE_1T		// 40 bit
> > ZONE_ALL	// everything larger
> > ZONE_MOVABLE	// movable allocations; no physical address guarantees
> > 
> > #ifdef CONFIG_64BIT
> > #define ZONE_NORMAL	ZONE_ALL
> > #else
> > #define ZONE_NORMAL	ZONE_LOWMEM
> > #endif
> > 
> > This would cover most driver DMA mask allocations; we could tweak the
> > offered zones based on analysis of what people need.
> 
> But those already do have aproper API, IIUC. So do we really need to
> make our GFP_*/Zone API more complicated than it already is?

I don't want to change the driver API (setting the DMA mask, etc),
but we don't actually have a good API to the page allocator for the
implementation of dma_alloc_foo() to request pages.  More or less,
architectures do:

	if (mask < 4GB)
		alloc_page(GFP_DMA)
	else if (mask < 64EB)
		alloc_page(GFP_DMA32)
	else
		alloc_page(GFP_HIGHMEM)

it more-or-less sucks that the devices with 28-bit DMA limits are forced
to allocate from the low 16MB when they're perfectly capable of using the
low 256MB.  Sure, my proposal doesn't help 27 or 26 bit DMA mask devices,
but those are pretty rare.

I'm sure you don't need reminding what a mess vmalloc_32 is, and the
implementation of saa7146_vmalloc_build_pgtable() just hurts.

> > #define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
> > #define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)
> > 
> > One other thing I want to see is that fallback from zones happens from
> > highest to lowest normally (ie if you fail to allocate in 1T, then you
> > try to allocate from 64G), but movable allocations hapen from lowest
> > to highest.  So ZONE_16M ends up full of page cache pages which are
> > readily evictable for the rare occasions when we need to allocate memory
> > below 16MB.
> > 
> > I'm sure there are lots of good reasons why this won't work, which is
> > why I've been hesitant to propose it before now.
> 
> I am worried you are playing with a can of worms...

Yes.  Me too.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 12:23       ` Michal Hocko
  (?)
@ 2018-05-24 15:18       ` Matthew Wilcox
  -1 siblings, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-24 15:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Thu, May 24, 2018 at 02:23:23PM +0200, Michal Hocko wrote:
> > If we had eight ZONEs, we could offer:
> 
> No, please no more zones. What we have is quite a maint. burden on its
> own. Ideally we should only have lowmem, highmem and special/device
> zones for directly kernel accessible memory, the one that the kernel
> cannot or must not use and completely special memory managed out of
> the page allocator. All the remaining constrains should better be
> implemented on top.

I believe you when you say that they're a maintenance pain.  Is that
maintenance pain because they're so specialised?  ie if we had more,
could we solve our pain by making them more generic?

> > ZONE_16M	// 24 bit
> > ZONE_256M	// 28 bit
> > ZONE_LOWMEM	// CONFIG_32BIT only
> > ZONE_4G		// 32 bit
> > ZONE_64G	// 36 bit
> > ZONE_1T		// 40 bit
> > ZONE_ALL	// everything larger
> > ZONE_MOVABLE	// movable allocations; no physical address guarantees
> > 
> > #ifdef CONFIG_64BIT
> > #define ZONE_NORMAL	ZONE_ALL
> > #else
> > #define ZONE_NORMAL	ZONE_LOWMEM
> > #endif
> > 
> > This would cover most driver DMA mask allocations; we could tweak the
> > offered zones based on analysis of what people need.
> 
> But those already do have aproper API, IIUC. So do we really need to
> make our GFP_*/Zone API more complicated than it already is?

I don't want to change the driver API (setting the DMA mask, etc),
but we don't actually have a good API to the page allocator for the
implementation of dma_alloc_foo() to request pages.  More or less,
architectures do:

	if (mask < 4GB)
		alloc_page(GFP_DMA)
	else if (mask < 64EB)
		alloc_page(GFP_DMA32)
	else
		alloc_page(GFP_HIGHMEM)

it more-or-less sucks that the devices with 28-bit DMA limits are forced
to allocate from the low 16MB when they're perfectly capable of using the
low 256MB.  Sure, my proposal doesn't help 27 or 26 bit DMA mask devices,
but those are pretty rare.

I'm sure you don't need reminding what a mess vmalloc_32 is, and the
implementation of saa7146_vmalloc_build_pgtable() just hurts.

> > #define GFP_HIGHUSER		(GFP_USER | ZONE_ALL)
> > #define GFP_HIGHUSER_MOVABLE	(GFP_USER | ZONE_MOVABLE)
> > 
> > One other thing I want to see is that fallback from zones happens from
> > highest to lowest normally (ie if you fail to allocate in 1T, then you
> > try to allocate from 64G), but movable allocations hapen from lowest
> > to highest.  So ZONE_16M ends up full of page cache pages which are
> > readily evictable for the rare occasions when we need to allocate memory
> > below 16MB.
> > 
> > I'm sure there are lots of good reasons why this won't work, which is
> > why I've been hesitant to propose it before now.
> 
> I am worried you are playing with a can of worms...

Yes.  Me too.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 15:18       ` Matthew Wilcox
  2018-05-24 15:29         ` Michal Hocko
@ 2018-05-24 15:29         ` Michal Hocko
  2018-05-25 12:00           ` Matthew Wilcox
  2018-05-25 12:00           ` Matthew Wilcox
  1 sibling, 2 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 15:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Thu 24-05-18 08:18:18, Matthew Wilcox wrote:
> On Thu, May 24, 2018 at 02:23:23PM +0200, Michal Hocko wrote:
> > > If we had eight ZONEs, we could offer:
> > 
> > No, please no more zones. What we have is quite a maint. burden on its
> > own. Ideally we should only have lowmem, highmem and special/device
> > zones for directly kernel accessible memory, the one that the kernel
> > cannot or must not use and completely special memory managed out of
> > the page allocator. All the remaining constrains should better be
> > implemented on top.
> 
> I believe you when you say that they're a maintenance pain.  Is that
> maintenance pain because they're so specialised?

Well, it used to be LRU balancing which is gone with the node reclaim
but that brings new challenges. Now as you say their meaning is not
really clear to users and that leads to bugs left and right.

> ie if we had more,
> could we solve our pain by making them more generic?

Well, if you have more you will consume more bits in the struct pages,
right?

[...]

> > But those already do have aproper API, IIUC. So do we really need to
> > make our GFP_*/Zone API more complicated than it already is?
> 
> I don't want to change the driver API (setting the DMA mask, etc),
> but we don't actually have a good API to the page allocator for the
> implementation of dma_alloc_foo() to request pages.  More or less,
> architectures do:
> 
> 	if (mask < 4GB)
> 		alloc_page(GFP_DMA)
> 	else if (mask < 64EB)
> 		alloc_page(GFP_DMA32)
> 	else
> 		alloc_page(GFP_HIGHMEM)
> 
> it more-or-less sucks that the devices with 28-bit DMA limits are forced
> to allocate from the low 16MB when they're perfectly capable of using the
> low 256MB.

Do we actually care all that much about those? If yes then we should
probably follow the ZONE_DMA (x86) path and use a CMA region for them.
I mean most devices should be good with very limited addressability or
below 4G, no?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 15:18       ` Matthew Wilcox
@ 2018-05-24 15:29         ` Michal Hocko
  2018-05-24 15:29         ` Michal Hocko
  1 sibling, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-24 15:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Thu 24-05-18 08:18:18, Matthew Wilcox wrote:
> On Thu, May 24, 2018 at 02:23:23PM +0200, Michal Hocko wrote:
> > > If we had eight ZONEs, we could offer:
> > 
> > No, please no more zones. What we have is quite a maint. burden on its
> > own. Ideally we should only have lowmem, highmem and special/device
> > zones for directly kernel accessible memory, the one that the kernel
> > cannot or must not use and completely special memory managed out of
> > the page allocator. All the remaining constrains should better be
> > implemented on top.
> 
> I believe you when you say that they're a maintenance pain.  Is that
> maintenance pain because they're so specialised?

Well, it used to be LRU balancing which is gone with the node reclaim
but that brings new challenges. Now as you say their meaning is not
really clear to users and that leads to bugs left and right.

> ie if we had more,
> could we solve our pain by making them more generic?

Well, if you have more you will consume more bits in the struct pages,
right?

[...]

> > But those already do have aproper API, IIUC. So do we really need to
> > make our GFP_*/Zone API more complicated than it already is?
> 
> I don't want to change the driver API (setting the DMA mask, etc),
> but we don't actually have a good API to the page allocator for the
> implementation of dma_alloc_foo() to request pages.  More or less,
> architectures do:
> 
> 	if (mask < 4GB)
> 		alloc_page(GFP_DMA)
> 	else if (mask < 64EB)
> 		alloc_page(GFP_DMA32)
> 	else
> 		alloc_page(GFP_HIGHMEM)
> 
> it more-or-less sucks that the devices with 28-bit DMA limits are forced
> to allocate from the low 16MB when they're perfectly capable of using the
> low 256MB.

Do we actually care all that much about those? If yes then we should
probably follow the ZONE_DMA (x86) path and use a CMA region for them.
I mean most devices should be good with very limited addressability or
below 4G, no?
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-25  9:43         ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-25  9:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

From: Michal Hocko [mailto:mhocko@kernel.org]
Sent: Thursday, May 24, 2018 8:19 PM> 
> > Let me try to reply your questions.
> > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > from the series of patches.
> >
> > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > shift operations, the first is for getting a zone_type and the second is for
> > checking the to be returned type is a correct or not. But with these patch XOR
> > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > been used to represent the encoded zone number, we can say there is no bad zone
> > number if all callers could use it without buggy way. Of course, the returned
> > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> 
> But you are losing the ability to check for wrong usage. And it seems
> that the sad reality is that the existing code do screw up.

In my opinion, originally there shouldn't be such many wrong combinations of these bottom 3 bits. For any user, whether or driver and fs, they should make a decision that which zone is they preferred. Matthew's idea is great, because with it the user must offer an unambiguous flag to gfp zone bits.

Ideally, before any user wants to modify the address zone modifier, they should clear it firstly, then ORing the GFP zone flag which comes from the zone they prefer.
With these patches, we can loudly announce that, the bottom 3 bits of zone mask couldn't accept internal ORing operations.
The operations like __GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM is illegal. The current GFP_ZONE_TABLE is precisely the root of this problem, that is __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM are formatted as 0x1, 0x2 and 0x4.

> 
> > 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> > is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> > are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> > amount of zone types to larger than 4, the zone shift should be 3.
> 
> But we do not want to expand the number of zones IMHO. The existing zoo
> is quite a maint. pain.
> 
> That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> It always makes my head explode when I look there but it seems to work
> with the current code and it is optimized for it. If you want to change
> this then you should make sure you describe reasons _why_ this is an
> improvement. And I would argue that "we can have more zones" is a
> relevant one.

Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.

* The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.

* No bad zone combination, because user should choose just one address zone modifier always.
* Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
* Up to 8 zones can be used. At least it isn't a disadvantage, right?

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-25  9:43         ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-25  9:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

From: Michal Hocko [mailto:mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
Sent: Thursday, May 24, 2018 8:19 PM> 
> > Let me try to reply your questions.
> > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > from the series of patches.
> >
> > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > shift operations, the first is for getting a zone_type and the second is for
> > checking the to be returned type is a correct or not. But with these patch XOR
> > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > been used to represent the encoded zone number, we can say there is no bad zone
> > number if all callers could use it without buggy way. Of course, the returned
> > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> 
> But you are losing the ability to check for wrong usage. And it seems
> that the sad reality is that the existing code do screw up.

In my opinion, originally there shouldn't be such many wrong combinations of these bottom 3 bits. For any user, whether or driver and fs, they should make a decision that which zone is they preferred. Matthew's idea is great, because with it the user must offer an unambiguous flag to gfp zone bits.

Ideally, before any user wants to modify the address zone modifier, they should clear it firstly, then ORing the GFP zone flag which comes from the zone they prefer.
With these patches, we can loudly announce that, the bottom 3 bits of zone mask couldn't accept internal ORing operations.
The operations like __GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM is illegal. The current GFP_ZONE_TABLE is precisely the root of this problem, that is __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM are formatted as 0x1, 0x2 and 0x4.

> 
> > 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> > is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> > are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> > amount of zone types to larger than 4, the zone shift should be 3.
> 
> But we do not want to expand the number of zones IMHO. The existing zoo
> is quite a maint. pain.
> 
> That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> It always makes my head explode when I look there but it seems to work
> with the current code and it is optimized for it. If you want to change
> this then you should make sure you describe reasons _why_ this is an
> improvement. And I would argue that "we can have more zones" is a
> relevant one.

Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.

* The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.

* No bad zone combination, because user should choose just one address zone modifier always.
* Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
* Up to 8 zones can be used. At least it isn't a disadvantage, right?

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 12:18       ` Michal Hocko
  (?)
  (?)
@ 2018-05-25  9:43       ` Huaisheng HS1 Ye
  -1 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-25  9:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

From: Michal Hocko [mailto:mhocko@kernel.org]
Sent: Thursday, May 24, 2018 8:19 PM> 
> > Let me try to reply your questions.
> > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > from the series of patches.
> >
> > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > shift operations, the first is for getting a zone_type and the second is for
> > checking the to be returned type is a correct or not. But with these patch XOR
> > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > been used to represent the encoded zone number, we can say there is no bad zone
> > number if all callers could use it without buggy way. Of course, the returned
> > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> 
> But you are losing the ability to check for wrong usage. And it seems
> that the sad reality is that the existing code do screw up.

In my opinion, originally there shouldn't be such many wrong combinations of these bottom 3 bits. For any user, whether or driver and fs, they should make a decision that which zone is they preferred. Matthew's idea is great, because with it the user must offer an unambiguous flag to gfp zone bits.

Ideally, before any user wants to modify the address zone modifier, they should clear it firstly, then ORing the GFP zone flag which comes from the zone they prefer.
With these patches, we can loudly announce that, the bottom 3 bits of zone mask couldn't accept internal ORing operations.
The operations like __GFP_DMA | __GFP_DMA32 | __GFP_HIGHMEM is illegal. The current GFP_ZONE_TABLE is precisely the root of this problem, that is __GFP_DMA, __GFP_DMA32 and __GFP_HIGHMEM are formatted as 0x1, 0x2 and 0x4.

> 
> > 2. GFP_ZONE_TABLE has limit with the amount of zone types. Current GFP_ZONE_TABLE
> > is 32 bits, in general, there are 4 zone types for most ofX86_64 platform, they
> > are ZONE_DMA, ZONE_DMA32, ZONE_NORMAL and ZONE_MOVABLE. If we want to expand the
> > amount of zone types to larger than 4, the zone shift should be 3.
> 
> But we do not want to expand the number of zones IMHO. The existing zoo
> is quite a maint. pain.
> 
> That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> It always makes my head explode when I look there but it seems to work
> with the current code and it is optimized for it. If you want to change
> this then you should make sure you describe reasons _why_ this is an
> improvement. And I would argue that "we can have more zones" is a
> relevant one.

Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.

* The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.

* No bad zone combination, because user should choose just one address zone modifier always.
* Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
* Up to 8 zones can be used. At least it isn't a disadvantage, right?

Sincerely,
Huaisheng Ye

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 15:29         ` Michal Hocko
  2018-05-25 12:00           ` Matthew Wilcox
@ 2018-05-25 12:00           ` Matthew Wilcox
  2018-05-28 13:33               ` Michal Hocko
  1 sibling, 1 reply; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-25 12:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Thu, May 24, 2018 at 05:29:43PM +0200, Michal Hocko wrote:
> > ie if we had more,
> > could we solve our pain by making them more generic?
> 
> Well, if you have more you will consume more bits in the struct pages,
> right?

Not necessarily ... the zone number is stored in the struct page
currently, so either two or three bits are used right now.  In my
proposal, one can infer the zone of a page from its PFN, except for
ZONE_MOVABLE.  So we could trim down to just one bit per struct page
for 32-bit machines while using 3 bits on 64-bit machines, where there
is plenty of space.

> > it more-or-less sucks that the devices with 28-bit DMA limits are forced
> > to allocate from the low 16MB when they're perfectly capable of using the
> > low 256MB.
> 
> Do we actually care all that much about those? If yes then we should
> probably follow the ZONE_DMA (x86) path and use a CMA region for them.
> I mean most devices should be good with very limited addressability or
> below 4G, no?

Sure.  One other thing I meant to mention was the media devices
(TV capture cards and so on) which want a vmalloc_32() allocation.
On 32-bit machines right now, we allocate from LOWMEM, when we really
should be allocating from the 1GB-4GB region.  32-bit machines generally
don't have a ZONE_DMA32 today.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-24 15:29         ` Michal Hocko
@ 2018-05-25 12:00           ` Matthew Wilcox
  2018-05-25 12:00           ` Matthew Wilcox
  1 sibling, 0 replies; 72+ messages in thread
From: Matthew Wilcox @ 2018-05-25 12:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Thu, May 24, 2018 at 05:29:43PM +0200, Michal Hocko wrote:
> > ie if we had more,
> > could we solve our pain by making them more generic?
> 
> Well, if you have more you will consume more bits in the struct pages,
> right?

Not necessarily ... the zone number is stored in the struct page
currently, so either two or three bits are used right now.  In my
proposal, one can infer the zone of a page from its PFN, except for
ZONE_MOVABLE.  So we could trim down to just one bit per struct page
for 32-bit machines while using 3 bits on 64-bit machines, where there
is plenty of space.

> > it more-or-less sucks that the devices with 28-bit DMA limits are forced
> > to allocate from the low 16MB when they're perfectly capable of using the
> > low 256MB.
> 
> Do we actually care all that much about those? If yes then we should
> probably follow the ZONE_DMA (x86) path and use a CMA region for them.
> I mean most devices should be good with very limited addressability or
> below 4G, no?

Sure.  One other thing I meant to mention was the media devices
(TV capture cards and so on) which want a vmalloc_32() allocation.
On 32-bit machines right now, we allocate from LOWMEM, when we really
should be allocating from the 1GB-4GB region.  32-bit machines generally
don't have a ZONE_DMA32 today.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-25 12:00           ` Matthew Wilcox
@ 2018-05-28 13:33               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-28 13:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Huaisheng Ye, akpm, linux-mm, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, chengnt, hehy1, linux-kernel,
	iommu, xen-devel, linux-btrfs, Huaisheng Ye

On Fri 25-05-18 05:00:44, Matthew Wilcox wrote:
> On Thu, May 24, 2018 at 05:29:43PM +0200, Michal Hocko wrote:
> > > ie if we had more,
> > > could we solve our pain by making them more generic?
> > 
> > Well, if you have more you will consume more bits in the struct pages,
> > right?
> 
> Not necessarily ... the zone number is stored in the struct page
> currently, so either two or three bits are used right now.  In my
> proposal, one can infer the zone of a page from its PFN, except for
> ZONE_MOVABLE.  So we could trim down to just one bit per struct page
> for 32-bit machines while using 3 bits on 64-bit machines, where there
> is plenty of space.

Just be warned that page_zone is called from many hot paths. I am not
sure adding something more complex there is going to fly.

> > > it more-or-less sucks that the devices with 28-bit DMA limits are forced
> > > to allocate from the low 16MB when they're perfectly capable of using the
> > > low 256MB.
> > 
> > Do we actually care all that much about those? If yes then we should
> > probably follow the ZONE_DMA (x86) path and use a CMA region for them.
> > I mean most devices should be good with very limited addressability or
> > below 4G, no?
> 
> Sure.  One other thing I meant to mention was the media devices
> (TV capture cards and so on) which want a vmalloc_32() allocation.
> On 32-bit machines right now, we allocate from LOWMEM, when we really
> should be allocating from the 1GB-4GB region.  32-bit machines generally
> don't have a ZONE_DMA32 today.

Well, _I_ think that vmalloc on 32b is just lost case...

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-28 13:33               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-28 13:33 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: kstewart, Huaisheng Ye, hehy1, gregkh, linux-kernel,
	alexander.levin, linux-mm, iommu, linux-btrfs, Huaisheng Ye,
	chengnt, xen-devel, akpm, colyli, mgorman, vbabka

On Fri 25-05-18 05:00:44, Matthew Wilcox wrote:
> On Thu, May 24, 2018 at 05:29:43PM +0200, Michal Hocko wrote:
> > > ie if we had more,
> > > could we solve our pain by making them more generic?
> > 
> > Well, if you have more you will consume more bits in the struct pages,
> > right?
> 
> Not necessarily ... the zone number is stored in the struct page
> currently, so either two or three bits are used right now.  In my
> proposal, one can infer the zone of a page from its PFN, except for
> ZONE_MOVABLE.  So we could trim down to just one bit per struct page
> for 32-bit machines while using 3 bits on 64-bit machines, where there
> is plenty of space.

Just be warned that page_zone is called from many hot paths. I am not
sure adding something more complex there is going to fly.

> > > it more-or-less sucks that the devices with 28-bit DMA limits are forced
> > > to allocate from the low 16MB when they're perfectly capable of using the
> > > low 256MB.
> > 
> > Do we actually care all that much about those? If yes then we should
> > probably follow the ZONE_DMA (x86) path and use a CMA region for them.
> > I mean most devices should be good with very limited addressability or
> > below 4G, no?
> 
> Sure.  One other thing I meant to mention was the media devices
> (TV capture cards and so on) which want a vmalloc_32() allocation.
> On 32-bit machines right now, we allocate from LOWMEM, when we really
> should be allocating from the 1GB-4GB region.  32-bit machines generally
> don't have a ZONE_DMA32 today.

Well, _I_ think that vmalloc on 32b is just lost case...

-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-28 13:37           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-28 13:37 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

On Fri 25-05-18 09:43:09, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko@kernel.org]
> Sent: Thursday, May 24, 2018 8:19 PM> 
> > > Let me try to reply your questions.
> > > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > > from the series of patches.
> > >
> > > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > > shift operations, the first is for getting a zone_type and the second is for
> > > checking the to be returned type is a correct or not. But with these patch XOR
> > > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > > been used to represent the encoded zone number, we can say there is no bad zone
> > > number if all callers could use it without buggy way. Of course, the returned
> > > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> > 
> > But you are losing the ability to check for wrong usage. And it seems
> > that the sad reality is that the existing code do screw up.
> 
> In my opinion, originally there shouldn't be such many wrong
> combinations of these bottom 3 bits. For any user, whether or
> driver and fs, they should make a decision that which zone is they
> preferred. Matthew's idea is great, because with it the user must
> offer an unambiguous flag to gfp zone bits.

Well, I would argue that those shouldn't really care about any zones at
all. All they should carea bout is whether they really need a low mem
zone (aka directly accessible to the kernel), highmem or they are the
allocation is generally movable. Mixing zones into the picture just
makes the whole thing more complicated and error prone.
[...]
> > That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> > It always makes my head explode when I look there but it seems to work
> > with the current code and it is optimized for it. If you want to change
> > this then you should make sure you describe reasons _why_ this is an
> > improvement. And I would argue that "we can have more zones" is a
> > relevant one.
> 
> Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.
> 
> * The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.
> 
> * No bad zone combination, because user should choose just one address zone modifier always.
> * Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
> * Up to 8 zones can be used. At least it isn't a disadvantage, right?

This should be a part of the changelog. Please note that you should
provide some number if you claim performance benefits. The complexity
will always be subjective.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-28 13:37           ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-28 13:37 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

On Fri 25-05-18 09:43:09, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
> Sent: Thursday, May 24, 2018 8:19 PM> 
> > > Let me try to reply your questions.
> > > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > > from the series of patches.
> > >
> > > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > > shift operations, the first is for getting a zone_type and the second is for
> > > checking the to be returned type is a correct or not. But with these patch XOR
> > > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > > been used to represent the encoded zone number, we can say there is no bad zone
> > > number if all callers could use it without buggy way. Of course, the returned
> > > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> > 
> > But you are losing the ability to check for wrong usage. And it seems
> > that the sad reality is that the existing code do screw up.
> 
> In my opinion, originally there shouldn't be such many wrong
> combinations of these bottom 3 bits. For any user, whether or
> driver and fs, they should make a decision that which zone is they
> preferred. Matthew's idea is great, because with it the user must
> offer an unambiguous flag to gfp zone bits.

Well, I would argue that those shouldn't really care about any zones at
all. All they should carea bout is whether they really need a low mem
zone (aka directly accessible to the kernel), highmem or they are the
allocation is generally movable. Mixing zones into the picture just
makes the whole thing more complicated and error prone.
[...]
> > That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> > It always makes my head explode when I look there but it seems to work
> > with the current code and it is optimized for it. If you want to change
> > this then you should make sure you describe reasons _why_ this is an
> > improvement. And I would argue that "we can have more zones" is a
> > relevant one.
> 
> Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.
> 
> * The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.
> 
> * No bad zone combination, because user should choose just one address zone modifier always.
> * Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
> * Up to 8 zones can be used. At least it isn't a disadvantage, right?

This should be a part of the changelog. Please note that you should
provide some number if you claim performance benefits. The complexity
will always be subjective.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-25  9:43         ` Huaisheng HS1 Ye
  (?)
  (?)
@ 2018-05-28 13:37         ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-28 13:37 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

On Fri 25-05-18 09:43:09, Huaisheng HS1 Ye wrote:
> From: Michal Hocko [mailto:mhocko@kernel.org]
> Sent: Thursday, May 24, 2018 8:19 PM> 
> > > Let me try to reply your questions.
> > > Exactly, GFP_ZONE_TABLE is too complicated. I think there are two advantages
> > > from the series of patches.
> > >
> > > 1. XOR operation is simple and efficient, GFP_ZONE_TABLE/BAD need to do twice
> > > shift operations, the first is for getting a zone_type and the second is for
> > > checking the to be returned type is a correct or not. But with these patch XOR
> > > operation just needs to use once. Because the bottom 3 bits of GFP bitmask have
> > > been used to represent the encoded zone number, we can say there is no bad zone
> > > number if all callers could use it without buggy way. Of course, the returned
> > > zone type in gfp_zone needs to be no more than ZONE_MOVABLE.
> > 
> > But you are losing the ability to check for wrong usage. And it seems
> > that the sad reality is that the existing code do screw up.
> 
> In my opinion, originally there shouldn't be such many wrong
> combinations of these bottom 3 bits. For any user, whether or
> driver and fs, they should make a decision that which zone is they
> preferred. Matthew's idea is great, because with it the user must
> offer an unambiguous flag to gfp zone bits.

Well, I would argue that those shouldn't really care about any zones at
all. All they should carea bout is whether they really need a low mem
zone (aka directly accessible to the kernel), highmem or they are the
allocation is generally movable. Mixing zones into the picture just
makes the whole thing more complicated and error prone.
[...]
> > That being said. I am not saying that I am in love with GFP_ZONE_TABLE.
> > It always makes my head explode when I look there but it seems to work
> > with the current code and it is optimized for it. If you want to change
> > this then you should make sure you describe reasons _why_ this is an
> > improvement. And I would argue that "we can have more zones" is a
> > relevant one.
> 
> Yes, GFP_ZONE_TABLE is too complicated. The patches have 4 advantages as below.
> 
> * The address zone modifiers have new operation method, that is, user should decide which zone is preferred at first, then give the encoded zone number to bottom 3 bits in GFP mask. That is much direct and clear than before.
> 
> * No bad zone combination, because user should choose just one address zone modifier always.
> * Better performance and efficiency, current gfp_zone has to take shifting operation twice for GFP_ZONE_TABLE and GFP_ZONE_BAD. With these patches, gfp_zone() just needs one XOR.
> * Up to 8 zones can be used. At least it isn't a disadvantage, right?

This should be a part of the changelog. Please note that you should
provide some number if you claim performance benefits. The complexity
will always be subjective.
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:02             ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-30  9:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On Behalf Of Michal Hocko
Sent: Monday, May 28, 2018 9:38 PM
> > In my opinion, originally there shouldn't be such many wrong
> > combinations of these bottom 3 bits. For any user, whether or
> > driver and fs, they should make a decision that which zone is they
> > preferred. Matthew's idea is great, because with it the user must
> > offer an unambiguous flag to gfp zone bits.
> 
> Well, I would argue that those shouldn't really care about any zones at
> all. All they should carea bout is whether they really need a low mem
> zone (aka directly accessible to the kernel), highmem or they are the
> allocation is generally movable. Mixing zones into the picture just
> makes the whole thing more complicated and error prone.

Dear Michal,

I don't quite understand that. I think those, mostly drivers, need to
get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
satisfied with a low mem zone, why they mark the gfp flags as
'GFP_KERNEL|__GFP_DMA32'?
GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
obvious that they want to get a DMA accessible zone below 4G.

> This should be a part of the changelog. Please note that you should
> provide some number if you claim performance benefits. The complexity
> will always be subjective.

Sure, I will post them to changelog with next version of patches.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* RE: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:02             ` Huaisheng HS1 Ye
  0 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-30  9:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

From: owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org [mailto:owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org] On Behalf Of Michal Hocko
Sent: Monday, May 28, 2018 9:38 PM
> > In my opinion, originally there shouldn't be such many wrong
> > combinations of these bottom 3 bits. For any user, whether or
> > driver and fs, they should make a decision that which zone is they
> > preferred. Matthew's idea is great, because with it the user must
> > offer an unambiguous flag to gfp zone bits.
> 
> Well, I would argue that those shouldn't really care about any zones at
> all. All they should carea bout is whether they really need a low mem
> zone (aka directly accessible to the kernel), highmem or they are the
> allocation is generally movable. Mixing zones into the picture just
> makes the whole thing more complicated and error prone.

Dear Michal,

I don't quite understand that. I think those, mostly drivers, need to
get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
satisfied with a low mem zone, why they mark the gfp flags as
'GFP_KERNEL|__GFP_DMA32'?
GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
obvious that they want to get a DMA accessible zone below 4G.

> This should be a part of the changelog. Please note that you should
> provide some number if you claim performance benefits. The complexity
> will always be subjective.

Sure, I will post them to changelog with next version of patches.

Sincerely,
Huaisheng Ye

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-28 13:37           ` Michal Hocko
  (?)
  (?)
@ 2018-05-30  9:02           ` Huaisheng HS1 Ye
  -1 siblings, 0 replies; 72+ messages in thread
From: Huaisheng HS1 Ye @ 2018-05-30  9:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On Behalf Of Michal Hocko
Sent: Monday, May 28, 2018 9:38 PM
> > In my opinion, originally there shouldn't be such many wrong
> > combinations of these bottom 3 bits. For any user, whether or
> > driver and fs, they should make a decision that which zone is they
> > preferred. Matthew's idea is great, because with it the user must
> > offer an unambiguous flag to gfp zone bits.
> 
> Well, I would argue that those shouldn't really care about any zones at
> all. All they should carea bout is whether they really need a low mem
> zone (aka directly accessible to the kernel), highmem or they are the
> allocation is generally movable. Mixing zones into the picture just
> makes the whole thing more complicated and error prone.

Dear Michal,

I don't quite understand that. I think those, mostly drivers, need to
get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
satisfied with a low mem zone, why they mark the gfp flags as
'GFP_KERNEL|__GFP_DMA32'?
GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
obvious that they want to get a DMA accessible zone below 4G.

> This should be a part of the changelog. Please note that you should
> provide some number if you claim performance benefits. The complexity
> will always be subjective.

Sure, I will post them to changelog with next version of patches.

Sincerely,
Huaisheng Ye




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:11               ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-30  9:11 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: Michal Hocko, akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

On Wed, May 30, 2018 at 09:02:13AM +0000, Huaisheng HS1 Ye wrote:
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?

Drivers should never use GFP_DMA32 directly.  The right abstraction is
the DMA API, ZONE_DMA32 is just a helper.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:11               ` Christoph Hellwig
  0 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-30  9:11 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ, Michal Hocko,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, colyli-l3A5Bk7waGM,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

On Wed, May 30, 2018 at 09:02:13AM +0000, Huaisheng HS1 Ye wrote:
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?

Drivers should never use GFP_DMA32 directly.  The right abstraction is
the DMA API, ZONE_DMA32 is just a helper.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-30  9:02             ` Huaisheng HS1 Ye
  (?)
  (?)
@ 2018-05-30  9:11             ` Christoph Hellwig
  -1 siblings, 0 replies; 72+ messages in thread
From: Christoph Hellwig @ 2018-05-30  9:11 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	Michal Hocko, linux-mm, colyli, iommu, linux-btrfs,
	NingTing Cheng, xen-devel, akpm, alexander.levin, mgorman,
	Christoph Hellwig, vbabka

On Wed, May 30, 2018 at 09:02:13AM +0000, Huaisheng HS1 Ye wrote:
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?

Drivers should never use GFP_DMA32 directly.  The right abstraction is
the DMA API, ZONE_DMA32 is just a helper.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:12               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-30  9:12 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: akpm, linux-mm, willy, vbabka, mgorman, kstewart,
	alexander.levin, gregkh, colyli, NingTing Cheng, Ocean HY1 He,
	linux-kernel, iommu, xen-devel, linux-btrfs, Christoph Hellwig

On Wed 30-05-18 09:02:13, Huaisheng HS1 Ye wrote:
> From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On Behalf Of Michal Hocko
> Sent: Monday, May 28, 2018 9:38 PM
> > > In my opinion, originally there shouldn't be such many wrong
> > > combinations of these bottom 3 bits. For any user, whether or
> > > driver and fs, they should make a decision that which zone is they
> > > preferred. Matthew's idea is great, because with it the user must
> > > offer an unambiguous flag to gfp zone bits.
> > 
> > Well, I would argue that those shouldn't really care about any zones at
> > all. All they should carea bout is whether they really need a low mem
> > zone (aka directly accessible to the kernel), highmem or they are the
> > allocation is generally movable. Mixing zones into the picture just
> > makes the whole thing more complicated and error prone.
> 
> Dear Michal,
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?
> GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
> obvious that they want to get a DMA accessible zone below 4G.

They want a specific pfn range. Not a _zone_. Zone is an MM abstraction
to manage memory. And not a great one as the time has shown. We have
moved away from the per-zone reclaim because it just turned out to be
problematic. Leaking this abstraction to users was a mistake IMHO. It
was surely convenient but we can clearly see it was just confusing and
many users just got it wrong.

I do agree with Christoph in other email that the proper way for DMA
users is to use the existing DMA API which is more towards what they
need. Set a restriction on dma-able memory ranges.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External]  Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
@ 2018-05-30  9:12               ` Michal Hocko
  0 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-30  9:12 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, Ocean HY1 He,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	willy-wEGCiKHe2LqWVfeAwA7xHQ,
	alexander.levin-H+0wwilmMs1BDgjK7y7TUQ,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-btrfs-u79uwXL29TY76Z2rM5mHXA, NingTing Cheng,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, colyli-l3A5Bk7waGM,
	mgorman-3eNAlZScCAx27rWaFMvyedHuzzzSOjJt, Christoph Hellwig,
	vbabka-AlSwsSmVLrQ

On Wed 30-05-18 09:02:13, Huaisheng HS1 Ye wrote:
> From: owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org [mailto:owner-linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org] On Behalf Of Michal Hocko
> Sent: Monday, May 28, 2018 9:38 PM
> > > In my opinion, originally there shouldn't be such many wrong
> > > combinations of these bottom 3 bits. For any user, whether or
> > > driver and fs, they should make a decision that which zone is they
> > > preferred. Matthew's idea is great, because with it the user must
> > > offer an unambiguous flag to gfp zone bits.
> > 
> > Well, I would argue that those shouldn't really care about any zones at
> > all. All they should carea bout is whether they really need a low mem
> > zone (aka directly accessible to the kernel), highmem or they are the
> > allocation is generally movable. Mixing zones into the picture just
> > makes the whole thing more complicated and error prone.
> 
> Dear Michal,
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?
> GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
> obvious that they want to get a DMA accessible zone below 4G.

They want a specific pfn range. Not a _zone_. Zone is an MM abstraction
to manage memory. And not a great one as the time has shown. We have
moved away from the per-zone reclaim because it just turned out to be
problematic. Leaking this abstraction to users was a mistake IMHO. It
was surely convenient but we can clearly see it was just confusing and
many users just got it wrong.

I do agree with Christoph in other email that the proper way for DMA
users is to use the existing DMA API which is more towards what they
need. Set a restriction on dma-able memory ranges.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [External] Re: [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD
  2018-05-30  9:02             ` Huaisheng HS1 Ye
                               ` (3 preceding siblings ...)
  (?)
@ 2018-05-30  9:12             ` Michal Hocko
  -1 siblings, 0 replies; 72+ messages in thread
From: Michal Hocko @ 2018-05-30  9:12 UTC (permalink / raw)
  To: Huaisheng HS1 Ye
  Cc: kstewart, Ocean HY1 He, gregkh, linux-kernel, willy,
	alexander.levin, linux-mm, iommu, linux-btrfs, NingTing Cheng,
	xen-devel, akpm, colyli, mgorman, Christoph Hellwig, vbabka

On Wed 30-05-18 09:02:13, Huaisheng HS1 Ye wrote:
> From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On Behalf Of Michal Hocko
> Sent: Monday, May 28, 2018 9:38 PM
> > > In my opinion, originally there shouldn't be such many wrong
> > > combinations of these bottom 3 bits. For any user, whether or
> > > driver and fs, they should make a decision that which zone is they
> > > preferred. Matthew's idea is great, because with it the user must
> > > offer an unambiguous flag to gfp zone bits.
> > 
> > Well, I would argue that those shouldn't really care about any zones at
> > all. All they should carea bout is whether they really need a low mem
> > zone (aka directly accessible to the kernel), highmem or they are the
> > allocation is generally movable. Mixing zones into the picture just
> > makes the whole thing more complicated and error prone.
> 
> Dear Michal,
> 
> I don't quite understand that. I think those, mostly drivers, need to
> get the correct zone they want. ZONE_DMA32 is an example, if drivers can be
> satisfied with a low mem zone, why they mark the gfp flags as
> 'GFP_KERNEL|__GFP_DMA32'?
> GFP_KERNEL is enough to make sure a directly accessible low mem, but it is
> obvious that they want to get a DMA accessible zone below 4G.

They want a specific pfn range. Not a _zone_. Zone is an MM abstraction
to manage memory. And not a great one as the time has shown. We have
moved away from the per-zone reclaim because it just turned out to be
problematic. Leaking this abstraction to users was a mistake IMHO. It
was surely convenient but we can clearly see it was just confusing and
many users just got it wrong.

I do agree with Christoph in other email that the proper way for DMA
users is to use the existing DMA API which is more towards what they
need. Set a restriction on dma-able memory ranges.
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2018-05-30  9:12 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-21 15:20 [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 01/12] include/linux/gfp.h: " Huaisheng Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 02/12] arch/x86/kernel/amd_gart_64: update usage of address zone modifiers Huaisheng Ye
2018-05-22  9:38   ` Christoph Hellwig
2018-05-22  9:38   ` Christoph Hellwig
2018-05-22  9:38     ` Christoph Hellwig
2018-05-22 10:17     ` [External] " Huaisheng HS1 Ye
2018-05-22 10:17     ` Huaisheng HS1 Ye
2018-05-22 10:17       ` Huaisheng HS1 Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 03/12] arch/x86/kernel/pci-calgary_64: " Huaisheng Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 04/12] drivers/iommu/amd_iommu: " Huaisheng Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 05/12] include/linux/dma-mapping: " Huaisheng Ye
2018-05-21 15:30   ` Christoph Hellwig
2018-05-21 15:30   ` Christoph Hellwig
2018-05-21 15:30     ` Christoph Hellwig
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 10/12] mm/zsmalloc: " Huaisheng Ye
2018-05-22 11:22   ` Matthew Wilcox
2018-05-22 11:22     ` Matthew Wilcox
2018-05-22 11:51     ` [External] " Huaisheng HS1 Ye
2018-05-22 11:51     ` Huaisheng HS1 Ye
2018-05-22 11:51       ` Huaisheng HS1 Ye
2018-05-22 11:22   ` Matthew Wilcox
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 11/12] include/linux/highmem: update usage of movableflags Huaisheng Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-21 15:20 ` [RFC PATCH v2 12/12] arch/x86/include/asm/page.h: " Huaisheng Ye
2018-05-21 15:20 ` Huaisheng Ye
2018-05-22  9:40 ` [RFC PATCH v2 00/12] get rid of GFP_ZONE_TABLE/BAD Christoph Hellwig
2018-05-22  9:40   ` Christoph Hellwig
2018-05-22  9:40 ` Christoph Hellwig
2018-05-22 18:37 ` Michal Hocko
2018-05-23 16:07   ` [External] " Huaisheng HS1 Ye
2018-05-23 16:07     ` Huaisheng HS1 Ye
2018-05-24 12:18     ` Michal Hocko
2018-05-24 12:18       ` Michal Hocko
2018-05-25  9:43       ` Huaisheng HS1 Ye
2018-05-25  9:43         ` Huaisheng HS1 Ye
2018-05-28 13:37         ` Michal Hocko
2018-05-28 13:37           ` Michal Hocko
2018-05-30  9:02           ` Huaisheng HS1 Ye
2018-05-30  9:02             ` Huaisheng HS1 Ye
2018-05-30  9:11             ` Christoph Hellwig
2018-05-30  9:11               ` Christoph Hellwig
2018-05-30  9:11             ` Christoph Hellwig
2018-05-30  9:12             ` Michal Hocko
2018-05-30  9:12               ` Michal Hocko
2018-05-30  9:12             ` Michal Hocko
2018-05-30  9:02           ` Huaisheng HS1 Ye
2018-05-28 13:37         ` Michal Hocko
2018-05-25  9:43       ` Huaisheng HS1 Ye
2018-05-24 12:18     ` Michal Hocko
2018-05-23 16:07   ` Huaisheng HS1 Ye
2018-05-24  5:19   ` Matthew Wilcox
2018-05-24  5:19     ` Matthew Wilcox
2018-05-24 12:23     ` Michal Hocko
2018-05-24 12:23       ` Michal Hocko
2018-05-24 15:18       ` Matthew Wilcox
2018-05-24 15:18       ` Matthew Wilcox
2018-05-24 15:29         ` Michal Hocko
2018-05-24 15:29         ` Michal Hocko
2018-05-25 12:00           ` Matthew Wilcox
2018-05-25 12:00           ` Matthew Wilcox
2018-05-28 13:33             ` Michal Hocko
2018-05-28 13:33               ` Michal Hocko
2018-05-24 12:23     ` Michal Hocko
2018-05-24  5:19   ` Matthew Wilcox
2018-05-22 18:37 ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.