linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm,hugetlb: compute page_size_log properly
@ 2017-03-08 17:06 Davidlohr Bueso
  2017-03-08 19:39 ` Andi Kleen
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Davidlohr Bueso @ 2017-03-08 17:06 UTC (permalink / raw)
  To: akpm
  Cc: mhocko, ak, mtk.manpages, dave, linux-mm, linux-kernel, Davidlohr Bueso

The SHM_HUGE_* stuff  was introduced in:

   42d7395feb5 (mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB)

It unnecessarily adds another layer, specific to sysv shm, without
anything special about it: the macros are identical to the MAP_HUGE_*
stuff, which in turn does correctly describe the hugepage subsystem.

One example of the problems with extra layers what this patch fixes:
mmap_pgoff() should never be using SHM_HUGE_* logic. This was
introduced by:

   091d0d55b28 (shm: fix null pointer deref when userspace specifies invalid hugepage size)

It is obviously harmless but lets just rip out the whole thing --
the shmget.2 manpage will need updating, as it should not be
describing kernel internals.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/shm.h                    | 13 -------------
 ipc/shm.c                              |  6 +++---
 mm/mmap.c                              |  2 +-
 tools/testing/selftests/vm/thuge-gen.c |  8 +-------
 4 files changed, 5 insertions(+), 24 deletions(-)

diff --git a/include/linux/shm.h b/include/linux/shm.h
index 429c1995d756..98fc25f9db8a 100644
--- a/include/linux/shm.h
+++ b/include/linux/shm.h
@@ -31,19 +31,6 @@ struct shmid_kernel /* private to the kernel */
 
 /* Bits [26:31] are reserved */
 
-/*
- * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
- */
-#define SHM_HUGE_SHIFT  26
-#define SHM_HUGE_MASK   0x3f
-#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
-#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
-
 #ifdef CONFIG_SYSVIPC
 long do_shmat(int shmid, char __user *shmaddr, int shmflg, unsigned long *addr,
 	      unsigned long shmlba);
diff --git a/ipc/shm.c b/ipc/shm.c
index 7e199fa1960f..f21a2338ee39 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -491,8 +491,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
 
 	sprintf (name, "SYSV%08x", key);
 	if (shmflg & SHM_HUGETLB) {
-		struct hstate *hs = hstate_sizelog((shmflg >> SHM_HUGE_SHIFT)
-						& SHM_HUGE_MASK);
+		struct hstate *hs = hstate_sizelog((shmflg >> MAP_HUGE_SHIFT)
+						   & MAP_HUGE_MASK);
 		size_t hugesize;
 
 		if (!hs) {
@@ -506,7 +506,7 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
 			acctflag = VM_NORESERVE;
 		file = hugetlb_file_setup(name, hugesize, acctflag,
 				  &shp->mlock_user, HUGETLB_SHMFS_INODE,
-				(shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK);
+				(shmflg >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
 	} else {
 		/*
 		 * Do not allow no accounting for OVERCOMMIT_NEVER, even
diff --git a/mm/mmap.c b/mm/mmap.c
index 0718c175db8f..a1c4cefc5a38 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1369,7 +1369,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 	} else if (flags & MAP_HUGETLB) {
 		struct user_struct *user = NULL;
 		struct hstate *hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) &
-						   SHM_HUGE_MASK);
+						   MAP_HUGE_MASK);
 
 		if (!hs)
 			return -EINVAL;
diff --git a/tools/testing/selftests/vm/thuge-gen.c b/tools/testing/selftests/vm/thuge-gen.c
index c87957295f74..4479015ec96a 100644
--- a/tools/testing/selftests/vm/thuge-gen.c
+++ b/tools/testing/selftests/vm/thuge-gen.c
@@ -32,12 +32,6 @@
 #define MAP_HUGE_MASK   0x3f
 #define MAP_HUGETLB	0x40000
 
-#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
-#define SHM_HUGE_SHIFT  26
-#define SHM_HUGE_MASK   0x3f
-#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
-#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
-
 #define NUM_PAGESIZES   5
 
 #define NUM_PAGES 4
@@ -243,7 +237,7 @@ int main(void)
 
 	for (i = 0; i < num_page_sizes; i++) {
 		unsigned long ps = page_sizes[i];
-		int arg = ilog2(ps) << SHM_HUGE_SHIFT;
+		int arg = ilog2(ps) << MAP_HUGE_SHIFT;
 		printf("Testing %luMB shmget with shift %x\n", ps >> 20, arg);
 		test_shmget(ps, SHM_HUGETLB | arg);
 	}
-- 
2.6.6

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-08 17:06 [PATCH] mm,hugetlb: compute page_size_log properly Davidlohr Bueso
@ 2017-03-08 19:39 ` Andi Kleen
  2017-03-09  3:24   ` Anshuman Khandual
  2017-03-09  8:55 ` Michal Hocko
  2017-03-28 16:53 ` Davidlohr Bueso
  2 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2017-03-08 19:39 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, mhocko, mtk.manpages, linux-mm, linux-kernel, Davidlohr Bueso

> One example of the problems with extra layers what this patch fixes:
> mmap_pgoff() should never be using SHM_HUGE_* logic. This was
> introduced by:
> 
>    091d0d55b28 (shm: fix null pointer deref when userspace specifies invalid hugepage size)
> 
> It is obviously harmless but lets just rip out the whole thing --
> the shmget.2 manpage will need updating, as it should not be
> describing kernel internals.

The SHM_* defines were supposed to be exported to user space,
but somehow they didn't make it into uapi.

But something like this is useful, it's a much nicer 
interface for users than to hard code the bit position

So I would rather if you move it to uapi instead of 
removing. What the kernel uses internally doesn't
really matter.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-08 19:39 ` Andi Kleen
@ 2017-03-09  3:24   ` Anshuman Khandual
  0 siblings, 0 replies; 29+ messages in thread
From: Anshuman Khandual @ 2017-03-09  3:24 UTC (permalink / raw)
  To: Andi Kleen, Davidlohr Bueso
  Cc: akpm, mhocko, mtk.manpages, linux-mm, linux-kernel, Davidlohr Bueso

On 03/09/2017 01:09 AM, Andi Kleen wrote:
>> One example of the problems with extra layers what this patch fixes:
>> mmap_pgoff() should never be using SHM_HUGE_* logic. This was
>> introduced by:
>>
>>    091d0d55b28 (shm: fix null pointer deref when userspace specifies invalid hugepage size)
>>
>> It is obviously harmless but lets just rip out the whole thing --
>> the shmget.2 manpage will need updating, as it should not be
>> describing kernel internals.
> 
> The SHM_* defines were supposed to be exported to user space,
> but somehow they didn't make it into uapi.

Yeah, its not part of UAPI which it should have been. Now we
need to ilog2(page_size) and shift it before using them in
the user space. BTW, mmap() interface also would want this
encoding should we choose to use non default HugeTLB page
sizes.

> 
> But something like this is useful, it's a much nicer 
> interface for users than to hard code the bit position

Right. But as we need this both for shm and mmap() interface,
we can only have one set of values exported to the UAPI. The
other set needs to be removed IMHO. BTW, we need to add the
encoding for other arch supported HugeTLB supported sizes as
well like 16MB, 16GB etc (on POWER).
 
> 
> So I would rather if you move it to uapi instead of 
> removing. What the kernel uses internally doesn't
> really matter.

Had a sent a clean up patch last year which unfortunately I
forgot to resend though it has got ACK from Michal Hocko
and Balbir Singh.

https://lkml.org/lkml/2016/4/7/43

I had also tried to add POWER HugeTLB size encoding in the
arch specific header files. Probably its time to move all
of them to generic header.

https://lkml.org/lkml/2016/4/7/48

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-08 17:06 [PATCH] mm,hugetlb: compute page_size_log properly Davidlohr Bueso
  2017-03-08 19:39 ` Andi Kleen
@ 2017-03-09  8:55 ` Michal Hocko
  2017-03-28 16:53 ` Davidlohr Bueso
  2 siblings, 0 replies; 29+ messages in thread
From: Michal Hocko @ 2017-03-09  8:55 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, ak, mtk.manpages, linux-mm, linux-kernel, Davidlohr Bueso

On Wed 08-03-17 09:06:01, Davidlohr Bueso wrote:
> The SHM_HUGE_* stuff  was introduced in:
> 
>    42d7395feb5 (mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB)
> 
> It unnecessarily adds another layer, specific to sysv shm, without
> anything special about it: the macros are identical to the MAP_HUGE_*
> stuff, which in turn does correctly describe the hugepage subsystem.
> 
> One example of the problems with extra layers what this patch fixes:
> mmap_pgoff() should never be using SHM_HUGE_* logic. This was
> introduced by:
> 
>    091d0d55b28 (shm: fix null pointer deref when userspace specifies invalid hugepage size)
> 
> It is obviously harmless but lets just rip out the whole thing --
> the shmget.2 manpage will need updating, as it should not be
> describing kernel internals.

Yes, I agree the additional layer just adds confusion and as it turned
out it is error prone. As this has never been exported to the userspace
properly without anybody complaining I would strongly suspect it is not
really needed so just get rid of it.

> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  include/linux/shm.h                    | 13 -------------
>  ipc/shm.c                              |  6 +++---
>  mm/mmap.c                              |  2 +-
>  tools/testing/selftests/vm/thuge-gen.c |  8 +-------
>  4 files changed, 5 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/shm.h b/include/linux/shm.h
> index 429c1995d756..98fc25f9db8a 100644
> --- a/include/linux/shm.h
> +++ b/include/linux/shm.h
> @@ -31,19 +31,6 @@ struct shmid_kernel /* private to the kernel */
>  
>  /* Bits [26:31] are reserved */
>  
> -/*
> - * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> - */
> -#define SHM_HUGE_SHIFT  26
> -#define SHM_HUGE_MASK   0x3f
> -#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
> -#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
> -
>  #ifdef CONFIG_SYSVIPC
>  long do_shmat(int shmid, char __user *shmaddr, int shmflg, unsigned long *addr,
>  	      unsigned long shmlba);
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 7e199fa1960f..f21a2338ee39 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -491,8 +491,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
>  
>  	sprintf (name, "SYSV%08x", key);
>  	if (shmflg & SHM_HUGETLB) {
> -		struct hstate *hs = hstate_sizelog((shmflg >> SHM_HUGE_SHIFT)
> -						& SHM_HUGE_MASK);
> +		struct hstate *hs = hstate_sizelog((shmflg >> MAP_HUGE_SHIFT)
> +						   & MAP_HUGE_MASK);
>  		size_t hugesize;
>  
>  		if (!hs) {
> @@ -506,7 +506,7 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
>  			acctflag = VM_NORESERVE;
>  		file = hugetlb_file_setup(name, hugesize, acctflag,
>  				  &shp->mlock_user, HUGETLB_SHMFS_INODE,
> -				(shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK);
> +				(shmflg >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
>  	} else {
>  		/*
>  		 * Do not allow no accounting for OVERCOMMIT_NEVER, even
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 0718c175db8f..a1c4cefc5a38 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1369,7 +1369,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
>  	} else if (flags & MAP_HUGETLB) {
>  		struct user_struct *user = NULL;
>  		struct hstate *hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) &
> -						   SHM_HUGE_MASK);
> +						   MAP_HUGE_MASK);
>  
>  		if (!hs)
>  			return -EINVAL;
> diff --git a/tools/testing/selftests/vm/thuge-gen.c b/tools/testing/selftests/vm/thuge-gen.c
> index c87957295f74..4479015ec96a 100644
> --- a/tools/testing/selftests/vm/thuge-gen.c
> +++ b/tools/testing/selftests/vm/thuge-gen.c
> @@ -32,12 +32,6 @@
>  #define MAP_HUGE_MASK   0x3f
>  #define MAP_HUGETLB	0x40000
>  
> -#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
> -#define SHM_HUGE_SHIFT  26
> -#define SHM_HUGE_MASK   0x3f
> -#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
> -#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
> -
>  #define NUM_PAGESIZES   5
>  
>  #define NUM_PAGES 4
> @@ -243,7 +237,7 @@ int main(void)
>  
>  	for (i = 0; i < num_page_sizes; i++) {
>  		unsigned long ps = page_sizes[i];
> -		int arg = ilog2(ps) << SHM_HUGE_SHIFT;
> +		int arg = ilog2(ps) << MAP_HUGE_SHIFT;
>  		printf("Testing %luMB shmget with shift %x\n", ps >> 20, arg);
>  		test_shmget(ps, SHM_HUGETLB | arg);
>  	}
> -- 
> 2.6.6

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-08 17:06 [PATCH] mm,hugetlb: compute page_size_log properly Davidlohr Bueso
  2017-03-08 19:39 ` Andi Kleen
  2017-03-09  8:55 ` Michal Hocko
@ 2017-03-28 16:53 ` Davidlohr Bueso
  2017-03-28 16:55   ` Davidlohr Bueso
  2 siblings, 1 reply; 29+ messages in thread
From: Davidlohr Bueso @ 2017-03-28 16:53 UTC (permalink / raw)
  To: akpm; +Cc: mhocko, ak, mtk.manpages, linux-mm, linux-kernel, Davidlohr Bueso

Do we have any consensus here? Keeping SHM_HUGE_* is currently
winning 2-1. If there are in fact users out there computing the
value manually, then I am ok with keeping it and properly exporting
it. Michal?

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-28 16:53 ` Davidlohr Bueso
@ 2017-03-28 16:55   ` Davidlohr Bueso
  2017-03-28 17:54     ` Matthew Wilcox
  0 siblings, 1 reply; 29+ messages in thread
From: Davidlohr Bueso @ 2017-03-28 16:55 UTC (permalink / raw)
  To: akpm, mhocko, ak, mtk.manpages, linux-mm, linux-kernel,
	Davidlohr Bueso, khandual

Sorry, forgot to add Anshuman.

On Tue, 28 Mar 2017, Davidlohr Bueso wrote:

>Do we have any consensus here? Keeping SHM_HUGE_* is currently
>winning 2-1. If there are in fact users out there computing the
>value manually, then I am ok with keeping it and properly exporting
>it. Michal?
>
>Thanks,
>Davidlohr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-28 16:55   ` Davidlohr Bueso
@ 2017-03-28 17:54     ` Matthew Wilcox
  2017-03-29  8:06       ` Michal Hocko
                         ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-03-28 17:54 UTC (permalink / raw)
  To: akpm, mhocko, ak, mtk.manpages, linux-mm, linux-kernel,
	Davidlohr Bueso, khandual

On Tue, Mar 28, 2017 at 09:55:13AM -0700, Davidlohr Bueso wrote:
> Do we have any consensus here? Keeping SHM_HUGE_* is currently
> winning 2-1. If there are in fact users out there computing the
> value manually, then I am ok with keeping it and properly exporting
> it. Michal?

Well, let's see what it looks like to do that.  I went down the rabbit
hole trying to understand why some of the SHM_ flags had the same value
as each other until I realised some of them were internal flags, some
were flags to shmat() and others were flags to shmget().  Hopefully I
disambiguated them nicely in this patch.  I also added 8MB and 16GB sizes.
Any more architectures with a pet favourite huge/giant page size we
should add convenience defines for?

diff --git a/include/linux/shm.h b/include/linux/shm.h
index 04e881829625..cd95243efd1a 100644
--- a/include/linux/shm.h
+++ b/include/linux/shm.h
@@ -24,26 +24,13 @@ struct shmid_kernel /* private to the kernel */
 	struct list_head	shm_clist;	/* list by creator */
 };
 
-/* shm_mode upper byte flags */
-#define	SHM_DEST	01000	/* segment will be destroyed on last detach */
-#define SHM_LOCKED      02000   /* segment will not be swapped */
-#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
-#define SHM_NORESERVE   010000  /* don't check for reservations */
-
-/* Bits [26:31] are reserved */
-
 /*
- * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
+ * These flags are used internally; they cannot be specified by the user.
+ * They are masked off in newseg().  These values are used by IPC_CREAT
+ * and IPC_EXCL when calling shmget().
  */
-#define SHM_HUGE_SHIFT  26
-#define SHM_HUGE_MASK   0x3f
-#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
-#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
+#define	SHM_DEST	01000	/* segment will be destroyed on last detach */
+#define SHM_LOCKED      02000   /* segment will not be swapped */
 
 #ifdef CONFIG_SYSVIPC
 struct sysv_shm {
diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h
index 1fbf24ea37fd..44b36cb228d7 100644
--- a/include/uapi/linux/shm.h
+++ b/include/uapi/linux/shm.h
@@ -40,15 +40,34 @@ struct shmid_ds {
 /* Include the definition of shmid64_ds and shminfo64 */
 #include <asm/shmbuf.h>
 
-/* permission flag for shmget */
+/* shmget() shmflg values. */
+/* The bottom nine bits are the same as open(2) mode flags */
 #define SHM_R		0400	/* or S_IRUGO from <linux/stat.h> */
 #define SHM_W		0200	/* or S_IWUGO from <linux/stat.h> */
+/* Bits 9 & 10 are IPC_CREAT and IPC_EXCL */
+#define SHM_HUGETLB     (1 << 11) /* segment will use huge TLB pages */
+#define SHM_NORESERVE   (1 << 12) /* don't check for reservations */
 
-/* mode for attach */
-#define	SHM_RDONLY	010000	/* read-only access */
-#define	SHM_RND		020000	/* round attach address to SHMLBA boundary */
-#define	SHM_REMAP	040000	/* take-over region on attach */
-#define	SHM_EXEC	0100000	/* execution access */
+/*
+ * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
+ * This gives us 6 bits, which is enough until someone invents 128 bit address
+ * spaces.  These match MAP_HUGE_SHIFT and MAP_HUGE_MASK.
+ *
+ * Assume these are all powers of two.
+ * When 0 use the default page size.
+ */
+#define SHM_HUGE_SHIFT	26
+#define SHM_HUGE_MASK	0x3f
+#define SHM_HUGE_2MB	(21 << SHM_HUGE_SHIFT)
+#define SHM_HUGE_8MB	(23 << SHM_HUGE_SHIFT)
+#define SHM_HUGE_1GB	(30 << SHM_HUGE_SHIFT)
+#define SHM_HUGE_16GB	(34 << SHM_HUGE_SHIFT)
+
+/* shmat() shmflg values */
+#define	SHM_RDONLY	(1 << 12) /* read-only access */
+#define	SHM_RND		(1 << 13) /* round attach address to SHMLBA boundary */
+#define	SHM_REMAP	(1 << 14) /* take-over region on attach */
+#define	SHM_EXEC	(1 << 15) /* execution access */
 
 /* super user shmctl commands */
 #define SHM_LOCK 	11
diff --git a/mm/mmap.c b/mm/mmap.c
index 499b988b1639..40b29aca18c1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1479,7 +1479,7 @@ SYSCALL_DEFINE6(mmap_pgoff, unsigned long, addr, unsigned long, len,
 		struct user_struct *user = NULL;
 		struct hstate *hs;
 
-		hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & SHM_HUGE_MASK);
+		hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
 		if (!hs)
 			return -EINVAL;
 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-28 17:54     ` Matthew Wilcox
@ 2017-03-29  8:06       ` Michal Hocko
  2017-03-29 17:45         ` Andi Kleen
  2017-04-13  6:02       ` Aneesh Kumar K.V
  2017-07-17 22:27       ` Mike Kravetz
  2 siblings, 1 reply; 29+ messages in thread
From: Michal Hocko @ 2017-03-29  8:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, ak, mtk.manpages, linux-mm, linux-kernel, Davidlohr Bueso,
	khandual

On Tue 28-03-17 10:54:08, Matthew Wilcox wrote:
> On Tue, Mar 28, 2017 at 09:55:13AM -0700, Davidlohr Bueso wrote:
> > Do we have any consensus here? Keeping SHM_HUGE_* is currently
> > winning 2-1. If there are in fact users out there computing the
> > value manually, then I am ok with keeping it and properly exporting
> > it. Michal?
> 
> Well, let's see what it looks like to do that.  I went down the rabbit
> hole trying to understand why some of the SHM_ flags had the same value
> as each other until I realised some of them were internal flags, some
> were flags to shmat() and others were flags to shmget().  Hopefully I
> disambiguated them nicely in this patch.  I also added 8MB and 16GB sizes.
> Any more architectures with a pet favourite huge/giant page size we
> should add convenience defines for?

Do we actually have any users?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-29  8:06       ` Michal Hocko
@ 2017-03-29 17:45         ` Andi Kleen
  2017-03-30  6:12           ` Michal Hocko
  0 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2017-03-29 17:45 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, akpm, mtk.manpages, linux-mm, linux-kernel,
	Davidlohr Bueso, khandual

On Wed, Mar 29, 2017 at 10:06:25AM +0200, Michal Hocko wrote:
> On Tue 28-03-17 10:54:08, Matthew Wilcox wrote:
> > On Tue, Mar 28, 2017 at 09:55:13AM -0700, Davidlohr Bueso wrote:
> > > Do we have any consensus here? Keeping SHM_HUGE_* is currently
> > > winning 2-1. If there are in fact users out there computing the
> > > value manually, then I am ok with keeping it and properly exporting
> > > it. Michal?
> > 
> > Well, let's see what it looks like to do that.  I went down the rabbit
> > hole trying to understand why some of the SHM_ flags had the same value
> > as each other until I realised some of them were internal flags, some
> > were flags to shmat() and others were flags to shmget().  Hopefully I
> > disambiguated them nicely in this patch.  I also added 8MB and 16GB sizes.
> > Any more architectures with a pet favourite huge/giant page size we
> > should add convenience defines for?
> 
> Do we actually have any users?

Yes this feature is widely used.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-29 17:45         ` Andi Kleen
@ 2017-03-30  6:12           ` Michal Hocko
  2017-04-12 16:18             ` Davidlohr Bueso
  0 siblings, 1 reply; 29+ messages in thread
From: Michal Hocko @ 2017-03-30  6:12 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Matthew Wilcox, akpm, mtk.manpages, linux-mm, linux-kernel,
	Davidlohr Bueso, khandual

On Wed 29-03-17 10:45:14, Andi Kleen wrote:
> On Wed, Mar 29, 2017 at 10:06:25AM +0200, Michal Hocko wrote:
> > On Tue 28-03-17 10:54:08, Matthew Wilcox wrote:
> > > On Tue, Mar 28, 2017 at 09:55:13AM -0700, Davidlohr Bueso wrote:
> > > > Do we have any consensus here? Keeping SHM_HUGE_* is currently
> > > > winning 2-1. If there are in fact users out there computing the
> > > > value manually, then I am ok with keeping it and properly exporting
> > > > it. Michal?
> > > 
> > > Well, let's see what it looks like to do that.  I went down the rabbit
> > > hole trying to understand why some of the SHM_ flags had the same value
> > > as each other until I realised some of them were internal flags, some
> > > were flags to shmat() and others were flags to shmget().  Hopefully I
> > > disambiguated them nicely in this patch.  I also added 8MB and 16GB sizes.
> > > Any more architectures with a pet favourite huge/giant page size we
> > > should add convenience defines for?
> > 
> > Do we actually have any users?
> 
> Yes this feature is widely used.

Considering that none of SHM_HUGE* has been exported to the userspace
headers all the users would have to use the this flag by the value and I
am quite skeptical that application actually do that. Could you point me
to some projects that use this?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-30  6:12           ` Michal Hocko
@ 2017-04-12 16:18             ` Davidlohr Bueso
  2017-04-12 16:30               ` Andi Kleen
  2017-04-12 17:21               ` Matthew Wilcox
  0 siblings, 2 replies; 29+ messages in thread
From: Davidlohr Bueso @ 2017-04-12 16:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andi Kleen, Matthew Wilcox, akpm, mtk.manpages, linux-mm,
	linux-kernel, Davidlohr Bueso, khandual

On Thu, 30 Mar 2017, Michal Hocko wrote:

>On Wed 29-03-17 10:45:14, Andi Kleen wrote:
>> On Wed, Mar 29, 2017 at 10:06:25AM +0200, Michal Hocko wrote:
>> >
>> > Do we actually have any users?
>>
>> Yes this feature is widely used.
>
>Considering that none of SHM_HUGE* has been exported to the userspace
>headers all the users would have to use the this flag by the value and I
>am quite skeptical that application actually do that. Could you point me
>to some projects that use this?

Hmm Andrew, if there's not one example, could you please pick up this patch?

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-04-12 16:18             ` Davidlohr Bueso
@ 2017-04-12 16:30               ` Andi Kleen
  2017-04-12 17:21               ` Matthew Wilcox
  1 sibling, 0 replies; 29+ messages in thread
From: Andi Kleen @ 2017-04-12 16:30 UTC (permalink / raw)
  To: Michal Hocko, Matthew Wilcox, akpm, mtk.manpages, linux-mm,
	linux-kernel, Davidlohr Bueso, khandual

On Wed, Apr 12, 2017 at 09:18:29AM -0700, Davidlohr Bueso wrote:
> On Thu, 30 Mar 2017, Michal Hocko wrote:
> 
> > On Wed 29-03-17 10:45:14, Andi Kleen wrote:
> > > On Wed, Mar 29, 2017 at 10:06:25AM +0200, Michal Hocko wrote:
> > > >
> > > > Do we actually have any users?
> > > 
> > > Yes this feature is widely used.
> > 
> > Considering that none of SHM_HUGE* has been exported to the userspace
> > headers all the users would have to use the this flag by the value and I
> > am quite skeptical that application actually do that. Could you point me
> > to some projects that use this?
> 
> Hmm Andrew, if there's not one example, could you please pick up this patch?

?!? We just don't break user ABIs this way in Linux!

Just because you don't like something you cannot simply remove it.

I don't know if there are open source users, but there are known closed
source users for which this was originally implemented.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-04-12 16:18             ` Davidlohr Bueso
  2017-04-12 16:30               ` Andi Kleen
@ 2017-04-12 17:21               ` Matthew Wilcox
  1 sibling, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-04-12 17:21 UTC (permalink / raw)
  To: Michal Hocko, Andi Kleen, akpm, mtk.manpages, linux-mm,
	linux-kernel, Davidlohr Bueso, khandual

On Wed, Apr 12, 2017 at 09:18:29AM -0700, Davidlohr Bueso wrote:
> On Thu, 30 Mar 2017, Michal Hocko wrote:
> 
> > On Wed 29-03-17 10:45:14, Andi Kleen wrote:
> > > On Wed, Mar 29, 2017 at 10:06:25AM +0200, Michal Hocko wrote:
> > > >
> > > > Do we actually have any users?
> > > 
> > > Yes this feature is widely used.
> > 
> > Considering that none of SHM_HUGE* has been exported to the userspace
> > headers all the users would have to use the this flag by the value and I
> > am quite skeptical that application actually do that. Could you point me
> > to some projects that use this?
> 
> Hmm Andrew, if there's not one example, could you please pick up this patch?

Please comment on the replacement patch I suggested.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-28 17:54     ` Matthew Wilcox
  2017-03-29  8:06       ` Michal Hocko
@ 2017-04-13  6:02       ` Aneesh Kumar K.V
  2017-04-13 12:46         ` Matthew Wilcox
  2017-07-17 22:27       ` Mike Kravetz
  2 siblings, 1 reply; 29+ messages in thread
From: Aneesh Kumar K.V @ 2017-04-13  6:02 UTC (permalink / raw)
  To: Matthew Wilcox, akpm, mhocko, ak, mtk.manpages, linux-mm,
	linux-kernel, Davidlohr Bueso, khandual

Matthew Wilcox <willy@infradead.org> writes:

> On Tue, Mar 28, 2017 at 09:55:13AM -0700, Davidlohr Bueso wrote:
>> Do we have any consensus here? Keeping SHM_HUGE_* is currently
>> winning 2-1. If there are in fact users out there computing the
>> value manually, then I am ok with keeping it and properly exporting
>> it. Michal?
>
> Well, let's see what it looks like to do that.  I went down the rabbit
> hole trying to understand why some of the SHM_ flags had the same value
> as each other until I realised some of them were internal flags, some
> were flags to shmat() and others were flags to shmget().  Hopefully I
> disambiguated them nicely in this patch.  I also added 8MB and 16GB sizes.
> Any more architectures with a pet favourite huge/giant page size we
> should add convenience defines for?
>
> diff --git a/include/linux/shm.h b/include/linux/shm.h
> index 04e881829625..cd95243efd1a 100644
> --- a/include/linux/shm.h
> +++ b/include/linux/shm.h
> @@ -24,26 +24,13 @@ struct shmid_kernel /* private to the kernel */
>  	struct list_head	shm_clist;	/* list by creator */
>  };
>  
> -/* shm_mode upper byte flags */
> -#define	SHM_DEST	01000	/* segment will be destroyed on last detach */
> -#define SHM_LOCKED      02000   /* segment will not be swapped */
> -#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
> -#define SHM_NORESERVE   010000  /* don't check for reservations */
> -
> -/* Bits [26:31] are reserved */
> -
>  /*
> - * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> + * These flags are used internally; they cannot be specified by the user.
> + * They are masked off in newseg().  These values are used by IPC_CREAT
> + * and IPC_EXCL when calling shmget().
>   */
> -#define SHM_HUGE_SHIFT  26
> -#define SHM_HUGE_MASK   0x3f
> -#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
> -#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
> +#define	SHM_DEST	01000	/* segment will be destroyed on last detach */
> +#define SHM_LOCKED      02000   /* segment will not be swapped */
>  
>  #ifdef CONFIG_SYSVIPC
>  struct sysv_shm {
> diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h
> index 1fbf24ea37fd..44b36cb228d7 100644
> --- a/include/uapi/linux/shm.h
> +++ b/include/uapi/linux/shm.h
> @@ -40,15 +40,34 @@ struct shmid_ds {
>  /* Include the definition of shmid64_ds and shminfo64 */
>  #include <asm/shmbuf.h>
>  
> -/* permission flag for shmget */
> +/* shmget() shmflg values. */
> +/* The bottom nine bits are the same as open(2) mode flags */
>  #define SHM_R		0400	/* or S_IRUGO from <linux/stat.h> */
>  #define SHM_W		0200	/* or S_IWUGO from <linux/stat.h> */
> +/* Bits 9 & 10 are IPC_CREAT and IPC_EXCL */
> +#define SHM_HUGETLB     (1 << 11) /* segment will use huge TLB pages */
> +#define SHM_NORESERVE   (1 << 12) /* don't check for reservations */
>  
> -/* mode for attach */
> -#define	SHM_RDONLY	010000	/* read-only access */
> -#define	SHM_RND		020000	/* round attach address to SHMLBA boundary */
> -#define	SHM_REMAP	040000	/* take-over region on attach */
> -#define	SHM_EXEC	0100000	/* execution access */
> +/*
> + * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> + * This gives us 6 bits, which is enough until someone invents 128 bit address
> + * spaces.  These match MAP_HUGE_SHIFT and MAP_HUGE_MASK.
> + *
> + * Assume these are all powers of two.
> + * When 0 use the default page size.
> + */
> +#define SHM_HUGE_SHIFT	26
> +#define SHM_HUGE_MASK	0x3f
> +#define SHM_HUGE_2MB	(21 << SHM_HUGE_SHIFT)
> +#define SHM_HUGE_8MB	(23 << SHM_HUGE_SHIFT)
> +#define SHM_HUGE_1GB	(30 << SHM_HUGE_SHIFT)
> +#define SHM_HUGE_16GB	(34 << SHM_HUGE_SHIFT)


This should be in arch/uapi like MAP_HUGE_2M ? That will let arch add
#defines based on the hugepae size supported by them.

> +
> +/* shmat() shmflg values */
> +#define	SHM_RDONLY	(1 << 12) /* read-only access */
> +#define	SHM_RND		(1 << 13) /* round attach address to SHMLBA boundary */
> +#define	SHM_REMAP	(1 << 14) /* take-over region on attach */
> +#define	SHM_EXEC	(1 << 15) /* execution access */
>  

-aneesh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-04-13  6:02       ` Aneesh Kumar K.V
@ 2017-04-13 12:46         ` Matthew Wilcox
  0 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-04-13 12:46 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: akpm, mhocko, ak, mtk.manpages, linux-mm, linux-kernel,
	Davidlohr Bueso, khandual

On Thu, Apr 13, 2017 at 11:32:09AM +0530, Aneesh Kumar K.V wrote:
> > +#define SHM_HUGE_SHIFT	26
> > +#define SHM_HUGE_MASK	0x3f
> > +#define SHM_HUGE_2MB	(21 << SHM_HUGE_SHIFT)
> > +#define SHM_HUGE_8MB	(23 << SHM_HUGE_SHIFT)
> > +#define SHM_HUGE_1GB	(30 << SHM_HUGE_SHIFT)
> > +#define SHM_HUGE_16GB	(34 << SHM_HUGE_SHIFT)
> 
> This should be in arch/uapi like MAP_HUGE_2M ? That will let arch add
> #defines based on the hugepae size supported by them.

Well, what do we want to happen if source code contains SHM_HUGE_2MB?
Do we want it to fail to compile on ppc, or do we want it to request 2MB
pages and get ... hmm, looks like it fails at runtime (size_to_hstate
ends up returning NULL).  So, yeah, looks like a compile-time failure
would be better.

But speaking of MAP_HUGE_, the only definitions so far are in arch/x86.
Are you going to add the ppc versions?

Also, which header file?  I'm reluctant to add a new header, but
asm/shmbuf.h doesn't seem like a great place to put it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mm,hugetlb: compute page_size_log properly
  2017-03-28 17:54     ` Matthew Wilcox
  2017-03-29  8:06       ` Michal Hocko
  2017-04-13  6:02       ` Aneesh Kumar K.V
@ 2017-07-17 22:27       ` Mike Kravetz
  2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
                           ` (2 more replies)
  2 siblings, 3 replies; 29+ messages in thread
From: Mike Kravetz @ 2017-07-17 22:27 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Matthew Wilcox, akpm, mhocko, ak, mtk.manpages, Davidlohr Bueso,
	khandual, aneesh.kumar, aarcange, Mike Kravetz

I hate to resurrect this thread, but I would like to add hugetlb support
to memfd_create.  This is for JVM garbage collection as discussed in
this thread [1].

Adding hugetlb support to memfd_create, means that memfd_create will take
a flag something like MFD_HUGETLB.  And, if a user wants hugetlb pages
they may want a huge page size different than the system default.  So, it
make sense to use the same type of encoding used by mmap and shmget.
However, I would hate to copy/paste the same values used by mmap and shmget
and just give them different names.  So, how about something like the
following:

1) Put all the log2 encoded huge page size definitions in a common header
   file.
2) Arch specific code can use these values, or overwrite as needed.
3) All system calls using this encoding (mmap, shmget and memfd_create in
   the future) will use these common values.

I have also put the shm user space definitions in the uapi file as
previously suggested by Matthew Wilcox.  I did not (yet) move the
shm definitions to arch specific files as suggested by Aneesh Kumar.

[1] https://lkml.org/lkml/2017/7/6/564

Mike Kravetz (3):
  mm:hugetlb:  Define system call hugetlb size encodings in single file
  mm: arch: Use new hugetlb size encoding definitions
  mm: shm: Use new hugetlb size encoding definitions

 arch/alpha/include/uapi/asm/mman.h        | 14 ++++++--------
 arch/mips/include/uapi/asm/mman.h         | 14 ++++++--------
 arch/parisc/include/uapi/asm/mman.h       | 14 ++++++--------
 arch/powerpc/include/uapi/asm/mman.h      | 23 ++++++++++-------------
 arch/x86/include/uapi/asm/mman.h          | 10 ++++++++--
 arch/xtensa/include/uapi/asm/mman.h       | 14 ++++++--------
 include/linux/shm.h                       | 17 -----------------
 include/uapi/asm-generic/hugetlb_encode.h | 30 ++++++++++++++++++++++++++++++
 include/uapi/asm-generic/mman-common.h    |  6 ++++--
 include/uapi/linux/shm.h                  | 23 +++++++++++++++++++++--
 10 files changed, 97 insertions(+), 68 deletions(-)
 create mode 100644 include/uapi/asm-generic/hugetlb_encode.h

-- 
2.7.5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC PATCH 1/3] mm:hugetlb:  Define system call hugetlb size encodings in single file
  2017-07-17 22:27       ` Mike Kravetz
@ 2017-07-17 22:27         ` Mike Kravetz
  2017-07-18  0:00           ` Matthew Wilcox
  2017-07-26  9:50           ` Michal Hocko
  2017-07-17 22:28         ` [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions Mike Kravetz
  2017-07-17 22:28         ` [RFC PATCH 3/3] mm: shm: " Mike Kravetz
  2 siblings, 2 replies; 29+ messages in thread
From: Mike Kravetz @ 2017-07-17 22:27 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Matthew Wilcox, akpm, mhocko, ak, mtk.manpages, Davidlohr Bueso,
	khandual, aneesh.kumar, aarcange, Mike Kravetz

If hugetlb pages are requested in mmap or shmget system calls,  a huge
page size other than default can be requested.  This is accomplished by
encoding the log2 of the huge page size in the upper bits of the flag
argument.  asm-generic and arch specific headers all define the same
values for these encodings.

Put common definitions in a single header file.  arch specific code can
still override if desired.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/uapi/asm-generic/hugetlb_encode.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)
 create mode 100644 include/uapi/asm-generic/hugetlb_encode.h

diff --git a/include/uapi/asm-generic/hugetlb_encode.h b/include/uapi/asm-generic/hugetlb_encode.h
new file mode 100644
index 0000000..aa09fc0
--- /dev/null
+++ b/include/uapi/asm-generic/hugetlb_encode.h
@@ -0,0 +1,30 @@
+#ifndef _ASM_GENERIC_HUGETLB_ENCODE_H_
+#define _ASM_GENERIC_HUGETLB_ENCODE_H_
+
+/*
+ * Several system calls take a flag to request "hugetlb" huge pages.
+ * Without further specification, these system calls will use the
+ * system's default huge page size.  If a system supports multiple
+ * huge page sizes, the desired huge page size can be specified in
+ * bits [26:31] of the flag arguments.  The value in these 6 bits
+ * will encode the log2 of the huge page size.
+ *
+ * The following definitions are associated with this huge page size
+ * encoding in flag arguments.  System call specific header files
+ * that use this encoding should include this file.  They can then
+ * provide definitions based on these with their own specific prefix.
+ * for example #define MAP_HUGE_SHIFT HUGETLB_FLAG_ENCODE_SHIFT.
+ */
+
+#define HUGETLB_FLAG_ENCODE_SHIFT	26
+#define HUGETLB_FLAG_ENCODE_MASK	0x3f
+
+#define HUGETLB_FLAG_ENCODE_512KB	(19 << MAP_HUGE_SHIFT
+#define HUGETLB_FLAG_ENCODE_1MB		(20 << MAP_HUGE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_2MB		(21 << MAP_HUGE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_8MB		(23 << MAP_HUGE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_16MB	(24 << MAP_HUGE_SHIFT)
+#define HUGETLB_FLAG_ENCODE_1GB		(30 << MAP_HUGE_SHIFT)
+#define HUGETLB_FLAG_ENCODE__16GB	(34 << MAP_HUGE_SHIFT)
+
+#endif /* _ASM_GENERIC_HUGETLB_ENCODE_H_ */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions
  2017-07-17 22:27       ` Mike Kravetz
  2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
@ 2017-07-17 22:28         ` Mike Kravetz
  2017-07-26  9:52           ` Michal Hocko
  2017-07-17 22:28         ` [RFC PATCH 3/3] mm: shm: " Mike Kravetz
  2 siblings, 1 reply; 29+ messages in thread
From: Mike Kravetz @ 2017-07-17 22:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Matthew Wilcox, akpm, mhocko, ak, mtk.manpages, Davidlohr Bueso,
	khandual, aneesh.kumar, aarcange, Mike Kravetz

Use the common definitions from hugetlb_encode.h header file for
encoding hugetlb size definitions in mmap system call flags.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/alpha/include/uapi/asm/mman.h     | 14 ++++++--------
 arch/mips/include/uapi/asm/mman.h      | 14 ++++++--------
 arch/parisc/include/uapi/asm/mman.h    | 14 ++++++--------
 arch/powerpc/include/uapi/asm/mman.h   | 23 ++++++++++-------------
 arch/x86/include/uapi/asm/mman.h       | 10 ++++++++--
 arch/xtensa/include/uapi/asm/mman.h    | 14 ++++++--------
 include/uapi/asm-generic/mman-common.h |  6 ++++--
 7 files changed, 46 insertions(+), 49 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
index 02760f6..bfa5828 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -1,6 +1,8 @@
 #ifndef __ALPHA_MMAN_H__
 #define __ALPHA_MMAN_H__
 
+#include <asm-generic/hugetlb_encode.h>
+
 #define PROT_READ	0x1		/* page can be read */
 #define PROT_WRITE	0x2		/* page can be written */
 #define PROT_EXEC	0x4		/* page can be executed */
@@ -68,15 +70,11 @@
 #define MAP_FILE	0
 
 /*
- * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
  */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
 
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
index 655e2fb..9e55284 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -8,6 +8,8 @@
 #ifndef _ASM_MMAN_H
 #define _ASM_MMAN_H
 
+#include <asm-generic/hugetlb_encode.h>
+
 /*
  * Protections are chosen from these bits, OR'd together.  The
  * implementation does not necessarily support PROT_EXEC or PROT_WRITE
@@ -95,15 +97,11 @@
 #define MAP_FILE	0
 
 /*
- * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
  */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
 
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
index 5979745..11c6d86 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -1,6 +1,8 @@
 #ifndef __PARISC_MMAN_H__
 #define __PARISC_MMAN_H__
 
+#include <asm-generic/hugetlb_encode.h>
+
 #define PROT_READ	0x1		/* page can be read */
 #define PROT_WRITE	0x2		/* page can be written */
 #define PROT_EXEC	0x4		/* page can be executed */
@@ -65,15 +67,11 @@
 #define MAP_VARIABLE	0
 
 /*
- * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
  */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
 
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index ab45cc2..80fd56c 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -8,6 +8,7 @@
 #define _UAPI_ASM_POWERPC_MMAN_H
 
 #include <asm-generic/mman-common.h>
+#include <asm-generic/hugetlb_encode.h>
 
 
 #define PROT_SAO	0x10		/* Strong Access Ordering */
@@ -30,19 +31,15 @@
 #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
 
 /*
- * When MAP_HUGETLB is set, bits [26:31] of the flags argument to mmap(2),
- * encode the log2 of the huge page size. A value of zero indicates that the
- * default huge page size should be used. To use a non-default huge page size,
- * one of these defines can be used, or the size can be encoded by hand. Note
- * that on most systems only a subset, or possibly none, of these sizes will be
- * available.
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
  */
-#define MAP_HUGE_512KB	(19 << MAP_HUGE_SHIFT)	/* 512KB HugeTLB Page */
-#define MAP_HUGE_1MB	(20 << MAP_HUGE_SHIFT)	/* 1MB   HugeTLB Page */
-#define MAP_HUGE_2MB	(21 << MAP_HUGE_SHIFT)	/* 2MB   HugeTLB Page */
-#define MAP_HUGE_8MB	(23 << MAP_HUGE_SHIFT)	/* 8MB   HugeTLB Page */
-#define MAP_HUGE_16MB	(24 << MAP_HUGE_SHIFT)	/* 16MB  HugeTLB Page */
-#define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
-#define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
+#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB	/* 512KB HugeTLB Page */
+#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB		/* 1MB   HugeTLB Page */
+#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB		/* 2MB   HugeTLB Page */
+#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB		/* 8MB   HugeTLB Page */
+#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB	/* 16MB  HugeTLB Page */
+#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB		/* 1GB   HugeTLB Page */
+#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB	/* 16GB  HugeTLB Page */
 
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h
index 39bca7f..5a67293 100644
--- a/arch/x86/include/uapi/asm/mman.h
+++ b/arch/x86/include/uapi/asm/mman.h
@@ -1,10 +1,16 @@
 #ifndef _ASM_X86_MMAN_H
 #define _ASM_X86_MMAN_H
 
+#include <asm-generic/hugetlb_encode.h>
+
 #define MAP_32BIT	0x40		/* only give out 32bit addresses */
 
-#define MAP_HUGE_2MB    (21 << MAP_HUGE_SHIFT)
-#define MAP_HUGE_1GB    (30 << MAP_HUGE_SHIFT)
+/*
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
+ */
+#define MAP_HUGE_2MB    HUGETLB_FLAG_ENCODE_2MB
+#define MAP_HUGE_1GB    HUGETLB_FLAG_ENCODE_1GB
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 /*
diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
index 24365b3..5deb5e4 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -14,6 +14,8 @@
 #ifndef _XTENSA_MMAN_H
 #define _XTENSA_MMAN_H
 
+#include <asm-generic/hugetlb_encode.h>
+
 /*
  * Protections are chosen from these bits, OR'd together.  The
  * implementation does not necessarily support PROT_EXEC or PROT_WRITE
@@ -107,15 +109,11 @@
 #define MAP_FILE	0
 
 /*
- * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
+ * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
  */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
 
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 8c27db0..00d56b8 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -1,6 +1,8 @@
 #ifndef __ASM_GENERIC_MMAN_COMMON_H
 #define __ASM_GENERIC_MMAN_COMMON_H
 
+#include <asm-generic/hugetlb_encode.h>
+
 /*
  Author: Michael S. Tsirkin <mst@mellanox.co.il>, Mellanox Technologies Ltd.
  Based on: asm-xxx/mman.h
@@ -69,8 +71,8 @@
  * Assume these are all power of twos.
  * When 0 use the default page size.
  */
-#define MAP_HUGE_SHIFT	26
-#define MAP_HUGE_MASK	0x3f
+#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
 
 #define PKEY_DISABLE_ACCESS	0x1
 #define PKEY_DISABLE_WRITE	0x2
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-17 22:27       ` Mike Kravetz
  2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
  2017-07-17 22:28         ` [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions Mike Kravetz
@ 2017-07-17 22:28         ` Mike Kravetz
  2017-07-26  9:53           ` Michal Hocko
  2 siblings, 1 reply; 29+ messages in thread
From: Mike Kravetz @ 2017-07-17 22:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: Matthew Wilcox, akpm, mhocko, ak, mtk.manpages, Davidlohr Bueso,
	khandual, aneesh.kumar, aarcange, Mike Kravetz

Use the common definitions from hugetlb_encode.h header file for
encoding hugetlb size definitions in shmget system call flags.  In
addition, move these definitions to the from the internal to user
(uapi) header file.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/shm.h      | 17 -----------------
 include/uapi/linux/shm.h | 23 +++++++++++++++++++++--
 2 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/include/linux/shm.h b/include/linux/shm.h
index 04e8818..d56285a 100644
--- a/include/linux/shm.h
+++ b/include/linux/shm.h
@@ -27,23 +27,6 @@ struct shmid_kernel /* private to the kernel */
 /* shm_mode upper byte flags */
 #define	SHM_DEST	01000	/* segment will be destroyed on last detach */
 #define SHM_LOCKED      02000   /* segment will not be swapped */
-#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
-#define SHM_NORESERVE   010000  /* don't check for reservations */
-
-/* Bits [26:31] are reserved */
-
-/*
- * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
- * This gives us 6 bits, which is enough until someone invents 128 bit address
- * spaces.
- *
- * Assume these are all power of twos.
- * When 0 use the default page size.
- */
-#define SHM_HUGE_SHIFT  26
-#define SHM_HUGE_MASK   0x3f
-#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
-#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
 
 #ifdef CONFIG_SYSVIPC
 struct sysv_shm {
diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h
index 1fbf24e..329bc17 100644
--- a/include/uapi/linux/shm.h
+++ b/include/uapi/linux/shm.h
@@ -3,6 +3,7 @@
 
 #include <linux/ipc.h>
 #include <linux/errno.h>
+#include <asm-generic/hugetlb_encode.h>
 #ifndef __KERNEL__
 #include <unistd.h>
 #endif
@@ -40,11 +41,29 @@ struct shmid_ds {
 /* Include the definition of shmid64_ds and shminfo64 */
 #include <asm/shmbuf.h>
 
-/* permission flag for shmget */
+/* shmget() shmflg values. */
+/* The bottom nine bits are the same as open(2) mode flags */
 #define SHM_R		0400	/* or S_IRUGO from <linux/stat.h> */
 #define SHM_W		0200	/* or S_IWUGO from <linux/stat.h> */
+/* Bits 9 & 10 are IPC_CREAT and IPC_EXCL */
+#define SHM_HUGETLB	04000	/* segment will use huge TLB pages */
+#define	SHM_NORESERVE	010000	/* don't check for reservations */
 
-/* mode for attach */
+/*
+ * Huge page size encoding when SHM_HUGETLB is specified, and a huge page
+ * size other than the default is desired.  See hugetlb_encode.h
+ */
+#define SHM_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
+#define SHM_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
+#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB
+#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB
+#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB
+#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB
+#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB
+#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB
+#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB
+
+/* shmat() shmflg values */
 #define	SHM_RDONLY	010000	/* read-only access */
 #define	SHM_RND		020000	/* round attach address to SHMLBA boundary */
 #define	SHM_REMAP	040000	/* take-over region on attach */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 1/3] mm:hugetlb:  Define system call hugetlb size encodings in single file
  2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
@ 2017-07-18  0:00           ` Matthew Wilcox
  2017-07-26  9:50           ` Michal Hocko
  1 sibling, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-07-18  0:00 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, akpm, mhocko, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Mon, Jul 17, 2017 at 03:27:59PM -0700, Mike Kravetz wrote:
> +#define HUGETLB_FLAG_ENCODE_512KB	(19 << MAP_HUGE_SHIFT
> +#define HUGETLB_FLAG_ENCODE_1MB		(20 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_2MB		(21 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_8MB		(23 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_16MB	(24 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_1GB		(30 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE__16GB	(34 << MAP_HUGE_SHIFT)

The __ seems like a mistake?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 1/3] mm:hugetlb:  Define system call hugetlb size encodings in single file
  2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
  2017-07-18  0:00           ` Matthew Wilcox
@ 2017-07-26  9:50           ` Michal Hocko
  1 sibling, 0 replies; 29+ messages in thread
From: Michal Hocko @ 2017-07-26  9:50 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Mon 17-07-17 15:27:59, Mike Kravetz wrote:
> If hugetlb pages are requested in mmap or shmget system calls,  a huge
> page size other than default can be requested.  This is accomplished by
> encoding the log2 of the huge page size in the upper bits of the flag
> argument.  asm-generic and arch specific headers all define the same
> values for these encodings.
> 
> Put common definitions in a single header file.  arch specific code can
> still override if desired.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

with
s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/uapi/asm-generic/hugetlb_encode.h | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>  create mode 100644 include/uapi/asm-generic/hugetlb_encode.h
> 
> diff --git a/include/uapi/asm-generic/hugetlb_encode.h b/include/uapi/asm-generic/hugetlb_encode.h
> new file mode 100644
> index 0000000..aa09fc0
> --- /dev/null
> +++ b/include/uapi/asm-generic/hugetlb_encode.h
> @@ -0,0 +1,30 @@
> +#ifndef _ASM_GENERIC_HUGETLB_ENCODE_H_
> +#define _ASM_GENERIC_HUGETLB_ENCODE_H_
> +
> +/*
> + * Several system calls take a flag to request "hugetlb" huge pages.
> + * Without further specification, these system calls will use the
> + * system's default huge page size.  If a system supports multiple
> + * huge page sizes, the desired huge page size can be specified in
> + * bits [26:31] of the flag arguments.  The value in these 6 bits
> + * will encode the log2 of the huge page size.
> + *
> + * The following definitions are associated with this huge page size
> + * encoding in flag arguments.  System call specific header files
> + * that use this encoding should include this file.  They can then
> + * provide definitions based on these with their own specific prefix.
> + * for example #define MAP_HUGE_SHIFT HUGETLB_FLAG_ENCODE_SHIFT.
> + */
> +
> +#define HUGETLB_FLAG_ENCODE_SHIFT	26
> +#define HUGETLB_FLAG_ENCODE_MASK	0x3f
> +
> +#define HUGETLB_FLAG_ENCODE_512KB	(19 << MAP_HUGE_SHIFT
> +#define HUGETLB_FLAG_ENCODE_1MB		(20 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_2MB		(21 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_8MB		(23 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_16MB	(24 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE_1GB		(30 << MAP_HUGE_SHIFT)
> +#define HUGETLB_FLAG_ENCODE__16GB	(34 << MAP_HUGE_SHIFT)
> +
> +#endif /* _ASM_GENERIC_HUGETLB_ENCODE_H_ */
> -- 
> 2.7.5
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions
  2017-07-17 22:28         ` [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions Mike Kravetz
@ 2017-07-26  9:52           ` Michal Hocko
  0 siblings, 0 replies; 29+ messages in thread
From: Michal Hocko @ 2017-07-26  9:52 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Mon 17-07-17 15:28:00, Mike Kravetz wrote:
> Use the common definitions from hugetlb_encode.h header file for
> encoding hugetlb size definitions in mmap system call flags.
> 
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  arch/alpha/include/uapi/asm/mman.h     | 14 ++++++--------
>  arch/mips/include/uapi/asm/mman.h      | 14 ++++++--------
>  arch/parisc/include/uapi/asm/mman.h    | 14 ++++++--------
>  arch/powerpc/include/uapi/asm/mman.h   | 23 ++++++++++-------------
>  arch/x86/include/uapi/asm/mman.h       | 10 ++++++++--
>  arch/xtensa/include/uapi/asm/mman.h    | 14 ++++++--------
>  include/uapi/asm-generic/mman-common.h |  6 ++++--
>  7 files changed, 46 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h
> index 02760f6..bfa5828 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -1,6 +1,8 @@
>  #ifndef __ALPHA_MMAN_H__
>  #define __ALPHA_MMAN_H__
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  #define PROT_READ	0x1		/* page can be read */
>  #define PROT_WRITE	0x2		/* page can be written */
>  #define PROT_EXEC	0x4		/* page can be executed */
> @@ -68,15 +70,11 @@
>  #define MAP_FILE	0
>  
>  /*
> - * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
>   */
> -#define MAP_HUGE_SHIFT	26
> -#define MAP_HUGE_MASK	0x3f
> +#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
>  
>  #define PKEY_DISABLE_ACCESS	0x1
>  #define PKEY_DISABLE_WRITE	0x2
> diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h
> index 655e2fb..9e55284 100644
> --- a/arch/mips/include/uapi/asm/mman.h
> +++ b/arch/mips/include/uapi/asm/mman.h
> @@ -8,6 +8,8 @@
>  #ifndef _ASM_MMAN_H
>  #define _ASM_MMAN_H
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  /*
>   * Protections are chosen from these bits, OR'd together.  The
>   * implementation does not necessarily support PROT_EXEC or PROT_WRITE
> @@ -95,15 +97,11 @@
>  #define MAP_FILE	0
>  
>  /*
> - * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
>   */
> -#define MAP_HUGE_SHIFT	26
> -#define MAP_HUGE_MASK	0x3f
> +#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
>  
>  #define PKEY_DISABLE_ACCESS	0x1
>  #define PKEY_DISABLE_WRITE	0x2
> diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h
> index 5979745..11c6d86 100644
> --- a/arch/parisc/include/uapi/asm/mman.h
> +++ b/arch/parisc/include/uapi/asm/mman.h
> @@ -1,6 +1,8 @@
>  #ifndef __PARISC_MMAN_H__
>  #define __PARISC_MMAN_H__
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  #define PROT_READ	0x1		/* page can be read */
>  #define PROT_WRITE	0x2		/* page can be written */
>  #define PROT_EXEC	0x4		/* page can be executed */
> @@ -65,15 +67,11 @@
>  #define MAP_VARIABLE	0
>  
>  /*
> - * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
>   */
> -#define MAP_HUGE_SHIFT	26
> -#define MAP_HUGE_MASK	0x3f
> +#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
>  
>  #define PKEY_DISABLE_ACCESS	0x1
>  #define PKEY_DISABLE_WRITE	0x2
> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index ab45cc2..80fd56c 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -8,6 +8,7 @@
>  #define _UAPI_ASM_POWERPC_MMAN_H
>  
>  #include <asm-generic/mman-common.h>
> +#include <asm-generic/hugetlb_encode.h>
>  
>  
>  #define PROT_SAO	0x10		/* Strong Access Ordering */
> @@ -30,19 +31,15 @@
>  #define MAP_HUGETLB	0x40000		/* create a huge page mapping */
>  
>  /*
> - * When MAP_HUGETLB is set, bits [26:31] of the flags argument to mmap(2),
> - * encode the log2 of the huge page size. A value of zero indicates that the
> - * default huge page size should be used. To use a non-default huge page size,
> - * one of these defines can be used, or the size can be encoded by hand. Note
> - * that on most systems only a subset, or possibly none, of these sizes will be
> - * available.
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
>   */
> -#define MAP_HUGE_512KB	(19 << MAP_HUGE_SHIFT)	/* 512KB HugeTLB Page */
> -#define MAP_HUGE_1MB	(20 << MAP_HUGE_SHIFT)	/* 1MB   HugeTLB Page */
> -#define MAP_HUGE_2MB	(21 << MAP_HUGE_SHIFT)	/* 2MB   HugeTLB Page */
> -#define MAP_HUGE_8MB	(23 << MAP_HUGE_SHIFT)	/* 8MB   HugeTLB Page */
> -#define MAP_HUGE_16MB	(24 << MAP_HUGE_SHIFT)	/* 16MB  HugeTLB Page */
> -#define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
> -#define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
> +#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB	/* 512KB HugeTLB Page */
> +#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB		/* 1MB   HugeTLB Page */
> +#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB		/* 2MB   HugeTLB Page */
> +#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB		/* 8MB   HugeTLB Page */
> +#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB	/* 16MB  HugeTLB Page */
> +#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB		/* 1GB   HugeTLB Page */
> +#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB	/* 16GB  HugeTLB Page */
>  
>  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h
> index 39bca7f..5a67293 100644
> --- a/arch/x86/include/uapi/asm/mman.h
> +++ b/arch/x86/include/uapi/asm/mman.h
> @@ -1,10 +1,16 @@
>  #ifndef _ASM_X86_MMAN_H
>  #define _ASM_X86_MMAN_H
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  #define MAP_32BIT	0x40		/* only give out 32bit addresses */
>  
> -#define MAP_HUGE_2MB    (21 << MAP_HUGE_SHIFT)
> -#define MAP_HUGE_1GB    (30 << MAP_HUGE_SHIFT)
> +/*
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
> + */
> +#define MAP_HUGE_2MB    HUGETLB_FLAG_ENCODE_2MB
> +#define MAP_HUGE_1GB    HUGETLB_FLAG_ENCODE_1GB
>  
>  #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
>  /*
> diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h
> index 24365b3..5deb5e4 100644
> --- a/arch/xtensa/include/uapi/asm/mman.h
> +++ b/arch/xtensa/include/uapi/asm/mman.h
> @@ -14,6 +14,8 @@
>  #ifndef _XTENSA_MMAN_H
>  #define _XTENSA_MMAN_H
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  /*
>   * Protections are chosen from these bits, OR'd together.  The
>   * implementation does not necessarily support PROT_EXEC or PROT_WRITE
> @@ -107,15 +109,11 @@
>  #define MAP_FILE	0
>  
>  /*
> - * When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> + * Huge page size encoding when MAP_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
>   */
> -#define MAP_HUGE_SHIFT	26
> -#define MAP_HUGE_MASK	0x3f
> +#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
>  
>  #define PKEY_DISABLE_ACCESS	0x1
>  #define PKEY_DISABLE_WRITE	0x2
> diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
> index 8c27db0..00d56b8 100644
> --- a/include/uapi/asm-generic/mman-common.h
> +++ b/include/uapi/asm-generic/mman-common.h
> @@ -1,6 +1,8 @@
>  #ifndef __ASM_GENERIC_MMAN_COMMON_H
>  #define __ASM_GENERIC_MMAN_COMMON_H
>  
> +#include <asm-generic/hugetlb_encode.h>
> +
>  /*
>   Author: Michael S. Tsirkin <mst@mellanox.co.il>, Mellanox Technologies Ltd.
>   Based on: asm-xxx/mman.h
> @@ -69,8 +71,8 @@
>   * Assume these are all power of twos.
>   * When 0 use the default page size.
>   */
> -#define MAP_HUGE_SHIFT	26
> -#define MAP_HUGE_MASK	0x3f
> +#define MAP_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define MAP_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
>  
>  #define PKEY_DISABLE_ACCESS	0x1
>  #define PKEY_DISABLE_WRITE	0x2
> -- 
> 2.7.5
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-17 22:28         ` [RFC PATCH 3/3] mm: shm: " Mike Kravetz
@ 2017-07-26  9:53           ` Michal Hocko
  2017-07-26 10:07             ` Michal Hocko
  0 siblings, 1 reply; 29+ messages in thread
From: Michal Hocko @ 2017-07-26  9:53 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
> Use the common definitions from hugetlb_encode.h header file for
> encoding hugetlb size definitions in shmget system call flags.  In
> addition, move these definitions to the from the internal to user
> (uapi) header file.

s@to the from@from@

> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>

with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  include/linux/shm.h      | 17 -----------------
>  include/uapi/linux/shm.h | 23 +++++++++++++++++++++--
>  2 files changed, 21 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/shm.h b/include/linux/shm.h
> index 04e8818..d56285a 100644
> --- a/include/linux/shm.h
> +++ b/include/linux/shm.h
> @@ -27,23 +27,6 @@ struct shmid_kernel /* private to the kernel */
>  /* shm_mode upper byte flags */
>  #define	SHM_DEST	01000	/* segment will be destroyed on last detach */
>  #define SHM_LOCKED      02000   /* segment will not be swapped */
> -#define SHM_HUGETLB     04000   /* segment will use huge TLB pages */
> -#define SHM_NORESERVE   010000  /* don't check for reservations */
> -
> -/* Bits [26:31] are reserved */
> -
> -/*
> - * When SHM_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
> - * This gives us 6 bits, which is enough until someone invents 128 bit address
> - * spaces.
> - *
> - * Assume these are all power of twos.
> - * When 0 use the default page size.
> - */
> -#define SHM_HUGE_SHIFT  26
> -#define SHM_HUGE_MASK   0x3f
> -#define SHM_HUGE_2MB    (21 << SHM_HUGE_SHIFT)
> -#define SHM_HUGE_1GB    (30 << SHM_HUGE_SHIFT)
>  
>  #ifdef CONFIG_SYSVIPC
>  struct sysv_shm {
> diff --git a/include/uapi/linux/shm.h b/include/uapi/linux/shm.h
> index 1fbf24e..329bc17 100644
> --- a/include/uapi/linux/shm.h
> +++ b/include/uapi/linux/shm.h
> @@ -3,6 +3,7 @@
>  
>  #include <linux/ipc.h>
>  #include <linux/errno.h>
> +#include <asm-generic/hugetlb_encode.h>
>  #ifndef __KERNEL__
>  #include <unistd.h>
>  #endif
> @@ -40,11 +41,29 @@ struct shmid_ds {
>  /* Include the definition of shmid64_ds and shminfo64 */
>  #include <asm/shmbuf.h>
>  
> -/* permission flag for shmget */
> +/* shmget() shmflg values. */
> +/* The bottom nine bits are the same as open(2) mode flags */
>  #define SHM_R		0400	/* or S_IRUGO from <linux/stat.h> */
>  #define SHM_W		0200	/* or S_IWUGO from <linux/stat.h> */
> +/* Bits 9 & 10 are IPC_CREAT and IPC_EXCL */
> +#define SHM_HUGETLB	04000	/* segment will use huge TLB pages */
> +#define	SHM_NORESERVE	010000	/* don't check for reservations */
>  
> -/* mode for attach */
> +/*
> + * Huge page size encoding when SHM_HUGETLB is specified, and a huge page
> + * size other than the default is desired.  See hugetlb_encode.h
> + */
> +#define SHM_HUGE_SHIFT	HUGETLB_FLAG_ENCODE_SHIFT
> +#define SHM_HUGE_MASK	HUGETLB_FLAG_ENCODE_MASK
> +#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB
> +#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB
> +#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB
> +#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB
> +#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB
> +#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB
> +#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB
> +
> +/* shmat() shmflg values */
>  #define	SHM_RDONLY	010000	/* read-only access */
>  #define	SHM_RND		020000	/* round attach address to SHMLBA boundary */
>  #define	SHM_REMAP	040000	/* take-over region on attach */
> -- 
> 2.7.5
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-26  9:53           ` Michal Hocko
@ 2017-07-26 10:07             ` Michal Hocko
  2017-07-26 17:39               ` Mike Kravetz
  0 siblings, 1 reply; 29+ messages in thread
From: Michal Hocko @ 2017-07-26 10:07 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Wed 26-07-17 11:53:38, Michal Hocko wrote:
> On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
> > Use the common definitions from hugetlb_encode.h header file for
> > encoding hugetlb size definitions in shmget system call flags.  In
> > addition, move these definitions to the from the internal to user
> > (uapi) header file.
> 
> s@to the from@from@
> 
> > 
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> 
> with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@
> 
> Acked-by: Michal Hocko <mhocko@suse.com>

Btw. man page mentions only 2MB and 1GB, we should document others and
note that each arch might support only subset of them

> > +#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB
> > +#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB
> > +#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB
> > +#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB
> > +#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB
> > +#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB
> > +#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-26 10:07             ` Michal Hocko
@ 2017-07-26 17:39               ` Mike Kravetz
  2017-07-26 18:48                 ` Matthew Wilcox
  2017-07-27  7:50                 ` Michal Hocko
  0 siblings, 2 replies; 29+ messages in thread
From: Mike Kravetz @ 2017-07-26 17:39 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On 07/26/2017 03:07 AM, Michal Hocko wrote:
> On Wed 26-07-17 11:53:38, Michal Hocko wrote:
>> On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
>>> Use the common definitions from hugetlb_encode.h header file for
>>> encoding hugetlb size definitions in shmget system call flags.  In
>>> addition, move these definitions to the from the internal to user
>>> (uapi) header file.
>>
>> s@to the from@from@
>>
>>>
>>> Suggested-by: Matthew Wilcox <willy@infradead.org>
>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>
>> with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@
>>
>> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Btw. man page mentions only 2MB and 1GB, we should document others and
> note that each arch might support only subset of them

Thanks for looking at these Michal.
BTW, those definitions below are wrong.  They should be SHM_HUGE_*. :(

In the overview of this RFC, I mentioned still needing to address the
comment from Aneesh about splitting SHM_HUGE_* definitions into arch
specific header files.  This is how it is done for mmap.  If an arch
supports multiple huge page sizes, the 'asm/mman.h' contains definitions
for those sizes.  There will be a bit of churn (such as header file
renaming) to do this for shm as well.  So, I keep going back and forth
asking myself 'is it worth it'?  Some things to consider.

- We should be consistent between mmap and shm.  Also remember, that I
  will propose adding the same type of encoding to memfd_create.  So,
  three system calls will use the encoding.  They should be consistent.
- Adding the arch specific definitions seems the 'most correct', as a
  user can not use a definition not supported by the arch.  However,
  even if an arch supports a huge page size it does not mean that the
  running kernel supports that size.  Therefore, the folllowing is in
  the man page.
  "The  range  of  huge page sizes that are supported by the system
   can be discovered by listing  the  subdirectories  in
   /sys/kernel/mm/hugepages."
- Another alternative is to make all known huge page sizes available
  to all users.  This is 'easier' as the definitions can likely reside
  in a common header file.  The user will  need to determine what
  huge page sizes are supported by the running kernel as mentioned in
  the man page.

Any thoughts/suggestions on these alternatives?  I'll send out another
patch set based on comments.  In any case, I think mmap and shm need to
be the same.
-- 
Mike Kravetz

>>> +#define MAP_HUGE_512KB	HUGETLB_FLAG_ENCODE_512KB
>>> +#define MAP_HUGE_1MB	HUGETLB_FLAG_ENCODE_1MB
>>> +#define MAP_HUGE_2MB	HUGETLB_FLAG_ENCODE_2MB
>>> +#define MAP_HUGE_8MB	HUGETLB_FLAG_ENCODE_8MB
>>> +#define MAP_HUGE_16MB	HUGETLB_FLAG_ENCODE_16MB
>>> +#define MAP_HUGE_1GB	HUGETLB_FLAG_ENCODE_1GB
>>> +#define MAP_HUGE_16GB	HUGETLB_FLAG_ENCODE__16GB

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-26 17:39               ` Mike Kravetz
@ 2017-07-26 18:48                 ` Matthew Wilcox
  2017-07-27  7:50                 ` Michal Hocko
  1 sibling, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-07-26 18:48 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Michal Hocko, linux-mm, linux-kernel, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Wed, Jul 26, 2017 at 10:39:30AM -0700, Mike Kravetz wrote:
> In the overview of this RFC, I mentioned still needing to address the
> comment from Aneesh about splitting SHM_HUGE_* definitions into arch
> specific header files.  This is how it is done for mmap.  If an arch
> supports multiple huge page sizes, the 'asm/mman.h' contains definitions
> for those sizes.  There will be a bit of churn (such as header file
> renaming) to do this for shm as well.  So, I keep going back and forth
> asking myself 'is it worth it'?  Some things to consider.
> 
> - We should be consistent between mmap and shm.  Also remember, that I
>   will propose adding the same type of encoding to memfd_create.  So,
>   three system calls will use the encoding.  They should be consistent.

I think mmap is wrong here.  User programs are generally not architecture
specific, so they'll have to test with ifdefs or something awful.
For all we know, POWER 14 and whatever x86 CPU comes out in 2030 will
support (nearly) arbitrary page sizes like Itanium does, and a user
program compiled today should be able to take advantage of it.

> - Adding the arch specific definitions seems the 'most correct', as a
>   user can not use a definition not supported by the arch.  However,
>   even if an arch supports a huge page size it does not mean that the
>   running kernel supports that size.  Therefore, the folllowing is in
>   the man page.
>   "The  range  of  huge page sizes that are supported by the system
>    can be discovered by listing  the  subdirectories  in
>    /sys/kernel/mm/hugepages."
> - Another alternative is to make all known huge page sizes available
>   to all users.  This is 'easier' as the definitions can likely reside
>   in a common header file.  The user will  need to determine what
>   huge page sizes are supported by the running kernel as mentioned in
>   the man page.

That's my preference.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-26 17:39               ` Mike Kravetz
  2017-07-26 18:48                 ` Matthew Wilcox
@ 2017-07-27  7:50                 ` Michal Hocko
  2017-07-27 21:18                   ` Mike Kravetz
  1 sibling, 1 reply; 29+ messages in thread
From: Michal Hocko @ 2017-07-27  7:50 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Wed 26-07-17 10:39:30, Mike Kravetz wrote:
> On 07/26/2017 03:07 AM, Michal Hocko wrote:
> > On Wed 26-07-17 11:53:38, Michal Hocko wrote:
> >> On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
> >>> Use the common definitions from hugetlb_encode.h header file for
> >>> encoding hugetlb size definitions in shmget system call flags.  In
> >>> addition, move these definitions to the from the internal to user
> >>> (uapi) header file.
> >>
> >> s@to the from@from@
> >>
> >>>
> >>> Suggested-by: Matthew Wilcox <willy@infradead.org>
> >>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> >>
> >> with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@
> >>
> >> Acked-by: Michal Hocko <mhocko@suse.com>
> > 
> > Btw. man page mentions only 2MB and 1GB, we should document others and
> > note that each arch might support only subset of them
> 
> Thanks for looking at these Michal.
> BTW, those definitions below are wrong.  They should be SHM_HUGE_*. :(

Ups, and I completely missed that.

> In the overview of this RFC, I mentioned still needing to address the
> comment from Aneesh about splitting SHM_HUGE_* definitions into arch
> specific header files.  This is how it is done for mmap.  If an arch
> supports multiple huge page sizes, the 'asm/mman.h' contains definitions
> for those sizes.  There will be a bit of churn (such as header file
> renaming) to do this for shm as well.  So, I keep going back and forth
> asking myself 'is it worth it'?

Why cannot we use a generic header? Btw. I think it would be better for
MMAP definitions as well.

> Some things to consider.
> 
> - We should be consistent between mmap and shm.  Also remember, that I
>   will propose adding the same type of encoding to memfd_create.  So,
>   three system calls will use the encoding.  They should be consistent.

agreed

> - Adding the arch specific definitions seems the 'most correct', as a
>   user can not use a definition not supported by the arch.  However,
>   even if an arch supports a huge page size it does not mean that the
>   running kernel supports that size.  Therefore, the folllowing is in
>   the man page.
>   "The  range  of  huge page sizes that are supported by the system
>    can be discovered by listing  the  subdirectories  in
>    /sys/kernel/mm/hugepages."

Doesn't the respective call return -EINVAL on the unsupported hugepage
size?

> - Another alternative is to make all known huge page sizes available
>   to all users.  This is 'easier' as the definitions can likely reside
>   in a common header file.  The user will  need to determine what
>   huge page sizes are supported by the running kernel as mentioned in
>   the man page.

yes I think this makes more sense.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-27  7:50                 ` Michal Hocko
@ 2017-07-27 21:18                   ` Mike Kravetz
  2017-07-28  6:30                     ` Michal Hocko
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Kravetz @ 2017-07-27 21:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On 07/27/2017 12:50 AM, Michal Hocko wrote:
> On Wed 26-07-17 10:39:30, Mike Kravetz wrote:
>> On 07/26/2017 03:07 AM, Michal Hocko wrote:
>>> On Wed 26-07-17 11:53:38, Michal Hocko wrote:
>>>> On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
>>>>> Use the common definitions from hugetlb_encode.h header file for
>>>>> encoding hugetlb size definitions in shmget system call flags.  In
>>>>> addition, move these definitions to the from the internal to user
>>>>> (uapi) header file.
>>>>
>>>> s@to the from@from@
>>>>
>>>>>
>>>>> Suggested-by: Matthew Wilcox <willy@infradead.org>
>>>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>>>
>>>> with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@
>>>>
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>
>>> Btw. man page mentions only 2MB and 1GB, we should document others and
>>> note that each arch might support only subset of them
>>
>> Thanks for looking at these Michal.
>> BTW, those definitions below are wrong.  They should be SHM_HUGE_*. :(
> 
> Ups, and I completely missed that.
> 
>> In the overview of this RFC, I mentioned still needing to address the
>> comment from Aneesh about splitting SHM_HUGE_* definitions into arch
>> specific header files.  This is how it is done for mmap.  If an arch
>> supports multiple huge page sizes, the 'asm/mman.h' contains definitions
>> for those sizes.  There will be a bit of churn (such as header file
>> renaming) to do this for shm as well.  So, I keep going back and forth
>> asking myself 'is it worth it'?
> 
> Why cannot we use a generic header? Btw. I think it would be better for
> MMAP definitions as well.

I assume you are asking about a uapi asm-generic header file?  Currently
mmap has two such files:  mman.h and mman-common.h.  In order to get the
definitions in such files, arch specific header files must #include the
asm-generic headers.  There are arch specific mmap headers today that do
not include either of the asm-generic headers.  And, they have their own
definitions for MAP_HUGE_SHIFT.  So, it seems we can not use one of the
existing mmap asm-generic header files.  Rather, we would need to create
a new one and have that included by all arch specific files.

However, ALL the MAP_HUGE_* definitions in all the arch specific and
asm-generic header files are the same.  It would be possible to just put
all those MAP_HUGE_* definitions in the primary uapi header file
(include/uapi/linux/mman.h).  If there was ever a need for arch specific
values in the future, we could split them out at that time.

>> Some things to consider.
>>
>> - We should be consistent between mmap and shm.  Also remember, that I
>>   will propose adding the same type of encoding to memfd_create.  So,
>>   three system calls will use the encoding.  They should be consistent.
> 
> agreed
> 
>> - Adding the arch specific definitions seems the 'most correct', as a
>>   user can not use a definition not supported by the arch.  However,
>>   even if an arch supports a huge page size it does not mean that the
>>   running kernel supports that size.  Therefore, the folllowing is in
>>   the man page.
>>   "The  range  of  huge page sizes that are supported by the system
>>    can be discovered by listing  the  subdirectories  in
>>    /sys/kernel/mm/hugepages."
> 
> Doesn't the respective call return -EINVAL on the unsupported hugepage
> size?

Yes, it does.

>> - Another alternative is to make all known huge page sizes available
>>   to all users.  This is 'easier' as the definitions can likely reside
>>   in a common header file.  The user will  need to determine what
>>   huge page sizes are supported by the running kernel as mentioned in
>>   the man page.
> 
> yes I think this makes more sense.

Ok, thanks.

The only remaining question is what kind of common header to use:
1) An asm-generic header file in case there may be arch specific differences
   in the future.
2) Use the primary uapi header file in include/uapi/linux/mman|shm.h.

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [RFC PATCH 3/3] mm: shm: Use new hugetlb size encoding definitions
  2017-07-27 21:18                   ` Mike Kravetz
@ 2017-07-28  6:30                     ` Michal Hocko
  0 siblings, 0 replies; 29+ messages in thread
From: Michal Hocko @ 2017-07-28  6:30 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linux-mm, linux-kernel, Matthew Wilcox, akpm, ak, mtk.manpages,
	Davidlohr Bueso, khandual, aneesh.kumar, aarcange

On Thu 27-07-17 14:18:11, Mike Kravetz wrote:
> On 07/27/2017 12:50 AM, Michal Hocko wrote:
> > On Wed 26-07-17 10:39:30, Mike Kravetz wrote:
> >> On 07/26/2017 03:07 AM, Michal Hocko wrote:
> >>> On Wed 26-07-17 11:53:38, Michal Hocko wrote:
> >>>> On Mon 17-07-17 15:28:01, Mike Kravetz wrote:
> >>>>> Use the common definitions from hugetlb_encode.h header file for
> >>>>> encoding hugetlb size definitions in shmget system call flags.  In
> >>>>> addition, move these definitions to the from the internal to user
> >>>>> (uapi) header file.
> >>>>
> >>>> s@to the from@from@
> >>>>
> >>>>>
> >>>>> Suggested-by: Matthew Wilcox <willy@infradead.org>
> >>>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> >>>>
> >>>> with s@HUGETLB_FLAG_ENCODE__16GB@HUGETLB_FLAG_ENCODE_16GB@
> >>>>
> >>>> Acked-by: Michal Hocko <mhocko@suse.com>
> >>>
> >>> Btw. man page mentions only 2MB and 1GB, we should document others and
> >>> note that each arch might support only subset of them
> >>
> >> Thanks for looking at these Michal.
> >> BTW, those definitions below are wrong.  They should be SHM_HUGE_*. :(
> > 
> > Ups, and I completely missed that.
> > 
> >> In the overview of this RFC, I mentioned still needing to address the
> >> comment from Aneesh about splitting SHM_HUGE_* definitions into arch
> >> specific header files.  This is how it is done for mmap.  If an arch
> >> supports multiple huge page sizes, the 'asm/mman.h' contains definitions
> >> for those sizes.  There will be a bit of churn (such as header file
> >> renaming) to do this for shm as well.  So, I keep going back and forth
> >> asking myself 'is it worth it'?
> > 
> > Why cannot we use a generic header? Btw. I think it would be better for
> > MMAP definitions as well.
> 
> I assume you are asking about a uapi asm-generic header file?  Currently
> mmap has two such files:  mman.h and mman-common.h.  In order to get the
> definitions in such files, arch specific header files must #include the
> asm-generic headers.  There are arch specific mmap headers today that do
> not include either of the asm-generic headers.  And, they have their own
> definitions for MAP_HUGE_SHIFT.  So, it seems we can not use one of the
> existing mmap asm-generic header files.  Rather, we would need to create
> a new one and have that included by all arch specific files.

yes, add a new one like you did in your first patch

> However, ALL the MAP_HUGE_* definitions in all the arch specific and
> asm-generic header files are the same.  It would be possible to just put
> all those MAP_HUGE_* definitions in the primary uapi header file
> (include/uapi/linux/mman.h).  If there was ever a need for arch specific
> values in the future, we could split them out at that time.

agreed

[...]

> >> - Another alternative is to make all known huge page sizes available
> >>   to all users.  This is 'easier' as the definitions can likely reside
> >>   in a common header file.  The user will  need to determine what
> >>   huge page sizes are supported by the running kernel as mentioned in
> >>   the man page.
> > 
> > yes I think this makes more sense.
> 
> Ok, thanks.
> 
> The only remaining question is what kind of common header to use:
> 1) An asm-generic header file in case there may be arch specific differences
>    in the future.
> 2) Use the primary uapi header file in include/uapi/linux/mman|shm.h.

I would use the primary one and only got the arch specific if we ever
need to do arch specific thing.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-07-28  6:30 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-08 17:06 [PATCH] mm,hugetlb: compute page_size_log properly Davidlohr Bueso
2017-03-08 19:39 ` Andi Kleen
2017-03-09  3:24   ` Anshuman Khandual
2017-03-09  8:55 ` Michal Hocko
2017-03-28 16:53 ` Davidlohr Bueso
2017-03-28 16:55   ` Davidlohr Bueso
2017-03-28 17:54     ` Matthew Wilcox
2017-03-29  8:06       ` Michal Hocko
2017-03-29 17:45         ` Andi Kleen
2017-03-30  6:12           ` Michal Hocko
2017-04-12 16:18             ` Davidlohr Bueso
2017-04-12 16:30               ` Andi Kleen
2017-04-12 17:21               ` Matthew Wilcox
2017-04-13  6:02       ` Aneesh Kumar K.V
2017-04-13 12:46         ` Matthew Wilcox
2017-07-17 22:27       ` Mike Kravetz
2017-07-17 22:27         ` [RFC PATCH 1/3] mm:hugetlb: Define system call hugetlb size encodings in single file Mike Kravetz
2017-07-18  0:00           ` Matthew Wilcox
2017-07-26  9:50           ` Michal Hocko
2017-07-17 22:28         ` [RFC PATCH 2/3] mm: arch: Use new hugetlb size encoding definitions Mike Kravetz
2017-07-26  9:52           ` Michal Hocko
2017-07-17 22:28         ` [RFC PATCH 3/3] mm: shm: " Mike Kravetz
2017-07-26  9:53           ` Michal Hocko
2017-07-26 10:07             ` Michal Hocko
2017-07-26 17:39               ` Mike Kravetz
2017-07-26 18:48                 ` Matthew Wilcox
2017-07-27  7:50                 ` Michal Hocko
2017-07-27 21:18                   ` Mike Kravetz
2017-07-28  6:30                     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).