linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit
@ 2018-06-18 10:28 Waiman Long
  2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

v7->v8:
 - Remove the __read_mostly tag for ipc_mni and related variables as their
   accesses are not really in performance critical path.
 - Add a new ipcmni_compat sysctl parameter that can be set to restore old
   range check behavior if desired.

v6->v7:
 - Drop the range clamping code and just return error instead for now
   until there is user request for clamping support.
 - Fix compilation error when CONFIG_SYSVIPC_SYSCTL isn't defined.

v5->v6:
 - Consolidate the 3 ctl_table flags into 2.
 - Make similar changes to proc_doulongvec_minmax() and its associates
   to complete the clamping change.
 - Remove the sysctl registration failure test patch for now for later
   consideration.
 - Add extra braces to patch 1 to reduce code diff in a later patch.

v4->v5:
 - Revert the flags back to 16-bit so that there will be no change to
   the size of ctl_table.
 - Enhance the sysctl_check_flags() as requested by Luis to perform more
   checks to spot incorrect ctl_table entries.
 - Change the sysctl selftest to use dummy sysctls instead of production
   ones & enhance it to do more checks.
 - Add one more sysctl selftest for registration failure.
 - Add 2 ipc patches to add an extended mode to increase IPCMNI from
   32k to 2M.
 - Miscellaneous change to incorporate feedback comments from
   reviewers.

v3->v4:
 - Remove v3 patches 1 & 2 as they have been merged into the mm tree.
 - Change flags from uint16_t to unsigned int.
 - Remove CTL_FLAGS_OOR_WARNED and use pr_warn_ratelimited() instead.
 - Simplify the warning message code.
 - Add a new patch to fail the ctl_table registration with invalid flag.
 - Add a test case for range clamping in sysctl selftest.

v2->v3:
 - Fix kdoc comment errors.
 - Incorporate comments and suggestions from Luis R. Rodriguez.
 - Add a patch to fix a typo error in fs/proc/proc_sysctl.c.

v1->v2:
 - Add kdoc comments to the do_proc_do{u}intvec_minmax_conv_param
   structures.
 - Add a new flags field to the ctl_table structure for specifying
   whether range clamping should be activated instead of adding new
   sysctl parameter handlers.
 - Clamp the semmni value embedded in the multi-values sem parameter.

v5 patch: https://lkml.org/lkml/2018/3/16/1106
v6 patch: https://lkml.org/lkml/2018/4/27/1094
v7 patch: https://lkml.org/lkml/2018/5/7/666

The sysctl parameters msgmni, shmmni and semmni have an inherent limit
of IPC_MNI (32k). However, users may not be aware of that because they
can write a value much higher than that without getting any error or
notification. Reading the parameters back will show the newly written
values which are not real.

The real IPCMNI limit is now enforced to make sure that users won't
put in an unrealistic value. The first 2 patches enforce the limits.

There are also users out there requesting increase in the IPCMNI value.
The last 2 patches attempt to do that by using a boot kernel parameter
"ipcmni_extend" to increase the IPCMNI limit from 32k to 2M if the users
really want the extended value.

Enforcing the range limit check may cause some existing applications to break
if they unwittingly set a value higher than 32k. To allow system administrators
to work around this issue, a new ipcmni_compat sysctl parameter can now be set
to restore the old behavior. This compatibility mode can only be set if the
ipcmni_extend boot parameter is not specified. Patch 5 implements this new
sysctl parameter.

Waiman Long (5):
  ipc: IPCMNI limit check for msgmni and shmmni
  ipc: IPCMNI limit check for semmni
  ipc: Allow boot time extension of IPCMNI from 32k to 2M
  ipc: Conserve sequence numbers in extended IPCMNI mode
  ipc: Add a new ipcmni_compat sysctl to fall back to old behavior

 Documentation/admin-guide/kernel-parameters.txt |  3 +
 Documentation/sysctl/kernel.txt                 | 15 +++++
 include/linux/ipc_namespace.h                   |  1 +
 ipc/ipc_sysctl.c                                | 78 ++++++++++++++++++++++++-
 ipc/util.c                                      | 41 ++++++++-----
 ipc/util.h                                      | 50 +++++++++++++---
 6 files changed, 164 insertions(+), 24 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
@ 2018-06-18 10:28 ` Waiman Long
  2018-06-28  3:16   ` Luis R. Rodriguez
  2018-08-17 16:51   ` Davidlohr Bueso
  2018-06-18 10:28 ` [PATCH v8 2/5] ipc: IPCMNI limit check for semmni Waiman Long
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

A user can write arbitrary integer values to msgmni and shmmni sysctl
parameters without getting error, but the actual limit is really
IPCMNI (32k). This can mislead users as they think they can get a
value that is not real.

The right limits are now set for msgmni and shmmni so that the users
will become aware if they set a value outside of the acceptable range.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 ipc/ipc_sysctl.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index 8ad93c2..f87cb29 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -99,6 +99,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
 static int zero;
 static int one = 1;
 static int int_max = INT_MAX;
+static int ipc_mni = IPCMNI;
 
 static struct ctl_table ipc_kern_table[] = {
 	{
@@ -120,7 +121,9 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
 		.data		= &init_ipc_ns.shm_ctlmni,
 		.maxlen		= sizeof(init_ipc_ns.shm_ctlmni),
 		.mode		= 0644,
-		.proc_handler	= proc_ipc_dointvec,
+		.proc_handler	= proc_ipc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &ipc_mni,
 	},
 	{
 		.procname	= "shm_rmid_forced",
@@ -147,7 +150,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
 		.mode		= 0644,
 		.proc_handler	= proc_ipc_dointvec_minmax,
 		.extra1		= &zero,
-		.extra2		= &int_max,
+		.extra2		= &ipc_mni,
 	},
 	{
 		.procname	= "auto_msgmni",
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 2/5] ipc: IPCMNI limit check for semmni
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
  2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
@ 2018-06-18 10:28 ` Waiman Long
  2018-06-28 22:39   ` Luis R. Rodriguez
  2018-08-17 16:53   ` Davidlohr Bueso
  2018-06-18 10:28 ` [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M Waiman Long
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

For SysV semaphores, the semmni value is the last part of the 4-element
sem number array. To make semmni behave in a similar way to msgmni and
shmmni, we can't directly use the _minmax handler. Instead, a special
sem specific handler is added to check the last argument to make sure
that it is limited to the [0, IPCMNI] range. An error will be returned
if this is not the case.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 ipc/ipc_sysctl.c | 23 ++++++++++++++++++++++-
 ipc/util.h       |  9 +++++++++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index f87cb29..49f9bf4 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -88,12 +88,33 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
 	return proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos);
 }
 
+static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
+	void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret, semmni;
+	struct ipc_namespace *ns = current->nsproxy->ipc_ns;
+
+	semmni = ns->sem_ctls[3];
+	ret = proc_ipc_dointvec(table, write, buffer, lenp, ppos);
+
+	if (!ret)
+		ret = sem_check_semmni(current->nsproxy->ipc_ns);
+
+	/*
+	 * Reset the semmni value if an error happens.
+	 */
+	if (ret)
+		ns->sem_ctls[3] = semmni;
+	return ret;
+}
+
 #else
 #define proc_ipc_doulongvec_minmax NULL
 #define proc_ipc_dointvec	   NULL
 #define proc_ipc_dointvec_minmax   NULL
 #define proc_ipc_dointvec_minmax_orphans   NULL
 #define proc_ipc_auto_msgmni	   NULL
+#define proc_ipc_sem_dointvec	   NULL
 #endif
 
 static int zero;
@@ -175,7 +196,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
 		.data		= &init_ipc_ns.sem_ctls,
 		.maxlen		= 4*sizeof(int),
 		.mode		= 0644,
-		.proc_handler	= proc_ipc_dointvec,
+		.proc_handler	= proc_ipc_sem_dointvec,
 	},
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	{
diff --git a/ipc/util.h b/ipc/util.h
index 0aba323..8e9c52c 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -218,6 +218,15 @@ int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
 void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
 		void (*free)(struct ipc_namespace *, struct kern_ipc_perm *));
 
+static inline int sem_check_semmni(struct ipc_namespace *ns) {
+	/*
+	 * Check semmni range [0, IPCMNI]
+	 * semmni is the last element of sem_ctls[4] array
+	 */
+	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > IPCMNI))
+		? -ERANGE : 0;
+}
+
 #ifdef CONFIG_COMPAT
 #include <linux/compat.h>
 struct compat_ipc_perm {
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
  2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
  2018-06-18 10:28 ` [PATCH v8 2/5] ipc: IPCMNI limit check for semmni Waiman Long
@ 2018-06-18 10:28 ` Waiman Long
  2018-08-17 16:45   ` Davidlohr Bueso
  2018-06-18 10:28 ` [PATCH v8 4/5] ipc: Conserve sequence numbers in extended IPCMNI mode Waiman Long
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

The maximum number of unique System V IPC identifiers was limited to
32k.  That limit should be big enough for most use cases.

However, there are some users out there requesting for more. To satisfy
the need of those users, a new boot time kernel option "ipcmni_extend"
is added to extend the IPCMNI value to 2M. This is a 64X increase which
hopefully is big enough for them.

This new option does have the side effect of reducing the maximum
number of unique sequence numbers from 64k down to 1k. So it is
a trade-off.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 ++
 ipc/ipc_sysctl.c                                | 12 ++++++-
 ipc/util.c                                      | 12 +++----
 ipc/util.h                                      | 42 +++++++++++++++++++------
 4 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index efc7aa7..6712a42 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1744,6 +1744,9 @@
 	ip=		[IP_PNP]
 			See Documentation/filesystems/nfs/nfsroot.txt.
 
+	ipcmni_extend	[KNL] Extend the maximum number of unique System V
+			IPC identifiers from 32768 to 2097152.
+
 	irqaffinity=	[SMP] Set the default irq affinity mask
 			The argument is a cpu list, as described above.
 
diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index 49f9bf4..73b7782 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -120,7 +120,8 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 static int zero;
 static int one = 1;
 static int int_max = INT_MAX;
-static int ipc_mni = IPCMNI;
+int ipc_mni = IPCMNI;
+int ipc_mni_shift = IPCMNI_SHIFT;
 
 static struct ctl_table ipc_kern_table[] = {
 	{
@@ -246,3 +247,12 @@ static int __init ipc_sysctl_init(void)
 }
 
 device_initcall(ipc_sysctl_init);
+
+static int __init ipc_mni_extend(char *str)
+{
+	ipc_mni = IPCMNI_EXTEND;
+	ipc_mni_shift = IPCMNI_EXTEND_SHIFT;
+	pr_info("IPCMNI extended to %d.\n", ipc_mni);
+	return 0;
+}
+early_param("ipcmni_extend", ipc_mni_extend);
diff --git a/ipc/util.c b/ipc/util.c
index 4e81182..782a8d0 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -113,7 +113,7 @@ static int __init ipc_init(void)
  * @ids: ipc identifier set
  *
  * Set up the sequence range to use for the ipc identifier range (limited
- * below IPCMNI) then initialise the keys hashtable and ids idr.
+ * below ipc_mni) then initialise the keys hashtable and ids idr.
  */
 int ipc_init_ids(struct ipc_ids *ids)
 {
@@ -214,7 +214,7 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
 		ids->next_id = -1;
 	}
 
-	return SEQ_MULTIPLIER * new->seq + id;
+	return (new->seq << SEQ_SHIFT) + id;
 }
 
 #else
@@ -228,7 +228,7 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
 	if (ids->seq > IPCID_SEQ_MAX)
 		ids->seq = 0;
 
-	return SEQ_MULTIPLIER * new->seq + id;
+	return (new->seq << SEQ_SHIFT) + id;
 }
 
 #endif /* CONFIG_CHECKPOINT_RESTORE */
@@ -252,8 +252,8 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
 	kgid_t egid;
 	int id, err;
 
-	if (limit > IPCMNI)
-		limit = IPCMNI;
+	if (limit > ipc_mni)
+		limit = ipc_mni;
 
 	if (!ids->tables_initialized || ids->in_use >= limit)
 		return -ENOSPC;
@@ -777,7 +777,7 @@ static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
 	if (total >= ids->in_use)
 		return NULL;
 
-	for (; pos < IPCMNI; pos++) {
+	for (; pos < ipc_mni; pos++) {
 		ipc = idr_find(&ids->ipcs_idr, pos);
 		if (ipc != NULL) {
 			*new_pos = pos + 1;
diff --git a/ipc/util.h b/ipc/util.h
index 8e9c52c..d103630 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -15,8 +15,30 @@
 #include <linux/err.h>
 #include <linux/ipc_namespace.h>
 
-#define IPCMNI 32768  /* <= MAX_INT limit for ipc arrays (including sysctl changes) */
-#define SEQ_MULTIPLIER	(IPCMNI)
+/*
+ * By default, the ipc arrays can have up to 32k (15 bits) entries.
+ * When IPCMNI extension mode is turned on, the ipc arrays can have up
+ * to 2M (21 bits) entries. However, the space for sequence number will
+ * be shrunk from 16 bits to 10 bits.
+ */
+#define IPCMNI_SHIFT		15
+#define IPCMNI_EXTEND_SHIFT	21
+#define IPCMNI			(1 << IPCMNI_SHIFT)
+#define IPCMNI_EXTEND		(1 << IPCMNI_EXTEND_SHIFT)
+
+#ifdef CONFIG_SYSVIPC_SYSCTL
+extern int ipc_mni;
+extern int ipc_mni_shift;
+
+#define SEQ_SHIFT		ipc_mni_shift
+#define SEQ_MASK		((1 << ipc_mni_shift) - 1)
+
+#else /* CONFIG_SYSVIPC_SYSCTL */
+
+#define ipc_mni 		IPCMNI
+#define SEQ_SHIFT		IPCMNI_SHIFT
+#define SEQ_MASK		((1 << IPCMNI_SHIFT) - 1)
+#endif /* CONFIG_SYSVIPC_SYSCTL */
 
 int sem_init(void);
 int msg_init(void);
@@ -96,9 +118,9 @@ void __init ipc_init_proc_interface(const char *path, const char *header,
 #define IPC_MSG_IDS	1
 #define IPC_SHM_IDS	2
 
-#define ipcid_to_idx(id) ((id) % SEQ_MULTIPLIER)
-#define ipcid_to_seqx(id) ((id) / SEQ_MULTIPLIER)
-#define IPCID_SEQ_MAX min_t(int, INT_MAX/SEQ_MULTIPLIER, USHRT_MAX)
+#define ipcid_to_idx(id)  ((id) & SEQ_MASK)
+#define ipcid_to_seqx(id) ((id) >> SEQ_SHIFT)
+#define IPCID_SEQ_MAX	  (INT_MAX >> SEQ_SHIFT)
 
 /* must be called with ids->rwsem acquired for writing */
 int ipc_addid(struct ipc_ids *, struct kern_ipc_perm *, int);
@@ -123,8 +145,8 @@ static inline int ipc_get_maxid(struct ipc_ids *ids)
 	if (ids->in_use == 0)
 		return -1;
 
-	if (ids->in_use == IPCMNI)
-		return IPCMNI - 1;
+	if (ids->in_use == ipc_mni)
+		return ipc_mni - 1;
 
 	return ids->max_id;
 }
@@ -175,7 +197,7 @@ static inline void ipc_update_pid(struct pid **pos, struct pid *pid)
 
 static inline int ipc_checkid(struct kern_ipc_perm *ipcp, int uid)
 {
-	return uid / SEQ_MULTIPLIER != ipcp->seq;
+	return (uid >> SEQ_SHIFT) != ipcp->seq;
 }
 
 static inline void ipc_lock_object(struct kern_ipc_perm *perm)
@@ -220,10 +242,10 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
 
 static inline int sem_check_semmni(struct ipc_namespace *ns) {
 	/*
-	 * Check semmni range [0, IPCMNI]
+	 * Check semmni range [0, ipc_mni]
 	 * semmni is the last element of sem_ctls[4] array
 	 */
-	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > IPCMNI))
+	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipc_mni))
 		? -ERANGE : 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 4/5] ipc: Conserve sequence numbers in extended IPCMNI mode
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
                   ` (2 preceding siblings ...)
  2018-06-18 10:28 ` [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M Waiman Long
@ 2018-06-18 10:28 ` Waiman Long
  2018-06-18 10:28 ` [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior Waiman Long
  2018-08-17 16:50 ` [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Davidlohr Bueso
  5 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

The mixing in of a sequence number into the IPC IDs is probably to
avoid ID reuse in userspace as much as possible. With extended IPCMNI
mode, the number of usable sequence numbers is greatly reduced leading
to higher chance of ID reuse.

To address this issue, we need to conserve the sequence number space
as much as possible. Right now, the sequence number is incremented
for every new ID created. In reality, we only need to increment the
sequence number when one or more IDs have been removed previously to
make sure that those IDs will not be reused when a new one is built.
This is being done in the extended IPCMNI mode,

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/ipc_namespace.h |  1 +
 ipc/ipc_sysctl.c              |  2 ++
 ipc/util.c                    | 29 ++++++++++++++++++++++-------
 ipc/util.h                    |  2 ++
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index b5630c8..9c86fd9 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -16,6 +16,7 @@
 struct ipc_ids {
 	int in_use;
 	unsigned short seq;
+	unsigned short deleted;
 	bool tables_initialized;
 	struct rw_semaphore rwsem;
 	struct idr ipcs_idr;
diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index 73b7782..d9ac6ca 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -122,6 +122,7 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 static int int_max = INT_MAX;
 int ipc_mni = IPCMNI;
 int ipc_mni_shift = IPCMNI_SHIFT;
+bool ipc_mni_extended;
 
 static struct ctl_table ipc_kern_table[] = {
 	{
@@ -252,6 +253,7 @@ static int __init ipc_mni_extend(char *str)
 {
 	ipc_mni = IPCMNI_EXTEND;
 	ipc_mni_shift = IPCMNI_EXTEND_SHIFT;
+	ipc_mni_extended = true;
 	pr_info("IPCMNI extended to %d.\n", ipc_mni);
 	return 0;
 }
diff --git a/ipc/util.c b/ipc/util.c
index 782a8d0..7c8e733 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -119,7 +119,8 @@ int ipc_init_ids(struct ipc_ids *ids)
 {
 	int err;
 	ids->in_use = 0;
-	ids->seq = 0;
+	ids->deleted = false;
+	ids->seq = ipc_mni_extended ? 0 : -1; /* seq # is pre-incremented */
 	init_rwsem(&ids->rwsem);
 	err = rhashtable_init(&ids->key_ht, &ipc_kht_params);
 	if (err)
@@ -193,6 +194,11 @@ static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
 	return NULL;
 }
 
+/*
+ * To conserve sequence number space with extended ipc_mni when new ID
+ * is built, the sequence number is incremented only when one or more
+ * IDs have been removed previously.
+ */
 #ifdef CONFIG_CHECKPOINT_RESTORE
 /*
  * Specify desired id for next allocated IPC object.
@@ -206,9 +212,13 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
 			      struct kern_ipc_perm *new)
 {
 	if (ids->next_id < 0) { /* default, behave as !CHECKPOINT_RESTORE */
-		new->seq = ids->seq++;
-		if (ids->seq > IPCID_SEQ_MAX)
-			ids->seq = 0;
+		if (!ipc_mni_extended || ids->deleted) {
+			ids->seq++;
+			if (ids->seq > IPCID_SEQ_MAX)
+				ids->seq = 0;
+			ids->deleted = false;
+		}
+		new->seq = ids->seq;
 	} else {
 		new->seq = ipcid_to_seqx(ids->next_id);
 		ids->next_id = -1;
@@ -224,9 +234,13 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
 static inline int ipc_buildid(int id, struct ipc_ids *ids,
 			      struct kern_ipc_perm *new)
 {
-	new->seq = ids->seq++;
-	if (ids->seq > IPCID_SEQ_MAX)
-		ids->seq = 0;
+	if (!ipc_mni_extended || ids->deleted) {
+		ids->seq++;
+		if (ids->seq > IPCID_SEQ_MAX)
+			ids->seq = 0;
+		ids->deleted = false;
+	}
+	new->seq = ids->seq;
 
 	return (new->seq << SEQ_SHIFT) + id;
 }
@@ -436,6 +450,7 @@ void ipc_rmid(struct ipc_ids *ids, struct kern_ipc_perm *ipcp)
 	idr_remove(&ids->ipcs_idr, lid);
 	ipc_kht_remove(ids, ipcp);
 	ids->in_use--;
+	ids->deleted = true;
 	ipcp->deleted = true;
 
 	if (unlikely(lid == ids->max_id)) {
diff --git a/ipc/util.h b/ipc/util.h
index d103630..62b6247 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -29,6 +29,7 @@
 #ifdef CONFIG_SYSVIPC_SYSCTL
 extern int ipc_mni;
 extern int ipc_mni_shift;
+extern bool ipc_mni_extended;
 
 #define SEQ_SHIFT		ipc_mni_shift
 #define SEQ_MASK		((1 << ipc_mni_shift) - 1)
@@ -36,6 +37,7 @@
 #else /* CONFIG_SYSVIPC_SYSCTL */
 
 #define ipc_mni 		IPCMNI
+#define ipc_mni_extended	false
 #define SEQ_SHIFT		IPCMNI_SHIFT
 #define SEQ_MASK		((1 << IPCMNI_SHIFT) - 1)
 #endif /* CONFIG_SYSVIPC_SYSCTL */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
                   ` (3 preceding siblings ...)
  2018-06-18 10:28 ` [PATCH v8 4/5] ipc: Conserve sequence numbers in extended IPCMNI mode Waiman Long
@ 2018-06-18 10:28 ` Waiman Long
  2018-06-18 11:36   ` kbuild test robot
  2018-06-18 14:27   ` kbuild test robot
  2018-08-17 16:50 ` [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Davidlohr Bueso
  5 siblings, 2 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-18 10:28 UTC (permalink / raw)
  To: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet
  Cc: linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, Waiman Long

With strict range limit enforcement of msgmni, shmmni and sem, it is
possible that some existing applications that set those values to above
32k may fail. To help users to work around this potential problem, a new
boolean ipcmni_compat sysctl is added to provide the old beahavior for
compatibility when it is set to 1. In other word, the limit will then be
enforced internally but no error will be reported.

This compatibility mode can only be enabled if the ipcmni_extend kernel
boot parameter is not specified.

The sysctl documentation is also updated accordingly.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/sysctl/kernel.txt | 15 +++++++++++++++
 ipc/ipc_sysctl.c                | 42 ++++++++++++++++++++++++++++++++++++++---
 ipc/util.h                      |  5 +++--
 3 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index eded671d..e98d967 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -39,6 +39,7 @@ show up in /proc/sys/kernel:
 - hung_task_check_count
 - hung_task_timeout_secs
 - hung_task_warnings
+- ipcmni_compat
 - kexec_load_disabled
 - kptr_restrict
 - l2cr                        [ PPC only ]
@@ -374,6 +375,20 @@ This file shows up if CONFIG_DETECT_HUNG_TASK is enabled.
 
 ==============================================================
 
+ipcmni_compat:
+
+A boolean flag to control range checking behavior of msgmni, shmmni
+and the mni portion of sem.
+
+0: Range limits will be strictly enforced and error will be returned
+   if limits are exceeded.
+1: Range limits will only be enforced internally and no error will be
+   returned if the upper limit is exceeded. This compatibility behavior
+   can only be selected if the ipcmni_extend kernel boot parameter is
+   not specified.
+
+==============================================================
+
 kexec_load_disabled:
 
 A toggle indicating if the kexec_load syscall has been disabled. This
diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index d9ac6ca..5c0eac4 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -18,6 +18,8 @@
 #include <linux/msg.h>
 #include "util.h"
 
+static int ipcmni_compat;
+
 static void *get_ipc(struct ctl_table *table)
 {
 	char *which = table->data;
@@ -108,6 +110,25 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 	return ret;
 }
 
+static int proc_ipcmni_compat_minmax(struct ctl_table *table, int write,
+	void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	if (ret)
+		return ret;
+
+	/*
+	 * ipcmni_compat can only be set if !ipcmni_extend.
+	 */
+	if (ipcmni_compat && ipc_mni_extended) {
+		ipcmni_compat = 0;
+		return -EINVAL;
+	}
+	ipcmni_max = ipcmni_compat ? INT_MAX : ipc_mni;
+	return 0;
+}
+
 #else
 #define proc_ipc_doulongvec_minmax NULL
 #define proc_ipc_dointvec	   NULL
@@ -115,11 +136,13 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 #define proc_ipc_dointvec_minmax_orphans   NULL
 #define proc_ipc_auto_msgmni	   NULL
 #define proc_ipc_sem_dointvec	   NULL
+#define proc_ipcmni_compat_minmax  NULL
 #endif
 
 static int zero;
 static int one = 1;
 static int int_max = INT_MAX;
+int ipcmni_max = IPCMNI;
 int ipc_mni = IPCMNI;
 int ipc_mni_shift = IPCMNI_SHIFT;
 bool ipc_mni_extended;
@@ -146,7 +169,7 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 		.mode		= 0644,
 		.proc_handler	= proc_ipc_dointvec_minmax,
 		.extra1		= &zero,
-		.extra2		= &ipc_mni,
+		.extra2		= &ipcmni_max,
 	},
 	{
 		.procname	= "shm_rmid_forced",
@@ -173,7 +196,7 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 		.mode		= 0644,
 		.proc_handler	= proc_ipc_dointvec_minmax,
 		.extra1		= &zero,
-		.extra2		= &ipc_mni,
+		.extra2		= &ipcmni_max,
 	},
 	{
 		.procname	= "auto_msgmni",
@@ -229,6 +252,19 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
 		.extra2		= &int_max,
 	},
 #endif
+	/*
+	 * Unlike other IPC sysctl parameters above, the following sysctl
+	 * parameter is global and affect behavior for all the namespaces.
+	 */
+	{
+		.procname	= "ipcmni_compat",
+		.data		= &ipcmni_compat,
+		.maxlen		= sizeof(ipcmni_compat),
+		.mode		= 0644,
+		.proc_handler	= proc_ipcmni_compat_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
 	{}
 };
 
@@ -251,7 +287,7 @@ static int __init ipc_sysctl_init(void)
 
 static int __init ipc_mni_extend(char *str)
 {
-	ipc_mni = IPCMNI_EXTEND;
+	ipc_mni = ipcmni_max = IPCMNI_EXTEND;
 	ipc_mni_shift = IPCMNI_EXTEND_SHIFT;
 	ipc_mni_extended = true;
 	pr_info("IPCMNI extended to %d.\n", ipc_mni);
diff --git a/ipc/util.h b/ipc/util.h
index 62b6247..dde5014 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -29,6 +29,7 @@
 #ifdef CONFIG_SYSVIPC_SYSCTL
 extern int ipc_mni;
 extern int ipc_mni_shift;
+extern int ipcmni_max;
 extern bool ipc_mni_extended;
 
 #define SEQ_SHIFT		ipc_mni_shift
@@ -244,10 +245,10 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
 
 static inline int sem_check_semmni(struct ipc_namespace *ns) {
 	/*
-	 * Check semmni range [0, ipc_mni]
+	 * Check semmni range [0, ipcmni_max]
 	 * semmni is the last element of sem_ctls[4] array
 	 */
-	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipc_mni))
+	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
 		? -ERANGE : 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior
  2018-06-18 10:28 ` [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior Waiman Long
@ 2018-06-18 11:36   ` kbuild test robot
  2018-06-18 14:27   ` kbuild test robot
  1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2018-06-18 11:36 UTC (permalink / raw)
  To: Waiman Long
  Cc: kbuild-all, Luis R. Rodriguez, Kees Cook, Andrew Morton,
	Jonathan Corbet, linux-kernel, linux-fsdevel, linux-doc, Al Viro,
	Matthew Wilcox, Eric W. Biederman, Takashi Iwai, Davidlohr Bueso,
	Waiman Long

[-- Attachment #1: Type: text/plain, Size: 2507 bytes --]

Hi Waiman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.18-rc1 next-20180618]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/ipc-IPCMNI-limit-check-for-mni-increase-that-limit/20180618-183206
config: i386-randconfig-x013-201824 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   In file included from ipc/mqueue.c:43:0:
   ipc/util.h: In function 'sem_check_semmni':
>> ipc/util.h:251:54: error: 'ipcmni_max' undeclared (first use in this function); did you mean 'icmp_mib'?
     return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
                                                         ^~~~~~~~~~
                                                         icmp_mib
   ipc/util.h:251:54: note: each undeclared identifier is reported only once for each function it appears in
--
   In file included from ipc/msgutil.c:22:0:
   ipc/util.h: In function 'sem_check_semmni':
>> ipc/util.h:251:54: error: 'ipcmni_max' undeclared (first use in this function); did you mean 'pci_iomap'?
     return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
                                                         ^~~~~~~~~~
                                                         pci_iomap
   ipc/util.h:251:54: note: each undeclared identifier is reported only once for each function it appears in

vim +251 ipc/util.h

   239	
   240	struct kern_ipc_perm *ipc_obtain_object_check(struct ipc_ids *ids, int id);
   241	int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
   242				const struct ipc_ops *ops, struct ipc_params *params);
   243	void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
   244			void (*free)(struct ipc_namespace *, struct kern_ipc_perm *));
   245	
   246	static inline int sem_check_semmni(struct ipc_namespace *ns) {
   247		/*
   248		 * Check semmni range [0, ipcmni_max]
   249		 * semmni is the last element of sem_ctls[4] array
   250		 */
 > 251		return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
   252			? -ERANGE : 0;
   253	}
   254	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28855 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior
  2018-06-18 10:28 ` [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior Waiman Long
  2018-06-18 11:36   ` kbuild test robot
@ 2018-06-18 14:27   ` kbuild test robot
  1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2018-06-18 14:27 UTC (permalink / raw)
  To: Waiman Long
  Cc: kbuild-all, Luis R. Rodriguez, Kees Cook, Andrew Morton,
	Jonathan Corbet, linux-kernel, linux-fsdevel, linux-doc, Al Viro,
	Matthew Wilcox, Eric W. Biederman, Takashi Iwai, Davidlohr Bueso,
	Waiman Long

[-- Attachment #1: Type: text/plain, Size: 1902 bytes --]

Hi Waiman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.18-rc1 next-20180618]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/ipc-IPCMNI-limit-check-for-mni-increase-that-limit/20180618-183206
config: i386-randconfig-a0-06181851 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   In file included from ipc/mqueue.c:43:0:
   ipc/util.h: In function 'sem_check_semmni':
>> ipc/util.h:251:54: error: 'ipcmni_max' undeclared (first use in this function)
     return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
                                                         ^
   ipc/util.h:251:54: note: each undeclared identifier is reported only once for each function it appears in

vim +/ipcmni_max +251 ipc/util.h

   239	
   240	struct kern_ipc_perm *ipc_obtain_object_check(struct ipc_ids *ids, int id);
   241	int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
   242				const struct ipc_ops *ops, struct ipc_params *params);
   243	void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
   244			void (*free)(struct ipc_namespace *, struct kern_ipc_perm *));
   245	
   246	static inline int sem_check_semmni(struct ipc_namespace *ns) {
   247		/*
   248		 * Check semmni range [0, ipcmni_max]
   249		 * semmni is the last element of sem_ctls[4] array
   250		 */
 > 251		return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipcmni_max))
   252			? -ERANGE : 0;
   253	}
   254	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25790 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni
  2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
@ 2018-06-28  3:16   ` Luis R. Rodriguez
  2018-08-17 16:51   ` Davidlohr Bueso
  1 sibling, 0 replies; 19+ messages in thread
From: Luis R. Rodriguez @ 2018-06-28  3:16 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso

On Mon, Jun 18, 2018 at 06:28:14PM +0800, Waiman Long wrote:
> A user can write arbitrary integer values to msgmni and shmmni sysctl
> parameters without getting error, but the actual limit is really
> IPCMNI (32k). This can mislead users as they think they can get a
> value that is not real.
> 
> The right limits are now set for msgmni and shmmni so that the users
> will become aware if they set a value outside of the acceptable range.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>

  Luis

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 2/5] ipc: IPCMNI limit check for semmni
  2018-06-18 10:28 ` [PATCH v8 2/5] ipc: IPCMNI limit check for semmni Waiman Long
@ 2018-06-28 22:39   ` Luis R. Rodriguez
  2018-06-29  7:26     ` Waiman Long
  2018-08-17 16:53   ` Davidlohr Bueso
  1 sibling, 1 reply; 19+ messages in thread
From: Luis R. Rodriguez @ 2018-06-28 22:39 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso

On Mon, Jun 18, 2018 at 06:28:15PM +0800, Waiman Long wrote:
> For SysV semaphores, the semmni value is the last part of the 4-element
> sem number array. To make semmni behave in a similar way to msgmni and
> shmmni, we can't directly use the _minmax handler. Instead, a special
> sem specific handler is added to check the last argument to make sure
> that it is limited to the [0, IPCMNI] range. An error will be returned
> if this is not the case.

Sorry my time has run out, and will be on vacation until July 10th. If
this can wait until then then great, otherwise hopefully Kees can take
a look. It would be wonderful to get Eric's review on this as well.

  Luis

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 2/5] ipc: IPCMNI limit check for semmni
  2018-06-28 22:39   ` Luis R. Rodriguez
@ 2018-06-29  7:26     ` Waiman Long
  0 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2018-06-29  7:26 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Kees Cook, Andrew Morton, Jonathan Corbet, linux-kernel,
	linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso

On 06/29/2018 06:39 AM, Luis R. Rodriguez wrote:
> On Mon, Jun 18, 2018 at 06:28:15PM +0800, Waiman Long wrote:
>> For SysV semaphores, the semmni value is the last part of the 4-element
>> sem number array. To make semmni behave in a similar way to msgmni and
>> shmmni, we can't directly use the _minmax handler. Instead, a special
>> sem specific handler is added to check the last argument to make sure
>> that it is limited to the [0, IPCMNI] range. An error will be returned
>> if this is not the case.
> Sorry my time has run out, and will be on vacation until July 10th. If
> this can wait until then then great, otherwise hopefully Kees can take
> a look. It would be wonderful to get Eric's review on this as well.
>
>   Luis

Thanks for the review. I will try to get other reviewer to chime in.

Kees and Eric, do you have comments on this patchset?

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M
  2018-06-18 10:28 ` [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M Waiman Long
@ 2018-08-17 16:45   ` Davidlohr Bueso
  2018-08-18  1:15     ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Davidlohr Bueso @ 2018-08-17 16:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

Cc'ing Manfred.

On Mon, 18 Jun 2018, Waiman Long wrote:

>The maximum number of unique System V IPC identifiers was limited to
>32k.  That limit should be big enough for most use cases.
>
>However, there are some users out there requesting for more. To satisfy
>the need of those users, a new boot time kernel option "ipcmni_extend"
>is added to extend the IPCMNI value to 2M. This is a 64X increase which
>hopefully is big enough for them.

Could you please provide more info on the need of these users and how
you came up with this new value (which just seems quite arbitrary)?

Thanks,
Davidlohr

>
>This new option does have the side effect of reducing the maximum
>number of unique sequence numbers from 64k down to 1k. So it is
>a trade-off.
>
>Signed-off-by: Waiman Long <longman@redhat.com>
>---
> Documentation/admin-guide/kernel-parameters.txt |  3 ++
> ipc/ipc_sysctl.c                                | 12 ++++++-
> ipc/util.c                                      | 12 +++----
> ipc/util.h                                      | 42 +++++++++++++++++++------
> 4 files changed, 52 insertions(+), 17 deletions(-)
>
>diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>index efc7aa7..6712a42 100644
>--- a/Documentation/admin-guide/kernel-parameters.txt
>+++ b/Documentation/admin-guide/kernel-parameters.txt
>@@ -1744,6 +1744,9 @@
> 	ip=		[IP_PNP]
> 			See Documentation/filesystems/nfs/nfsroot.txt.
>
>+	ipcmni_extend	[KNL] Extend the maximum number of unique System V
>+			IPC identifiers from 32768 to 2097152.
>+
> 	irqaffinity=	[SMP] Set the default irq affinity mask
> 			The argument is a cpu list, as described above.
>
>diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
>index 49f9bf4..73b7782 100644
>--- a/ipc/ipc_sysctl.c
>+++ b/ipc/ipc_sysctl.c
>@@ -120,7 +120,8 @@ static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
> static int zero;
> static int one = 1;
> static int int_max = INT_MAX;
>-static int ipc_mni = IPCMNI;
>+int ipc_mni = IPCMNI;
>+int ipc_mni_shift = IPCMNI_SHIFT;
>
> static struct ctl_table ipc_kern_table[] = {
> 	{
>@@ -246,3 +247,12 @@ static int __init ipc_sysctl_init(void)
> }
>
> device_initcall(ipc_sysctl_init);
>+
>+static int __init ipc_mni_extend(char *str)
>+{
>+	ipc_mni = IPCMNI_EXTEND;
>+	ipc_mni_shift = IPCMNI_EXTEND_SHIFT;
>+	pr_info("IPCMNI extended to %d.\n", ipc_mni);
>+	return 0;
>+}
>+early_param("ipcmni_extend", ipc_mni_extend);
>diff --git a/ipc/util.c b/ipc/util.c
>index 4e81182..782a8d0 100644
>--- a/ipc/util.c
>+++ b/ipc/util.c
>@@ -113,7 +113,7 @@ static int __init ipc_init(void)
>  * @ids: ipc identifier set
>  *
>  * Set up the sequence range to use for the ipc identifier range (limited
>- * below IPCMNI) then initialise the keys hashtable and ids idr.
>+ * below ipc_mni) then initialise the keys hashtable and ids idr.
>  */
> int ipc_init_ids(struct ipc_ids *ids)
> {
>@@ -214,7 +214,7 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
> 		ids->next_id = -1;
> 	}
>
>-	return SEQ_MULTIPLIER * new->seq + id;
>+	return (new->seq << SEQ_SHIFT) + id;
> }
>
> #else
>@@ -228,7 +228,7 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
> 	if (ids->seq > IPCID_SEQ_MAX)
> 		ids->seq = 0;
>
>-	return SEQ_MULTIPLIER * new->seq + id;
>+	return (new->seq << SEQ_SHIFT) + id;
> }
>
> #endif /* CONFIG_CHECKPOINT_RESTORE */
>@@ -252,8 +252,8 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
> 	kgid_t egid;
> 	int id, err;
>
>-	if (limit > IPCMNI)
>-		limit = IPCMNI;
>+	if (limit > ipc_mni)
>+		limit = ipc_mni;
>
> 	if (!ids->tables_initialized || ids->in_use >= limit)
> 		return -ENOSPC;
>@@ -777,7 +777,7 @@ static struct kern_ipc_perm *sysvipc_find_ipc(struct ipc_ids *ids, loff_t pos,
> 	if (total >= ids->in_use)
> 		return NULL;
>
>-	for (; pos < IPCMNI; pos++) {
>+	for (; pos < ipc_mni; pos++) {
> 		ipc = idr_find(&ids->ipcs_idr, pos);
> 		if (ipc != NULL) {
> 			*new_pos = pos + 1;
>diff --git a/ipc/util.h b/ipc/util.h
>index 8e9c52c..d103630 100644
>--- a/ipc/util.h
>+++ b/ipc/util.h
>@@ -15,8 +15,30 @@
> #include <linux/err.h>
> #include <linux/ipc_namespace.h>
>
>-#define IPCMNI 32768  /* <= MAX_INT limit for ipc arrays (including sysctl changes) */
>-#define SEQ_MULTIPLIER	(IPCMNI)
>+/*
>+ * By default, the ipc arrays can have up to 32k (15 bits) entries.
>+ * When IPCMNI extension mode is turned on, the ipc arrays can have up
>+ * to 2M (21 bits) entries. However, the space for sequence number will
>+ * be shrunk from 16 bits to 10 bits.
>+ */
>+#define IPCMNI_SHIFT		15
>+#define IPCMNI_EXTEND_SHIFT	21
>+#define IPCMNI			(1 << IPCMNI_SHIFT)
>+#define IPCMNI_EXTEND		(1 << IPCMNI_EXTEND_SHIFT)
>+
>+#ifdef CONFIG_SYSVIPC_SYSCTL
>+extern int ipc_mni;
>+extern int ipc_mni_shift;
>+
>+#define SEQ_SHIFT		ipc_mni_shift
>+#define SEQ_MASK		((1 << ipc_mni_shift) - 1)
>+
>+#else /* CONFIG_SYSVIPC_SYSCTL */
>+
>+#define ipc_mni 		IPCMNI
>+#define SEQ_SHIFT		IPCMNI_SHIFT
>+#define SEQ_MASK		((1 << IPCMNI_SHIFT) - 1)
>+#endif /* CONFIG_SYSVIPC_SYSCTL */
>
> int sem_init(void);
> int msg_init(void);
>@@ -96,9 +118,9 @@ void __init ipc_init_proc_interface(const char *path, const char *header,
> #define IPC_MSG_IDS	1
> #define IPC_SHM_IDS	2
>
>-#define ipcid_to_idx(id) ((id) % SEQ_MULTIPLIER)
>-#define ipcid_to_seqx(id) ((id) / SEQ_MULTIPLIER)
>-#define IPCID_SEQ_MAX min_t(int, INT_MAX/SEQ_MULTIPLIER, USHRT_MAX)
>+#define ipcid_to_idx(id)  ((id) & SEQ_MASK)
>+#define ipcid_to_seqx(id) ((id) >> SEQ_SHIFT)
>+#define IPCID_SEQ_MAX	  (INT_MAX >> SEQ_SHIFT)
>
> /* must be called with ids->rwsem acquired for writing */
> int ipc_addid(struct ipc_ids *, struct kern_ipc_perm *, int);
>@@ -123,8 +145,8 @@ static inline int ipc_get_maxid(struct ipc_ids *ids)
> 	if (ids->in_use == 0)
> 		return -1;
>
>-	if (ids->in_use == IPCMNI)
>-		return IPCMNI - 1;
>+	if (ids->in_use == ipc_mni)
>+		return ipc_mni - 1;
>
> 	return ids->max_id;
> }
>@@ -175,7 +197,7 @@ static inline void ipc_update_pid(struct pid **pos, struct pid *pid)
>
> static inline int ipc_checkid(struct kern_ipc_perm *ipcp, int uid)
> {
>-	return uid / SEQ_MULTIPLIER != ipcp->seq;
>+	return (uid >> SEQ_SHIFT) != ipcp->seq;
> }
>
> static inline void ipc_lock_object(struct kern_ipc_perm *perm)
>@@ -220,10 +242,10 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
>
> static inline int sem_check_semmni(struct ipc_namespace *ns) {
> 	/*
>-	 * Check semmni range [0, IPCMNI]
>+	 * Check semmni range [0, ipc_mni]
> 	 * semmni is the last element of sem_ctls[4] array
> 	 */
>-	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > IPCMNI))
>+	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > ipc_mni))
> 		? -ERANGE : 0;
> }
>
>-- 
>1.8.3.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit
  2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
                   ` (4 preceding siblings ...)
  2018-06-18 10:28 ` [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior Waiman Long
@ 2018-08-17 16:50 ` Davidlohr Bueso
  2018-09-06 22:24   ` Andrew Morton
  5 siblings, 1 reply; 19+ messages in thread
From: Davidlohr Bueso @ 2018-08-17 16:50 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

On Mon, 18 Jun 2018, Waiman Long wrote:

>v7->v8:
> - Remove the __read_mostly tag for ipc_mni and related variables as their
>   accesses are not really in performance critical path.
> - Add a new ipcmni_compat sysctl parameter that can be set to restore old
>   range check behavior if desired.
>
>v6->v7:
> - Drop the range clamping code and just return error instead for now
>   until there is user request for clamping support.
> - Fix compilation error when CONFIG_SYSVIPC_SYSCTL isn't defined.
>
>v5->v6:
> - Consolidate the 3 ctl_table flags into 2.
> - Make similar changes to proc_doulongvec_minmax() and its associates
>   to complete the clamping change.
> - Remove the sysctl registration failure test patch for now for later
>   consideration.
> - Add extra braces to patch 1 to reduce code diff in a later patch.
>
>v4->v5:
> - Revert the flags back to 16-bit so that there will be no change to
>   the size of ctl_table.
> - Enhance the sysctl_check_flags() as requested by Luis to perform more
>   checks to spot incorrect ctl_table entries.
> - Change the sysctl selftest to use dummy sysctls instead of production
>   ones & enhance it to do more checks.
> - Add one more sysctl selftest for registration failure.
> - Add 2 ipc patches to add an extended mode to increase IPCMNI from
>   32k to 2M.
> - Miscellaneous change to incorporate feedback comments from
>   reviewers.
>
>v3->v4:
> - Remove v3 patches 1 & 2 as they have been merged into the mm tree.
> - Change flags from uint16_t to unsigned int.
> - Remove CTL_FLAGS_OOR_WARNED and use pr_warn_ratelimited() instead.
> - Simplify the warning message code.
> - Add a new patch to fail the ctl_table registration with invalid flag.
> - Add a test case for range clamping in sysctl selftest.
>
>v2->v3:
> - Fix kdoc comment errors.
> - Incorporate comments and suggestions from Luis R. Rodriguez.
> - Add a patch to fix a typo error in fs/proc/proc_sysctl.c.
>
>v1->v2:
> - Add kdoc comments to the do_proc_do{u}intvec_minmax_conv_param
>   structures.
> - Add a new flags field to the ctl_table structure for specifying
>   whether range clamping should be activated instead of adding new
>   sysctl parameter handlers.
> - Clamp the semmni value embedded in the multi-values sem parameter.
>
>v5 patch: https://lkml.org/lkml/2018/3/16/1106
>v6 patch: https://lkml.org/lkml/2018/4/27/1094
>v7 patch: https://lkml.org/lkml/2018/5/7/666
>
>The sysctl parameters msgmni, shmmni and semmni have an inherent limit
>of IPC_MNI (32k). However, users may not be aware of that because they
>can write a value much higher than that without getting any error or
>notification. Reading the parameters back will show the newly written
>values which are not real.
>
>The real IPCMNI limit is now enforced to make sure that users won't
>put in an unrealistic value. The first 2 patches enforce the limits.
>
>There are also users out there requesting increase in the IPCMNI value.
>The last 2 patches attempt to do that by using a boot kernel parameter
>"ipcmni_extend" to increase the IPCMNI limit from 32k to 2M if the users
>really want the extended value.
>
>Enforcing the range limit check may cause some existing applications to break
>if they unwittingly set a value higher than 32k. To allow system administrators
>to work around this issue, a new ipcmni_compat sysctl parameter can now be set
>to restore the old behavior. This compatibility mode can only be set if the
>ipcmni_extend boot parameter is not specified. Patch 5 implements this new
>sysctl parameter.
>
>Waiman Long (5):
>  ipc: IPCMNI limit check for msgmni and shmmni
>  ipc: IPCMNI limit check for semmni

I've reviewed the first two which look good and are actual immediate fixes
to the bogus user input. I haven't gotten around yet the rest of the patches
but are at least a bit more controversial than the first two. As such, could
patch 1 and 2 be picked up once the merge window closes, for v4.20? Although
-stable might want in, this situation is quite historic, so it's not that
urgent.

Thanks,
Davidlohr

>  ipc: Allow boot time extension of IPCMNI from 32k to 2M
>  ipc: Conserve sequence numbers in extended IPCMNI mode
>  ipc: Add a new ipcmni_compat sysctl to fall back to old behavior
>
> Documentation/admin-guide/kernel-parameters.txt |  3 +
> Documentation/sysctl/kernel.txt                 | 15 +++++
> include/linux/ipc_namespace.h                   |  1 +
> ipc/ipc_sysctl.c                                | 78 ++++++++++++++++++++++++-
> ipc/util.c                                      | 41 ++++++++-----
> ipc/util.h                                      | 50 +++++++++++++---
> 6 files changed, 164 insertions(+), 24 deletions(-)
>
>-- 
>1.8.3.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni
  2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
  2018-06-28  3:16   ` Luis R. Rodriguez
@ 2018-08-17 16:51   ` Davidlohr Bueso
  1 sibling, 0 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2018-08-17 16:51 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

On Mon, 18 Jun 2018, Waiman Long wrote:

>A user can write arbitrary integer values to msgmni and shmmni sysctl
>parameters without getting error, but the actual limit is really
>IPCMNI (32k). This can mislead users as they think they can get a
>value that is not real.
>
>The right limits are now set for msgmni and shmmni so that the users
>will become aware if they set a value outside of the acceptable range.
>
>Signed-off-by: Waiman Long <longman@redhat.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

>---
> ipc/ipc_sysctl.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
>diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
>index 8ad93c2..f87cb29 100644
>--- a/ipc/ipc_sysctl.c
>+++ b/ipc/ipc_sysctl.c
>@@ -99,6 +99,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
> static int zero;
> static int one = 1;
> static int int_max = INT_MAX;
>+static int ipc_mni = IPCMNI;
>
> static struct ctl_table ipc_kern_table[] = {
> 	{
>@@ -120,7 +121,9 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
> 		.data		= &init_ipc_ns.shm_ctlmni,
> 		.maxlen		= sizeof(init_ipc_ns.shm_ctlmni),
> 		.mode		= 0644,
>-		.proc_handler	= proc_ipc_dointvec,
>+		.proc_handler	= proc_ipc_dointvec_minmax,
>+		.extra1		= &zero,
>+		.extra2		= &ipc_mni,
> 	},
> 	{
> 		.procname	= "shm_rmid_forced",
>@@ -147,7 +150,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
> 		.mode		= 0644,
> 		.proc_handler	= proc_ipc_dointvec_minmax,
> 		.extra1		= &zero,
>-		.extra2		= &int_max,
>+		.extra2		= &ipc_mni,
> 	},
> 	{
> 		.procname	= "auto_msgmni",
>-- 
>1.8.3.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 2/5] ipc: IPCMNI limit check for semmni
  2018-06-18 10:28 ` [PATCH v8 2/5] ipc: IPCMNI limit check for semmni Waiman Long
  2018-06-28 22:39   ` Luis R. Rodriguez
@ 2018-08-17 16:53   ` Davidlohr Bueso
  1 sibling, 0 replies; 19+ messages in thread
From: Davidlohr Bueso @ 2018-08-17 16:53 UTC (permalink / raw)
  To: Waiman Long
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

On Mon, 18 Jun 2018, Waiman Long wrote:

>For SysV semaphores, the semmni value is the last part of the 4-element
>sem number array. To make semmni behave in a similar way to msgmni and
>shmmni, we can't directly use the _minmax handler. Instead, a special
>sem specific handler is added to check the last argument to make sure
>that it is limited to the [0, IPCMNI] range. An error will be returned
>if this is not the case.
>
>Signed-off-by: Waiman Long <longman@redhat.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

>---
> ipc/ipc_sysctl.c | 23 ++++++++++++++++++++++-
> ipc/util.h       |  9 +++++++++
> 2 files changed, 31 insertions(+), 1 deletion(-)
>
>diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
>index f87cb29..49f9bf4 100644
>--- a/ipc/ipc_sysctl.c
>+++ b/ipc/ipc_sysctl.c
>@@ -88,12 +88,33 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
> 	return proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos);
> }
>
>+static int proc_ipc_sem_dointvec(struct ctl_table *table, int write,
>+	void __user *buffer, size_t *lenp, loff_t *ppos)
>+{
>+	int ret, semmni;
>+	struct ipc_namespace *ns = current->nsproxy->ipc_ns;
>+
>+	semmni = ns->sem_ctls[3];
>+	ret = proc_ipc_dointvec(table, write, buffer, lenp, ppos);
>+
>+	if (!ret)
>+		ret = sem_check_semmni(current->nsproxy->ipc_ns);
>+
>+	/*
>+	 * Reset the semmni value if an error happens.
>+	 */
>+	if (ret)
>+		ns->sem_ctls[3] = semmni;
>+	return ret;
>+}
>+
> #else
> #define proc_ipc_doulongvec_minmax NULL
> #define proc_ipc_dointvec	   NULL
> #define proc_ipc_dointvec_minmax   NULL
> #define proc_ipc_dointvec_minmax_orphans   NULL
> #define proc_ipc_auto_msgmni	   NULL
>+#define proc_ipc_sem_dointvec	   NULL
> #endif
>
> static int zero;
>@@ -175,7 +196,7 @@ static int proc_ipc_auto_msgmni(struct ctl_table *table, int write,
> 		.data		= &init_ipc_ns.sem_ctls,
> 		.maxlen		= 4*sizeof(int),
> 		.mode		= 0644,
>-		.proc_handler	= proc_ipc_dointvec,
>+		.proc_handler	= proc_ipc_sem_dointvec,
> 	},
> #ifdef CONFIG_CHECKPOINT_RESTORE
> 	{
>diff --git a/ipc/util.h b/ipc/util.h
>index 0aba323..8e9c52c 100644
>--- a/ipc/util.h
>+++ b/ipc/util.h
>@@ -218,6 +218,15 @@ int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
> void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
> 		void (*free)(struct ipc_namespace *, struct kern_ipc_perm *));
>
>+static inline int sem_check_semmni(struct ipc_namespace *ns) {
>+	/*
>+	 * Check semmni range [0, IPCMNI]
>+	 * semmni is the last element of sem_ctls[4] array
>+	 */
>+	return ((ns->sem_ctls[3] < 0) || (ns->sem_ctls[3] > IPCMNI))
>+		? -ERANGE : 0;
>+}
>+
> #ifdef CONFIG_COMPAT
> #include <linux/compat.h>
> struct compat_ipc_perm {
>-- 
>1.8.3.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M
  2018-08-17 16:45   ` Davidlohr Bueso
@ 2018-08-18  1:15     ` Waiman Long
  2018-10-02 16:32       ` Manfred Spraul
  0 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2018-08-18  1:15 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

On 08/17/2018 12:45 PM, Davidlohr Bueso wrote:
> Cc'ing Manfred.
>
> On Mon, 18 Jun 2018, Waiman Long wrote:
>
>> The maximum number of unique System V IPC identifiers was limited to
>> 32k.  That limit should be big enough for most use cases.
>>
>> However, there are some users out there requesting for more. To satisfy
>> the need of those users, a new boot time kernel option "ipcmni_extend"
>> is added to extend the IPCMNI value to 2M. This is a 64X increase which
>> hopefully is big enough for them.
>
> Could you please provide more info on the need of these users and how
> you came up with this new value (which just seems quite arbitrary)?
>
> Thanks,
> Davidlohr 

Red Hat has a customer that is migrating from Solaris to Linux. Some of
their applications just happen to use more than 32k of shared memory
segments. I think Solaris allows up to 16M unique ID.

Yes, the amount of increase is a bit arbitrary. I was trying to balance
how many bits should be left for sequence number. Maybe I should just
take 8 more bits for ID and leave 8 bits for sequence number to match
Solaris.

-Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit
  2018-08-17 16:50 ` [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Davidlohr Bueso
@ 2018-09-06 22:24   ` Andrew Morton
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2018-09-06 22:24 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Waiman Long, Luis R. Rodriguez, Kees Cook, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso, manfred

On Fri, 17 Aug 2018 09:50:08 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:

> >Enforcing the range limit check may cause some existing applications to break
> >if they unwittingly set a value higher than 32k. To allow system administrators
> >to work around this issue, a new ipcmni_compat sysctl parameter can now be set
> >to restore the old behavior. This compatibility mode can only be set if the
> >ipcmni_extend boot parameter is not specified. Patch 5 implements this new
> >sysctl parameter.
> >
> >Waiman Long (5):
> >  ipc: IPCMNI limit check for msgmni and shmmni
> >  ipc: IPCMNI limit check for semmni
> 
> I've reviewed the first two which look good and are actual immediate fixes
> to the bogus user input. I haven't gotten around yet the rest of the patches
> but are at least a bit more controversial than the first two. As such, could
> patch 1 and 2 be picked up once the merge window closes, for v4.20? Although
> -stable might want in, this situation is quite historic, so it's not that
> urgent.

Thanks.  Could we please have a refresh and resend?  Hopefully Luis
will have time for another pass.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M
  2018-08-18  1:15     ` Waiman Long
@ 2018-10-02 16:32       ` Manfred Spraul
  2018-10-02 17:43         ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Manfred Spraul @ 2018-10-02 16:32 UTC (permalink / raw)
  To: Waiman Long, Davidlohr Bueso
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso

Hello together,

On 8/18/18 3:15 AM, Waiman Long wrote:
> On 08/17/2018 12:45 PM, Davidlohr Bueso wrote:
>> Cc'ing Manfred.
>>
>> On Mon, 18 Jun 2018, Waiman Long wrote:
>>
>>> The maximum number of unique System V IPC identifiers was limited to
>>> 32k.  That limit should be big enough for most use cases.
>>>
>>> However, there are some users out there requesting for more. To satisfy
>>> the need of those users, a new boot time kernel option "ipcmni_extend"
>>> is added to extend the IPCMNI value to 2M. This is a 64X increase which
>>> hopefully is big enough for them.
>> Could you please provide more info on the need of these users and how
>> you came up with this new value (which just seems quite arbitrary)?
>>
>> Thanks,
>> Davidlohr
> Red Hat has a customer that is migrating from Solaris to Linux. Some of
> their applications just happen to use more than 32k of shared memory
> segments. I think Solaris allows up to 16M unique ID.
>
> Yes, the amount of increase is a bit arbitrary. I was trying to balance
> how many bits should be left for sequence number. Maybe I should just
> take 8 more bits for ID and leave 8 bits for sequence number to match
> Solaris.

- I think we should use the same numbers as Solaris.
Otherwise we later have to touch it again.

- What is the performance when using shmget() with already 10M segments 
present?

- I like the new logic for updating the sequence counter.

Is there a reason why you only enable it for extended mode?

You create a rarely used codepath, and I don't understand what speaks 
against switching to the 'deleted' approach for all systems.


--

     Manfred

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M
  2018-10-02 16:32       ` Manfred Spraul
@ 2018-10-02 17:43         ` Waiman Long
  0 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2018-10-02 17:43 UTC (permalink / raw)
  To: Manfred Spraul, Davidlohr Bueso
  Cc: Luis R. Rodriguez, Kees Cook, Andrew Morton, Jonathan Corbet,
	linux-kernel, linux-fsdevel, linux-doc, Al Viro, Matthew Wilcox,
	Eric W. Biederman, Takashi Iwai, Davidlohr Bueso

On 10/02/2018 12:32 PM, Manfred Spraul wrote:
> Hello together,
>
> On 8/18/18 3:15 AM, Waiman Long wrote:
>> On 08/17/2018 12:45 PM, Davidlohr Bueso wrote:
>>> Cc'ing Manfred.
>>>
>>> On Mon, 18 Jun 2018, Waiman Long wrote:
>>>
>>>> The maximum number of unique System V IPC identifiers was limited to
>>>> 32k.  That limit should be big enough for most use cases.
>>>>
>>>> However, there are some users out there requesting for more. To
>>>> satisfy
>>>> the need of those users, a new boot time kernel option "ipcmni_extend"
>>>> is added to extend the IPCMNI value to 2M. This is a 64X increase
>>>> which
>>>> hopefully is big enough for them.
>>> Could you please provide more info on the need of these users and how
>>> you came up with this new value (which just seems quite arbitrary)?
>>>
>>> Thanks,
>>> Davidlohr
>> Red Hat has a customer that is migrating from Solaris to Linux. Some of
>> their applications just happen to use more than 32k of shared memory
>> segments. I think Solaris allows up to 16M unique ID.
>>
>> Yes, the amount of increase is a bit arbitrary. I was trying to balance
>> how many bits should be left for sequence number. Maybe I should just
>> take 8 more bits for ID and leave 8 bits for sequence number to match
>> Solaris.
>
> - I think we should use the same numbers as Solaris.
> Otherwise we later have to touch it again.

As said in my patch, it is a trade-off between # of uniq identifiers
versus the chance of id reuse. I don't have an objection to increase it
further, but I don't see the customers to really need such a large value.

>
> - What is the performance when using shmget() with already 10M
> segments present?

I am not sure the performance impact as I had not measure it myself. The
shmget() function is considered in slowpath. We are generally less
concern about its performance than other code paths that are in a
performance critical path.

>
> - I like the new logic for updating the sequence counter.
>
> Is there a reason why you only enable it for extended mode?

I tried not to disturb the existing logic for backward compatibility
concern. I don't mind switching it all over to use the new "deleted"
approach if other people have no objection.

Cheers,
Longman

> You create a rarely used codepath, and I don't understand what speaks
> against switching to the 'deleted' approach for all systems.
>
>
> -- 
>
>     Manfred
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-10-03  0:27 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-18 10:28 [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Waiman Long
2018-06-18 10:28 ` [PATCH v8 1/5] ipc: IPCMNI limit check for msgmni and shmmni Waiman Long
2018-06-28  3:16   ` Luis R. Rodriguez
2018-08-17 16:51   ` Davidlohr Bueso
2018-06-18 10:28 ` [PATCH v8 2/5] ipc: IPCMNI limit check for semmni Waiman Long
2018-06-28 22:39   ` Luis R. Rodriguez
2018-06-29  7:26     ` Waiman Long
2018-08-17 16:53   ` Davidlohr Bueso
2018-06-18 10:28 ` [PATCH v8 3/5] ipc: Allow boot time extension of IPCMNI from 32k to 2M Waiman Long
2018-08-17 16:45   ` Davidlohr Bueso
2018-08-18  1:15     ` Waiman Long
2018-10-02 16:32       ` Manfred Spraul
2018-10-02 17:43         ` Waiman Long
2018-06-18 10:28 ` [PATCH v8 4/5] ipc: Conserve sequence numbers in extended IPCMNI mode Waiman Long
2018-06-18 10:28 ` [PATCH v8 5/5] ipc: Add a new ipcmni_compat sysctl to fall back to old behavior Waiman Long
2018-06-18 11:36   ` kbuild test robot
2018-06-18 14:27   ` kbuild test robot
2018-08-17 16:50 ` [PATCH v8 0/5] ipc: IPCMNI limit check for *mni & increase that limit Davidlohr Bueso
2018-09-06 22:24   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).