linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Waiman Long <longman@redhat.com>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	Matthew Wilcox <willy@infradead.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Takashi Iwai <tiwai@suse.de>, Davidlohr Bueso <dbueso@suse.de>,
	1vier1@web.de
Subject: Re: [PATCH v12 2/3] ipc: Conserve sequence numbers in ipcmni_extend mode
Date: Sat, 16 Mar 2019 19:52:39 +0100	[thread overview]
Message-ID: <398a8bcb-7568-0a5b-c6cb-77420de445b9@colorfullife.com> (raw)
In-Reply-To: <1551379645-819-3-git-send-email-longman@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]

Hi,

On 2/28/19 7:47 PM, Waiman Long wrote:
> @@ -216,10 +221,11 @@ static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new)
>   	 */
>   
>   	if (next_id < 0) { /* !CHECKPOINT_RESTORE or next_id is unset */
> -		new->seq = ids->seq++;
> -		if (ids->seq > IPCID_SEQ_MAX)
> -			ids->seq = 0;
>   		idx = idr_alloc(&ids->ipcs_idr, new, 0, 0, GFP_NOWAIT);
> +		if ((idx <= ids->last_idx) && (++ids->seq > IPCID_SEQ_MAX))
> +			ids->seq = 0;

I'm always impressed by such lines:

Everything in just two lines, use "++a", etc.

But: How did you test it?

idr_alloc() can fail, the code doesn't handle that :-(


> +		new->seq = ids->seq;

As written this morning:

Writing new->seq after inserting "new" into the idr creates races 
without any good reason.

I could not spot a bug, even find_alloc_undo() appears to be safe, but 
why should we take this risk?


Attached is:

- proposed replacement for this patch.

- the test patch that I have used to check the error handling.


--

     Manfred


[-- Attachment #2: patch-debug-idr_alloc_failure --]
[-- Type: text/plain, Size: 871 bytes --]

diff --git a/ipc/util.c b/ipc/util.c
index 6e0fe3410423..5dafe4bc78a1 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -309,6 +309,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
 		}
 	}
 	if (idx < 0) {
+pr_info("failed allocation.\n");
 		new->deleted = true;
 		spin_unlock(&new->lock);
 		rcu_read_unlock();
diff --git a/lib/idr.c b/lib/idr.c
index cb1db9b8d3f6..ba274baa87e3 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -83,6 +83,17 @@ int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp)
 	if (WARN_ON_ONCE(start < 0))
 		return -EINVAL;
 
+	{
+		u64 a = get_jiffies_64();
+
+		if (time_after64(a, (u64)INITIAL_JIFFIES+40*HZ)) {
+			if (a%5 < 2) {
+				pr_info("idr_alloc:Failing.\n");
+				return -ENOSPC;
+			}
+		}
+	}
+
 	ret = idr_alloc_u32(idr, ptr, &id, end > 0 ? end - 1 : INT_MAX, gfp);
 	if (ret)
 		return ret;

[-- Attachment #3: 0001-ipc-Conserve-sequence-numbers-in-ipcmni_extend-mode.patch --]
[-- Type: text/x-patch, Size: 5210 bytes --]

From edee319b2d5c96af14b8b8899e5dde324861e4e4 Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Sat, 16 Mar 2019 10:18:53 +0100
Subject: [PATCH] ipc: Conserve sequence numbers in ipcmni_extend mode

Rewrite, based on the patch from Waiman Long:

The mixing in of a sequence number into the IPC IDs is probably to
avoid ID reuse in userspace as much as possible. With ipcmni_extend
mode, the number of usable sequence numbers is greatly reduced leading
to higher chance of ID reuse.

To address this issue, we need to conserve the sequence number space
as much as possible. Right now, the sequence number is incremented for
every new ID created. In reality, we only need to increment the sequence
number when new allocated ID is not greater than the last one allocated.
It is in such case that the new ID may collide with an existing one.
This is being done irrespective of the ipcmni mode.

In order to avoid any races, the index is first allocated and
then the pointer is replaced.

Changes compared to the initial patch:
- Handle failures from idr_alloc().
- Avoid that concurrent operations can see the wrong
  sequence number.
(This is achieved by using idr_replace()).
- IPCMNI_SEQ_SHIFT is not a constant, thus renamed to
	ipcmni_seq_shift().
- IPCMNI_SEQ_MAX is not a constant, thus renamed to
	ipcmni_seq_max().

Suggested-by: Matthew Wilcox <willy@infradead.org>
Original-patch-from: Waiman Long <longman@redhat.com>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 include/linux/ipc_namespace.h |  1 +
 ipc/util.c                    | 35 ++++++++++++++++++++++++++++++-----
 ipc/util.h                    |  8 ++++----
 3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index 6ab8c1bada3f..c309f43bde45 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -19,6 +19,7 @@ struct ipc_ids {
 	struct rw_semaphore rwsem;
 	struct idr ipcs_idr;
 	int max_idx;
+	int last_idx;	/* For wrap around detection */
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	int next_id;
 #endif
diff --git a/ipc/util.c b/ipc/util.c
index 07ae117ccdc0..6e0fe3410423 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -120,6 +120,7 @@ void ipc_init_ids(struct ipc_ids *ids)
 	rhashtable_init(&ids->key_ht, &ipc_kht_params);
 	idr_init(&ids->ipcs_idr);
 	ids->max_idx = -1;
+	ids->last_idx = -1;
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	ids->next_id = -1;
 #endif
@@ -193,6 +194,10 @@ static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
  *
  * The caller must own kern_ipc_perm.lock.of the new object.
  * On error, the function returns a (negative) error code.
+ *
+ * To conserve sequence number space, especially with extended ipc_mni,
+ * the sequence number is incremented only when the returned ID is less than
+ * the last one.
  */
 static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new)
 {
@@ -216,17 +221,37 @@ static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new)
 	 */
 
 	if (next_id < 0) { /* !CHECKPOINT_RESTORE or next_id is unset */
-		new->seq = ids->seq++;
-		if (ids->seq > IPCID_SEQ_MAX)
-			ids->seq = 0;
-		idx = idr_alloc(&ids->ipcs_idr, new, 0, 0, GFP_NOWAIT);
+
+		/* allocate the idx, with a NULL struct kern_ipc_perm */
+		idx = idr_alloc(&ids->ipcs_idr, NULL, 0, 0, GFP_NOWAIT);
+
+		if (idx >= 0) {
+			/*
+			 * idx got allocated successfully.
+			 * Now calculate the sequence number and set the
+			 * pointer for real.
+			 */
+			if (idx <= ids->last_idx) {
+				ids->seq++;
+				if (ids->seq >= ipcid_seq_max())
+					ids->seq = 0;
+			}
+			ids->last_idx = idx;
+
+			new->seq = ids->seq;
+			/* no need for smp_wmb(), this is done
+			 * inside idr_replace, as part of
+			 * rcu_assign_pointer
+			 */
+			idr_replace(&ids->ipcs_idr, new, idx);
+		}
 	} else {
 		new->seq = ipcid_to_seqx(next_id);
 		idx = idr_alloc(&ids->ipcs_idr, new, ipcid_to_idx(next_id),
 				0, GFP_NOWAIT);
 	}
 	if (idx >= 0)
-		new->id = (new->seq << IPCMNI_SEQ_SHIFT) + idx;
+		new->id = (new->seq << ipcmni_seq_shift()) + idx;
 	return idx;
 }
 
diff --git a/ipc/util.h b/ipc/util.h
index 9746886757de..8c834ed39012 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -34,13 +34,13 @@
 extern int ipc_mni;
 extern int ipc_mni_shift;
 
-#define IPCMNI_SEQ_SHIFT	ipc_mni_shift
+#define ipcmni_seq_shift()	ipc_mni_shift
 #define IPCMNI_IDX_MASK		((1 << ipc_mni_shift) - 1)
 
 #else /* CONFIG_SYSVIPC_SYSCTL */
 
 #define ipc_mni			IPCMNI
-#define IPCMNI_SEQ_SHIFT	IPCMNI_SHIFT
+#define ipcmni_seq_shift()	IPCMNI_SHIFT
 #define IPCMNI_IDX_MASK		((1 << IPCMNI_SHIFT) - 1)
 #endif /* CONFIG_SYSVIPC_SYSCTL */
 
@@ -123,8 +123,8 @@ struct pid_namespace *ipc_seq_pid_ns(struct seq_file *);
 #define IPC_SHM_IDS	2
 
 #define ipcid_to_idx(id)  ((id) & IPCMNI_IDX_MASK)
-#define ipcid_to_seqx(id) ((id) >> IPCMNI_SEQ_SHIFT)
-#define IPCID_SEQ_MAX	  (INT_MAX >> IPCMNI_SEQ_SHIFT)
+#define ipcid_to_seqx(id) ((id) >> ipcmni_seq_shift())
+#define ipcid_seq_max()	  (INT_MAX >> ipcmni_seq_shift())
 
 /* must be called with ids->rwsem acquired for writing */
 int ipc_addid(struct ipc_ids *, struct kern_ipc_perm *, int);
-- 
2.17.2


  reply	other threads:[~2019-03-16 18:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28 18:47 [PATCH v12 0/3] ipc: Increase IPCMNI limit Waiman Long
2019-02-28 18:47 ` [PATCH v12 1/3] ipc: Allow boot time extension of IPCMNI from 32k to 16M Waiman Long
2019-02-28 18:47 ` [PATCH v12 2/3] ipc: Conserve sequence numbers in ipcmni_extend mode Waiman Long
2019-03-16 18:52   ` Manfred Spraul [this message]
2019-03-18 18:57     ` Waiman Long
2019-03-18 19:00     ` Waiman Long
2019-02-28 18:47 ` [PATCH v12 3/3] ipc: Do cyclic id allocation with " Waiman Long
2019-03-17 18:27   ` Manfred Spraul
2019-03-18 18:37     ` Waiman Long
2019-03-18 18:53       ` Waiman Long
     [not found]     ` <728b5e85-3129-9707-3802-306f66093c78@redhat.com>
2019-03-19 18:18       ` Manfred Spraul
2019-03-19 18:46         ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=398a8bcb-7568-0a5b-c6cb-77420de445b9@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=1vier1@web.de \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dbueso@suse.de \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mcgrof@kernel.org \
    --cc=tiwai@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).