All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Waiman Long <longman@redhat.com>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	Matthew Wilcox <willy@infradead.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Takashi Iwai <tiwai@suse.de>, Davidlohr Bueso <dbueso@suse.de>
Subject: Re: [PATCH v12 3/3] ipc: Do cyclic id allocation with ipcmni_extend mode
Date: Tue, 19 Mar 2019 19:18:39 +0100	[thread overview]
Message-ID: <28571549-344f-8423-a20d-aeccff0e838a@colorfullife.com> (raw)
In-Reply-To: <728b5e85-3129-9707-3802-306f66093c78@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

Hi Waiman,


On 3/18/19 7:46 PM, Waiman Long wrote:
> --- a/ipc/util.c
>> +++ b/ipc/util.c
>> @@ -221,9 +221,17 @@ static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new)
>>   	 */
>>   
>>   	if (next_id < 0) { /* !CHECKPOINT_RESTORE or next_id is unset */
>> +		int max_idx;
>> +
>> +		max_idx = ids->in_use*3/2;
>> +		if (max_idx > ipc_mni)
>> +			max_idx = ipc_mni;
>> +		if (max_idx < ipc_min_cycle)
>> +			max_idx = ipc_min_cycle;
>
> Why don't you use the min() and max() macros which will make it easier 
> to read?
>
Changed.
>>   
>>   		/* allocate the idx, with a NULL struct kern_ipc_perm */
>> -		idx = idr_alloc(&ids->ipcs_idr, NULL, 0, 0, GFP_NOWAIT);
>> +		idx = idr_alloc_cyclic(&ids->ipcs_idr, NULL, 0, max_idx,
>> +					GFP_NOWAIT);
>>   
>>   		if (idx >= 0) {
>>   			/*
>> diff --git a/ipc/util.h b/ipc/util.h
>> index 8c834ed39012..ef4e86bb2db8 100644
>> --- a/ipc/util.h
>> +++ b/ipc/util.h
>> @@ -27,12 +27,14 @@
>>    */
>>   #define IPCMNI_SHIFT		15
>>   #define IPCMNI_EXTEND_SHIFT	24
>> +#define IPCMNI_EXTEND_MIN_CYCLE	(2 << 12)
>
> How about
>
> #define IPCMNI_EXTEND_MIN_CYCLE    (RADIX_TREE_MAP_SIZE * 
> RADIX_TREE_MAP_SIZE)
>
Good idea.
Actually, "2<<12" was the initial guess.

And then I noticed that this ends up as a two level radix tree during 
testing :-)


Updated patch attached.

--

     Manfred


[-- Attachment #2: 0002-ipc-Do-cyclic-id-allocation-for-the-ipc-object.patch --]
[-- Type: text/x-patch, Size: 4890 bytes --]

From 844c9d78cea41983a89c820bd5265ceded59883b Mon Sep 17 00:00:00 2001
From: Manfred Spraul <manfred@colorfullife.com>
Date: Sun, 17 Mar 2019 06:29:00 +0100
Subject: [PATCH 2/2] ipc: Do cyclic id allocation for the ipc object.

For ipcmni_extend mode, the sequence number space is only 7 bits. So
the chance of id reuse is relatively high compared with the non-extended
mode.

To alleviate this id reuse problem, this patch enables cyclic allocation
for the index to the radix tree (idx).
The disadvantage is that this can cause a slight slow-down of the fast
path, as the radix tree could be higher than necessary.

To limit the radix tree height, I have chosen the following limits:
- 1) The cycling is done over in_use*1.5.
- 2) At least, the cycling is done over
   "normal" ipcnmi mode: RADIX_TREE_MAP_SIZE elements
   "ipcmni_extended": 4096 elements

Result:
- for normal mode:
	No change for <= 42 active ipc elements. With more than 42
	active ipc elements, a 2nd level would be added to the radix
	tree.
	Without cyclic allocation, a 2nd level would be added only with
	more than 63 active elements.

- for extended mode:
	Cycling creates always at least a 2-level radix tree.
	With more than 2730 active objects, a 3rd level would be
	added, instead of > 4095 active objects until the 3rd level
	is added without cyclic allocation.

For a 2-level radix tree compared to a 1-level radix tree, I have
observed < 1% performance impact.

Notes:
1) Normal "x=semget();y=semget();" is unaffected: Then the idx
  is e.g. a and a+1, regardless if idr_alloc() or idr_alloc_cyclic()
  is used.

2) The -1% happens in a microbenchmark after this situation:
	x=semget();
	for(i=0;i<4000;i++) {t=semget();semctl(t,0,IPC_RMID);}
	y=semget();
	Now perform semget calls on x and y that do not sleep.

3) The worst-case reuse cycle time is unfortunately unaffected:
   If you have 2^24-1 ipc objects allocated, and get/remove the last
   possible element in a loop, then the id is reused after 128
   get/remove pairs.

Performance check:
A microbenchmark that performes no-op semop() randomly on two IDs,
with only these two IDs allocated.
The IDs were set using /proc/sys/kernel/sem_next_id.
The test was run 5 times, averages are shown.

1 & 2: Base (6.22 seconds for 10.000.000 semops)
1 & 40: -0.2%
1 & 3348: - 0.8%
1 & 27348: - 1.6%
1 & 15777204: - 3.2%

Or: ~12.6 cpu cycles per additional radix tree level.
The cpu is an Intel I3-5010U. ~1300 cpu cycles/syscall is slower
than what I remember (spectre impact?).

V2 of the patch:
- use "min" and "max"
- use RADIX_TREE_MAP_SIZE * RADIX_TREE_MAP_SIZE instead of
	(2<<12).

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
---
 ipc/ipc_sysctl.c | 2 ++
 ipc/util.c       | 7 ++++++-
 ipc/util.h       | 3 +++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c
index 73b7782eccf4..bfaae457810c 100644
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -122,6 +122,7 @@ static int one = 1;
 static int int_max = INT_MAX;
 int ipc_mni = IPCMNI;
 int ipc_mni_shift = IPCMNI_SHIFT;
+int ipc_min_cycle = RADIX_TREE_MAP_SIZE;
 
 static struct ctl_table ipc_kern_table[] = {
 	{
@@ -252,6 +253,7 @@ static int __init ipc_mni_extend(char *str)
 {
 	ipc_mni = IPCMNI_EXTEND;
 	ipc_mni_shift = IPCMNI_EXTEND_SHIFT;
+	ipc_min_cycle = IPCMNI_EXTEND_MIN_CYCLE;
 	pr_info("IPCMNI extended to %d.\n", ipc_mni);
 	return 0;
 }
diff --git a/ipc/util.c b/ipc/util.c
index 6e0fe3410423..1a492afb1d8b 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -221,9 +221,14 @@ static inline int ipc_idr_alloc(struct ipc_ids *ids, struct kern_ipc_perm *new)
 	 */
 
 	if (next_id < 0) { /* !CHECKPOINT_RESTORE or next_id is unset */
+		int max_idx;
+
+		max_idx = max(ids->in_use*3/2, ipc_min_cycle);
+		max_idx = min(max_idx, ipc_mni);
 
 		/* allocate the idx, with a NULL struct kern_ipc_perm */
-		idx = idr_alloc(&ids->ipcs_idr, NULL, 0, 0, GFP_NOWAIT);
+		idx = idr_alloc_cyclic(&ids->ipcs_idr, NULL, 0, max_idx,
+					GFP_NOWAIT);
 
 		if (idx >= 0) {
 			/*
diff --git a/ipc/util.h b/ipc/util.h
index 8c834ed39012..d316399f0c32 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -27,12 +27,14 @@
  */
 #define IPCMNI_SHIFT		15
 #define IPCMNI_EXTEND_SHIFT	24
+#define IPCMNI_EXTEND_MIN_CYCLE	(RADIX_TREE_MAP_SIZE * RADIX_TREE_MAP_SIZE)
 #define IPCMNI			(1 << IPCMNI_SHIFT)
 #define IPCMNI_EXTEND		(1 << IPCMNI_EXTEND_SHIFT)
 
 #ifdef CONFIG_SYSVIPC_SYSCTL
 extern int ipc_mni;
 extern int ipc_mni_shift;
+extern int ipc_min_cycle;
 
 #define ipcmni_seq_shift()	ipc_mni_shift
 #define IPCMNI_IDX_MASK		((1 << ipc_mni_shift) - 1)
@@ -40,6 +42,7 @@ extern int ipc_mni_shift;
 #else /* CONFIG_SYSVIPC_SYSCTL */
 
 #define ipc_mni			IPCMNI
+#define ipc_min_cycle		RADIX_TREE_MAP_SIZE
 #define ipcmni_seq_shift()	IPCMNI_SHIFT
 #define IPCMNI_IDX_MASK		((1 << IPCMNI_SHIFT) - 1)
 #endif /* CONFIG_SYSVIPC_SYSCTL */
-- 
2.17.2


  parent reply	other threads:[~2019-03-19 18:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-28 18:47 [PATCH v12 0/3] ipc: Increase IPCMNI limit Waiman Long
2019-02-28 18:47 ` [PATCH v12 1/3] ipc: Allow boot time extension of IPCMNI from 32k to 16M Waiman Long
2019-02-28 18:47 ` [PATCH v12 2/3] ipc: Conserve sequence numbers in ipcmni_extend mode Waiman Long
2019-03-16 18:52   ` Manfred Spraul
2019-03-18 18:57     ` Waiman Long
2019-03-18 19:00     ` Waiman Long
2019-02-28 18:47 ` [PATCH v12 3/3] ipc: Do cyclic id allocation with " Waiman Long
2019-03-17 18:27   ` Manfred Spraul
2019-03-18 18:37     ` Waiman Long
2019-03-18 18:53       ` Waiman Long
     [not found]     ` <728b5e85-3129-9707-3802-306f66093c78@redhat.com>
2019-03-19 18:18       ` Manfred Spraul [this message]
2019-03-19 18:46         ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28571549-344f-8423-a20d-aeccff0e838a@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dbueso@suse.de \
    --cc=ebiederm@xmission.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mcgrof@kernel.org \
    --cc=tiwai@suse.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.