linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/10] implement alternative and much simpler id allocator
@ 2016-12-08  1:22 Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 01/10] lib/idr.c: reused free bitmaps are already clear Rasmus Villemoes
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:22 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes, linux-block, dri-devel

TL;DR: these patches save 250 KB of memory, with more low-hanging
fruit ready to pick.

While browsing through the lib/idr.c code, I noticed that the code at
the end of ida_get_new_above() probably doesn't work as intended: Most
users of ida use it via ida_simple_get(), and that starts by
unconditionally calling ida_pre_get(), ensuring that ida->idr has
8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
case, none (or at most one) of these get used during
ida_get_new_above(), and we only free one, leaving at least 6 (usually
7) idr_layers in the free list.

The comment suggests that the idea is that subsequent allocations
would get rid of these one by one, but this doesn't work due to the
unconditional filling done by ida_simple_get - and callers who do
their own locking presumably call ida_pre_get themselves, with the
same effect.

To confirm my suspicion, and to see how many idas are actually in use
and how much they're utilized, I hacked together a /proc/ida file to
provide some stats on the currently active idas. Some examples of
lines from that on my laptop:

file:line	gets	puts	diff	layers	id_free_cnt
block/blk-core.c:49	27	0	27	1	7
drivers/base/platform.c:35	1	0	1	1	6
drivers/gpu/drm/drm_crtc.c:201	0	0	0	0	0
drivers/gpu/drm/drm_crtc.c:201	0	0	0	0	0
...
drivers/gpu/drm/drm_crtc.c:201	1	0	1	1	6
drivers/gpu/drm/drm_crtc.c:201	2	0	2	1	7
drivers/gpu/drm/drm_crtc.c:5700	5	0	5	1	7
...
drivers/scsi/hosts.c:45	6	0	6	1	7
drivers/scsi/sd.c:123	1	0	1	1	6
fs/devpts/inode.c:386	57	40	17	1	7
fs/kernfs/dir.c:918	10	0	10	1	7
fs/kernfs/dir.c:918	11	0	11	1	7
...
fs/kernfs/dir.c:918	24799	1609	23190	1	7
...
fs/super.c:882	49	5	44	1	7
kernel/workqueue.c:3198	10	8	2	1	7
kernel/workqueue.c:3198	11	9	2	1	7
kernel/workqueue.c:3198	2	0	2	1	7
kernel/workqueue.c:3198	2	0	2	1	7
kernel/workqueue.c:3198	2	0	2	1	7
kernel/workqueue.c:3198	94	91	3	1	7

file:line is the location of the ida_init or DEFINE_IDA, gets and puts
are the lifetime number of allocations/frees - their difference is
thus the current number of allocated ids. layers and id_free_cnt are
the members of ida->idr.

As expected, there are 0, 6 or 7 idr_layers cached and unused in each
ida, depending on whether there have been 0, 1 or more allocations
done. Add the idr_layer which is in use, and we see that e.g. the
workqueue instances use more than 6*8*sizeof(struct idr_layer) ~ 100K
of memory, or almost 1K per allocated id.

Patches 1 and 2 are minor optimization opportunities, while patch 3 is
an attempt at making the 'free the extra idr_layers one at a time'
actually work. But it's not a very good solution, since it doesn't
help the users who never do more than a handful of allocations, nor
does it help those who call ida_pre_get/ida_get_new
directly. Moreover, even if we somehow had perfect heuristics and got
rid of all excess idr_layers and ida_bitmap (another 128 bytes we
usually have hanging around), the minimum overhead of sizeof(struct
idr_layer)+sizeof(struct ida_bitmap) ~ 2200 bytes is quite a lot for
the many ida users who never allocate more than 100 ids.

So instead/in addition, I suggest we introduce a much simpler
allocator based on a dynamically growing bitmap. Patches 4-10
introduce struct tida and has a few examples of replacing ida with
tida. [Yeah, tida is not a good name, and probably _get and _put are also
bad.]

I've booted a 4.8.12 kernel with these backported, and as expected
they save around 250KB of memory. There's more to gain, but this is
just an RFC.



Rasmus Villemoes (10):
  lib/idr.c: reused free bitmaps are already clear
  lib/idr.c: delete useless condition
  lib/idr.c: only fill ida->idr when needed
  lib/tida.c: a very simple integer id allocator
  kernel/workqueue.c: replace id allocator ida with tida
  block: use tida as small id allocator
  drivers/base/platform.c: use simpler id allocator
  lib/tida.c: introduce tida_get_above
  drm: use simpler id allocator
  fs/devpts: use tida for id allocation

 block/blk-core.c                |   6 +--
 block/blk-sysfs.c               |   2 +-
 block/blk.h                     |   4 +-
 drivers/base/platform.c         |  10 ++--
 drivers/gpu/drm/drm_connector.c |  21 ++++----
 drivers/gpu/drm/drm_crtc.c      |   4 +-
 fs/devpts/inode.c               |  28 ++++-------
 include/drm/drm_crtc.h          |   3 +-
 include/linux/tida.h            |  32 ++++++++++++
 kernel/workqueue.c              |  14 +++---
 lib/Makefile                    |   2 +-
 lib/idr.c                       |  13 +++--
 lib/tida.c                      | 107 ++++++++++++++++++++++++++++++++++++++++
 13 files changed, 188 insertions(+), 58 deletions(-)
 create mode 100644 include/linux/tida.h
 create mode 100644 lib/tida.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC 01/10] lib/idr.c: reused free bitmaps are already clear
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
@ 2016-12-08  1:22 ` Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 02/10] lib/idr.c: delete useless condition Rasmus Villemoes
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:22 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

If we're recycling a previously used struct ida_bitmap, it ended up in
->free_bitmap precisely because it was all zeroes. Thus, by using
kzalloc in ida_pre_get we can avoid doing the memset while holding
whatever locks protects the ida (which, judging by the number of
callers of ida_simple_get vs ida_get_new_above, is usually always the
global simple_ida_lock).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 lib/idr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 6098336df267..9cbfae251d77 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -903,7 +903,7 @@ int ida_pre_get(struct ida *ida, gfp_t gfp_mask)
 	if (!ida->free_bitmap) {
 		struct ida_bitmap *bitmap;
 
-		bitmap = kmalloc(sizeof(struct ida_bitmap), gfp_mask);
+		bitmap = kzalloc(sizeof(struct ida_bitmap), gfp_mask);
 		if (!bitmap)
 			return 0;
 
@@ -962,7 +962,6 @@ int ida_get_new_above(struct ida *ida, int starting_id, int *p_id)
 		if (!bitmap)
 			return -EAGAIN;
 
-		memset(bitmap, 0, sizeof(struct ida_bitmap));
 		rcu_assign_pointer(pa[0]->ary[idr_id & IDR_MASK],
 				(void *)bitmap);
 		pa[0]->count++;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 02/10] lib/idr.c: delete useless condition
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 01/10] lib/idr.c: reused free bitmaps are already clear Rasmus Villemoes
@ 2016-12-08  1:22 ` Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 03/10] lib/idr.c: only fill ida->idr when needed Rasmus Villemoes
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:22 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

Whether or not we have a spare ida_bitmap hanging off ida->free_bitmap
doesn't seem related to whether the embedded struct idr may have a
spare struct idr_layer in its free list. So the only thing this
condition does is increase the chance that we end up calling
get_from_free_list in vain.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 lib/idr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/idr.c b/lib/idr.c
index 9cbfae251d77..1e786f817e66 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -991,7 +991,7 @@ int ida_get_new_above(struct ida *ida, int starting_id, int *p_id)
 	 * Throw away extra resources one by one after each successful
 	 * allocation.
 	 */
-	if (ida->idr.id_free_cnt || ida->free_bitmap) {
+	if (ida->idr.id_free_cnt) {
 		struct idr_layer *p = get_from_free_list(&ida->idr);
 		if (p)
 			kmem_cache_free(idr_layer_cache, p);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 03/10] lib/idr.c: only fill ida->idr when needed
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 01/10] lib/idr.c: reused free bitmaps are already clear Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 02/10] lib/idr.c: delete useless condition Rasmus Villemoes
@ 2016-12-08  1:22 ` Rasmus Villemoes
  2016-12-08  1:22 ` [RFC 04/10] lib/tida.c: a very simple integer id allocator Rasmus Villemoes
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:22 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

If I'm reading the code correctly, every call of ida_simple_get (and
almost all users of ida use that) starts by calling

  ida_pre_get -> __idr_pre_get

which fills up the free list of &ida->idr with MAX_IDR_FREE (aka 8)
struct idr_layers. After a successful id allocation, ida_get_new_above
gets rid of up to one of these (maybe some were used during the
allocation, but in the common case all 8 will still be there). On the
next call of ida_simple_get, we again fill up &ida->idr, and
deallocate at most one, which at least contradicts the comment at the
bottom of ida_get_new_above.

So rather than unconditionally calling ida_pre_get in ida_simple_get,
only do that if we get -EAGAIN.

But I'm not really convinced that this is a good way of doing
things. Many users of ida probably only ever allocates one or two ids,
so those would not hit the "free the excess idr_layers" often enough
to actually get rid of the extra memory. In practice, those two bits
could cost around 16KB of memory. But maybe the users shouldn't be
using ida in the first place, then. (For example, the watchdog code
always passes max=MAX_DOGS==32, so they'd be much better served with a
small fixed bitmap).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 lib/idr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/idr.c b/lib/idr.c
index 1e786f817e66..6393feb4b087 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -1092,9 +1092,6 @@ int ida_simple_get(struct ida *ida, unsigned int start, unsigned int end,
 	}
 
 again:
-	if (!ida_pre_get(ida, gfp_mask))
-		return -ENOMEM;
-
 	spin_lock_irqsave(&simple_ida_lock, flags);
 	ret = ida_get_new_above(ida, start, &id);
 	if (!ret) {
@@ -1107,8 +1104,11 @@ int ida_simple_get(struct ida *ida, unsigned int start, unsigned int end,
 	}
 	spin_unlock_irqrestore(&simple_ida_lock, flags);
 
-	if (unlikely(ret == -EAGAIN))
+	if (unlikely(ret == -EAGAIN)) {
+		if (!ida_pre_get(ida, gfp_mask))
+			return -ENOMEM;
 		goto again;
+	}
 
 	return ret;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 04/10] lib/tida.c: a very simple integer id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (2 preceding siblings ...)
  2016-12-08  1:22 ` [RFC 03/10] lib/idr.c: only fill ida->idr when needed Rasmus Villemoes
@ 2016-12-08  1:22 ` Rasmus Villemoes
  2016-12-08  1:23 ` [RFC 05/10] kernel/workqueue.c: replace id allocator ida with tida Rasmus Villemoes
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:22 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

The idr-based id allocator ida has a rather large memory footprint
(despite claims to the contrary). For example, on my laptop, each of
the six idas associated to the workqueues uses more than 16 KB of
memory (most of which sits unused in the embedded idr structure), even
though _combined_ they have only handed out a little over 100 ids in
their lifetime. That's about 1KB of overhead per id, which is rather
far from "each id only occupies a bit".

This introduces tida (t for tiny, trivial, whatever). Initially we
only support allocating the smallest available id, but this can
already replace many uses of ida throughout the tree.

It shouldn't be that hard to implement e.g. tida_get_min() or
tida_get_exact() which might make this capable of replacing more ida
instances.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 include/linux/tida.h |  26 +++++++++++++
 lib/Makefile         |   2 +-
 lib/tida.c           | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 130 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/tida.h
 create mode 100644 lib/tida.c

diff --git a/include/linux/tida.h b/include/linux/tida.h
new file mode 100644
index 000000000000..9aa3ad96a632
--- /dev/null
+++ b/include/linux/tida.h
@@ -0,0 +1,26 @@
+#ifndef __LINUX_TIDA_H__
+#define __LINUX_TIDA_H__
+
+#include <linux/types.h>
+#include <linux/spinlock.h>
+
+struct tida {
+	unsigned long	*bits;
+	unsigned long	alloc;
+	spinlock_t	lock;
+	int		hint;
+};
+
+#define TIDA_INIT(name)				\
+	{ .lock = __SPIN_LOCK_UNLOCKED(name.lock), }
+
+#define DEFINE_TIDA(name) struct tida name = TIDA_INIT(name)
+
+void tida_init(struct tida *tida);
+void tida_destroy(struct tida *tida);
+
+int tida_get(struct tida *tida, gfp_t gfp);
+void tida_put(struct tida *tida, int id);
+
+
+#endif /* __LINUX_TIDA_H__ */
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..e6f5b9896bbd 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -18,7 +18,7 @@ KCOV_INSTRUMENT_dynamic_debug.o := n
 
 lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 rbtree.o radix-tree.o dump_stack.o timerqueue.o\
-	 idr.o int_sqrt.o extable.o \
+	 idr.o int_sqrt.o extable.o tida.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
diff --git a/lib/tida.c b/lib/tida.c
new file mode 100644
index 000000000000..1ea0deb6fa64
--- /dev/null
+++ b/lib/tida.c
@@ -0,0 +1,103 @@
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/tida.h>
+
+/*
+ * An extremely simple integer id allocator with small memory
+ * footprint, mostly useful for cases where up to a few hundred ids
+ * get allocated.
+ *
+ * Note that the backing bitmap is never shrunk.
+ */
+
+/*
+ * Invariant:
+ *
+ *   0 <= tida->hint
+ *     <= find_next_zero_bit(tida->bits, tida->alloc, 0)
+ *     <= tida->alloc.
+ */
+
+static int
+tida_expand(struct tida *tida, gfp_t gfp, unsigned long *flags)
+	__releases(tida->lock)
+	__acquires(tida->lock)
+{
+	unsigned long newalloc, oldalloc = tida->alloc;
+	unsigned long *bits;
+
+	newalloc = oldalloc ? 2 * oldalloc : BITS_PER_LONG;
+
+	spin_unlock_irqrestore(&tida->lock, *flags);
+	bits = kcalloc(BITS_TO_LONGS(newalloc), sizeof(*bits), gfp);
+	spin_lock_irqsave(&tida->lock, *flags);
+
+	if (!bits)
+		return -ENOMEM;
+
+	if (tida->alloc < newalloc) {
+		memcpy(bits, tida->bits, tida->alloc/8);
+		tida->alloc = newalloc;
+		swap(tida->bits, bits);
+	}
+	kfree(bits);
+
+	return 0;
+}
+
+int
+tida_get(struct tida *tida, gfp_t gfp)
+{
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&tida->lock, flags);
+	while (1) {
+		/* find_next_zero_bit is fine with a NULL bitmap as long as size is 0 */
+		ret = find_next_zero_bit(tida->bits, tida->alloc, tida->hint);
+		if (ret < tida->alloc)
+			break;
+		ret = tida_expand(tida, gfp, &flags);
+		if (ret < 0)
+			goto out;
+	}
+
+	__set_bit(ret, tida->bits);
+	tida->hint = ret+1;
+out:
+	spin_unlock_irqrestore(&tida->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(tida_get);
+
+void
+tida_put(struct tida *tida, int id)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&tida->lock, flags);
+	__clear_bit(id, tida->bits);
+	if (id < tida->hint)
+		tida->hint = id;
+	spin_unlock_irqrestore(&tida->lock, flags);
+}
+EXPORT_SYMBOL_GPL(tida_put);
+
+void
+tida_init(struct tida *tida)
+{
+	memset(tida, 0, sizeof(*tida));
+	spin_lock_init(&tida->lock);
+}
+EXPORT_SYMBOL_GPL(tida_init);
+
+void
+tida_destroy(struct tida *tida)
+{
+	kfree(tida->bits);
+}
+EXPORT_SYMBOL_GPL(tida_destroy);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 05/10] kernel/workqueue.c: replace id allocator ida with tida
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (3 preceding siblings ...)
  2016-12-08  1:22 ` [RFC 04/10] lib/tida.c: a very simple integer id allocator Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-08  1:23 ` [RFC 06/10] block: use tida as small id allocator Rasmus Villemoes
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

The implementation of the id allocator ida causes about 14 KB of
unused memory to sit around in the embedded struct idr. Even if one
could get rid of that, ida would still use at least one idr_layer
worth 2 KB of memory as well as one idr_bitmap worth another 128
bytes, which is all a bit much if we only create, say, a few 100
workers.

Using the much simpler tida saves around 100 KB of memory.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 kernel/workqueue.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 479d840db286..6fef51ac1ad7 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -41,7 +41,7 @@
 #include <linux/kallsyms.h>
 #include <linux/debug_locks.h>
 #include <linux/lockdep.h>
-#include <linux/idr.h>
+#include <linux/tida.h>
 #include <linux/jhash.h>
 #include <linux/hashtable.h>
 #include <linux/rculist.h>
@@ -171,7 +171,7 @@ struct worker_pool {
 	struct list_head	workers;	/* A: attached workers */
 	struct completion	*detach_completion; /* all workers detached */
 
-	struct ida		worker_ida;	/* worker IDs for task name */
+	struct tida		worker_ida;	/* worker IDs for task name */
 
 	struct workqueue_attrs	*attrs;		/* I: worker attributes */
 	struct hlist_node	hash_node;	/* PL: unbound_pool_hash node */
@@ -1757,7 +1757,7 @@ static struct worker *create_worker(struct worker_pool *pool)
 	char id_buf[16];
 
 	/* ID is needed to determine kthread name */
-	id = ida_simple_get(&pool->worker_ida, 0, 0, GFP_KERNEL);
+	id = tida_get(&pool->worker_ida, GFP_KERNEL);
 	if (id < 0)
 		goto fail;
 
@@ -1796,7 +1796,7 @@ static struct worker *create_worker(struct worker_pool *pool)
 
 fail:
 	if (id >= 0)
-		ida_simple_remove(&pool->worker_ida, id);
+		tida_put(&pool->worker_ida, id);
 	kfree(worker);
 	return NULL;
 }
@@ -2186,7 +2186,7 @@ static int worker_thread(void *__worker)
 		worker->task->flags &= ~PF_WQ_WORKER;
 
 		set_task_comm(worker->task, "kworker/dying");
-		ida_simple_remove(&pool->worker_ida, worker->id);
+		tida_put(&pool->worker_ida, worker->id);
 		worker_detach_from_pool(worker, pool);
 		kfree(worker);
 		return 0;
@@ -3207,7 +3207,7 @@ static int init_worker_pool(struct worker_pool *pool)
 	mutex_init(&pool->attach_mutex);
 	INIT_LIST_HEAD(&pool->workers);
 
-	ida_init(&pool->worker_ida);
+	tida_init(&pool->worker_ida);
 	INIT_HLIST_NODE(&pool->hash_node);
 	pool->refcnt = 1;
 
@@ -3236,7 +3236,7 @@ static void rcu_free_pool(struct rcu_head *rcu)
 {
 	struct worker_pool *pool = container_of(rcu, struct worker_pool, rcu);
 
-	ida_destroy(&pool->worker_ida);
+	tida_destroy(&pool->worker_ida);
 	free_workqueue_attrs(pool->attrs);
 	kfree(pool);
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 06/10] block: use tida as small id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (4 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 05/10] kernel/workqueue.c: replace id allocator ida with tida Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-08  3:56   ` Jens Axboe
  2016-12-08  1:23 ` [RFC 07/10] drivers/base/platform.c: use simpler " Rasmus Villemoes
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes, linux-block

A struct ida ends up costing > 16 KB of runtime memory, which is quite
a lot for something which on my laptop as of this writing has handed
out 27 ids in its lifetime. So use the simpler and lighter-weight
struct tida.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 block/blk-core.c  | 6 +++---
 block/blk-sysfs.c | 2 +-
 block/blk.h       | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 14d7c0740dc0..6919a519d797 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -46,7 +46,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_split);
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_unplug);
 
-DEFINE_IDA(blk_queue_ida);
+DEFINE_TIDA(blk_queue_ida);
 
 /*
  * For the allocated request tables
@@ -699,7 +699,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 	if (!q)
 		return NULL;
 
-	q->id = ida_simple_get(&blk_queue_ida, 0, 0, gfp_mask);
+	q->id = tida_get(&blk_queue_ida, gfp_mask);
 	if (q->id < 0)
 		goto fail_q;
 
@@ -771,7 +771,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
 fail_split:
 	bioset_free(q->bio_split);
 fail_id:
-	ida_simple_remove(&blk_queue_ida, q->id);
+	tida_put(&blk_queue_ida, q->id);
 fail_q:
 	kmem_cache_free(blk_requestq_cachep, q);
 	return NULL;
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 9cc8d7c5439a..5a04dd6cb20d 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -652,7 +652,7 @@ static void blk_release_queue(struct kobject *kobj)
 	if (q->bio_split)
 		bioset_free(q->bio_split);
 
-	ida_simple_remove(&blk_queue_ida, q->id);
+	tida_put(&blk_queue_ida, q->id);
 	call_rcu(&q->rcu_head, blk_free_queue_rcu);
 }
 
diff --git a/block/blk.h b/block/blk.h
index 74444c49078f..a0a606efed91 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -1,7 +1,7 @@
 #ifndef BLK_INTERNAL_H
 #define BLK_INTERNAL_H
 
-#include <linux/idr.h>
+#include <linux/tida.h>
 #include <linux/blk-mq.h>
 #include "blk-mq.h"
 
@@ -34,7 +34,7 @@ struct blk_flush_queue {
 extern struct kmem_cache *blk_requestq_cachep;
 extern struct kmem_cache *request_cachep;
 extern struct kobj_type blk_queue_ktype;
-extern struct ida blk_queue_ida;
+extern struct tida blk_queue_ida;
 
 static inline struct blk_flush_queue *blk_get_flush_queue(
 		struct request_queue *q, struct blk_mq_ctx *ctx)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 07/10] drivers/base/platform.c: use simpler id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (5 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 06/10] block: use tida as small id allocator Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-08  1:23 ` [RFC 08/10] lib/tida.c: introduce tida_get_above Rasmus Villemoes
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

Use the newly introduced tida allocator, which saves about 16 KB of memory.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/base/platform.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index c4af00385502..3177222b28dc 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -22,7 +22,7 @@
 #include <linux/slab.h>
 #include <linux/pm_runtime.h>
 #include <linux/pm_domain.h>
-#include <linux/idr.h>
+#include <linux/tida.h>
 #include <linux/acpi.h>
 #include <linux/clk/clk-conf.h>
 #include <linux/limits.h>
@@ -32,7 +32,7 @@
 #include "power/power.h"
 
 /* For automatically allocated device IDs */
-static DEFINE_IDA(platform_devid_ida);
+static DEFINE_TIDA(platform_devid_ida);
 
 struct device platform_bus = {
 	.init_name	= "platform",
@@ -372,7 +372,7 @@ int platform_device_add(struct platform_device *pdev)
 		 * that we remember it must be freed, and we append a suffix
 		 * to avoid namespace collision with explicit IDs.
 		 */
-		ret = ida_simple_get(&platform_devid_ida, 0, 0, GFP_KERNEL);
+		ret = tida_get(&platform_devid_ida, GFP_KERNEL);
 		if (ret < 0)
 			goto err_out;
 		pdev->id = ret;
@@ -411,7 +411,7 @@ int platform_device_add(struct platform_device *pdev)
 
  failed:
 	if (pdev->id_auto) {
-		ida_simple_remove(&platform_devid_ida, pdev->id);
+		tida_put(&platform_devid_ida, pdev->id);
 		pdev->id = PLATFORM_DEVID_AUTO;
 	}
 
@@ -443,7 +443,7 @@ void platform_device_del(struct platform_device *pdev)
 		device_del(&pdev->dev);
 
 		if (pdev->id_auto) {
-			ida_simple_remove(&platform_devid_ida, pdev->id);
+			tida_put(&platform_devid_ida, pdev->id);
 			pdev->id = PLATFORM_DEVID_AUTO;
 		}
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 08/10] lib/tida.c: introduce tida_get_above
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (6 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 07/10] drivers/base/platform.c: use simpler " Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-08  1:23 ` [RFC 09/10] drm: use simpler id allocator Rasmus Villemoes
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

Some potential users want to impose a minimum on the returned
id. Extend tida_get to accept a start parameter, renaming it to
tida_get_above, and make tida_get a trivial wrapper.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 include/linux/tida.h |  8 +++++++-
 lib/tida.c           | 22 +++++++++++++---------
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/linux/tida.h b/include/linux/tida.h
index 9aa3ad96a632..a76fe01bee0b 100644
--- a/include/linux/tida.h
+++ b/include/linux/tida.h
@@ -19,8 +19,14 @@ struct tida {
 void tida_init(struct tida *tida);
 void tida_destroy(struct tida *tida);
 
-int tida_get(struct tida *tida, gfp_t gfp);
+int tida_get_above(struct tida *tida, int start, gfp_t gfp);
 void tida_put(struct tida *tida, int id);
 
+static inline int
+tida_get(struct tida *tida, gfp_t gfp)
+{
+	return tida_get_above(tida, 0, gfp);
+}
+
 
 #endif /* __LINUX_TIDA_H__ */
diff --git a/lib/tida.c b/lib/tida.c
index 1ea0deb6fa64..0d43b207325a 100644
--- a/lib/tida.c
+++ b/lib/tida.c
@@ -23,16 +23,15 @@
  */
 
 static int
-tida_expand(struct tida *tida, gfp_t gfp, unsigned long *flags)
+tida_expand(struct tida *tida, gfp_t gfp, unsigned long *flags, unsigned long minalloc)
 	__releases(tida->lock)
 	__acquires(tida->lock)
 {
 	unsigned long newalloc, oldalloc = tida->alloc;
 	unsigned long *bits;
 
-	newalloc = oldalloc ? 2 * oldalloc : BITS_PER_LONG;
-
 	spin_unlock_irqrestore(&tida->lock, *flags);
+	newalloc = max(2*oldalloc, round_up(minalloc, BITS_PER_LONG));
 	bits = kcalloc(BITS_TO_LONGS(newalloc), sizeof(*bits), gfp);
 	spin_lock_irqsave(&tida->lock, *flags);
 
@@ -50,29 +49,34 @@ tida_expand(struct tida *tida, gfp_t gfp, unsigned long *flags)
 }
 
 int
-tida_get(struct tida *tida, gfp_t gfp)
+tida_get_above(struct tida *tida, int start, gfp_t gfp)
 {
 	unsigned long flags;
-	int ret;
+	int ret, from;
+
+	if (WARN_ON_ONCE(start < 0))
+		return -EINVAL;
 
 	spin_lock_irqsave(&tida->lock, flags);
 	while (1) {
 		/* find_next_zero_bit is fine with a NULL bitmap as long as size is 0 */
-		ret = find_next_zero_bit(tida->bits, tida->alloc, tida->hint);
+		from = max(start, tida->hint);
+		ret = find_next_zero_bit(tida->bits, tida->alloc, from);
 		if (ret < tida->alloc)
 			break;
-		ret = tida_expand(tida, gfp, &flags);
+		ret = tida_expand(tida, gfp, &flags, from + 1);
 		if (ret < 0)
 			goto out;
 	}
 
 	__set_bit(ret, tida->bits);
-	tida->hint = ret+1;
+	if (start <= tida->hint)
+		tida->hint = ret + 1;
 out:
 	spin_unlock_irqrestore(&tida->lock, flags);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(tida_get);
+EXPORT_SYMBOL_GPL(tida_get_above);
 
 void
 tida_put(struct tida *tida, int id)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 09/10] drm: use simpler id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (7 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 08/10] lib/tida.c: introduce tida_get_above Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-08  1:23 ` [RFC 10/10] fs/devpts: use tida for id allocation Rasmus Villemoes
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes, dri-devel

Using the recently introduced "tida" allocator for small integer ids
saves about 100 KB of memory on my laptop - every struct ida from
which a single id has been allocated uses at least 16 KB of memory due
to the pre-allocation/caching of struct idr_layers (each worth a
little over 2K).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/gpu/drm/drm_connector.c | 21 ++++++++++-----------
 drivers/gpu/drm/drm_crtc.c      |  4 ++--
 include/drm/drm_crtc.h          |  3 ++-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 2db7fb510b6c..70e5f3b84f2a 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -61,7 +61,7 @@
 struct drm_conn_prop_enum_list {
 	int type;
 	const char *name;
-	struct ida ida;
+	struct tida ida;
 };
 
 /*
@@ -93,7 +93,7 @@ void drm_connector_ida_init(void)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(drm_connector_enum_list); i++)
-		ida_init(&drm_connector_enum_list[i].ida);
+		tida_init(&drm_connector_enum_list[i].ida);
 }
 
 void drm_connector_ida_destroy(void)
@@ -101,7 +101,7 @@ void drm_connector_ida_destroy(void)
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(drm_connector_enum_list); i++)
-		ida_destroy(&drm_connector_enum_list[i].ida);
+		tida_destroy(&drm_connector_enum_list[i].ida);
 }
 
 /**
@@ -186,7 +186,7 @@ int drm_connector_init(struct drm_device *dev,
 {
 	struct drm_mode_config *config = &dev->mode_config;
 	int ret;
-	struct ida *connector_ida =
+	struct tida *connector_ida =
 		&drm_connector_enum_list[connector_type].ida;
 
 	drm_modeset_lock_all(dev);
@@ -201,7 +201,7 @@ int drm_connector_init(struct drm_device *dev,
 	connector->dev = dev;
 	connector->funcs = funcs;
 
-	ret = ida_simple_get(&config->connector_ida, 0, 0, GFP_KERNEL);
+	ret = tida_get(&config->connector_ida, GFP_KERNEL);
 	if (ret < 0)
 		goto out_put;
 	connector->index = ret;
@@ -209,7 +209,7 @@ int drm_connector_init(struct drm_device *dev,
 
 	connector->connector_type = connector_type;
 	connector->connector_type_id =
-		ida_simple_get(connector_ida, 1, 0, GFP_KERNEL);
+		tida_get_above(connector_ida, 1, GFP_KERNEL);
 	if (connector->connector_type_id < 0) {
 		ret = connector->connector_type_id;
 		goto out_put_id;
@@ -250,10 +250,10 @@ int drm_connector_init(struct drm_device *dev,
 	connector->debugfs_entry = NULL;
 out_put_type_id:
 	if (ret)
-		ida_simple_remove(connector_ida, connector->connector_type_id);
+		tida_put(connector_ida, connector->connector_type_id);
 out_put_id:
 	if (ret)
-		ida_simple_remove(&config->connector_ida, connector->index);
+		tida_put(&config->connector_ida, connector->index);
 out_put:
 	if (ret)
 		drm_mode_object_unregister(dev, &connector->base);
@@ -341,11 +341,10 @@ void drm_connector_cleanup(struct drm_connector *connector)
 	list_for_each_entry_safe(mode, t, &connector->modes, head)
 		drm_mode_remove(connector, mode);
 
-	ida_simple_remove(&drm_connector_enum_list[connector->connector_type].ida,
+	tida_put(&drm_connector_enum_list[connector->connector_type].ida,
 			  connector->connector_type_id);
 
-	ida_simple_remove(&dev->mode_config.connector_ida,
-			  connector->index);
+	tida_put(&dev->mode_config.connector_ida, connector->index);
 
 	kfree(connector->display_info.bus_formats);
 	drm_mode_object_unregister(dev, &connector->base);
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 2d7bedf28647..c38cda9bdf09 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -1075,7 +1075,7 @@ void drm_mode_config_init(struct drm_device *dev)
 	INIT_LIST_HEAD(&dev->mode_config.plane_list);
 	idr_init(&dev->mode_config.crtc_idr);
 	idr_init(&dev->mode_config.tile_idr);
-	ida_init(&dev->mode_config.connector_ida);
+	tida_init(&dev->mode_config.connector_ida);
 
 	drm_modeset_lock_all(dev);
 	drm_mode_create_standard_properties(dev);
@@ -1156,7 +1156,7 @@ void drm_mode_config_cleanup(struct drm_device *dev)
 		drm_framebuffer_free(&fb->base.refcount);
 	}
 
-	ida_destroy(&dev->mode_config.connector_ida);
+	tida_destroy(&dev->mode_config.connector_ida);
 	idr_destroy(&dev->mode_config.tile_idr);
 	idr_destroy(&dev->mode_config.crtc_idr);
 	drm_modeset_lock_fini(&dev->mode_config.connection_mutex);
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 0aa292526567..3f5255b801e2 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -29,6 +29,7 @@
 #include <linux/spinlock.h>
 #include <linux/types.h>
 #include <linux/idr.h>
+#include <linux/tida.h>
 #include <linux/fb.h>
 #include <linux/hdmi.h>
 #include <linux/media-bus-format.h>
@@ -1045,7 +1046,7 @@ struct drm_mode_config {
 	/**
 	 * @connector_ida: ID allocator for connector indices.
 	 */
-	struct ida connector_ida;
+	struct tida connector_ida;
 	/**
 	 * @connector_list: List of connector objects.
 	 */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC 10/10] fs/devpts: use tida for id allocation
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (8 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 09/10] drm: use simpler id allocator Rasmus Villemoes
@ 2016-12-08  1:23 ` Rasmus Villemoes
  2016-12-09 13:49 ` [RFC 00/10] implement alternative and much simpler id allocator Tejun Heo
  2016-12-09 22:01 ` Andrew Morton
  11 siblings, 0 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-08  1:23 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Jens Axboe, Greg Kroah-Hartman,
	Rasmus Villemoes

Using the newly introduced tida for allocating pty indexes saves
around 16KB of runtime memory. Since tida does it's own internal
locking, we don't need to protect it with the &allocated_ptys_lock and
do the whole "preallocate outside lock, loop if we're extremely
unlucky", thus allowing us simplify devpts_new_index somewhat.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 fs/devpts/inode.c | 28 ++++++++++------------------
 1 file changed, 10 insertions(+), 18 deletions(-)

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index 108df2e3602c..ea40437f246a 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -22,7 +22,7 @@
 #include <linux/tty.h>
 #include <linux/mutex.h>
 #include <linux/magic.h>
-#include <linux/idr.h>
+#include <linux/tida.h>
 #include <linux/devpts_fs.h>
 #include <linux/parser.h>
 #include <linux/fsnotify.h>
@@ -122,7 +122,7 @@ static const match_table_t tokens = {
 };
 
 struct pts_fs_info {
-	struct ida allocated_ptys;
+	struct tida allocated_ptys;
 	struct pts_mount_opts mount_opts;
 	struct super_block *sb;
 	struct dentry *ptmx_dentry;
@@ -377,7 +377,7 @@ static void *new_pts_fs_info(struct super_block *sb)
 	if (!fsi)
 		return NULL;
 
-	ida_init(&fsi->allocated_ptys);
+	tida_init(&fsi->allocated_ptys);
 	fsi->mount_opts.mode = DEVPTS_DEFAULT_MODE;
 	fsi->mount_opts.ptmxmode = DEVPTS_DEFAULT_PTMX_MODE;
 	fsi->sb = sb;
@@ -453,7 +453,7 @@ static void devpts_kill_sb(struct super_block *sb)
 	struct pts_fs_info *fsi = DEVPTS_SB(sb);
 
 	if (fsi)
-		ida_destroy(&fsi->allocated_ptys);
+		tida_destroy(&fsi->allocated_ptys);
 	kfree(fsi);
 	kill_litter_super(sb);
 }
@@ -473,29 +473,21 @@ static struct file_system_type devpts_fs_type = {
 int devpts_new_index(struct pts_fs_info *fsi)
 {
 	int index;
-	int ida_ret;
 
-retry:
-	if (!ida_pre_get(&fsi->allocated_ptys, GFP_KERNEL))
-		return -ENOMEM;
+	index = tida_get(&fsi->allocated_ptys, GFP_KERNEL);
+	if (index < 0)
+		return index;
 
 	mutex_lock(&allocated_ptys_lock);
 	if (pty_count >= (pty_limit -
 			  (fsi->mount_opts.reserve ? 0 : pty_reserve))) {
+		tida_put(&fsi->allocated_ptys, index);
 		mutex_unlock(&allocated_ptys_lock);
 		return -ENOSPC;
 	}
 
-	ida_ret = ida_get_new(&fsi->allocated_ptys, &index);
-	if (ida_ret < 0) {
-		mutex_unlock(&allocated_ptys_lock);
-		if (ida_ret == -EAGAIN)
-			goto retry;
-		return -EIO;
-	}
-
 	if (index >= fsi->mount_opts.max) {
-		ida_remove(&fsi->allocated_ptys, index);
+		tida_put(&fsi->allocated_ptys, index);
 		mutex_unlock(&allocated_ptys_lock);
 		return -ENOSPC;
 	}
@@ -507,7 +499,7 @@ int devpts_new_index(struct pts_fs_info *fsi)
 void devpts_kill_index(struct pts_fs_info *fsi, int idx)
 {
 	mutex_lock(&allocated_ptys_lock);
-	ida_remove(&fsi->allocated_ptys, idx);
+	tida_put(&fsi->allocated_ptys, idx);
 	pty_count--;
 	mutex_unlock(&allocated_ptys_lock);
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC 06/10] block: use tida as small id allocator
  2016-12-08  1:23 ` [RFC 06/10] block: use tida as small id allocator Rasmus Villemoes
@ 2016-12-08  3:56   ` Jens Axboe
  2016-12-08 11:02     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 25+ messages in thread
From: Jens Axboe @ 2016-12-08  3:56 UTC (permalink / raw)
  To: Rasmus Villemoes, Tejun Heo, Andrew Morton
  Cc: linux-kernel, Lai Jiangshan, Greg Kroah-Hartman, linux-block

On 12/07/2016 06:23 PM, Rasmus Villemoes wrote:
> A struct ida ends up costing > 16 KB of runtime memory, which is quite
> a lot for something which on my laptop as of this writing has handed
> out 27 ids in its lifetime. So use the simpler and lighter-weight
> struct tida.

I'm worried that your example of your laptop isn't an all encompassing
test case. How well does the simplified ida allocator work for tens of
thousands of disks, at scan time? SCSI is notorious for setting up and
tearing down a ton of queues at probe time.

Unless we have more testing than 'it works on my laptop and saves 16k',
I'm not super intereted in the patch.


-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 06/10] block: use tida as small id allocator
  2016-12-08  3:56   ` Jens Axboe
@ 2016-12-08 11:02     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 25+ messages in thread
From: Greg Kroah-Hartman @ 2016-12-08 11:02 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Jens Axboe, Tejun Heo, Andrew Morton, linux-kernel,
	Lai Jiangshan, linux-block

On Wed, Dec 07, 2016 at 08:56:27PM -0700, Jens Axboe wrote:
> On 12/07/2016 06:23 PM, Rasmus Villemoes wrote:
> > A struct ida ends up costing > 16 KB of runtime memory, which is quite
> > a lot for something which on my laptop as of this writing has handed
> > out 27 ids in its lifetime. So use the simpler and lighter-weight
> > struct tida.
> 
> I'm worried that your example of your laptop isn't an all encompassing
> test case. How well does the simplified ida allocator work for tens of
> thousands of disks, at scan time? SCSI is notorious for setting up and
> tearing down a ton of queues at probe time.
> 
> Unless we have more testing than 'it works on my laptop and saves 16k',
> I'm not super intereted in the patch.

Rasmus, you can create 30k virtual scsi devices on your laptop to test
if this really does work or saves any real time or memory.  I'd
recommend doing that if you want to get patches like this ever accepted.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (9 preceding siblings ...)
  2016-12-08  1:23 ` [RFC 10/10] fs/devpts: use tida for id allocation Rasmus Villemoes
@ 2016-12-09 13:49 ` Tejun Heo
  2016-12-09 22:01 ` Andrew Morton
  11 siblings, 0 replies; 25+ messages in thread
From: Tejun Heo @ 2016-12-09 13:49 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Andrew Morton, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel

Hello, Rasmus.

On Thu, Dec 08, 2016 at 02:22:55AM +0100, Rasmus Villemoes wrote:
> TL;DR: these patches save 250 KB of memory, with more low-hanging
> fruit ready to pick.
> 
> While browsing through the lib/idr.c code, I noticed that the code at
> the end of ida_get_new_above() probably doesn't work as intended: Most
> users of ida use it via ida_simple_get(), and that starts by
> unconditionally calling ida_pre_get(), ensuring that ida->idr has
> 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
> case, none (or at most one) of these get used during
> ida_get_new_above(), and we only free one, leaving at least 6 (usually
> 7) idr_layers in the free list.

So, if you take a look at idr_layer_alloc(), there are two alternative
preloading mechanisms.  The one based on per-idr freelist -
get_from_free_list() - and the one using per-cpu preload cache.  idr
currently uses the new percpu path and ida uses the old path.  There
isn't anything fundamental to this difference.  It's just that we
introduced the new perpcu path to idr and haven't converted ida over
to it yet.

> Patches 1 and 2 are minor optimization opportunities, while patch 3 is
> an attempt at making the 'free the extra idr_layers one at a time'
> actually work. But it's not a very good solution, since it doesn't
> help the users who never do more than a handful of allocations, nor
> does it help those who call ida_pre_get/ida_get_new
> directly. Moreover, even if we somehow had perfect heuristics and got
> rid of all excess idr_layers and ida_bitmap (another 128 bytes we
> usually have hanging around), the minimum overhead of sizeof(struct
> idr_layer)+sizeof(struct ida_bitmap) ~ 2200 bytes is quite a lot for
> the many ida users who never allocate more than 100 ids.
> 
> So instead/in addition, I suggest we introduce a much simpler
> allocator based on a dynamically growing bitmap. Patches 4-10
> introduce struct tida and has a few examples of replacing ida with
> tida. [Yeah, tida is not a good name, and probably _get and _put are also
> bad.]

So, instead of creating something new, it'd be a lot better to
implement the same per-cpu preload behavior for ida so that caching is
per-cpu instead of per-ida.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
                   ` (10 preceding siblings ...)
  2016-12-09 13:49 ` [RFC 00/10] implement alternative and much simpler id allocator Tejun Heo
@ 2016-12-09 22:01 ` Andrew Morton
  2016-12-12 17:09   ` Tejun Heo
  2016-12-16 19:14   ` Matthew Wilcox
  11 siblings, 2 replies; 25+ messages in thread
From: Andrew Morton @ 2016-12-09 22:01 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Matthew Wilcox

On Thu,  8 Dec 2016 02:22:55 +0100 Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:

> TL;DR: these patches save 250 KB of memory, with more low-hanging
> fruit ready to pick.
> 
> While browsing through the lib/idr.c code, I noticed that the code at
> the end of ida_get_new_above() probably doesn't work as intended: Most
> users of ida use it via ida_simple_get(), and that starts by
> unconditionally calling ida_pre_get(), ensuring that ida->idr has
> 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
> case, none (or at most one) of these get used during
> ida_get_new_above(), and we only free one, leaving at least 6 (usually
> 7) idr_layers in the free list.

Please be aware of

http://ozlabs.org/~akpm/mmots/broken-out/reimplement-idr-and-ida-using-the-radix-tree.patch
http://lkml.kernel.org/r/1480369871-5271-68-git-send-email-mawilcox@linuxonhyperv.com

I expect we'll be merging patches 1-32 of that series into 4.10-rc1 and
the above patch (#33) into 4.11-rc1.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-09 22:01 ` Andrew Morton
@ 2016-12-12 17:09   ` Tejun Heo
  2016-12-12 17:35     ` Matthew Wilcox
  2016-12-16 19:14   ` Matthew Wilcox
  1 sibling, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2016-12-12 17:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rasmus Villemoes, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Matthew Wilcox

Hello,

On Fri, Dec 09, 2016 at 02:01:40PM -0800, Andrew Morton wrote:
> On Thu,  8 Dec 2016 02:22:55 +0100 Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> 
> > TL;DR: these patches save 250 KB of memory, with more low-hanging
> > fruit ready to pick.
> > 
> > While browsing through the lib/idr.c code, I noticed that the code at
> > the end of ida_get_new_above() probably doesn't work as intended: Most
> > users of ida use it via ida_simple_get(), and that starts by
> > unconditionally calling ida_pre_get(), ensuring that ida->idr has
> > 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
> > case, none (or at most one) of these get used during
> > ida_get_new_above(), and we only free one, leaving at least 6 (usually
> > 7) idr_layers in the free list.
> 
> Please be aware of
> 
> http://ozlabs.org/~akpm/mmots/broken-out/reimplement-idr-and-ida-using-the-radix-tree.patch
> http://lkml.kernel.org/r/1480369871-5271-68-git-send-email-mawilcox@linuxonhyperv.com
> 
> I expect we'll be merging patches 1-32 of that series into 4.10-rc1 and
> the above patch (#33) into 4.11-rc1.

Ah, yeah, great to see the silly implementation being replaced the
radix tree.  ida_pre_get() looks suspicious tho.  idr_preload()
immedicately being followed by idr_preload_end() probably is broken.
Maybe what we need is moving ida to idr like preload interface and
then convert it to radix based interface?  ida currently assumes
per-ida preloading.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-12 17:09   ` Tejun Heo
@ 2016-12-12 17:35     ` Matthew Wilcox
  2016-12-12 18:05       ` Tejun Heo
  0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-12 17:35 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: Rasmus Villemoes, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, kent.overstreet

From: Tejun Heo [mailto:htejun@gmail.com] On Behalf Of Tejun Heo
> Ah, yeah, great to see the silly implementation being replaced the
> radix tree.  ida_pre_get() looks suspicious tho.  idr_preload()
> immedicately being followed by idr_preload_end() probably is broken.
> Maybe what we need is moving ida to idr like preload interface and
> then convert it to radix based interface?  ida currently assumes
> per-ida preloading.

Hey Tejun!  Great to have your comments on this reimplementation.

I know the preload followed by preload_end looks wrong.  I don't think it's broken though.  If we get preempted, then the worst situation is that we'll end up with the memory we preallocated being allocated to somebody else.  Then the attempt to allocate memory can fail, and we'll return -EAGAIN, at which point all callers are supposed to return to the pre_get() stage.  Certainly that's what ida_simple_get() does.

Hmm ... looking at the implementation again, we might return -ENOMEM when we should return -EAGAIN.  Let me fix that (and the test suite ...)

I'd definitely be open to changing the IDA API.  I know Kent had some thoughts on that including splitting the simple lock into a per-IDA lock.  His last work on it was here, I believe:

https://evilpiepirate.org/git/linux-bcache.git/log/?h=idr

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-12 17:35     ` Matthew Wilcox
@ 2016-12-12 18:05       ` Tejun Heo
  0 siblings, 0 replies; 25+ messages in thread
From: Tejun Heo @ 2016-12-12 18:05 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Rasmus Villemoes, linux-kernel, Lai Jiangshan,
	Jens Axboe, Greg Kroah-Hartman, linux-block, dri-devel,
	kent.overstreet

Hello, Matthew.

On Mon, Dec 12, 2016 at 05:35:17PM +0000, Matthew Wilcox wrote:
> I know the preload followed by preload_end looks wrong.  I don't
> think it's broken though.  If we get preempted, then the worst
> situation is that we'll end up with the memory we preallocated being
> allocated to somebody else.  Then the attempt to allocate memory can
> fail, and we'll return -EAGAIN, at which point all callers are
> supposed to return to the pre_get() stage.  Certainly that's what
> ida_simple_get() does.

Ah, right, ida_pre_get() doesn't have any protection against other
task allocating inbetween pre_get and the actual allocation, so it
should retry on failure.  Yeah, then the proposed preloading wouldn't
be wrong.  It'd be nice to explain what's going on tho.

> I'd definitely be open to changing the IDA API.  I know Kent had
> some thoughts on that including splitting the simple lock into a
> per-IDA lock.  His last work on it was here, I believe:
> 
> https://evilpiepirate.org/git/linux-bcache.git/log/?h=idr

Yeah, that was a big re-write, but for now I think it'd be nice to
replace ida's pre_get mechanism with something similar to idr's
preload so that they're more consistent.  There aren't that many
direct users of ida_pre_get(), so it shouldn't be too difficult to
change.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-09 22:01 ` Andrew Morton
  2016-12-12 17:09   ` Tejun Heo
@ 2016-12-16 19:14   ` Matthew Wilcox
  2016-12-16 20:32     ` Rasmus Villemoes
  1 sibling, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-16 19:14 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

From: Andrew Morton [mailto:akpm@linux-foundation.org]
> On Thu,  8 Dec 2016 02:22:55 +0100 Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
> > TL;DR: these patches save 250 KB of memory, with more low-hanging
> > fruit ready to pick.
> >
> > While browsing through the lib/idr.c code, I noticed that the code at
> > the end of ida_get_new_above() probably doesn't work as intended: Most
> > users of ida use it via ida_simple_get(), and that starts by
> > unconditionally calling ida_pre_get(), ensuring that ida->idr has
> > 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
> > case, none (or at most one) of these get used during
> > ida_get_new_above(), and we only free one, leaving at least 6 (usually
> > 7) idr_layers in the free list.
> 
> I expect we'll be merging patches 1-32 of that series into 4.10-rc1 and
> the above patch (#33) into 4.11-rc1.

Hi Rasmus,

Thanks for your work on this; you've really put some effort into proving your work has value.  My motivation was purely aesthetic, but you've got some genuine savings here (admittedly it's about a quarter of a cent's worth of memory with DRAM selling for $10/GB).  Nevertheless, that adds up over a billion devices, and there are still people trying to fit Linux into 4MB embedded devices.

I think my reimplementation of the IDA on top of the radix tree is close enough to your tIDA in memory consumption that it doesn't warrant a new data structure.

On a 64-bit machine, your tIDA root is 24 bytes; my new IDA root is 16 bytes.  If you allocate only one entry, you'll allocate 8 bytes.  Thanks to the slab allocator, that gets rounded up to 32 bytes.  I allocate the full 128 byte leaf, but I store the pointer to it in the root (unlike the IDR, the radix tree doesn't need to allocate a layer for a single entry).  So tIDA wins on memory consumption between 1 and 511 IDs, and newIDA is slightly ahead between 512 and 1023 IDs.  Above 1024 IDs, I allocate a layer (576 bytes), and a second leaf (832 bytes total), while you just double to 256 bytes.  I think tIDA's memory consumption then stays ahead of new IDA.  But performance of 'allocate new ID' should be better for newIDA than tIDA as newIDA can skip over all the cachelines of full bitmaps.

Yesterday, I found a new problem with the IDA allocator that you hadn't mentioned -- about half of the users of the IDA data structure never call destroy_ida().  Which means that they're leaking the preloaded bitmap.  I have a patch which moves the preloaded IDA bitmap from being stored in the IDA to being stored in a percpu variable.  You can find it here: http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16 I'd welcome more testing and code review.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-16 19:14   ` Matthew Wilcox
@ 2016-12-16 20:32     ` Rasmus Villemoes
  2016-12-16 21:09       ` Matthew Wilcox
                         ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-16 20:32 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

On Fri, Dec 16 2016, Matthew Wilcox <mawilcox@microsoft.com> wrote:

> From: Andrew Morton [mailto:akpm@linux-foundation.org]
>> On Thu,  8 Dec 2016 02:22:55 +0100 Rasmus Villemoes
>> <linux@rasmusvillemoes.dk> wrote:
>> > TL;DR: these patches save 250 KB of memory, with more low-hanging
>> > fruit ready to pick.
>> >
>> > While browsing through the lib/idr.c code, I noticed that the code at
>> > the end of ida_get_new_above() probably doesn't work as intended: Most
>> > users of ida use it via ida_simple_get(), and that starts by
>> > unconditionally calling ida_pre_get(), ensuring that ida->idr has
>> > 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
>> > case, none (or at most one) of these get used during
>> > ida_get_new_above(), and we only free one, leaving at least 6 (usually
>> > 7) idr_layers in the free list.
>> 
>> I expect we'll be merging patches 1-32 of that series into 4.10-rc1 and
>> the above patch (#33) into 4.11-rc1.
>
> Hi Rasmus,
>
> Thanks for your work on this; you've really put some effort into
> proving your work has value.  My motivation was purely aesthetic, but
> you've got some genuine savings here (admittedly it's about a quarter
> of a cent's worth of memory with DRAM selling for $10/GB).
> Nevertheless, that adds up over a billion devices, and there are still
> people trying to fit Linux into 4MB embedded devices.
>

Yeah, my main motivation was embedded devices which don't have the
luxury of measuring their RAM in GB. E.g., it's crazy that the
watchdog_ida effectively use more memory than the .text of the watchdog
subsystem, and similarly for the kthread workers, etc., etc.. I didn't
mean for my patches to go in as is, more to provoke some discussion. I
wasn't aware of your reimplementation, but it seems that may make the
problem go away.

> I think my reimplementation of the IDA on top of the radix tree is
> close enough to your tIDA in memory consumption that it doesn't
> warrant a new data structure.
>
> On a 64-bit machine, your tIDA root is 24 bytes; my new IDA root is 16
> bytes.  If you allocate only one entry, you'll allocate 8 bytes.
> Thanks to the slab allocator, that gets rounded up to 32 bytes.  I
> allocate the full 128 byte leaf, but I store the pointer to it in the
> root (unlike the IDR, the radix tree doesn't need to allocate a layer
> for a single entry).  So tIDA wins on memory consumption between 1 and
> 511 IDs, and newIDA is slightly ahead between 512 and 1023 IDs.

This sounds good. I think there may still be a lot of users that never
allocate more than a handful of IDAs, making a 128 byte allocation still
somewhat excessive. One thing I considered was (exactly as it's done for
file descriptor tables) to embed a single word in the struct ida and
use that initially; I haven't looked closely at newIDA, so I don't know
how easy that would be or if its worth the complexity.

Rasmus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-16 20:32     ` Rasmus Villemoes
@ 2016-12-16 21:09       ` Matthew Wilcox
  2016-12-17 13:28       ` Matthew Wilcox
  2016-12-18  2:42       ` Matthew Wilcox
  2 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-16 21:09 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

From: Rasmus Villemoes [mailto:linux@rasmusvillemoes.dk]
> On Fri, Dec 16 2016, Matthew Wilcox <mawilcox@microsoft.com> wrote:
> > Thanks for your work on this; you've really put some effort into
> > proving your work has value.  My motivation was purely aesthetic, but
> > you've got some genuine savings here (admittedly it's about a quarter
> > of a cent's worth of memory with DRAM selling for $10/GB).
> > Nevertheless, that adds up over a billion devices, and there are still
> > people trying to fit Linux into 4MB embedded devices.
> >
> 
> Yeah, my main motivation was embedded devices which don't have the
> luxury of measuring their RAM in GB. E.g., it's crazy that the
> watchdog_ida effectively use more memory than the .text of the watchdog
> subsystem, and similarly for the kthread workers, etc., etc.. I didn't
> mean for my patches to go in as is, more to provoke some discussion. I
> wasn't aware of your reimplementation, but it seems that may make the
> problem go away.

It certainly shrinks the problem down to a size where it may not be worth introducing another implementation.

> > On a 64-bit machine, your tIDA root is 24 bytes; my new IDA root is 16
> > bytes.  If you allocate only one entry, you'll allocate 8 bytes.
> > Thanks to the slab allocator, that gets rounded up to 32 bytes.  I
> > allocate the full 128 byte leaf, but I store the pointer to it in the
> > root (unlike the IDR, the radix tree doesn't need to allocate a layer
> > for a single entry).  So tIDA wins on memory consumption between 1 and
> > 511 IDs, and newIDA is slightly ahead between 512 and 1023 IDs.
> 
> This sounds good. I think there may still be a lot of users that never
> allocate more than a handful of IDAs, making a 128 byte allocation still
> somewhat excessive. One thing I considered was (exactly as it's done for
> file descriptor tables) to embed a single word in the struct ida and
> use that initially; I haven't looked closely at newIDA, so I don't know
> how easy that would be or if its worth the complexity.

Heh, I was thinking about that too.  The radix tree supports "exceptional entries" which have the bottom bit set.  On a 64-bit machine, we could use 62 of the bits in the radix tree root to store the ID bitmap.  I'm a little wary of the potential complexity, but we should try it out.

Did you come up with any fun tests that could be added to the test-suite?  It feels a little slender right now.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-16 20:32     ` Rasmus Villemoes
  2016-12-16 21:09       ` Matthew Wilcox
@ 2016-12-17 13:28       ` Matthew Wilcox
  2016-12-22 23:46         ` Rasmus Villemoes
  2016-12-18  2:42       ` Matthew Wilcox
  2 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-17 13:28 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

From: Matthew Wilcox
> From: Rasmus Villemoes [mailto:linux@rasmusvillemoes.dk]
> > This sounds good. I think there may still be a lot of users that never
> > allocate more than a handful of IDAs, making a 128 byte allocation still
> > somewhat excessive. One thing I considered was (exactly as it's done for
> > file descriptor tables) to embed a single word in the struct ida and
> > use that initially; I haven't looked closely at newIDA, so I don't know
> > how easy that would be or if its worth the complexity.
> 
> Heh, I was thinking about that too.  The radix tree supports "exceptional
> entries" which have the bottom bit set.  On a 64-bit machine, we could use 62
> of the bits in the radix tree root to store the ID bitmap.  I'm a little wary of the
> potential complexity, but we should try it out.

Test patch here: http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16
It passes the test suite ... which I actually had to adjust because it now succeeds in cases where it hadn't (allocating ID 0 without preallocating), and it will now fail in cases where it hadn't previously (assuming a single preallocation would be enough).  There shouldn't be any examples of that in the kernel proper; it was simply me being lazy when I wrote the test suite.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-16 20:32     ` Rasmus Villemoes
  2016-12-16 21:09       ` Matthew Wilcox
  2016-12-17 13:28       ` Matthew Wilcox
@ 2016-12-18  2:42       ` Matthew Wilcox
  2 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-18  2:42 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

From: Matthew Wilcox
> From: Matthew Wilcox
> > Heh, I was thinking about that too.  The radix tree supports "exceptional
> > entries" which have the bottom bit set.  On a 64-bit machine, we could use
> 62
> > of the bits in the radix tree root to store the ID bitmap.  I'm a little wary of
> the
> > potential complexity, but we should try it out.
> 
> Test patch here: http://git.infradead.org/users/willy/linux-
> dax.git/shortlog/refs/heads/idr-2016-12-16
> It passes the test suite ... which I actually had to adjust because it now succeeds
> in cases where it hadn't (allocating ID 0 without preallocating), and it will now
> fail in cases where it hadn't previously (assuming a single preallocation would
> be enough).  There shouldn't be any examples of that in the kernel proper; it
> was simply me being lazy when I wrote the test suite.

Found a bug, committed a fix (and another test case).  It now no longer returns -EAGAIN in situations where it wouldn't have, so I've reverted that bit of the test suite change.  Still succeeds when it wouldn't have before, but I feel no pressure to change that ;-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-17 13:28       ` Matthew Wilcox
@ 2016-12-22 23:46         ` Rasmus Villemoes
  2016-12-23 17:03           ` Matthew Wilcox
  0 siblings, 1 reply; 25+ messages in thread
From: Rasmus Villemoes @ 2016-12-22 23:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

On Sat, Dec 17 2016, Matthew Wilcox <mawilcox@microsoft.com> wrote:

> From: Matthew Wilcox
>> From: Rasmus Villemoes [mailto:linux@rasmusvillemoes.dk]
>> > This sounds good. I think there may still be a lot of users that never
>> > allocate more than a handful of IDAs, making a 128 byte allocation still
>> > somewhat excessive. One thing I considered was (exactly as it's done for
>> > file descriptor tables) to embed a single word in the struct ida and
>> > use that initially; I haven't looked closely at newIDA, so I don't know
>> > how easy that would be or if its worth the complexity.
>> 
>> Heh, I was thinking about that too.  The radix tree supports "exceptional
>> entries" which have the bottom bit set.  On a 64-bit machine, we could use 62
>> of the bits in the radix tree root to store the ID bitmap.  I'm a little wary of the
>> potential complexity, but we should try it out.
>
> Test patch here: http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-16
> It passes the test suite ... which I actually had to adjust because it
> now succeeds in cases where it hadn't (allocating ID 0 without
> preallocating), and it will now fail in cases where it hadn't
> previously (assuming a single preallocation would be enough).  There
> shouldn't be any examples of that in the kernel proper; it was simply
> me being lazy when I wrote the test suite.

Nice work! A few random comments/questions:

- It does add some complexity, but I think a few comments would make it
  more digestable.

- Hm, maybe I'm confused, and I certainly don't understand how the whole
  radix tree works. But do you use every leaf node as an exceptional
  entry initially, to allocate the first 62 ids from that level? This
  code

	if ((bit + RADIX_TREE_EXCEPTIONAL_SHIFT) <
					BITS_PER_LONG) {
		bit += RADIX_TREE_EXCEPTIONAL_SHIFT;
		radix_tree_iter_replace(root, &iter, slot,
				(void *)((1UL << bit) |
				RADIX_TREE_EXCEPTIONAL_ENTRY));
		*id = new;
		return 0;
	}

   operates on bit, which is the offset from index*IDA_BITMAP_BITS, and
   it seems to create an exceptional entry somewhere down the tree
   (which may of course be the root).

   But we don't seem to allocate another bit from that exceptional entry
   ever unless it happened to sit at index 0; the code

	unsigned long tmp = (unsigned long)bitmap;
	if (start < BITS_PER_LONG) {
		unsigned tbit = find_next_zero_bit(&tmp,
					BITS_PER_LONG, start);
		if (tbit < BITS_PER_LONG) {
			tmp |= 1UL << tbit;
			rcu_assign_pointer(*slot, (void *)tmp);
			*id = new + tbit -
				RADIX_TREE_EXCEPTIONAL_SHIFT;
			return 0;
		}
	}

   is only used for small values of start (though it does seem to
   account for a non-zero value of new == iter.index * IDA_BITMAP_BITS).
   
- In the micro-optimization department, I think we should avoid
  find_next_zero_bit on a single word, and instead use __ffs. Something
  like (assuming we can use bit instead of start)

  if (bit + RADIX_TREE_EXCEPTIONAL_SHIFT < BITS_PER_LONG) {
    tmp = (~(unsigned long)bitmap) >> (bit + RADIX_TREE_EXCEPTIONAL_SHIFT);
    if (tmp) {
      tbit = __ffs(tmp) + bit + RADIX_TREE_EXCEPTIONAL_SHIFT;
      tmp = (unsigned long)bitmap | (1UL << tbit);
      rcu_assign_pointer(*slot, (void *)tmp);
      *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT;
      return 0;
    }
  }

Rasmus

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [RFC 00/10] implement alternative and much simpler id allocator
  2016-12-22 23:46         ` Rasmus Villemoes
@ 2016-12-23 17:03           ` Matthew Wilcox
  0 siblings, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2016-12-23 17:03 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tejun Heo, linux-kernel, Lai Jiangshan, Jens Axboe,
	Greg Kroah-Hartman, linux-block, dri-devel, Andrew Morton

From: Rasmus Villemoes [mailto:linux@rasmusvillemoes.dk]
> Nice work! A few random comments/questions:
> 
> - It does add some complexity, but I think a few comments would make it
>   more digestable.

I'm open to adding some comments ... I need some time between writing the code and writing the comments to be sure what comments are useful.

> - Hm, maybe I'm confused, and I certainly don't understand how the whole
>   radix tree works. But do you use every leaf node as an exceptional
>   entry initially, to allocate the first 62 ids from that level? This
>   code

I do!  And that question tells me one useful comment to add!

> 	if ((bit + RADIX_TREE_EXCEPTIONAL_SHIFT) <
> 					BITS_PER_LONG) {
> 		bit += RADIX_TREE_EXCEPTIONAL_SHIFT;
> 		radix_tree_iter_replace(root, &iter, slot,
> 				(void *)((1UL << bit) |
> 				RADIX_TREE_EXCEPTIONAL_ENTRY));
> 		*id = new;
> 		return 0;
> 	}
> 
>    operates on bit, which is the offset from index*IDA_BITMAP_BITS, and
>    it seems to create an exceptional entry somewhere down the tree
>    (which may of course be the root).
> 
>    But we don't seem to allocate another bit from that exceptional entry
>    ever unless it happened to sit at index 0; the code
> 
> 	unsigned long tmp = (unsigned long)bitmap;
> 	if (start < BITS_PER_LONG) {
> 		unsigned tbit = find_next_zero_bit(&tmp,
> 					BITS_PER_LONG, start);
> 		if (tbit < BITS_PER_LONG) {
> 			tmp |= 1UL << tbit;
> 			rcu_assign_pointer(*slot, (void *)tmp);
> 			*id = new + tbit -
> 				RADIX_TREE_EXCEPTIONAL_SHIFT;
> 			return 0;
> 		}
> 	}
> 
>    is only used for small values of start (though it does seem to
>    account for a non-zero value of new == iter.index * IDA_BITMAP_BITS).

Ahh.  You're reading the code wrong, and I wrote the code wrongly too, so it's clearly overly complex.  I was testing with 'start' set to 0, allocating N entries, and then freeing them.  If I'd set start to 1024 and allocated two entries, I'd've seen the failure.

Please see the top commit here ("Improve IDA exceptional entry handling"): http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-20


> - In the micro-optimization department, I think we should avoid
>   find_next_zero_bit on a single word, and instead use __ffs. Something
>   like (assuming we can use bit instead of start)
> 
>   if (bit + RADIX_TREE_EXCEPTIONAL_SHIFT < BITS_PER_LONG) {
>     tmp = (~(unsigned long)bitmap) >> (bit + RADIX_TREE_EXCEPTIONAL_SHIFT);
>     if (tmp) {
>       tbit = __ffs(tmp) + bit + RADIX_TREE_EXCEPTIONAL_SHIFT;
>       tmp = (unsigned long)bitmap | (1UL << tbit);
>       rcu_assign_pointer(*slot, (void *)tmp);
>       *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT;
>       return 0;
>     }
>   }

I'm certainly open to microoptimisations, but I'll have to think about this one for a bit.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-12-23 17:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-08  1:22 [RFC 00/10] implement alternative and much simpler id allocator Rasmus Villemoes
2016-12-08  1:22 ` [RFC 01/10] lib/idr.c: reused free bitmaps are already clear Rasmus Villemoes
2016-12-08  1:22 ` [RFC 02/10] lib/idr.c: delete useless condition Rasmus Villemoes
2016-12-08  1:22 ` [RFC 03/10] lib/idr.c: only fill ida->idr when needed Rasmus Villemoes
2016-12-08  1:22 ` [RFC 04/10] lib/tida.c: a very simple integer id allocator Rasmus Villemoes
2016-12-08  1:23 ` [RFC 05/10] kernel/workqueue.c: replace id allocator ida with tida Rasmus Villemoes
2016-12-08  1:23 ` [RFC 06/10] block: use tida as small id allocator Rasmus Villemoes
2016-12-08  3:56   ` Jens Axboe
2016-12-08 11:02     ` Greg Kroah-Hartman
2016-12-08  1:23 ` [RFC 07/10] drivers/base/platform.c: use simpler " Rasmus Villemoes
2016-12-08  1:23 ` [RFC 08/10] lib/tida.c: introduce tida_get_above Rasmus Villemoes
2016-12-08  1:23 ` [RFC 09/10] drm: use simpler id allocator Rasmus Villemoes
2016-12-08  1:23 ` [RFC 10/10] fs/devpts: use tida for id allocation Rasmus Villemoes
2016-12-09 13:49 ` [RFC 00/10] implement alternative and much simpler id allocator Tejun Heo
2016-12-09 22:01 ` Andrew Morton
2016-12-12 17:09   ` Tejun Heo
2016-12-12 17:35     ` Matthew Wilcox
2016-12-12 18:05       ` Tejun Heo
2016-12-16 19:14   ` Matthew Wilcox
2016-12-16 20:32     ` Rasmus Villemoes
2016-12-16 21:09       ` Matthew Wilcox
2016-12-17 13:28       ` Matthew Wilcox
2016-12-22 23:46         ` Rasmus Villemoes
2016-12-23 17:03           ` Matthew Wilcox
2016-12-18  2:42       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).