linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/6] idr: fix top layer handling
@ 2013-02-08 21:00 Tejun Heo
  2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell

Most functions in idr fail to deal with the high bits when the idr
tree grows to the maximum height.

* idr_get_empty_slot() stops growing idr tree once the depth reaches
  MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
  cover the whole range.  The function doesn't even notice that it
  didn't grow the tree enough and ends up allocating the wrong ID
  given sufficiently high @starting_id.

  For example, on 64 bit, if the starting id is 0x7fffff01,
  idr_get_empty_slot() will grow the tree 5 layer deep, which only
  covers the 30 bits and then proceed to allocate as if the bit 30
  wasn't specified.  It ends up allocating 0x3fffff01 without the bit
  30 but still returns 0x7fffff01.

* __idr_remove_all() will not remove anything if the tree is fully
  grown.

* idr_find() can't find anything if the tree is fully grown.

* idr_for_each() and idr_get_next() can't iterate anything if the tree
  is fully grown.

Fix it by introducing idr_max() which returns the maximum possible ID
given the depth of tree and replacing the id limit checks in all
affected places.

As the idr_layer pointer array pa[] needs to be 1 larger than the
maximum depth, enlarge pa[] arrays by one.

While this plugs the discovered issues, the whole code base is
horrible and in desparate need of rewrite.  It's fragile like hell,
difficult to read and maintain, and plain ugly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
---
Andrew, these are further IDR fixes / improvements on top of -mm.
Probably best to route together.

Thanks.

 lib/idr.c |   38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

--- a/lib/idr.c
+++ b/lib/idr.c
@@ -43,6 +43,14 @@ static DEFINE_PER_CPU(struct idr_layer *
 static DEFINE_PER_CPU(int, idr_preload_cnt);
 static DEFINE_SPINLOCK(simple_ida_lock);
 
+/* the maximum ID which can be allocated given idr->layers */
+static int idr_max(int layers)
+{
+	int bits = min_t(int, layers * IDR_BITS, MAX_IDR_SHIFT);
+
+	return (1 << bits) - 1;
+}
+
 static struct idr_layer *get_from_free_list(struct idr *idp)
 {
 	struct idr_layer *p;
@@ -290,7 +298,7 @@ build_up:
 	 * Add a new layer to the top of the tree if the requested
 	 * id is larger than the currently allocated space.
 	 */
-	while ((layers < (MAX_IDR_LEVEL - 1)) && (id >= (1 << (layers*IDR_BITS)))) {
+	while (id > idr_max(layers)) {
 		layers++;
 		if (!p->count) {
 			/* special case: if the tree is currently empty,
@@ -361,7 +369,7 @@ static void idr_fill_slot(void *ptr, int
  */
 int idr_get_new_above(struct idr *idp, void *ptr, int starting_id, int *id)
 {
-	struct idr_layer *pa[MAX_IDR_LEVEL];
+	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
 	int rv;
 
 	rv = idr_get_empty_slot(idp, starting_id, pa, 0, idp);
@@ -454,7 +462,7 @@ EXPORT_SYMBOL(idr_preload);
 int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp_mask)
 {
 	int max = end ? end - 1 : INT_MAX;	/* inclusive upper limit */
-	struct idr_layer *pa[MAX_IDR_LEVEL];
+	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
 	int id;
 
 	might_sleep_if(gfp_mask & __GFP_WAIT);
@@ -487,7 +495,7 @@ static void idr_remove_warning(int id)
 static void sub_remove(struct idr *idp, int shift, int id)
 {
 	struct idr_layer *p = idp->top;
-	struct idr_layer **pa[MAX_IDR_LEVEL];
+	struct idr_layer **pa[MAX_IDR_LEVEL + 1];
 	struct idr_layer ***paa = &pa[0];
 	struct idr_layer *to_free;
 	int n;
@@ -568,16 +576,16 @@ void __idr_remove_all(struct idr *idp)
 	int n, id, max;
 	int bt_mask;
 	struct idr_layer *p;
-	struct idr_layer *pa[MAX_IDR_LEVEL];
+	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
 	struct idr_layer **paa = &pa[0];
 
 	n = idp->layers * IDR_BITS;
 	p = idp->top;
 	rcu_assign_pointer(idp->top, NULL);
-	max = 1 << n;
+	max = idr_max(idp->layers);
 
 	id = 0;
-	while (id < max) {
+	while (id >= 0 && id <= max) {
 		while (n > IDR_BITS && p) {
 			n -= IDR_BITS;
 			*paa++ = p;
@@ -647,7 +655,7 @@ void *idr_find(struct idr *idp, int id)
 	/* Mask off upper bits we don't use for the search. */
 	id &= MAX_IDR_MASK;
 
-	if (id >= (1 << n))
+	if (id > idr_max(p->layer + 1))
 		return NULL;
 	BUG_ON(n == 0);
 
@@ -683,15 +691,15 @@ int idr_for_each(struct idr *idp,
 {
 	int n, id, max, error = 0;
 	struct idr_layer *p;
-	struct idr_layer *pa[MAX_IDR_LEVEL];
+	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
 	struct idr_layer **paa = &pa[0];
 
 	n = idp->layers * IDR_BITS;
 	p = rcu_dereference_raw(idp->top);
-	max = 1 << n;
+	max = idr_max(idp->layers);
 
 	id = 0;
-	while (id < max) {
+	while (id >= 0 && id <= max) {
 		while (n > 0 && p) {
 			n -= IDR_BITS;
 			*paa++ = p;
@@ -729,7 +737,7 @@ EXPORT_SYMBOL(idr_for_each);
  */
 void *idr_get_next(struct idr *idp, int *nextidp)
 {
-	struct idr_layer *p, *pa[MAX_IDR_LEVEL];
+	struct idr_layer *p, *pa[MAX_IDR_LEVEL + 1];
 	struct idr_layer **paa = &pa[0];
 	int id = *nextidp;
 	int n, max;
@@ -739,9 +747,9 @@ void *idr_get_next(struct idr *idp, int
 	if (!p)
 		return NULL;
 	n = (p->layer + 1) * IDR_BITS;
-	max = 1 << n;
+	max = idr_max(p->layer + 1);
 
-	while (id < max) {
+	while (id >= 0 && id <= max) {
 		while (n > 0 && p) {
 			n -= IDR_BITS;
 			*paa++ = p;
@@ -915,7 +923,7 @@ EXPORT_SYMBOL(ida_pre_get);
  */
 int ida_get_new_above(struct ida *ida, int starting_id, int *p_id)
 {
-	struct idr_layer *pa[MAX_IDR_LEVEL];
+	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
 	struct ida_bitmap *bitmap;
 	unsigned long flags;
 	int idr_id = starting_id / IDA_BITMAP_BITS;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
@ 2013-02-08 21:01 ` Tejun Heo
  2013-02-08 22:09   ` Hefty, Sean
  2013-02-10 11:52   ` Wolfram Sang
  2013-02-08 21:02 ` [PATCH 3/6] idr: remove length restriction from idr_layer->bitmap Tejun Heo
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Rusty Russell, Jean Delvare, linux-i2c,
	Roland Dreier, Sean Hefty, Hal Rosenstock, Marciniszyn, Mike,
	Jack Morgenstein, Or Gerlitz, linux-rdma, Al Viro

MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
whole positive integer range, it's defined as 0x7fffffff or INT_MAX.

Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
They basically mask off the sign bit and operate on the rest, so if
the caller, by accident, passes in a negative number, the sign bit
will be masked off and the remaining part will be used as if that was
the input, which is worse than crashing.

The constant is visible in idr.h and there are several users in the
kernel.

* drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()

  Basically used to test if adap->nr is a negative number which isn't
  -1 and returns -EINVAL if so.  idr_alloc() already has negative
  @start checking (w/ WARN_ON_ONCE), so this can go away.

* drivers/infiniband/core/cm.c:cm_alloc_id()
  drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()

  Used to wrap cyclic @start.  Can be replaced with max(next, 0).
  Note that this type of cyclic allocation using idr is buggy.  These
  are prone to spurious -ENOSPC failure after the first wraparound.

* fs/super.c:get_anon_bdev()

  The ID allocated from ida is masked off before being tested whether
  it's inside valid range.  ida allocated ID can never be a negative
  number and the masking is unnecessary.

Update idr_*() functions to fail with -EINVAL when negative @id is
specified and update other MAX_IDR_MASK users as described above.

This leaves MAX_IDR_MASK without any user, remove it and relocate
other MAX_IDR_* constants to lib/idr.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: linux-i2c@vger.kernel.org
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/i2c/i2c-core.c          |    2 --
 drivers/infiniband/core/cm.c    |    2 +-
 drivers/infiniband/hw/mlx4/cm.c |    2 +-
 fs/super.c                      |    2 +-
 include/linux/idr.h             |   10 ----------
 lib/idr.c                       |   24 +++++++++++++++++-------
 6 files changed, 20 insertions(+), 22 deletions(-)

--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -978,8 +978,6 @@ int i2c_add_numbered_adapter(struct i2c_
 
 	if (adap->nr == -1) /* -1 means dynamically assign bus id */
 		return i2c_add_adapter(adap);
-	if (adap->nr & ~MAX_IDR_MASK)
-		return -EINVAL;
 
 	mutex_lock(&core_lock);
 	id = idr_alloc(&i2c_adapter_idr, adap, adap->nr, adap->nr + 1,
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -390,7 +390,7 @@ static int cm_alloc_id(struct cm_id_priv
 
 	id = idr_alloc(&cm.local_id_table, cm_id_priv, next_id, 0, GFP_NOWAIT);
 	if (id >= 0)
-		next_id = ((unsigned) id + 1) & MAX_IDR_MASK;
+		next_id = max(id + 1, 0);
 
 	spin_unlock_irqrestore(&cm.lock, flags);
 	idr_preload_end();
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -225,7 +225,7 @@ id_map_alloc(struct ib_device *ibdev, in
 
 	ret = idr_alloc(&sriov->pv_id_table, ent, next_id, 0, GFP_NOWAIT);
 	if (ret >= 0) {
-		next_id = ((unsigned)ret + 1) & MAX_IDR_MASK;
+		next_id = max(ret + 1, 0);
 		ent->pv_cm_id = (u32)ret;
 		sl_id_map_add(ibdev, ent);
 		list_add_tail(&ent->list, &sriov->cm_list);
--- a/fs/super.c
+++ b/fs/super.c
@@ -842,7 +842,7 @@ int get_anon_bdev(dev_t *p)
 	else if (error)
 		return -EAGAIN;
 
-	if ((dev & MAX_IDR_MASK) == (1 << MINORBITS)) {
+	if (dev == (1 << MINORBITS)) {
 		spin_lock(&unnamed_dev_lock);
 		ida_remove(&unnamed_dev_ida, dev);
 		if (unnamed_dev_start > dev)
--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -38,16 +38,6 @@
 #define IDR_SIZE (1 << IDR_BITS)
 #define IDR_MASK ((1 << IDR_BITS)-1)
 
-#define MAX_IDR_SHIFT (sizeof(int)*8 - 1)
-#define MAX_IDR_BIT (1U << MAX_IDR_SHIFT)
-#define MAX_IDR_MASK (MAX_IDR_BIT - 1)
-
-/* Leave the possibility of an incomplete final layer */
-#define MAX_IDR_LEVEL ((MAX_IDR_SHIFT + IDR_BITS - 1) / IDR_BITS)
-
-/* Number of id_layer structs to leave in free list */
-#define MAX_IDR_FREE (MAX_IDR_LEVEL * 2)
-
 struct idr_layer {
 	unsigned long		bitmap;	/* A zero bit means "space here" */
 	struct idr_layer __rcu	*ary[1<<IDR_BITS];
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -38,6 +38,15 @@
 #include <linux/percpu.h>
 #include <linux/hardirq.h>
 
+#define MAX_IDR_SHIFT		(sizeof(int) * 8 - 1)
+#define MAX_IDR_BIT		(1U << MAX_IDR_SHIFT)
+
+/* Leave the possibility of an incomplete final layer */
+#define MAX_IDR_LEVEL ((MAX_IDR_SHIFT + IDR_BITS - 1) / IDR_BITS)
+
+/* Number of id_layer structs to leave in free list */
+#define MAX_IDR_FREE (MAX_IDR_LEVEL * 2)
+
 static struct kmem_cache *idr_layer_cache;
 static DEFINE_PER_CPU(struct idr_layer *, idr_preload_head);
 static DEFINE_PER_CPU(int, idr_preload_cnt);
@@ -539,8 +548,8 @@ void idr_remove(struct idr *idp, int id)
 	struct idr_layer *p;
 	struct idr_layer *to_free;
 
-	/* Mask off upper bits we don't use for the search. */
-	id &= MAX_IDR_MASK;
+	if (WARN_ON_ONCE(id < 0))
+		return;
 
 	sub_remove(idp, (idp->layers - 1) * IDR_BITS, id);
 	if (idp->top && idp->top->count == 1 && (idp->layers > 1) &&
@@ -647,14 +656,14 @@ void *idr_find(struct idr *idp, int id)
 	int n;
 	struct idr_layer *p;
 
+	if (WARN_ON_ONCE(id < 0))
+		return NULL;
+
 	p = rcu_dereference_raw(idp->top);
 	if (!p)
 		return NULL;
 	n = (p->layer+1) * IDR_BITS;
 
-	/* Mask off upper bits we don't use for the search. */
-	id &= MAX_IDR_MASK;
-
 	if (id > idr_max(p->layer + 1))
 		return NULL;
 	BUG_ON(n == 0);
@@ -796,14 +805,15 @@ void *idr_replace(struct idr *idp, void
 	int n;
 	struct idr_layer *p, *old_p;
 
+	if (WARN_ON_ONCE(id < 0))
+		return ERR_PTR(-EINVAL);
+
 	p = idp->top;
 	if (!p)
 		return ERR_PTR(-EINVAL);
 
 	n = (p->layer+1) * IDR_BITS;
 
-	id &= MAX_IDR_MASK;
-
 	if (id >= (1 << n))
 		return ERR_PTR(-EINVAL);
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/6] idr: remove length restriction from idr_layer->bitmap
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
  2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
@ 2013-02-08 21:02 ` Tejun Heo
  2013-02-08 21:03 ` [PATCH 4/6] idr: make idr_layer larger Tejun Heo
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell

Currently, idr->bitmap is declared as an unsigned long which restricts
the number of bits an idr_layer can contain.  All bitops can handle
arbitrary positive integer bit number and there's no reason for this
restriction.

Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
unsigned long.

* idr_layer->bitmap is now an array.  '&' dropped from params to
  bitops.

* Replaced "== IDR_FULL" tests with bitmap_full() and removed
  IDR_FULL.

* Replaced find_next_bit() on ~bitmap with find_next_zero_bit().

* Replaced "bitmap = 0" with bitmap_clear().

This patch doesn't (or at least shouldn't) introduce any behavior
changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/idr.h |   12 +-----------
 lib/idr.c           |   34 +++++++++++++++++-----------------
 2 files changed, 18 insertions(+), 28 deletions(-)

--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -19,18 +19,8 @@
 
 #if BITS_PER_LONG == 32
 # define IDR_BITS 5
-# define IDR_FULL 0xfffffffful
-/* We can only use two of the bits in the top level because there is
-   only one possible bit in the top level (5 bits * 7 levels = 35
-   bits, but you only use 31 bits in the id). */
-# define TOP_LEVEL_FULL (IDR_FULL >> 30)
 #elif BITS_PER_LONG == 64
 # define IDR_BITS 6
-# define IDR_FULL 0xfffffffffffffffful
-/* We can only use two of the bits in the top level because there is
-   only one possible bit in the top level (6 bits * 6 levels = 36
-   bits, but you only use 31 bits in the id). */
-# define TOP_LEVEL_FULL (IDR_FULL >> 62)
 #else
 # error "BITS_PER_LONG is not 32 or 64"
 #endif
@@ -39,7 +29,7 @@
 #define IDR_MASK ((1 << IDR_BITS)-1)
 
 struct idr_layer {
-	unsigned long		bitmap;	/* A zero bit means "space here" */
+	DECLARE_BITMAP(bitmap, IDR_SIZE); /* A zero bit means "space here" */
 	struct idr_layer __rcu	*ary[1<<IDR_BITS];
 	int			count;	/* When zero, we can release it */
 	int			layer;	/* distance from leaf */
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -149,18 +149,18 @@ static void idr_mark_full(struct idr_lay
 	struct idr_layer *p = pa[0];
 	int l = 0;
 
-	__set_bit(id & IDR_MASK, &p->bitmap);
+	__set_bit(id & IDR_MASK, p->bitmap);
 	/*
 	 * If this layer is full mark the bit in the layer above to
 	 * show that this part of the radix tree is full.  This may
 	 * complete the layer above and require walking up the radix
 	 * tree.
 	 */
-	while (p->bitmap == IDR_FULL) {
+	while (bitmap_full(p->bitmap, IDR_SIZE)) {
 		if (!(p = pa[++l]))
 			break;
 		id = id >> IDR_BITS;
-		__set_bit((id & IDR_MASK), &p->bitmap);
+		__set_bit((id & IDR_MASK), p->bitmap);
 	}
 }
 
@@ -213,7 +213,6 @@ static int sub_alloc(struct idr *idp, in
 	int n, m, sh;
 	struct idr_layer *p, *new;
 	int l, id, oid;
-	unsigned long bm;
 
 	id = *starting_id;
  restart:
@@ -225,8 +224,7 @@ static int sub_alloc(struct idr *idp, in
 		 * We run around this while until we reach the leaf node...
 		 */
 		n = (id >> (IDR_BITS*l)) & IDR_MASK;
-		bm = ~p->bitmap;
-		m = find_next_bit(&bm, IDR_SIZE, n);
+		m = find_next_zero_bit(p->bitmap, IDR_SIZE, n);
 		if (m == IDR_SIZE) {
 			/* no space available go back to previous layer. */
 			l++;
@@ -318,7 +316,8 @@ build_up:
 			for (new = p; p && p != idp->top; new = p) {
 				p = p->ary[0];
 				new->ary[0] = NULL;
-				new->bitmap = new->count = 0;
+				new->count = 0;
+				bitmap_clear(new->bitmap, 0, IDR_SIZE);
 				__move_to_free_list(idp, new);
 			}
 			spin_unlock_irqrestore(&idp->lock, flags);
@@ -327,8 +326,8 @@ build_up:
 		new->ary[0] = p;
 		new->count = 1;
 		new->layer = layers-1;
-		if (p->bitmap == IDR_FULL)
-			__set_bit(0, &new->bitmap);
+		if (bitmap_full(p->bitmap, IDR_SIZE))
+			__set_bit(0, new->bitmap);
 		p = new;
 	}
 	rcu_assign_pointer(idp->top, p);
@@ -506,14 +505,14 @@ static void sub_remove(struct idr *idp,
 
 	while ((shift > 0) && p) {
 		n = (id >> shift) & IDR_MASK;
-		__clear_bit(n, &p->bitmap);
+		__clear_bit(n, p->bitmap);
 		*++paa = &p->ary[n];
 		p = p->ary[n];
 		shift -= IDR_BITS;
 	}
 	n = id & IDR_MASK;
-	if (likely(p != NULL && test_bit(n, &p->bitmap))){
-		__clear_bit(n, &p->bitmap);
+	if (likely(p != NULL && test_bit(n, p->bitmap))){
+		__clear_bit(n, p->bitmap);
 		rcu_assign_pointer(p->ary[n], NULL);
 		to_free = NULL;
 		while(*paa && ! --((**paa)->count)){
@@ -556,7 +555,8 @@ void idr_remove(struct idr *idp, int id)
 		p = idp->top->ary[0];
 		rcu_assign_pointer(idp->top, p);
 		--idp->layers;
-		to_free->bitmap = to_free->count = 0;
+		to_free->count = 0;
+		bitmap_clear(to_free->bitmap, 0, IDR_SIZE);
 		free_layer(to_free);
 	}
 	while (idp->id_free_cnt >= MAX_IDR_FREE) {
@@ -816,7 +816,7 @@ void *idr_replace(struct idr *idp, void
 	}
 
 	n = id & IDR_MASK;
-	if (unlikely(p == NULL || !test_bit(n, &p->bitmap)))
+	if (unlikely(p == NULL || !test_bit(n, p->bitmap)))
 		return ERR_PTR(-ENOENT);
 
 	old_p = p->ary[n];
@@ -1013,7 +1013,7 @@ void ida_remove(struct ida *ida, int id)
 	/* clear full bits while looking up the leaf idr_layer */
 	while ((shift > 0) && p) {
 		n = (idr_id >> shift) & IDR_MASK;
-		__clear_bit(n, &p->bitmap);
+		__clear_bit(n, p->bitmap);
 		p = p->ary[n];
 		shift -= IDR_BITS;
 	}
@@ -1022,7 +1022,7 @@ void ida_remove(struct ida *ida, int id)
 		goto err;
 
 	n = idr_id & IDR_MASK;
-	__clear_bit(n, &p->bitmap);
+	__clear_bit(n, p->bitmap);
 
 	bitmap = (void *)p->ary[n];
 	if (!test_bit(offset, bitmap->bitmap))
@@ -1031,7 +1031,7 @@ void ida_remove(struct ida *ida, int id)
 	/* update bitmap and remove it if empty */
 	__clear_bit(offset, bitmap->bitmap);
 	if (--bitmap->nr_busy == 0) {
-		__set_bit(n, &p->bitmap);	/* to please idr_remove() */
+		__set_bit(n, p->bitmap);	/* to please idr_remove() */
 		idr_remove(&ida->idr, idr_id);
 		free_bitmap(ida, bitmap);
 	}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/6] idr: make idr_layer larger
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
  2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
  2013-02-08 21:02 ` [PATCH 3/6] idr: remove length restriction from idr_layer->bitmap Tejun Heo
@ 2013-02-08 21:03 ` Tejun Heo
  2013-02-08 21:03 ` [PATCH 5/6] idr: add idr_layer->prefix Tejun Heo
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell

With recent preloading changes, idr no longer keeps full layer cache
per each idr instance (used to be ~6.5k per idr on 64bit) and the
previous patch removed restriction on the bitmap size.  Both now allow
us to have larger layers.

Increase IDR_BITS to 8 regardless of BITS_PER_LONG.  Each layer is
slightly larger than 2k on 64bit and 1k on 32bit and carries 256
entries.  The size isn't too large, especially compared to what we
used to waste on per-idr caches, and 256 entries should be able to
serve most use cases with single layer.  The max tree depth is 4 which
is much better than the previous 6 on 64bit and 7 on 32bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/idr.h |   15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -17,14 +17,13 @@
 #include <linux/init.h>
 #include <linux/rcupdate.h>
 
-#if BITS_PER_LONG == 32
-# define IDR_BITS 5
-#elif BITS_PER_LONG == 64
-# define IDR_BITS 6
-#else
-# error "BITS_PER_LONG is not 32 or 64"
-#endif
-
+/*
+ * We want shallower trees and thus more bits covered at each layer.  8
+ * bits gives us large enough first layer for most use cases and maximum
+ * tree depth of 4.  Each idr_layer is slightly larger than 2k on 64bit and
+ * 1k on 32bit.
+ */
+#define IDR_BITS 8
 #define IDR_SIZE (1 << IDR_BITS)
 #define IDR_MASK ((1 << IDR_BITS)-1)
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 5/6] idr: add idr_layer->prefix
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
                   ` (2 preceding siblings ...)
  2013-02-08 21:03 ` [PATCH 4/6] idr: make idr_layer larger Tejun Heo
@ 2013-02-08 21:03 ` Tejun Heo
  2013-02-08 21:03 ` [PATCH 6/6] idr: implement lookup hint Tejun Heo
  2013-02-11 23:39 ` [PATCH 1/6] idr: fix top layer handling Andrew Morton
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell

Add a field which carries the prefix of ID the idr_layer covers.  This
will be used to implement lookup hint.

This patch doesn't make use of the new field and doesn't introduce any
behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/idr.h |    1 +
 lib/idr.c           |   13 +++++++++++++
 2 files changed, 14 insertions(+)

--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -28,6 +28,7 @@
 #define IDR_MASK ((1 << IDR_BITS)-1)
 
 struct idr_layer {
+	int			prefix;	/* the ID prefix of this idr_layer */
 	DECLARE_BITMAP(bitmap, IDR_SIZE); /* A zero bit means "space here" */
 	struct idr_layer __rcu	*ary[1<<IDR_BITS];
 	int			count;	/* When zero, we can release it */
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -60,6 +60,16 @@ static int idr_max(int layers)
 	return (1 << bits) - 1;
 }
 
+/*
+ * Prefix mask for an idr_layer at @layer.  For layer 0, the prefix mask is
+ * all bits except for the lower IDR_BITS.  For layer 1, 2 * IDR_BITS, and
+ * so on.
+ */
+static int idr_layer_prefix_mask(int layer)
+{
+	return ~idr_max(layer + 1);
+}
+
 static struct idr_layer *get_from_free_list(struct idr *idp)
 {
 	struct idr_layer *p;
@@ -272,6 +282,7 @@ static int sub_alloc(struct idr *idp, in
 			if (!new)
 				return -ENOMEM;
 			new->layer = l-1;
+			new->prefix = id & idr_layer_prefix_mask(new->layer);
 			rcu_assign_pointer(p->ary[m], new);
 			p->count++;
 		}
@@ -313,6 +324,7 @@ build_up:
 			 * upwards.
 			 */
 			p->layer++;
+			WARN_ON_ONCE(p->prefix);
 			continue;
 		}
 		if (!(new = idr_layer_alloc(gfp_mask, layer_idr))) {
@@ -334,6 +346,7 @@ build_up:
 		new->ary[0] = p;
 		new->count = 1;
 		new->layer = layers-1;
+		new->prefix = id & idr_layer_prefix_mask(new->layer);
 		if (bitmap_full(p->bitmap, IDR_SIZE))
 			__set_bit(0, new->bitmap);
 		p = new;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 6/6] idr: implement lookup hint
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
                   ` (3 preceding siblings ...)
  2013-02-08 21:03 ` [PATCH 5/6] idr: add idr_layer->prefix Tejun Heo
@ 2013-02-08 21:03 ` Tejun Heo
  2013-02-11 23:39 ` [PATCH 1/6] idr: fix top layer handling Andrew Morton
  5 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2013-02-08 21:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell

While idr lookup isn't a particularly heavy operation, it still is too
substantial to use in hot paths without worrying about the performance
implications.  With recent changes, each idr_layer covers 256 slots
which should be enough to cover most use cases with single idr_layer
making lookup hint very attractive.

This patch adds idr->hint which points to the idr_layer which
allocated an ID most recently and the fast path lookup becomes

	if (look up target's prefix matches that of the hinted layer)
		return hint->ary[ID's offset in the leaf layer];

which can be inlined.

idr->hint is set to the leaf node on idr_fill_slot() and cleared from
free_layer().

Signed-off-by: Tejun Heo <tj@kernel.org>
---
 include/linux/idr.h |   25 ++++++++++++++++++++++++-
 lib/idr.c           |   38 ++++++++++++++++----------------------
 2 files changed, 40 insertions(+), 23 deletions(-)

--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -37,6 +37,7 @@ struct idr_layer {
 };
 
 struct idr {
+	struct idr_layer __rcu	*hint;	/* the last layer allocated from */
 	struct idr_layer __rcu	*top;
 	struct idr_layer	*id_free;
 	int			layers;	/* only valid w/o concurrent changes */
@@ -71,7 +72,7 @@ struct idr {
  * This is what we export.
  */
 
-void *idr_find(struct idr *idp, int id);
+void *idr_find_slowpath(struct idr *idp, int id);
 int idr_pre_get(struct idr *idp, gfp_t gfp_mask);
 int idr_get_new_above(struct idr *idp, void *ptr, int starting_id, int *id);
 void idr_preload(gfp_t gfp_mask);
@@ -97,6 +98,28 @@ static inline void idr_preload_end(void)
 }
 
 /**
+ * idr_find - return pointer for given id
+ * @idp: idr handle
+ * @id: lookup key
+ *
+ * Return the pointer given the id it has been registered with.  A %NULL
+ * return indicates that @id is not valid or you passed %NULL in
+ * idr_get_new().
+ *
+ * This function can be called under rcu_read_lock(), given that the leaf
+ * pointers lifetimes are correctly managed.
+ */
+static inline void *idr_find(struct idr *idr, int id)
+{
+	struct idr_layer *hint = rcu_dereference_raw(idr->hint);
+
+	if ((id & ~IDR_MASK) == hint->prefix)
+		return rcu_dereference_raw(hint->ary[id & IDR_MASK]);
+
+	return idr_find_slowpath(idr, id);
+}
+
+/**
  * idr_get_new - allocate new idr entry
  * @idp: idr handle
  * @ptr: pointer you want associated with the id
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -137,8 +137,10 @@ static void idr_layer_rcu_free(struct rc
 	kmem_cache_free(idr_layer_cache, layer);
 }
 
-static inline void free_layer(struct idr_layer *p)
+static inline void free_layer(struct idr *idr, struct idr_layer *p)
 {
+	if (idr->hint == p)
+		RCU_INIT_POINTER(idr->hint, NULL);
 	call_rcu(&p->rcu_head, idr_layer_rcu_free);
 }
 
@@ -363,8 +365,12 @@ build_up:
  * @id and @pa are from a successful allocation from idr_get_empty_slot().
  * Install the user pointer @ptr and mark the slot full.
  */
-static void idr_fill_slot(void *ptr, int id, struct idr_layer **pa)
+static void idr_fill_slot(struct idr *idr, void *ptr, int id,
+			  struct idr_layer **pa)
 {
+	/* update hint used for lookup, cleared from free_layer() */
+	rcu_assign_pointer(idr->hint, pa[0]);
+
 	rcu_assign_pointer(pa[0]->ary[id & IDR_MASK], (struct idr_layer *)ptr);
 	pa[0]->count++;
 	idr_mark_full(pa, id);
@@ -397,7 +403,7 @@ int idr_get_new_above(struct idr *idp, v
 	if (rv < 0)
 		return rv == -ENOMEM ? -EAGAIN : rv;
 
-	idr_fill_slot(ptr, rv, pa);
+	idr_fill_slot(idp, ptr, rv, pa);
 	*id = rv;
 	return 0;
 }
@@ -501,7 +507,7 @@ int idr_alloc(struct idr *idr, void *ptr
 	if (unlikely(id > max))
 		return -ENOSPC;
 
-	idr_fill_slot(ptr, id, pa);
+	idr_fill_slot(idr, ptr, id, pa);
 	return id;
 }
 EXPORT_SYMBOL_GPL(idr_alloc);
@@ -538,14 +544,14 @@ static void sub_remove(struct idr *idp,
 		to_free = NULL;
 		while(*paa && ! --((**paa)->count)){
 			if (to_free)
-				free_layer(to_free);
+				free_layer(idp, to_free);
 			to_free = **paa;
 			**paa-- = NULL;
 		}
 		if (!*paa)
 			idp->layers = 0;
 		if (to_free)
-			free_layer(to_free);
+			free_layer(idp, to_free);
 	} else
 		idr_remove_warning(id);
 }
@@ -578,7 +584,7 @@ void idr_remove(struct idr *idp, int id)
 		--idp->layers;
 		to_free->count = 0;
 		bitmap_clear(to_free->bitmap, 0, IDR_SIZE);
-		free_layer(to_free);
+		free_layer(idp, to_free);
 	}
 	while (idp->id_free_cnt >= MAX_IDR_FREE) {
 		p = get_from_free_list(idp);
@@ -619,7 +625,7 @@ void __idr_remove_all(struct idr *idp)
 		/* Get the highest bit that the above add changed from 0->1. */
 		while (n < fls(id ^ bt_mask)) {
 			if (p)
-				free_layer(p);
+				free_layer(idp, p);
 			n += IDR_BITS;
 			p = *--paa;
 		}
@@ -652,19 +658,7 @@ void idr_destroy(struct idr *idp)
 }
 EXPORT_SYMBOL(idr_destroy);
 
-/**
- * idr_find - return pointer for given id
- * @idp: idr handle
- * @id: lookup key
- *
- * Return the pointer given the id it has been registered with.  A %NULL
- * return indicates that @id is not valid or you passed %NULL in
- * idr_get_new().
- *
- * This function can be called under rcu_read_lock(), given that the leaf
- * pointers lifetimes are correctly managed.
- */
-void *idr_find(struct idr *idp, int id)
+void *idr_find_slowpath(struct idr *idp, int id)
 {
 	int n;
 	struct idr_layer *p;
@@ -688,7 +682,7 @@ void *idr_find(struct idr *idp, int id)
 	}
 	return((void *)p);
 }
-EXPORT_SYMBOL(idr_find);
+EXPORT_SYMBOL(idr_find_slowpath);
 
 /**
  * idr_for_each - iterate through all stored pointers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
  2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
@ 2013-02-08 22:09   ` Hefty, Sean
  2013-02-09 19:00     ` Tejun Heo
  2013-02-10 11:52   ` Wolfram Sang
  1 sibling, 1 reply; 13+ messages in thread
From: Hefty, Sean @ 2013-02-08 22:09 UTC (permalink / raw)
  To: Tejun Heo, Andrew Morton
  Cc: linux-kernel, Rusty Russell, Jean Delvare, linux-i2c,
	Roland Dreier, Hal Rosenstock, Marciniszyn, Mike,
	Jack Morgenstein, Or Gerlitz, linux-rdma, Al Viro

> * drivers/infiniband/core/cm.c:cm_alloc_id()
>   drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()
> 
>   Used to wrap cyclic @start.  Can be replaced with max(next, 0).
>   Note that this type of cyclic allocation using idr is buggy.  These
>   are prone to spurious -ENOSPC failure after the first wraparound.

The replacement code looks fine, but can you explain why the use is buggy?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
  2013-02-08 22:09   ` Hefty, Sean
@ 2013-02-09 19:00     ` Tejun Heo
  2013-02-10 21:19       ` Hefty, Sean
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2013-02-09 19:00 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Andrew Morton, linux-kernel, Rusty Russell, Jean Delvare,
	linux-i2c, Roland Dreier, Hal Rosenstock, Marciniszyn, Mike,
	Jack Morgenstein, Or Gerlitz, linux-rdma, Al Viro

Hello,

On Fri, Feb 08, 2013 at 10:09:13PM +0000, Hefty, Sean wrote:
> >   Used to wrap cyclic @start.  Can be replaced with max(next, 0).
> >   Note that this type of cyclic allocation using idr is buggy.  These
> >   are prone to spurious -ENOSPC failure after the first wraparound.
> 
> The replacement code looks fine, but can you explain why the use is buggy?

So, if you want a cyclic allocation, the allocation should be tried in
[start, END) and then [0, start); otherwise, after the allocation
wraps for the first time, as the closer the starting point gets to
END, the chance of not finding a vacant slot in [start, END) goes
higher.  When @start equals END - 1 for the second time, if the first
END - 1 allocation is still around, you'll get -ENOSPC.

In practice, I don't think anyone is hitting this.  idr has always
been horribly broken when it reaches higher range (> 1<<30 on 64bit)
so things would have broken even before the first wraparound.  It
still is a theoretical possibility which may trigger if idr is used
for, say, ipc messages or storage commands.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
  2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
  2013-02-08 22:09   ` Hefty, Sean
@ 2013-02-10 11:52   ` Wolfram Sang
  1 sibling, 0 replies; 13+ messages in thread
From: Wolfram Sang @ 2013-02-10 11:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, linux-kernel, Rusty Russell, Jean Delvare,
	linux-i2c, Roland Dreier, Sean Hefty, Hal Rosenstock,
	Marciniszyn, Mike, Jack Morgenstein, Or Gerlitz, linux-rdma,
	Al Viro

On Fri, Feb 08, 2013 at 01:01:49PM -0800, Tejun Heo wrote:
> MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
> whole positive integer range, it's defined as 0x7fffffff or INT_MAX.
> 
> Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
> They basically mask off the sign bit and operate on the rest, so if
> the caller, by accident, passes in a negative number, the sign bit
> will be masked off and the remaining part will be used as if that was
> the input, which is worse than crashing.
> 
> The constant is visible in idr.h and there are several users in the
> kernel.
> 
> * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()
> 
>   Basically used to test if adap->nr is a negative number which isn't
>   -1 and returns -EINVAL if so.  idr_alloc() already has negative
>   @start checking (w/ WARN_ON_ONCE), so this can go away.
> 
> * drivers/infiniband/core/cm.c:cm_alloc_id()
>   drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()
> 
>   Used to wrap cyclic @start.  Can be replaced with max(next, 0).
>   Note that this type of cyclic allocation using idr is buggy.  These
>   are prone to spurious -ENOSPC failure after the first wraparound.
> 
> * fs/super.c:get_anon_bdev()
> 
>   The ID allocated from ida is masked off before being tested whether
>   it's inside valid range.  ida allocated ID can never be a negative
>   number and the masking is unnecessary.
> 
> Update idr_*() functions to fail with -EINVAL when negative @id is
> specified and update other MAX_IDR_MASK users as described above.
> 
> This leaves MAX_IDR_MASK without any user, remove it and relocate
> other MAX_IDR_* constants to lib/idr.c.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>

For the i2c-part:

Acked-by: Wolfram Sang <wolfram@the-dreams.de>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
  2013-02-09 19:00     ` Tejun Heo
@ 2013-02-10 21:19       ` Hefty, Sean
  0 siblings, 0 replies; 13+ messages in thread
From: Hefty, Sean @ 2013-02-10 21:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, linux-kernel, Rusty Russell, Jean Delvare,
	linux-i2c, Roland Dreier, Hal Rosenstock, Marciniszyn, Mike,
	Jack Morgenstein, Or Gerlitz, linux-rdma, Al Viro

> So, if you want a cyclic allocation, the allocation should be tried in
> [start, END) and then [0, start); otherwise, after the allocation
> wraps for the first time, as the closer the starting point gets to
> END, the chance of not finding a vacant slot in [start, END) goes
> higher.  When @start equals END - 1 for the second time, if the first
> END - 1 allocation is still around, you'll get -ENOSPC.

Got it - thanks.  I'll make a note to fix this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] idr: fix top layer handling
  2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
                   ` (4 preceding siblings ...)
  2013-02-08 21:03 ` [PATCH 6/6] idr: implement lookup hint Tejun Heo
@ 2013-02-11 23:39 ` Andrew Morton
  2013-02-12 17:10   ` Tejun Heo
  5 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2013-02-11 23:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Rusty Russell, stable

On Fri, 8 Feb 2013 13:00:50 -0800
Tejun Heo <tj@kernel.org> wrote:

> Most functions in idr fail to deal with the high bits when the idr
> tree grows to the maximum height.
> 
> * idr_get_empty_slot() stops growing idr tree once the depth reaches
>   MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
>   cover the whole range.  The function doesn't even notice that it
>   didn't grow the tree enough and ends up allocating the wrong ID
>   given sufficiently high @starting_id.
> 
>   For example, on 64 bit, if the starting id is 0x7fffff01,
>   idr_get_empty_slot() will grow the tree 5 layer deep, which only
>   covers the 30 bits and then proceed to allocate as if the bit 30
>   wasn't specified.  It ends up allocating 0x3fffff01 without the bit
>   30 but still returns 0x7fffff01.
> 
> * __idr_remove_all() will not remove anything if the tree is fully
>   grown.
> 
> * idr_find() can't find anything if the tree is fully grown.
> 
> * idr_for_each() and idr_get_next() can't iterate anything if the tree
>   is fully grown.
> 
> Fix it by introducing idr_max() which returns the maximum possible ID
> given the depth of tree and replacing the id limit checks in all
> affected places.
> 
> As the idr_layer pointer array pa[] needs to be 1 larger than the
> maximum depth, enlarge pa[] arrays by one.
> 
> While this plugs the discovered issues, the whole code base is
> horrible and in desparate need of rewrite.  It's fragile like hell,
> difficult to read and maintain, and plain ugly.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: stable@vger.kernel.org

This doesn't apply happily to 3.7, so Greg will be needing a redone
version when the time arrives.

But does it really need backporting?  Is anyone likely to hit this in
practice?

Also, I assume you have some sort of IDR test harness over there.  Is
it something we can get into the tree in some fashion to help with
ongoing maintenance?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] idr: fix top layer handling
  2013-02-11 23:39 ` [PATCH 1/6] idr: fix top layer handling Andrew Morton
@ 2013-02-12 17:10   ` Tejun Heo
  2013-02-12 21:23     ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2013-02-12 17:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Rusty Russell, stable

Hey, Andrew.

On Mon, Feb 11, 2013 at 03:39:55PM -0800, Andrew Morton wrote:
> This doesn't apply happily to 3.7, so Greg will be needing a redone
> version when the time arrives.
> 
> But does it really need backporting?  Is anyone likely to hit this in
> practice?

For most cases, probably not.  IDR_BITS being 5 and 6 on 32 and 64bit
respectively, the only misbehaving bit is bit 30, so unless ID goes
over 1G, which should be close to non-existent if the IDs are being
allocated from zero, it shouldn't be a problem; however, we do have
users where IDR is used to allocate cyclic IDs.  They maintain and
progress the last allocated ID so that IDs don't get recycled without
wrapping around.  I have no idea whether any of them would be
allocating IDs fast enough to go over 1G limit in any reasonable
amount of time or whether there are such ID allocations which can be
exploited from userland without root priviliedges, in which case it
would probably directly lead to oops.

So, to sum up, at least I can't rule out the issue happening or being
exploited in the wild with older kernels.  It isn't too likely to
happen naturally but if there's a userland exploitable cyclic
alloction, going over 1G wouldn't be too difficult.

> Also, I assume you have some sort of IDR test harness over there.  Is
> it something we can get into the tree in some fashion to help with
> ongoing maintenance?

Right now, it's just a messy test module with ad-hoc loops and manual
alloc/frees with a lot of printks sprinkled everywhere, so I'm afraid
it wouldn't be suitable for any form of automated test.  It sure would
be great to have a testing harness for this tho.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/6] idr: fix top layer handling
  2013-02-12 17:10   ` Tejun Heo
@ 2013-02-12 21:23     ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2013-02-12 21:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Rusty Russell, stable

On Tue, 12 Feb 2013 09:10:49 -0800
Tejun Heo <tj@kernel.org> wrote:

> Hey, Andrew.
> 
> On Mon, Feb 11, 2013 at 03:39:55PM -0800, Andrew Morton wrote:
> > This doesn't apply happily to 3.7, so Greg will be needing a redone
> > version when the time arrives.
> > 
> > But does it really need backporting?  Is anyone likely to hit this in
> > practice?
> 
> For most cases, probably not.  IDR_BITS being 5 and 6 on 32 and 64bit
> respectively, the only misbehaving bit is bit 30, so unless ID goes
> over 1G, which should be close to non-existent if the IDs are being
> allocated from zero, it shouldn't be a problem; however, we do have
> users where IDR is used to allocate cyclic IDs.  They maintain and
> progress the last allocated ID so that IDs don't get recycled without
> wrapping around.  I have no idea whether any of them would be
> allocating IDs fast enough to go over 1G limit in any reasonable
> amount of time or whether there are such ID allocations which can be
> exploited from userland without root priviliedges, in which case it
> would probably directly lead to oops.
> 
> So, to sum up, at least I can't rule out the issue happening or being
> exploited in the wild with older kernels.  It isn't too likely to
> happen naturally but if there's a userland exploitable cyclic
> alloction, going over 1G wouldn't be too difficult.

OK.  The changelog has the cc:stable tag so I'll let you and Greg duke
it out ;)

> > Also, I assume you have some sort of IDR test harness over there.  Is
> > it something we can get into the tree in some fashion to help with
> > ongoing maintenance?
> 
> Right now, it's just a messy test module with ad-hoc loops and manual
> alloc/frees with a lot of printks sprinkled everywhere, so I'm afraid
> it wouldn't be suitable for any form of automated test.  It sure would
> be great to have a testing harness for this tho.

I slapped together a userspace test harness for lib/radix-tree.c many
years ago.  It's at http://ozlabs.org/~akpm/rtth.tar.gz.  Whenever
anyone does any significant radix-tree work, they download that, update
it to test their new stuff then send me an rtth diff.

I suppose we could move that code into the kernel tree and use/maintain
it there.  That would create a framework and precedent for new things
like the idr test harness.  iirc the test harness uses the kernel's
radix-tree.c and radix-tree.h unaltered - it provides a pile of hacky
header files sufficient to make the kernel code compilable into the
userspace test app.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-02-12 21:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-08 21:00 [PATCH 1/6] idr: fix top layer handling Tejun Heo
2013-02-08 21:01 ` [PATCH 2/6] idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c Tejun Heo
2013-02-08 22:09   ` Hefty, Sean
2013-02-09 19:00     ` Tejun Heo
2013-02-10 21:19       ` Hefty, Sean
2013-02-10 11:52   ` Wolfram Sang
2013-02-08 21:02 ` [PATCH 3/6] idr: remove length restriction from idr_layer->bitmap Tejun Heo
2013-02-08 21:03 ` [PATCH 4/6] idr: make idr_layer larger Tejun Heo
2013-02-08 21:03 ` [PATCH 5/6] idr: add idr_layer->prefix Tejun Heo
2013-02-08 21:03 ` [PATCH 6/6] idr: implement lookup hint Tejun Heo
2013-02-11 23:39 ` [PATCH 1/6] idr: fix top layer handling Andrew Morton
2013-02-12 17:10   ` Tejun Heo
2013-02-12 21:23     ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).