linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
@ 2012-01-13 15:52 Dimitri Sivanich
  2012-01-13 16:15 ` Eric Dumazet
  2012-01-13 16:22 ` Al Viro
  0 siblings, 2 replies; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-13 15:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexander Viro, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Paul E. McKenney,
	Paul Gortmaker, Andrew Morton, Jiri Kosina, Avi Kivity,
	linux-fsdevel, netdev

When the number of dentry cache hash table entries gets too high
(2147483648 entries), use of a signed integer in the initialization
loop prevents the dentry_hashtable from getting initialized, resulting
in a panic in __d_lookup.  Fixing this in dcache_init and a few other
spots for consistency.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
---
 fs/dcache.c    |    8 ++++----
 fs/inode.c     |    8 ++++----
 kernel/pid.c   |    4 ++--
 net/ipv4/tcp.c |    3 ++-
 4 files changed, 12 insertions(+), 11 deletions(-)

Index: linux/fs/dcache.c
===================================================================
--- linux.orig/fs/dcache.c
+++ linux/fs/dcache.c
@@ -2968,7 +2968,7 @@ __setup("dhash_entries=", set_dhash_entr
 
 static void __init dcache_init_early(void)
 {
-	int loop;
+	long loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -2986,13 +2986,13 @@ static void __init dcache_init_early(voi
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1L << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 static void __init dcache_init(void)
 {
-	int loop;
+	long loop;
 
 	/* 
 	 * A constructor could be added for stable state like the lists,
@@ -3016,7 +3016,7 @@ static void __init dcache_init(void)
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1L << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
Index: linux/fs/inode.c
===================================================================
--- linux.orig/fs/inode.c
+++ linux/fs/inode.c
@@ -1654,7 +1654,7 @@ __setup("ihash_entries=", set_ihash_entr
  */
 void __init inode_init_early(void)
 {
-	int loop;
+	long loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -1672,13 +1672,13 @@ void __init inode_init_early(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1L << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void __init inode_init(void)
 {
-	int loop;
+	long loop;
 
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
@@ -1702,7 +1702,7 @@ void __init inode_init(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1L << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
Index: linux/kernel/pid.c
===================================================================
--- linux.orig/kernel/pid.c
+++ linux/kernel/pid.c
@@ -543,12 +543,12 @@ struct pid *find_ge_pid(int nr, struct p
  */
 void __init pidhash_init(void)
 {
-	int i, pidhash_size;
+	long i, pidhash_size;
 
 	pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash), 0, 18,
 					   HASH_EARLY | HASH_SMALL,
 					   &pidhash_shift, NULL, 4096);
-	pidhash_size = 1 << pidhash_shift;
+	pidhash_size = 1L << pidhash_shift;
 
 	for (i = 0; i < pidhash_size; i++)
 		INIT_HLIST_HEAD(&pid_hash[i]);
Index: linux/net/ipv4/tcp.c
===================================================================
--- linux.orig/net/ipv4/tcp.c
+++ linux/net/ipv4/tcp.c
@@ -3220,7 +3220,8 @@ void __init tcp_init(void)
 {
 	struct sk_buff *skb = NULL;
 	unsigned long limit;
-	int i, max_share, cnt;
+	long i;
+	int max_share, cnt;
 	unsigned long jiffy = jiffies;
 
 	BUILD_BUG_ON(sizeof(struct tcp_skb_cb) > sizeof(skb->cb));

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-13 15:52 [PATCH] Fix panic in __d_lookup with high dentry hashtable counts Dimitri Sivanich
@ 2012-01-13 16:15 ` Eric Dumazet
  2012-01-13 16:22 ` Al Viro
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2012-01-13 16:15 UTC (permalink / raw)
  To: Dimitri Sivanich
  Cc: linux-kernel, Alexander Viro, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
	Paul E. McKenney, Paul Gortmaker, Andrew Morton, Jiri Kosina,
	Avi Kivity, linux-fsdevel, netdev

Le vendredi 13 janvier 2012 à 09:52 -0600, Dimitri Sivanich a écrit :
> When the number of dentry cache hash table entries gets too high
> (2147483648 entries), use of a signed integer in the initialization
> loop prevents the dentry_hashtable from getting initialized, resulting
> in a panic in __d_lookup.  Fixing this in dcache_init and a few other
> spots for consistency.

Well...

nr_dentry being an int, I dont think having a so big hash table is
needed/possible. Its probably a waste of memory ?

Maybe we should limit alloc_large_system_hash() to at most 2^30 slots.

[ And later, convert it to unsigned long *_hash_shift and unsigned long
*_hash_mask ]




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-13 15:52 [PATCH] Fix panic in __d_lookup with high dentry hashtable counts Dimitri Sivanich
  2012-01-13 16:15 ` Eric Dumazet
@ 2012-01-13 16:22 ` Al Viro
  2012-01-13 16:36   ` Dimitri Sivanich
  2012-01-17 17:13   ` Dimitri Sivanich
  1 sibling, 2 replies; 12+ messages in thread
From: Al Viro @ 2012-01-13 16:22 UTC (permalink / raw)
  To: Dimitri Sivanich
  Cc: linux-kernel, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Paul E. McKenney,
	Paul Gortmaker, Andrew Morton, Jiri Kosina, Avi Kivity,
	linux-fsdevel, netdev

On Fri, Jan 13, 2012 at 09:52:37AM -0600, Dimitri Sivanich wrote:
> When the number of dentry cache hash table entries gets too high
> (2147483648 entries), use of a signed integer in the initialization
> loop prevents the dentry_hashtable from getting initialized, resulting
> in a panic in __d_lookup.  Fixing this in dcache_init and a few other
> spots for consistency.

>  static void __init dcache_init(void)
>  {
> -	int loop;
> +	long loop;

You've got to be kidding.  Note that D_HASHMASK is at most 32bit.  Use
of long here is an overkill and so's 2^31 hash buckets (that's what,
16Gb in hash list heads alone?  What kind of average chain length do
you expect, BTW?)

Can alloc_large_system_hash() produce the horrors that large, anyway?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-13 16:22 ` Al Viro
@ 2012-01-13 16:36   ` Dimitri Sivanich
  2012-01-13 16:39     ` Dimitri Sivanich
  2012-01-17 17:13   ` Dimitri Sivanich
  1 sibling, 1 reply; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-13 16:36 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-kernel, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Paul E. McKenney,
	Paul Gortmaker, Andrew Morton, Jiri Kosina, Avi Kivity,
	linux-fsdevel, netdev

On Fri, Jan 13, 2012 at 04:22:36PM +0000, Al Viro wrote:
> On Fri, Jan 13, 2012 at 09:52:37AM -0600, Dimitri Sivanich wrote:
> > When the number of dentry cache hash table entries gets too high
> > (2147483648 entries), use of a signed integer in the initialization
> > loop prevents the dentry_hashtable from getting initialized, resulting
> > in a panic in __d_lookup.  Fixing this in dcache_init and a few other
> > spots for consistency.
> 
> >  static void __init dcache_init(void)
> >  {
> > -	int loop;
> > +	long loop;
> 
> You've got to be kidding.  Note that D_HASHMASK is at most 32bit.  Use
> of long here is an overkill and so's 2^31 hash buckets (that's what,
> 16Gb in hash list heads alone?  What kind of average chain length do
> you expect, BTW?)

Yes, long might be overkill right now, but the code is all __init time code.
I don't have numbers showing average chain length at this point, I was
simply fixing this one end case

> 
> Can alloc_large_system_hash() produce the horrors that large, anyway?

On a 16TB system, alloc_large_system_hash() produces 2^31 hash buckets, yes.

Would simply capping the value in alloc_large_system_hash() be more palatable?

Something like the following?

Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -5257,6 +5257,7 @@ void *__init alloc_large_system_hash(con
        if (max == 0) {
                max = ((unsigned long long)nr_all_pages << PAGE_SHIFT) >> 4;
                do_div(max, bucketsize);
+               max = min(max, 1ULL << 30);
        }
 
        if (numentries > max)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-13 16:36   ` Dimitri Sivanich
@ 2012-01-13 16:39     ` Dimitri Sivanich
  0 siblings, 0 replies; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-13 16:39 UTC (permalink / raw)
  To: Al Viro, linux-kernel, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
	Paul E. McKenney, Paul Gortmaker, Andrew Morton, Jiri Kosina,
	Avi Kivity, linux-fsdevel, netdev

On Fri, Jan 13, 2012 at 10:36:42AM -0600, Dimitri Sivanich wrote:
> On Fri, Jan 13, 2012 at 04:22:36PM +0000, Al Viro wrote:
> > On Fri, Jan 13, 2012 at 09:52:37AM -0600, Dimitri Sivanich wrote:
> > > When the number of dentry cache hash table entries gets too high
> > > (2147483648 entries), use of a signed integer in the initialization
> > > loop prevents the dentry_hashtable from getting initialized, resulting
> > > in a panic in __d_lookup.  Fixing this in dcache_init and a few other
> > > spots for consistency.
> > 
> > >  static void __init dcache_init(void)
> > >  {
> > > -	int loop;
> > > +	long loop;
> > 
> > You've got to be kidding.  Note that D_HASHMASK is at most 32bit.  Use
> > of long here is an overkill and so's 2^31 hash buckets (that's what,
> > 16Gb in hash list heads alone?  What kind of average chain length do
> > you expect, BTW?)
> 
> Yes, long might be overkill right now, but the code is all __init time code.
> I don't have numbers showing average chain length at this point, I was
> simply fixing this one end case
> 
> > 
> > Can alloc_large_system_hash() produce the horrors that large, anyway?
> 
> On a 16TB system, alloc_large_system_hash() produces 2^31 hash buckets, yes.
> 
> Would simply capping the value in alloc_large_system_hash() be more palatable?
> 
> Something like the following?
> 
> Index: linux/mm/page_alloc.c
> ===================================================================
> --- linux.orig/mm/page_alloc.c
> +++ linux/mm/page_alloc.c
> @@ -5257,6 +5257,7 @@ void *__init alloc_large_system_hash(con
>         if (max == 0) {
>                 max = ((unsigned long long)nr_all_pages << PAGE_SHIFT) >> 4;
>                 do_div(max, bucketsize);
> +               max = min(max, 1ULL << 30);
>         }
>  
>         if (numentries > max)

Sorry, you'd probably want something more like this:

Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -5258,6 +5258,7 @@ void *__init alloc_large_system_hash(con
                max = ((unsigned long long)nr_all_pages << PAGE_SHIFT) >> 4;
                do_div(max, bucketsize);
        }
+       max = min(max, 1ULL << 30);
 
        if (numentries > max)
                numentries = max;


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-13 16:22 ` Al Viro
  2012-01-13 16:36   ` Dimitri Sivanich
@ 2012-01-17 17:13   ` Dimitri Sivanich
  2012-01-17 17:22     ` David Miller
  2012-01-17 17:25     ` Al Viro
  1 sibling, 2 replies; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-17 17:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: Al Viro, Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
	Paul E. McKenney, Paul Gortmaker, Andrew Morton, Jiri Kosina,
	Avi Kivity, linux-fsdevel, netdev

When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use
of a signed integer in the dcache_init() initialization loop prevents
the dentry_hashtable from getting initialized, causing a panic in
__d_lookup().

In addition, the _hash_mask returned from alloc_large_system_hash() does
not support more than a 32 bit hash table size.

Changing the _hash_mask size returned from alloc_large_system_hash() to
support larger hash table sizes in the future, and changing loop counter
sizes appropriately.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
---
 fs/dcache.c             |   10 +++++-----
 fs/inode.c              |   10 +++++-----
 include/linux/bootmem.h |    2 +-
 mm/page_alloc.c         |    2 +-
 net/ipv4/route.c        |    8 ++++++--
 net/ipv4/tcp.c          |   13 +++++++++----
 net/ipv4/udp.c          |    5 ++++-
 7 files changed, 31 insertions(+), 19 deletions(-)

Index: linux/fs/dcache.c
===================================================================
--- linux.orig/fs/dcache.c
+++ linux/fs/dcache.c
@@ -99,7 +99,7 @@ static struct kmem_cache *dentry_cache _
 #define D_HASHBITS     d_hash_shift
 #define D_HASHMASK     d_hash_mask
 
-static unsigned int d_hash_mask __read_mostly;
+static unsigned long d_hash_mask __read_mostly;
 static unsigned int d_hash_shift __read_mostly;
 
 static struct hlist_bl_head *dentry_hashtable __read_mostly;
@@ -2968,7 +2968,7 @@ __setup("dhash_entries=", set_dhash_entr
 
 static void __init dcache_init_early(void)
 {
-	int loop;
+	unsigned long loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -2986,13 +2986,13 @@ static void __init dcache_init_early(voi
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1UL << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 static void __init dcache_init(void)
 {
-	int loop;
+	unsigned long loop;
 
 	/* 
 	 * A constructor could be added for stable state like the lists,
@@ -3016,7 +3016,7 @@ static void __init dcache_init(void)
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1UL << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
Index: linux/fs/inode.c
===================================================================
--- linux.orig/fs/inode.c
+++ linux/fs/inode.c
@@ -60,7 +60,7 @@
  *   inode_hash_lock
  */
 
-static unsigned int i_hash_mask __read_mostly;
+static unsigned long i_hash_mask __read_mostly;
 static unsigned int i_hash_shift __read_mostly;
 static struct hlist_head *inode_hashtable __read_mostly;
 static __cacheline_aligned_in_smp DEFINE_SPINLOCK(inode_hash_lock);
@@ -1654,7 +1654,7 @@ __setup("ihash_entries=", set_ihash_entr
  */
 void __init inode_init_early(void)
 {
-	int loop;
+	unsigned long loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -1672,13 +1672,13 @@ void __init inode_init_early(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1UL << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void __init inode_init(void)
 {
-	int loop;
+	unsigned long loop;
 
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
@@ -1702,7 +1702,7 @@ void __init inode_init(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1UL << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
Index: linux/include/linux/bootmem.h
===================================================================
--- linux.orig/include/linux/bootmem.h
+++ linux/include/linux/bootmem.h
@@ -153,7 +153,7 @@ extern void *alloc_large_system_hash(con
 				     int scale,
 				     int flags,
 				     unsigned int *_hash_shift,
-				     unsigned int *_hash_mask,
+				     unsigned long *_hash_mask,
 				     unsigned long limit);
 
 #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -5219,7 +5219,7 @@ void *__init alloc_large_system_hash(con
 				     int scale,
 				     int flags,
 				     unsigned int *_hash_shift,
-				     unsigned int *_hash_mask,
+				     unsigned long *_hash_mask,
 				     unsigned long limit)
 {
 	unsigned long long max = limit;
Index: linux/net/ipv4/route.c
===================================================================
--- linux.orig/net/ipv4/route.c
+++ linux/net/ipv4/route.c
@@ -3446,6 +3446,7 @@ __setup("rhash_entries=", set_rhash_entr
 
 int __init ip_rt_init(void)
 {
+	unsigned long hash_mask;
 	int rc = 0;
 
 #ifdef CONFIG_IP_ROUTE_CLASSID
@@ -3474,8 +3475,11 @@ int __init ip_rt_init(void)
 					15 : 17,
 					0,
 					&rt_hash_log,
-					&rt_hash_mask,
-					rhash_entries ? 0 : 512 * 1024);
+					&hash_mask,
+					rhash_entries ? 0x80000000 :
+								512 * 1024);
+	/* FIXME: Above limit value (0x80000000) allows the following cast. */
+	rt_hash_mask = (unsigned int) hash_mask;
 	memset(rt_hash_table, 0, (rt_hash_mask + 1) * sizeof(struct rt_hash_bucket));
 	rt_hash_lock_init();
 
Index: linux/net/ipv4/tcp.c
===================================================================
--- linux.orig/net/ipv4/tcp.c
+++ linux/net/ipv4/tcp.c
@@ -3219,8 +3219,9 @@ __setup("thash_entries=", set_thash_entr
 void __init tcp_init(void)
 {
 	struct sk_buff *skb = NULL;
-	unsigned long limit;
-	int i, max_share, cnt;
+	unsigned long limit, hash_mask;
+	unsigned i;
+	int max_share, cnt;
 	unsigned long jiffy = jiffies;
 
 	BUILD_BUG_ON(sizeof(struct tcp_skb_cb) > sizeof(skb->cb));
@@ -3245,8 +3246,12 @@ void __init tcp_init(void)
 					13 : 15,
 					0,
 					NULL,
-					&tcp_hashinfo.ehash_mask,
-					thash_entries ? 0 : 512 * 1024);
+					&hash_mask,
+					thash_entries ? 0x80000000 :
+								512 * 1024);
+	/* FIXME: Above limit value (0x80000000) allows the following cast. */
+	tcp_hashinfo.ehash_mask = (unsigned int)hash_mask;
+
 	for (i = 0; i <= tcp_hashinfo.ehash_mask; i++) {
 		INIT_HLIST_NULLS_HEAD(&tcp_hashinfo.ehash[i].chain, i);
 		INIT_HLIST_NULLS_HEAD(&tcp_hashinfo.ehash[i].twchain, i);
Index: linux/net/ipv4/udp.c
===================================================================
--- linux.orig/net/ipv4/udp.c
+++ linux/net/ipv4/udp.c
@@ -2180,6 +2180,7 @@ __setup("uhash_entries=", set_uhash_entr
 
 void __init udp_table_init(struct udp_table *table, const char *name)
 {
+	unsigned long hash_mask;
 	unsigned int i;
 
 	if (!CONFIG_BASE_SMALL)
@@ -2189,8 +2190,10 @@ void __init udp_table_init(struct udp_ta
 			21, /* one slot per 2 MB */
 			0,
 			&table->log,
-			&table->mask,
+			&hash_mask,
 			64 * 1024);
+	/* FIXME: Above limit value (64 * 1024) allows the following cast. */
+	table->mask = (unsigned int)hash_mask;
 	/*
 	 * Make sure hash table has the minimum size
 	 */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 17:13   ` Dimitri Sivanich
@ 2012-01-17 17:22     ` David Miller
  2012-01-17 17:41       ` Dimitri Sivanich
  2012-01-17 21:05       ` Dimitri Sivanich
  2012-01-17 17:25     ` Al Viro
  1 sibling, 2 replies; 12+ messages in thread
From: David Miller @ 2012-01-17 17:22 UTC (permalink / raw)
  To: sivanich
  Cc: linux-kernel, viro, eric.dumazet, kuznet, jmorris, yoshfuji,
	kaber, paulmck, paul.gortmaker, akpm, jkosina, avi,
	linux-fsdevel, netdev

From: Dimitri Sivanich <sivanich@sgi.com>
Date: Tue, 17 Jan 2012 11:13:52 -0600

> When the number of dentry cache hash table entries gets too high
> (2147483648 entries), as happens by default on a 16TB system, use
> of a signed integer in the dcache_init() initialization loop prevents
> the dentry_hashtable from getting initialized, causing a panic in
> __d_lookup().
> 
> In addition, the _hash_mask returned from alloc_large_system_hash() does
> not support more than a 32 bit hash table size.
> 
> Changing the _hash_mask size returned from alloc_large_system_hash() to
> support larger hash table sizes in the future, and changing loop counter
> sizes appropriately.
> 
> Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>

To be honest I think this is overkill.

Supporting anything larger than a 32-bit hash mask is not even close
to being reasonable.  Nobody needs a 4GB hash table, not for anything.

Instead I would just make sure everything is "unsigned int" or "u32"
and calculations use things like "((u32) 1) << shift", and enforce an
upper bounds of 0x80000000 or similar unconditionally in the hash
allocator itself (rather than conditionally in the networking code).

All of this "long" stuff is madness, what the heck is a long?  It's a
non-fixed type, yet you put constants in your code (0x80000000) which
depend upon that type's size.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 17:13   ` Dimitri Sivanich
  2012-01-17 17:22     ` David Miller
@ 2012-01-17 17:25     ` Al Viro
  2012-01-17 17:28       ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Al Viro @ 2012-01-17 17:25 UTC (permalink / raw)
  To: Dimitri Sivanich
  Cc: linux-kernel, Eric Dumazet, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy,
	Paul E. McKenney, Paul Gortmaker, Andrew Morton, Jiri Kosina,
	Avi Kivity, linux-fsdevel, netdev

On Tue, Jan 17, 2012 at 11:13:52AM -0600, Dimitri Sivanich wrote:
> When the number of dentry cache hash table entries gets too high
> (2147483648 entries), as happens by default on a 16TB system, use
> of a signed integer in the dcache_init() initialization loop prevents
> the dentry_hashtable from getting initialized, causing a panic in
> __d_lookup().
> 
> In addition, the _hash_mask returned from alloc_large_system_hash() does
> not support more than a 32 bit hash table size.
> 
> Changing the _hash_mask size returned from alloc_large_system_hash() to
> support larger hash table sizes in the future, and changing loop counter
> sizes appropriately.

... and I still would like to see somebody familiar with uses of other
hashes to comment on the desirability of such monsters.  For dcache and
icache it's absolutely certain to be worse than useless.  We are talking
about 4Gbuckets here...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 17:25     ` Al Viro
@ 2012-01-17 17:28       ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2012-01-17 17:28 UTC (permalink / raw)
  To: viro
  Cc: sivanich, linux-kernel, eric.dumazet, kuznet, jmorris, yoshfuji,
	kaber, paulmck, paul.gortmaker, akpm, jkosina, avi,
	linux-fsdevel, netdev

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 17 Jan 2012 17:25:27 +0000

> On Tue, Jan 17, 2012 at 11:13:52AM -0600, Dimitri Sivanich wrote:
>> When the number of dentry cache hash table entries gets too high
>> (2147483648 entries), as happens by default on a 16TB system, use
>> of a signed integer in the dcache_init() initialization loop prevents
>> the dentry_hashtable from getting initialized, causing a panic in
>> __d_lookup().
>> 
>> In addition, the _hash_mask returned from alloc_large_system_hash() does
>> not support more than a 32 bit hash table size.
>> 
>> Changing the _hash_mask size returned from alloc_large_system_hash() to
>> support larger hash table sizes in the future, and changing loop counter
>> sizes appropriately.
> 
> ... and I still would like to see somebody familiar with uses of other
> hashes to comment on the desirability of such monsters.  For dcache and
> icache it's absolutely certain to be worse than useless.  We are talking
> about 4Gbuckets here...

It's wrong in networking too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 17:22     ` David Miller
@ 2012-01-17 17:41       ` Dimitri Sivanich
  2012-01-17 21:05       ` Dimitri Sivanich
  1 sibling, 0 replies; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-17 17:41 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, viro, eric.dumazet, kuznet, jmorris, yoshfuji,
	kaber, paulmck, paul.gortmaker, akpm, jkosina, avi,
	linux-fsdevel, netdev

On Tue, Jan 17, 2012 at 12:22:29PM -0500, David Miller wrote:
> From: Dimitri Sivanich <sivanich@sgi.com>
> Date: Tue, 17 Jan 2012 11:13:52 -0600
> 
> > When the number of dentry cache hash table entries gets too high
> > (2147483648 entries), as happens by default on a 16TB system, use
> > of a signed integer in the dcache_init() initialization loop prevents
> > the dentry_hashtable from getting initialized, causing a panic in
> > __d_lookup().
> > 
> > In addition, the _hash_mask returned from alloc_large_system_hash() does
> > not support more than a 32 bit hash table size.
> > 
> > Changing the _hash_mask size returned from alloc_large_system_hash() to
> > support larger hash table sizes in the future, and changing loop counter
> > sizes appropriately.
> > 
> > Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
> 
> To be honest I think this is overkill.

I'm not going to flat-out disagree with you.  These would be huge hash
tables.  The thought was to make this __init code as flexible as possible.

> 
> Supporting anything larger than a 32-bit hash mask is not even close
> to being reasonable.  Nobody needs a 4GB hash table, not for anything.

Yes, at this point that is likely true.

> 
> Instead I would just make sure everything is "unsigned int" or "u32"
> and calculations use things like "((u32) 1) << shift", and enforce an
> upper bounds of 0x80000000 or similar unconditionally in the hash
> allocator itself (rather than conditionally in the networking code).

OK.  I had mentioned capping the value in alloc_large_system_hash() to
32 bits, but got no response to that proposal.  I'll create a proper
patch.

> 
> All of this "long" stuff is madness, what the heck is a long?  It's a
> non-fixed type, yet you put constants in your code (0x80000000) which
> depend upon that type's size.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 17:22     ` David Miller
  2012-01-17 17:41       ` Dimitri Sivanich
@ 2012-01-17 21:05       ` Dimitri Sivanich
  2012-01-18  4:57         ` David Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Dimitri Sivanich @ 2012-01-17 21:05 UTC (permalink / raw)
  To: David Miller
  Cc: linux-kernel, viro, eric.dumazet, kuznet, jmorris, yoshfuji,
	kaber, paulmck, paul.gortmaker, akpm, jkosina, avi,
	linux-fsdevel, netdev

On Tue, Jan 17, 2012 at 12:22:29PM -0500, David Miller wrote:
> To be honest I think this is overkill.
> 
> Supporting anything larger than a 32-bit hash mask is not even close
> to being reasonable.  Nobody needs a 4GB hash table, not for anything.
>
Here is a patch that keeps the 32-bit hash mask.


 
When the number of dentry cache hash table entries gets too high
(2147483648 entries), as happens by default on a 16TB system, use
of a signed integer in the dcache_init() initialization loop prevents
the dentry_hashtable from getting initialized, causing a panic in
__d_lookup().  Fix this in dcache_init() and similar areas.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
---
 fs/dcache.c     |    8 ++++----
 fs/inode.c      |    8 ++++----
 kernel/pid.c    |    4 ++--
 mm/page_alloc.c |    1 +
 net/ipv4/tcp.c  |    5 +++--
 5 files changed, 14 insertions(+), 12 deletions(-)

Index: linux/fs/dcache.c
===================================================================
--- linux.orig/fs/dcache.c
+++ linux/fs/dcache.c
@@ -2968,7 +2968,7 @@ __setup("dhash_entries=", set_dhash_entr
 
 static void __init dcache_init_early(void)
 {
-	int loop;
+	unsigned int loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -2986,13 +2986,13 @@ static void __init dcache_init_early(voi
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1U << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 static void __init dcache_init(void)
 {
-	int loop;
+	unsigned int loop;
 
 	/* 
 	 * A constructor could be added for stable state like the lists,
@@ -3016,7 +3016,7 @@ static void __init dcache_init(void)
 					&d_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << d_hash_shift); loop++)
+	for (loop = 0; loop < (1U << d_hash_shift); loop++)
 		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
Index: linux/fs/inode.c
===================================================================
--- linux.orig/fs/inode.c
+++ linux/fs/inode.c
@@ -1654,7 +1654,7 @@ __setup("ihash_entries=", set_ihash_entr
  */
 void __init inode_init_early(void)
 {
-	int loop;
+	unsigned int loop;
 
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
@@ -1672,13 +1672,13 @@ void __init inode_init_early(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1U << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void __init inode_init(void)
 {
-	int loop;
+	unsigned int loop;
 
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
@@ -1702,7 +1702,7 @@ void __init inode_init(void)
 					&i_hash_mask,
 					0);
 
-	for (loop = 0; loop < (1 << i_hash_shift); loop++)
+	for (loop = 0; loop < (1U << i_hash_shift); loop++)
 		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -5258,6 +5258,7 @@ void *__init alloc_large_system_hash(con
 		max = ((unsigned long long)nr_all_pages << PAGE_SHIFT) >> 4;
 		do_div(max, bucketsize);
 	}
+	max = min(max, 0x80000000ULL);
 
 	if (numentries > max)
 		numentries = max;
Index: linux/kernel/pid.c
===================================================================
--- linux.orig/kernel/pid.c
+++ linux/kernel/pid.c
@@ -543,12 +543,12 @@ struct pid *find_ge_pid(int nr, struct p
  */
 void __init pidhash_init(void)
 {
-	int i, pidhash_size;
+	unsigned int i, pidhash_size;
 
 	pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash), 0, 18,
 					   HASH_EARLY | HASH_SMALL,
 					   &pidhash_shift, NULL, 4096);
-	pidhash_size = 1 << pidhash_shift;
+	pidhash_size = 1U << pidhash_shift;
 
 	for (i = 0; i < pidhash_size; i++)
 		INIT_HLIST_HEAD(&pid_hash[i]);
Index: linux/net/ipv4/tcp.c
===================================================================
--- linux.orig/net/ipv4/tcp.c
+++ linux/net/ipv4/tcp.c
@@ -3220,7 +3220,8 @@ void __init tcp_init(void)
 {
 	struct sk_buff *skb = NULL;
 	unsigned long limit;
-	int i, max_share, cnt;
+	int max_share, cnt;
+	unsigned int i;
 	unsigned long jiffy = jiffies;
 
 	BUILD_BUG_ON(sizeof(struct tcp_skb_cb) > sizeof(skb->cb));
@@ -3263,7 +3264,7 @@ void __init tcp_init(void)
 					&tcp_hashinfo.bhash_size,
 					NULL,
 					64 * 1024);
-	tcp_hashinfo.bhash_size = 1 << tcp_hashinfo.bhash_size;
+	tcp_hashinfo.bhash_size = 1U << tcp_hashinfo.bhash_size;
 	for (i = 0; i < tcp_hashinfo.bhash_size; i++) {
 		spin_lock_init(&tcp_hashinfo.bhash[i].lock);
 		INIT_HLIST_HEAD(&tcp_hashinfo.bhash[i].chain);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Fix panic in __d_lookup with high dentry hashtable counts
  2012-01-17 21:05       ` Dimitri Sivanich
@ 2012-01-18  4:57         ` David Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2012-01-18  4:57 UTC (permalink / raw)
  To: sivanich
  Cc: linux-kernel, viro, eric.dumazet, kuznet, jmorris, yoshfuji,
	kaber, paulmck, paul.gortmaker, akpm, jkosina, avi,
	linux-fsdevel, netdev

From: Dimitri Sivanich <sivanich@sgi.com>
Date: Tue, 17 Jan 2012 15:05:41 -0600

> On Tue, Jan 17, 2012 at 12:22:29PM -0500, David Miller wrote:
>> To be honest I think this is overkill.
>> 
>> Supporting anything larger than a 32-bit hash mask is not even close
>> to being reasonable.  Nobody needs a 4GB hash table, not for anything.
>>
> Here is a patch that keeps the 32-bit hash mask.
> 
> 
>  
> When the number of dentry cache hash table entries gets too high
> (2147483648 entries), as happens by default on a 16TB system, use
> of a signed integer in the dcache_init() initialization loop prevents
> the dentry_hashtable from getting initialized, causing a panic in
> __d_lookup().  Fix this in dcache_init() and similar areas.
> 
> Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>

This looks good to me, thanks Dimitri:

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-01-18  5:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-13 15:52 [PATCH] Fix panic in __d_lookup with high dentry hashtable counts Dimitri Sivanich
2012-01-13 16:15 ` Eric Dumazet
2012-01-13 16:22 ` Al Viro
2012-01-13 16:36   ` Dimitri Sivanich
2012-01-13 16:39     ` Dimitri Sivanich
2012-01-17 17:13   ` Dimitri Sivanich
2012-01-17 17:22     ` David Miller
2012-01-17 17:41       ` Dimitri Sivanich
2012-01-17 21:05       ` Dimitri Sivanich
2012-01-18  4:57         ` David Miller
2012-01-17 17:25     ` Al Viro
2012-01-17 17:28       ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).