All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Zeroing hash tables in allocator
@ 2017-03-01  0:14 ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Changes:
v1 -> v2: Reverted NG4memcpy() changes

On large machines hash tables can be many gigabytes in size and it is
inefficient to zero them in a loop without platform specific optimizations.

Using memset() provides a standard platform optimized way to zero the
memory.

Pavel Tatashin (3):
  sparc64: NG4 memset 32 bits overflow
  mm: Zeroing hash tables in allocator
  mm: Updated callers to use HASH_ZERO flag

 arch/sparc/lib/NG4memset.S          |   26 +++++++++++++-------------
 fs/dcache.c                         |   18 ++++--------------
 fs/inode.c                          |   14 ++------------
 fs/namespace.c                      |   10 ++--------
 include/linux/bootmem.h             |    1 +
 kernel/locking/qspinlock_paravirt.h |    3 ++-
 kernel/pid.c                        |    7 ++-----
 mm/page_alloc.c                     |   12 +++++++++---
 8 files changed, 35 insertions(+), 56 deletions(-)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] Zeroing hash tables in allocator
@ 2017-03-01  0:14 ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Changes:
v1 -> v2: Reverted NG4memcpy() changes

On large machines hash tables can be many gigabytes in size and it is
inefficient to zero them in a loop without platform specific optimizations.

Using memset() provides a standard platform optimized way to zero the
memory.

Pavel Tatashin (3):
  sparc64: NG4 memset 32 bits overflow
  mm: Zeroing hash tables in allocator
  mm: Updated callers to use HASH_ZERO flag

 arch/sparc/lib/NG4memset.S          |   26 +++++++++++++-------------
 fs/dcache.c                         |   18 ++++--------------
 fs/inode.c                          |   14 ++------------
 fs/namespace.c                      |   10 ++--------
 include/linux/bootmem.h             |    1 +
 kernel/locking/qspinlock_paravirt.h |    3 ++-
 kernel/pid.c                        |    7 ++-----
 mm/page_alloc.c                     |   12 +++++++++---
 8 files changed, 35 insertions(+), 56 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01  0:14 ` Pavel Tatashin
@ 2017-03-01  0:14   ` Pavel Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Early in boot Linux patches memset and memcpy to branch to platform
optimized versions of these routines. The NG4 (Niagra 4) versions are
currently used on  all platforms starting from T4. Recently, there were M7
optimized routines added into UEK4 but not into mainline yet. So, even with
M7 optimized routines NG4 are still going to be used on T4, T5, M5, and M6
processors.

While investigating how to improve initialization time of dentry_hashtable
which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()
does not reset all the memory in this array, after studying the code, I
realized that NG4memset() branches use %icc register instead of %xcc to
check compare, so if value of length is over 32-bit long, which is true for
8G array, these routines fail to work properly.

The fix is to replace all %icc with %xcc in these routines. (Alternative is
to use %ncc, but this is misleading, as the code already has sparcv9 only
instructions, and cannot be compiled on 32-bit).

This is important to fix this bug, because even older T4-4 can have 2T of
memory, and there are large memory proportional data structures in kernel
which can be larger than 4G in size. The failing of memset() is silent and
corruption is hard to detect.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 arch/sparc/lib/NG4memset.S |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/sparc/lib/NG4memset.S b/arch/sparc/lib/NG4memset.S
index 41da4bd..e7c2e70 100644
--- a/arch/sparc/lib/NG4memset.S
+++ b/arch/sparc/lib/NG4memset.S
@@ -13,14 +13,14 @@
 	.globl		NG4memset
 NG4memset:
 	andcc		%o1, 0xff, %o4
-	be,pt		%icc, 1f
+	be,pt		%xcc, 1f
 	 mov		%o2, %o1
 	sllx		%o4, 8, %g1
 	or		%g1, %o4, %o2
 	sllx		%o2, 16, %g1
 	or		%g1, %o2, %o2
 	sllx		%o2, 32, %g1
-	ba,pt		%icc, 1f
+	ba,pt		%xcc, 1f
 	 or		%g1, %o2, %o4
 	.size		NG4memset,.-NG4memset
 
@@ -29,7 +29,7 @@ NG4memset:
 NG4bzero:
 	clr		%o4
 1:	cmp		%o1, 16
-	ble		%icc, .Ltiny
+	ble		%xcc, .Ltiny
 	 mov		%o0, %o3
 	sub		%g0, %o0, %g1
 	and		%g1, 0x7, %g1
@@ -37,7 +37,7 @@ NG4bzero:
 	 sub		%o1, %g1, %o1
 1:	stb		%o4, [%o0 + 0x00]
 	subcc		%g1, 1, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 1, %o0
 .Laligned8:
 	cmp		%o1, 64 + (64 - 8)
@@ -48,7 +48,7 @@ NG4bzero:
 	 sub		%o1, %g1, %o1
 1:	stx		%o4, [%o0 + 0x00]
 	subcc		%g1, 8, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x8, %o0
 .Laligned64:
 	andn		%o1, 64 - 1, %g1
@@ -58,30 +58,30 @@ NG4bzero:
 1:	stxa		%o4, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P
 	subcc		%g1, 0x40, %g1
 	stxa		%o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x40, %o0
 .Lpostloop:
 	cmp		%o1, 8
-	bl,pn		%icc, .Ltiny
+	bl,pn		%xcc, .Ltiny
 	 membar		#StoreStore|#StoreLoad
 .Lmedium:
 	andn		%o1, 0x7, %g1
 	sub		%o1, %g1, %o1
 1:	stx		%o4, [%o0 + 0x00]
 	subcc		%g1, 0x8, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x08, %o0
 	andcc		%o1, 0x4, %g1
-	be,pt		%icc, .Ltiny
+	be,pt		%xcc, .Ltiny
 	 sub		%o1, %g1, %o1
 	stw		%o4, [%o0 + 0x00]
 	add		%o0, 0x4, %o0
 .Ltiny:
 	cmp		%o1, 0
-	be,pn		%icc, .Lexit
+	be,pn		%xcc, .Lexit
 1:	 subcc		%o1, 1, %o1
 	stb		%o4, [%o0 + 0x00]
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 1, %o0
 .Lexit:
 	retl
@@ -99,7 +99,7 @@ NG4bzero:
 	stxa		%o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P
 	stxa		%o4, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P
 	stxa		%o4, [%o0 + %o5] ASI_BLK_INIT_QUAD_LDD_P
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x30, %o0
-	ba,a,pt		%icc, .Lpostloop
+	ba,a,pt		%xcc, .Lpostloop
 	.size		NG4bzero,.-NG4bzero
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01  0:14   ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Early in boot Linux patches memset and memcpy to branch to platform
optimized versions of these routines. The NG4 (Niagra 4) versions are
currently used on  all platforms starting from T4. Recently, there were M7
optimized routines added into UEK4 but not into mainline yet. So, even with
M7 optimized routines NG4 are still going to be used on T4, T5, M5, and M6
processors.

While investigating how to improve initialization time of dentry_hashtable
which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()
does not reset all the memory in this array, after studying the code, I
realized that NG4memset() branches use %icc register instead of %xcc to
check compare, so if value of length is over 32-bit long, which is true for
8G array, these routines fail to work properly.

The fix is to replace all %icc with %xcc in these routines. (Alternative is
to use %ncc, but this is misleading, as the code already has sparcv9 only
instructions, and cannot be compiled on 32-bit).

This is important to fix this bug, because even older T4-4 can have 2T of
memory, and there are large memory proportional data structures in kernel
which can be larger than 4G in size. The failing of memset() is silent and
corruption is hard to detect.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 arch/sparc/lib/NG4memset.S |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/sparc/lib/NG4memset.S b/arch/sparc/lib/NG4memset.S
index 41da4bd..e7c2e70 100644
--- a/arch/sparc/lib/NG4memset.S
+++ b/arch/sparc/lib/NG4memset.S
@@ -13,14 +13,14 @@
 	.globl		NG4memset
 NG4memset:
 	andcc		%o1, 0xff, %o4
-	be,pt		%icc, 1f
+	be,pt		%xcc, 1f
 	 mov		%o2, %o1
 	sllx		%o4, 8, %g1
 	or		%g1, %o4, %o2
 	sllx		%o2, 16, %g1
 	or		%g1, %o2, %o2
 	sllx		%o2, 32, %g1
-	ba,pt		%icc, 1f
+	ba,pt		%xcc, 1f
 	 or		%g1, %o2, %o4
 	.size		NG4memset,.-NG4memset
 
@@ -29,7 +29,7 @@ NG4memset:
 NG4bzero:
 	clr		%o4
 1:	cmp		%o1, 16
-	ble		%icc, .Ltiny
+	ble		%xcc, .Ltiny
 	 mov		%o0, %o3
 	sub		%g0, %o0, %g1
 	and		%g1, 0x7, %g1
@@ -37,7 +37,7 @@ NG4bzero:
 	 sub		%o1, %g1, %o1
 1:	stb		%o4, [%o0 + 0x00]
 	subcc		%g1, 1, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 1, %o0
 .Laligned8:
 	cmp		%o1, 64 + (64 - 8)
@@ -48,7 +48,7 @@ NG4bzero:
 	 sub		%o1, %g1, %o1
 1:	stx		%o4, [%o0 + 0x00]
 	subcc		%g1, 8, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x8, %o0
 .Laligned64:
 	andn		%o1, 64 - 1, %g1
@@ -58,30 +58,30 @@ NG4bzero:
 1:	stxa		%o4, [%o0 + %g0] ASI_BLK_INIT_QUAD_LDD_P
 	subcc		%g1, 0x40, %g1
 	stxa		%o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x40, %o0
 .Lpostloop:
 	cmp		%o1, 8
-	bl,pn		%icc, .Ltiny
+	bl,pn		%xcc, .Ltiny
 	 membar		#StoreStore|#StoreLoad
 .Lmedium:
 	andn		%o1, 0x7, %g1
 	sub		%o1, %g1, %o1
 1:	stx		%o4, [%o0 + 0x00]
 	subcc		%g1, 0x8, %g1
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x08, %o0
 	andcc		%o1, 0x4, %g1
-	be,pt		%icc, .Ltiny
+	be,pt		%xcc, .Ltiny
 	 sub		%o1, %g1, %o1
 	stw		%o4, [%o0 + 0x00]
 	add		%o0, 0x4, %o0
 .Ltiny:
 	cmp		%o1, 0
-	be,pn		%icc, .Lexit
+	be,pn		%xcc, .Lexit
 1:	 subcc		%o1, 1, %o1
 	stb		%o4, [%o0 + 0x00]
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 1, %o0
 .Lexit:
 	retl
@@ -99,7 +99,7 @@ NG4bzero:
 	stxa		%o4, [%o0 + %g2] ASI_BLK_INIT_QUAD_LDD_P
 	stxa		%o4, [%o0 + %g3] ASI_BLK_INIT_QUAD_LDD_P
 	stxa		%o4, [%o0 + %o5] ASI_BLK_INIT_QUAD_LDD_P
-	bne,pt		%icc, 1b
+	bne,pt		%xcc, 1b
 	 add		%o0, 0x30, %o0
-	ba,a,pt		%icc, .Lpostloop
+	ba,a,pt		%xcc, .Lpostloop
 	.size		NG4bzero,.-NG4bzero
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/3] mm: Zeroing hash tables in allocator
  2017-03-01  0:14 ` Pavel Tatashin
@ 2017-03-01  0:14   ` Pavel Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Add a new flag HASH_ZERO which when provided grantees that the hash table
that is returned by alloc_large_system_hash() is zeroed.  In most cases
that is what is needed by the caller. Use page level allocator's __GFP_ZERO
flags to zero the memory. It is using memset() which is efficient method to
zero memory and is optimized for most platforms.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 include/linux/bootmem.h |    1 +
 mm/page_alloc.c         |   12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 962164d..e223d91 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -358,6 +358,7 @@ static inline void __init memblock_free_late(
 #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
 #define HASH_SMALL	0x00000002	/* sub-page allocation allowed, min
 					 * shift passed via *_hash_shift */
+#define HASH_ZERO	0x00000004	/* Zero allocated hash table */
 
 /* Only NUMA needs hash distribution. 64bit NUMA architectures have
  * sufficient vmalloc space.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a7a6aac..1b0f7a4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7142,6 +7142,7 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 	unsigned long long max = high_limit;
 	unsigned long log2qty, size;
 	void *table = NULL;
+	gfp_t gfp_flags;
 
 	/* allow the kernel cmdline to have a say */
 	if (!numentries) {
@@ -7186,12 +7187,17 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 
 	log2qty = ilog2(numentries);
 
+	/*
+	 * memblock allocator returns zeroed memory already, so HASH_ZERO is
+	 * currently not used when HASH_EARLY is specified.
+	 */
+	gfp_flags = (flags & HASH_ZERO) ? GFP_ATOMIC | __GFP_ZERO : GFP_ATOMIC;
 	do {
 		size = bucketsize << log2qty;
 		if (flags & HASH_EARLY)
 			table = memblock_virt_alloc_nopanic(size, 0);
 		else if (hashdist)
-			table = __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL);
+			table = __vmalloc(size, gfp_flags, PAGE_KERNEL);
 		else {
 			/*
 			 * If bucketsize is not a power-of-two, we may free
@@ -7199,8 +7205,8 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 			 * alloc_pages_exact() automatically does
 			 */
 			if (get_order(size) < MAX_ORDER) {
-				table = alloc_pages_exact(size, GFP_ATOMIC);
-				kmemleak_alloc(table, size, 1, GFP_ATOMIC);
+				table = alloc_pages_exact(size, gfp_flags);
+				kmemleak_alloc(table, size, 1, gfp_flags);
 			}
 		}
 	} while (!table && size > PAGE_SIZE && --log2qty);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/3] mm: Zeroing hash tables in allocator
@ 2017-03-01  0:14   ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Add a new flag HASH_ZERO which when provided grantees that the hash table
that is returned by alloc_large_system_hash() is zeroed.  In most cases
that is what is needed by the caller. Use page level allocator's __GFP_ZERO
flags to zero the memory. It is using memset() which is efficient method to
zero memory and is optimized for most platforms.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 include/linux/bootmem.h |    1 +
 mm/page_alloc.c         |   12 +++++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 962164d..e223d91 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -358,6 +358,7 @@ static inline void __init memblock_free_late(
 #define HASH_EARLY	0x00000001	/* Allocating during early boot? */
 #define HASH_SMALL	0x00000002	/* sub-page allocation allowed, min
 					 * shift passed via *_hash_shift */
+#define HASH_ZERO	0x00000004	/* Zero allocated hash table */
 
 /* Only NUMA needs hash distribution. 64bit NUMA architectures have
  * sufficient vmalloc space.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a7a6aac..1b0f7a4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7142,6 +7142,7 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 	unsigned long long max = high_limit;
 	unsigned long log2qty, size;
 	void *table = NULL;
+	gfp_t gfp_flags;
 
 	/* allow the kernel cmdline to have a say */
 	if (!numentries) {
@@ -7186,12 +7187,17 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 
 	log2qty = ilog2(numentries);
 
+	/*
+	 * memblock allocator returns zeroed memory already, so HASH_ZERO is
+	 * currently not used when HASH_EARLY is specified.
+	 */
+	gfp_flags = (flags & HASH_ZERO) ? GFP_ATOMIC | __GFP_ZERO : GFP_ATOMIC;
 	do {
 		size = bucketsize << log2qty;
 		if (flags & HASH_EARLY)
 			table = memblock_virt_alloc_nopanic(size, 0);
 		else if (hashdist)
-			table = __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL);
+			table = __vmalloc(size, gfp_flags, PAGE_KERNEL);
 		else {
 			/*
 			 * If bucketsize is not a power-of-two, we may free
@@ -7199,8 +7205,8 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 			 * alloc_pages_exact() automatically does
 			 */
 			if (get_order(size) < MAX_ORDER) {
-				table = alloc_pages_exact(size, GFP_ATOMIC);
-				kmemleak_alloc(table, size, 1, GFP_ATOMIC);
+				table = alloc_pages_exact(size, gfp_flags);
+				kmemleak_alloc(table, size, 1, gfp_flags);
 			}
 		}
 	} while (!table && size > PAGE_SIZE && --log2qty);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] mm: Updated callers to use HASH_ZERO flag
  2017-03-01  0:14 ` Pavel Tatashin
@ 2017-03-01  0:14   ` Pavel Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Update dcache, inode, pid, mountpoint, and mount hash tables to use
HASH_ZERO, and remove initialization after allocations.
In case of places where HASH_EARLY was used such as in __pv_init_lock_hash
the zeroed hash table was already assumed, because memblock zeroes the
memory.

CPU: SPARC M6, Memory: 7T
Before fix:
Dentry cache hash table entries: 1073741824
Inode-cache hash table entries: 536870912
Mount-cache hash table entries: 16777216
Mountpoint-cache hash table entries: 16777216
ftrace: allocating 20414 entries in 40 pages
Total time: 11.798s

After fix:
Dentry cache hash table entries: 1073741824
Inode-cache hash table entries: 536870912
Mount-cache hash table entries: 16777216
Mountpoint-cache hash table entries: 16777216
ftrace: allocating 20414 entries in 40 pages
Total time: 3.198s

CPU: Intel Xeon E5-2630, Memory: 2.2T:
Before fix:
Dentry cache hash table entries: 536870912
Inode-cache hash table entries: 268435456
Mount-cache hash table entries: 8388608
Mountpoint-cache hash table entries: 8388608
CPU: Physical Processor ID: 0
Total time: 3.245s

After fix:
Dentry cache hash table entries: 536870912
Inode-cache hash table entries: 268435456
Mount-cache hash table entries: 8388608
Mountpoint-cache hash table entries: 8388608
CPU: Physical Processor ID: 0
Total time: 3.244s

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 fs/dcache.c                         |   18 ++++--------------
 fs/inode.c                          |   14 ++------------
 fs/namespace.c                      |   10 ++--------
 kernel/locking/qspinlock_paravirt.h |    3 ++-
 kernel/pid.c                        |    7 ++-----
 5 files changed, 12 insertions(+), 40 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 95d71ed..363502f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3548,8 +3548,6 @@ static int __init set_dhash_entries(char *str)
 
 static void __init dcache_init_early(void)
 {
-	unsigned int loop;
-
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
 	 */
@@ -3561,24 +3559,19 @@ static void __init dcache_init_early(void)
 					sizeof(struct hlist_bl_head),
 					dhash_entries,
 					13,
-					HASH_EARLY,
+					HASH_EARLY | HASH_ZERO,
 					&d_hash_shift,
 					&d_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << d_hash_shift); loop++)
-		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 static void __init dcache_init(void)
 {
-	unsigned int loop;
-
-	/* 
+	/*
 	 * A constructor could be added for stable state like the lists,
 	 * but it is probably not worth it because of the cache nature
-	 * of the dcache. 
+	 * of the dcache.
 	 */
 	dentry_cache = KMEM_CACHE(dentry,
 		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
@@ -3592,14 +3585,11 @@ static void __init dcache_init(void)
 					sizeof(struct hlist_bl_head),
 					dhash_entries,
 					13,
-					0,
+					HASH_ZERO,
 					&d_hash_shift,
 					&d_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << d_hash_shift); loop++)
-		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 /* SLAB cache for __getname() consumers */
diff --git a/fs/inode.c b/fs/inode.c
index 88110fd..1b15a7c 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1916,8 +1916,6 @@ static int __init set_ihash_entries(char *str)
  */
 void __init inode_init_early(void)
 {
-	unsigned int loop;
-
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
 	 */
@@ -1929,20 +1927,15 @@ void __init inode_init_early(void)
 					sizeof(struct hlist_head),
 					ihash_entries,
 					14,
-					HASH_EARLY,
+					HASH_EARLY | HASH_ZERO,
 					&i_hash_shift,
 					&i_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << i_hash_shift); loop++)
-		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void __init inode_init(void)
 {
-	unsigned int loop;
-
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
 					 sizeof(struct inode),
@@ -1960,14 +1953,11 @@ void __init inode_init(void)
 					sizeof(struct hlist_head),
 					ihash_entries,
 					14,
-					0,
+					HASH_ZERO,
 					&i_hash_shift,
 					&i_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << i_hash_shift); loop++)
-		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void init_special_inode(struct inode *inode, umode_t mode, dev_t rdev)
diff --git a/fs/namespace.c b/fs/namespace.c
index 8bfad42..275e6e2 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3238,7 +3238,6 @@ static void __init init_mount_tree(void)
 
 void __init mnt_init(void)
 {
-	unsigned u;
 	int err;
 
 	mnt_cache = kmem_cache_create("mnt_cache", sizeof(struct mount),
@@ -3247,22 +3246,17 @@ void __init mnt_init(void)
 	mount_hashtable = alloc_large_system_hash("Mount-cache",
 				sizeof(struct hlist_head),
 				mhash_entries, 19,
-				0,
+				HASH_ZERO,
 				&m_hash_shift, &m_hash_mask, 0, 0);
 	mountpoint_hashtable = alloc_large_system_hash("Mountpoint-cache",
 				sizeof(struct hlist_head),
 				mphash_entries, 19,
-				0,
+				HASH_ZERO,
 				&mp_hash_shift, &mp_hash_mask, 0, 0);
 
 	if (!mount_hashtable || !mountpoint_hashtable)
 		panic("Failed to allocate mount hash table\n");
 
-	for (u = 0; u <= m_hash_mask; u++)
-		INIT_HLIST_HEAD(&mount_hashtable[u]);
-	for (u = 0; u <= mp_hash_mask; u++)
-		INIT_HLIST_HEAD(&mountpoint_hashtable[u]);
-
 	kernfs_init();
 
 	err = sysfs_init();
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index e6b2f7a..4ccfcaa 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -193,7 +193,8 @@ void __init __pv_init_lock_hash(void)
 	 */
 	pv_lock_hash = alloc_large_system_hash("PV qspinlock",
 					       sizeof(struct pv_hash_entry),
-					       pv_hash_size, 0, HASH_EARLY,
+					       pv_hash_size, 0,
+					       HASH_EARLY | HASH_ZERO,
 					       &pv_lock_hash_bits, NULL,
 					       pv_hash_size, pv_hash_size);
 }
diff --git a/kernel/pid.c b/kernel/pid.c
index 0291804..013e023 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -572,16 +572,13 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
  */
 void __init pidhash_init(void)
 {
-	unsigned int i, pidhash_size;
+	unsigned int pidhash_size;
 
 	pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash), 0, 18,
-					   HASH_EARLY | HASH_SMALL,
+					   HASH_EARLY | HASH_SMALL | HASH_ZERO,
 					   &pidhash_shift, NULL,
 					   0, 4096);
 	pidhash_size = 1U << pidhash_shift;
-
-	for (i = 0; i < pidhash_size; i++)
-		INIT_HLIST_HEAD(&pid_hash[i]);
 }
 
 void __init pidmap_init(void)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] mm: Updated callers to use HASH_ZERO flag
@ 2017-03-01  0:14   ` Pavel Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pavel Tatashin @ 2017-03-01  0:14 UTC (permalink / raw)
  To: linux-mm, sparclinux

Update dcache, inode, pid, mountpoint, and mount hash tables to use
HASH_ZERO, and remove initialization after allocations.
In case of places where HASH_EARLY was used such as in __pv_init_lock_hash
the zeroed hash table was already assumed, because memblock zeroes the
memory.

CPU: SPARC M6, Memory: 7T
Before fix:
Dentry cache hash table entries: 1073741824
Inode-cache hash table entries: 536870912
Mount-cache hash table entries: 16777216
Mountpoint-cache hash table entries: 16777216
ftrace: allocating 20414 entries in 40 pages
Total time: 11.798s

After fix:
Dentry cache hash table entries: 1073741824
Inode-cache hash table entries: 536870912
Mount-cache hash table entries: 16777216
Mountpoint-cache hash table entries: 16777216
ftrace: allocating 20414 entries in 40 pages
Total time: 3.198s

CPU: Intel Xeon E5-2630, Memory: 2.2T:
Before fix:
Dentry cache hash table entries: 536870912
Inode-cache hash table entries: 268435456
Mount-cache hash table entries: 8388608
Mountpoint-cache hash table entries: 8388608
CPU: Physical Processor ID: 0
Total time: 3.245s

After fix:
Dentry cache hash table entries: 536870912
Inode-cache hash table entries: 268435456
Mount-cache hash table entries: 8388608
Mountpoint-cache hash table entries: 8388608
CPU: Physical Processor ID: 0
Total time: 3.244s

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
---
 fs/dcache.c                         |   18 ++++--------------
 fs/inode.c                          |   14 ++------------
 fs/namespace.c                      |   10 ++--------
 kernel/locking/qspinlock_paravirt.h |    3 ++-
 kernel/pid.c                        |    7 ++-----
 5 files changed, 12 insertions(+), 40 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 95d71ed..363502f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -3548,8 +3548,6 @@ static int __init set_dhash_entries(char *str)
 
 static void __init dcache_init_early(void)
 {
-	unsigned int loop;
-
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
 	 */
@@ -3561,24 +3559,19 @@ static void __init dcache_init_early(void)
 					sizeof(struct hlist_bl_head),
 					dhash_entries,
 					13,
-					HASH_EARLY,
+					HASH_EARLY | HASH_ZERO,
 					&d_hash_shift,
 					&d_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << d_hash_shift); loop++)
-		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 static void __init dcache_init(void)
 {
-	unsigned int loop;
-
-	/* 
+	/*
 	 * A constructor could be added for stable state like the lists,
 	 * but it is probably not worth it because of the cache nature
-	 * of the dcache. 
+	 * of the dcache.
 	 */
 	dentry_cache = KMEM_CACHE(dentry,
 		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
@@ -3592,14 +3585,11 @@ static void __init dcache_init(void)
 					sizeof(struct hlist_bl_head),
 					dhash_entries,
 					13,
-					0,
+					HASH_ZERO,
 					&d_hash_shift,
 					&d_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << d_hash_shift); loop++)
-		INIT_HLIST_BL_HEAD(dentry_hashtable + loop);
 }
 
 /* SLAB cache for __getname() consumers */
diff --git a/fs/inode.c b/fs/inode.c
index 88110fd..1b15a7c 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1916,8 +1916,6 @@ static int __init set_ihash_entries(char *str)
  */
 void __init inode_init_early(void)
 {
-	unsigned int loop;
-
 	/* If hashes are distributed across NUMA nodes, defer
 	 * hash allocation until vmalloc space is available.
 	 */
@@ -1929,20 +1927,15 @@ void __init inode_init_early(void)
 					sizeof(struct hlist_head),
 					ihash_entries,
 					14,
-					HASH_EARLY,
+					HASH_EARLY | HASH_ZERO,
 					&i_hash_shift,
 					&i_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << i_hash_shift); loop++)
-		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void __init inode_init(void)
 {
-	unsigned int loop;
-
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
 					 sizeof(struct inode),
@@ -1960,14 +1953,11 @@ void __init inode_init(void)
 					sizeof(struct hlist_head),
 					ihash_entries,
 					14,
-					0,
+					HASH_ZERO,
 					&i_hash_shift,
 					&i_hash_mask,
 					0,
 					0);
-
-	for (loop = 0; loop < (1U << i_hash_shift); loop++)
-		INIT_HLIST_HEAD(&inode_hashtable[loop]);
 }
 
 void init_special_inode(struct inode *inode, umode_t mode, dev_t rdev)
diff --git a/fs/namespace.c b/fs/namespace.c
index 8bfad42..275e6e2 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3238,7 +3238,6 @@ static void __init init_mount_tree(void)
 
 void __init mnt_init(void)
 {
-	unsigned u;
 	int err;
 
 	mnt_cache = kmem_cache_create("mnt_cache", sizeof(struct mount),
@@ -3247,22 +3246,17 @@ void __init mnt_init(void)
 	mount_hashtable = alloc_large_system_hash("Mount-cache",
 				sizeof(struct hlist_head),
 				mhash_entries, 19,
-				0,
+				HASH_ZERO,
 				&m_hash_shift, &m_hash_mask, 0, 0);
 	mountpoint_hashtable = alloc_large_system_hash("Mountpoint-cache",
 				sizeof(struct hlist_head),
 				mphash_entries, 19,
-				0,
+				HASH_ZERO,
 				&mp_hash_shift, &mp_hash_mask, 0, 0);
 
 	if (!mount_hashtable || !mountpoint_hashtable)
 		panic("Failed to allocate mount hash table\n");
 
-	for (u = 0; u <= m_hash_mask; u++)
-		INIT_HLIST_HEAD(&mount_hashtable[u]);
-	for (u = 0; u <= mp_hash_mask; u++)
-		INIT_HLIST_HEAD(&mountpoint_hashtable[u]);
-
 	kernfs_init();
 
 	err = sysfs_init();
diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index e6b2f7a..4ccfcaa 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -193,7 +193,8 @@ void __init __pv_init_lock_hash(void)
 	 */
 	pv_lock_hash = alloc_large_system_hash("PV qspinlock",
 					       sizeof(struct pv_hash_entry),
-					       pv_hash_size, 0, HASH_EARLY,
+					       pv_hash_size, 0,
+					       HASH_EARLY | HASH_ZERO,
 					       &pv_lock_hash_bits, NULL,
 					       pv_hash_size, pv_hash_size);
 }
diff --git a/kernel/pid.c b/kernel/pid.c
index 0291804..013e023 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -572,16 +572,13 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
  */
 void __init pidhash_init(void)
 {
-	unsigned int i, pidhash_size;
+	unsigned int pidhash_size;
 
 	pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash), 0, 18,
-					   HASH_EARLY | HASH_SMALL,
+					   HASH_EARLY | HASH_SMALL | HASH_ZERO,
 					   &pidhash_shift, NULL,
 					   0, 4096);
 	pidhash_size = 1U << pidhash_shift;
-
-	for (i = 0; i < pidhash_size; i++)
-		INIT_HLIST_HEAD(&pid_hash[i]);
 }
 
 void __init pidmap_init(void)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01  0:14   ` Pavel Tatashin
@ 2017-03-01  0:24     ` Andi Kleen
  -1 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01  0:24 UTC (permalink / raw)
  To: Pavel Tatashin; +Cc: linux-mm, sparclinux

Pavel Tatashin <pasha.tatashin@oracle.com> writes:
>
> While investigating how to improve initialization time of dentry_hashtable
> which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()

I don't think a 8G dentry (or other kernel) hash table makes much
sense. I would rather fix the hash table sizing algorithm to have some
reasonable upper limit than to optimize the zeroing.

I believe there are already boot options for it, but it would be better
if it worked out of the box.

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01  0:24     ` Andi Kleen
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01  0:24 UTC (permalink / raw)
  To: Pavel Tatashin; +Cc: linux-mm, sparclinux

Pavel Tatashin <pasha.tatashin@oracle.com> writes:
>
> While investigating how to improve initialization time of dentry_hashtable
> which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()

I don't think a 8G dentry (or other kernel) hash table makes much
sense. I would rather fix the hash table sizing algorithm to have some
reasonable upper limit than to optimize the zeroing.

I believe there are already boot options for it, but it would be better
if it worked out of the box.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01  0:24     ` Andi Kleen
@ 2017-03-01 14:51       ` Pasha Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 14:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

On 2017-02-28 19:24, Andi Kleen wrote:
> Pavel Tatashin <pasha.tatashin@oracle.com> writes:
>>
>> While investigating how to improve initialization time of dentry_hashtable
>> which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()
>
> I don't think a 8G dentry (or other kernel) hash table makes much
> sense. I would rather fix the hash table sizing algorithm to have some
> reasonable upper limit than to optimize the zeroing.
>
> I believe there are already boot options for it, but it would be better
> if it worked out of the box.
>
> -Andi


Hi Andi,

I agree that there should be some smarter cap for maximum hash table 
sizes, and as you said it is already possible to set the limits via 
parameters. I still think, however, this HASH_ZERO patch makes sense for 
the following reasons:

- Even if the default maximum size is reduced the size of these tables 
should still be tunable, as it really depends on the way machine is 
used, and in it is possible that for some use patterns large hash tables 
are necessary.

- Most of them are initialized before smp_init() call. The time from 
bootloader to smp_init() should be minimized as parallelization is not 
available yet. For example, LDOM domain on which I tested this patch 
with few more optimization takes 8.5 seconds to get from grub to 
smp_init() (760CPUs and 7T of memory), out of these 8.5 seconds 3.1s 
(vs. 11.8s before this patch) are spent initializing these hash tables. 
So, even 3.1s is still significant, and should be improved further by 
changing the default maximums, but that should be a different patch.

Thank you,
Pasha


> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 14:51       ` Pasha Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 14:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

On 2017-02-28 19:24, Andi Kleen wrote:
> Pavel Tatashin <pasha.tatashin@oracle.com> writes:
>>
>> While investigating how to improve initialization time of dentry_hashtable
>> which is 8G long on M6 ldom with 7T of main memory, I noticed that memset()
>
> I don't think a 8G dentry (or other kernel) hash table makes much
> sense. I would rather fix the hash table sizing algorithm to have some
> reasonable upper limit than to optimize the zeroing.
>
> I believe there are already boot options for it, but it would be better
> if it worked out of the box.
>
> -Andi


Hi Andi,

I agree that there should be some smarter cap for maximum hash table 
sizes, and as you said it is already possible to set the limits via 
parameters. I still think, however, this HASH_ZERO patch makes sense for 
the following reasons:

- Even if the default maximum size is reduced the size of these tables 
should still be tunable, as it really depends on the way machine is 
used, and in it is possible that for some use patterns large hash tables 
are necessary.

- Most of them are initialized before smp_init() call. The time from 
bootloader to smp_init() should be minimized as parallelization is not 
available yet. For example, LDOM domain on which I tested this patch 
with few more optimization takes 8.5 seconds to get from grub to 
smp_init() (760CPUs and 7T of memory), out of these 8.5 seconds 3.1s 
(vs. 11.8s before this patch) are spent initializing these hash tables. 
So, even 3.1s is still significant, and should be improved further by 
changing the default maximums, but that should be a different patch.

Thank you,
Pasha


> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 14:51       ` Pasha Tatashin
@ 2017-03-01 15:19         ` Andi Kleen
  -1 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 15:19 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

> - Even if the default maximum size is reduced the size of these
> tables should still be tunable, as it really depends on the way
> machine is used, and in it is possible that for some use patterns
> large hash tables are necessary.

I consider it very unlikely that a 8G dentry hash table ever makes
sense. I cannot even imagine a workload where you would have that
many active files. It's just a bad configuration that should be avoided.

And when the tables are small enough you don't need these hacks.

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 15:19         ` Andi Kleen
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 15:19 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

> - Even if the default maximum size is reduced the size of these
> tables should still be tunable, as it really depends on the way
> machine is used, and in it is possible that for some use patterns
> large hash tables are necessary.

I consider it very unlikely that a 8G dentry hash table ever makes
sense. I cannot even imagine a workload where you would have that
many active files. It's just a bad configuration that should be avoided.

And when the tables are small enough you don't need these hacks.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 15:19         ` Andi Kleen
@ 2017-03-01 16:34           ` Pasha Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 16:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

Hi Andi,

Thank you for your comment, I am thinking to limit the default maximum 
hash tables sizes to 512M.

If it is bigger than 512M, we would still need my patch to improve the 
performance. This is because it would mean that initialization of hash 
tables would still take over 1s out of 6s in bootload to smp_init() 
interval on larger machines.

I am not sure HASH_ZERO is a hack because if you look at the way 
pv_lock_hash is allocated, it assumes that the memory is already zeroed 
since it provides HASH_EARLY flag. It quietly assumes that the memblock 
boot allocator zeroes the memory for us. On the other hand, in other 
places where HASH_EARLY is specified we still explicitly zero the 
hashes. At least with HASH_ZERO flag this becomes a defined interface, 
and in the future if memblock allocator is changed to zero memory only 
on demand (as it really should), the HASH_ZERO flag can be passed there 
the same way it is passed to vmalloc() in my patch.

Does something like this look OK to you? If yes, I will send out a new 
patch.


  index 1b0f7a4..5ddf741 100644
  --- a/mm/page_alloc.c
  +++ b/mm/page_alloc.c
  @@ -79,6 +79,12 @@
   EXPORT_PER_CPU_SYMBOL(numa_node);
   #endif

  +/*
  + * This is the default maximum number of entries system hashes can 
have, the
  + * value can be overwritten by setting hash table sizes via kernel 
parameters.
  + */
  +#define SYSTEM_HASH_MAX_ENTRIES                (1 << 26)
  +
   #ifdef CONFIG_HAVE_MEMORYLESS_NODES
   /*
    * N.B., Do NOT reference the '_numa_mem_' per cpu variable directly.
  @@ -7154,6 +7160,11 @@ static unsigned long __init 
arch_reserved_kernel_pages(void)
                  if (PAGE_SHIFT < 20)
                          numentries = round_up(numentries, 
(1<<20)/PAGE_SIZE);

  +               /* Limit default maximum number of entries */
  +               if (numentries > SYSTEM_HASH_MAX_ENTRIES) {
  +                       numentries = SYSTEM_HASH_MAX_ENTRIES;
  +               }
  +
                  /* limit to 1 bucket per 2^scale bytes of low memory */
                  if (scale > PAGE_SHIFT)
                          numentries >>= (scale - PAGE_SHIFT);

Thank you
Pasha

On 2017-03-01 10:19, Andi Kleen wrote:
>> - Even if the default maximum size is reduced the size of these
>> tables should still be tunable, as it really depends on the way
>> machine is used, and in it is possible that for some use patterns
>> large hash tables are necessary.
>
> I consider it very unlikely that a 8G dentry hash table ever makes
> sense. I cannot even imagine a workload where you would have that
> many active files. It's just a bad configuration that should be avoided.
>
> And when the tables are small enough you don't need these hacks.
>
> -Andi
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 16:34           ` Pasha Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 16:34 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

Hi Andi,

Thank you for your comment, I am thinking to limit the default maximum 
hash tables sizes to 512M.

If it is bigger than 512M, we would still need my patch to improve the 
performance. This is because it would mean that initialization of hash 
tables would still take over 1s out of 6s in bootload to smp_init() 
interval on larger machines.

I am not sure HASH_ZERO is a hack because if you look at the way 
pv_lock_hash is allocated, it assumes that the memory is already zeroed 
since it provides HASH_EARLY flag. It quietly assumes that the memblock 
boot allocator zeroes the memory for us. On the other hand, in other 
places where HASH_EARLY is specified we still explicitly zero the 
hashes. At least with HASH_ZERO flag this becomes a defined interface, 
and in the future if memblock allocator is changed to zero memory only 
on demand (as it really should), the HASH_ZERO flag can be passed there 
the same way it is passed to vmalloc() in my patch.

Does something like this look OK to you? If yes, I will send out a new 
patch.


  index 1b0f7a4..5ddf741 100644
  --- a/mm/page_alloc.c
  +++ b/mm/page_alloc.c
  @@ -79,6 +79,12 @@
   EXPORT_PER_CPU_SYMBOL(numa_node);
   #endif

  +/*
  + * This is the default maximum number of entries system hashes can 
have, the
  + * value can be overwritten by setting hash table sizes via kernel 
parameters.
  + */
  +#define SYSTEM_HASH_MAX_ENTRIES                (1 << 26)
  +
   #ifdef CONFIG_HAVE_MEMORYLESS_NODES
   /*
    * N.B., Do NOT reference the '_numa_mem_' per cpu variable directly.
  @@ -7154,6 +7160,11 @@ static unsigned long __init 
arch_reserved_kernel_pages(void)
                  if (PAGE_SHIFT < 20)
                          numentries = round_up(numentries, 
(1<<20)/PAGE_SIZE);

  +               /* Limit default maximum number of entries */
  +               if (numentries > SYSTEM_HASH_MAX_ENTRIES) {
  +                       numentries = SYSTEM_HASH_MAX_ENTRIES;
  +               }
  +
                  /* limit to 1 bucket per 2^scale bytes of low memory */
                  if (scale > PAGE_SHIFT)
                          numentries >>= (scale - PAGE_SHIFT);

Thank you
Pasha

On 2017-03-01 10:19, Andi Kleen wrote:
>> - Even if the default maximum size is reduced the size of these
>> tables should still be tunable, as it really depends on the way
>> machine is used, and in it is possible that for some use patterns
>> large hash tables are necessary.
>
> I consider it very unlikely that a 8G dentry hash table ever makes
> sense. I cannot even imagine a workload where you would have that
> many active files. It's just a bad configuration that should be avoided.
>
> And when the tables are small enough you don't need these hacks.
>
> -Andi
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 16:34           ` Pasha Tatashin
@ 2017-03-01 17:31             ` Andi Kleen
  -1 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 17:31 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

On Wed, Mar 01, 2017 at 11:34:10AM -0500, Pasha Tatashin wrote:
> Hi Andi,
> 
> Thank you for your comment, I am thinking to limit the default
> maximum hash tables sizes to 512M.
> 
> If it is bigger than 512M, we would still need my patch to improve

Even 512MB seems too large. I wouldn't go larger than a few tens
of MB, maybe 32MB.

Also you would need to cover all the big hashes.

The most critical ones are likely the network hash tables, these
maybe be a bit larger (but certainly also not 0.5TB) 

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 17:31             ` Andi Kleen
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 17:31 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

On Wed, Mar 01, 2017 at 11:34:10AM -0500, Pasha Tatashin wrote:
> Hi Andi,
> 
> Thank you for your comment, I am thinking to limit the default
> maximum hash tables sizes to 512M.
> 
> If it is bigger than 512M, we would still need my patch to improve

Even 512MB seems too large. I wouldn't go larger than a few tens
of MB, maybe 32MB.

Also you would need to cover all the big hashes.

The most critical ones are likely the network hash tables, these
maybe be a bit larger (but certainly also not 0.5TB) 

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 17:31             ` Andi Kleen
@ 2017-03-01 21:20               ` Pasha Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 21:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

Hi Andi,

After thinking some more about this issue, I figured that I would not 
want to set default maximums.

Currently, the defaults are scaled with system memory size, which seems 
like the right thing to do to me. They are set to size hash tables one 
entry per page and, if a scale argument is provided, scale them down to 
1/2, 1/4, 1/8 entry per page etc.

So, in some cases the scale argument may be wrong, and dentry, inode, or 
some other client of alloc_large_system_hash() should be adjusted.

For example, I am pretty sure that scale value in most places should be 
changed from literal value (inode scale = 14, dentry scale = 13, etc to: 
(PAGE_SHIFT + value): inode scale would become (PAGE_SHIFT + 2), dentry 
scale would become (PAGE_SHIFT + 1), etc. This is because we want 1/4 
inodes and 1/2 dentries per every page in the system.
In alloc_large_system_hash() we have basically this:
nentries = nr_kernel_pages >> (scale - PAGE_SHIFT);

This is basically a bug, and would not change the theory, but I am sure 
that changing scales without at least some theoretical backup is not a 
good idea and would most likely lead to regressions, especially on some 
smaller configurations.

Therefore, in my opinion having one fast way to zero hash tables, as 
this patch tries to do, is a good thing. In the next patch revision I 
can go ahead and change scales to be (PAGE_SHIFT + val) from current 
literals.

Thank you,
Pasha

On 2017-03-01 12:31, Andi Kleen wrote:
> On Wed, Mar 01, 2017 at 11:34:10AM -0500, Pasha Tatashin wrote:
>> Hi Andi,
>>
>> Thank you for your comment, I am thinking to limit the default
>> maximum hash tables sizes to 512M.
>>
>> If it is bigger than 512M, we would still need my patch to improve
>
> Even 512MB seems too large. I wouldn't go larger than a few tens
> of MB, maybe 32MB.
>
> Also you would need to cover all the big hashes.
>
> The most critical ones are likely the network hash tables, these
> maybe be a bit larger (but certainly also not 0.5TB)
>
> -Andi
>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 21:20               ` Pasha Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-01 21:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux

Hi Andi,

After thinking some more about this issue, I figured that I would not 
want to set default maximums.

Currently, the defaults are scaled with system memory size, which seems 
like the right thing to do to me. They are set to size hash tables one 
entry per page and, if a scale argument is provided, scale them down to 
1/2, 1/4, 1/8 entry per page etc.

So, in some cases the scale argument may be wrong, and dentry, inode, or 
some other client of alloc_large_system_hash() should be adjusted.

For example, I am pretty sure that scale value in most places should be 
changed from literal value (inode scale = 14, dentry scale = 13, etc to: 
(PAGE_SHIFT + value): inode scale would become (PAGE_SHIFT + 2), dentry 
scale would become (PAGE_SHIFT + 1), etc. This is because we want 1/4 
inodes and 1/2 dentries per every page in the system.
In alloc_large_system_hash() we have basically this:
nentries = nr_kernel_pages >> (scale - PAGE_SHIFT);

This is basically a bug, and would not change the theory, but I am sure 
that changing scales without at least some theoretical backup is not a 
good idea and would most likely lead to regressions, especially on some 
smaller configurations.

Therefore, in my opinion having one fast way to zero hash tables, as 
this patch tries to do, is a good thing. In the next patch revision I 
can go ahead and change scales to be (PAGE_SHIFT + val) from current 
literals.

Thank you,
Pasha

On 2017-03-01 12:31, Andi Kleen wrote:
> On Wed, Mar 01, 2017 at 11:34:10AM -0500, Pasha Tatashin wrote:
>> Hi Andi,
>>
>> Thank you for your comment, I am thinking to limit the default
>> maximum hash tables sizes to 512M.
>>
>> If it is bigger than 512M, we would still need my patch to improve
>
> Even 512MB seems too large. I wouldn't go larger than a few tens
> of MB, maybe 32MB.
>
> Also you would need to cover all the big hashes.
>
> The most critical ones are likely the network hash tables, these
> maybe be a bit larger (but certainly also not 0.5TB)
>
> -Andi
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 21:20               ` Pasha Tatashin
@ 2017-03-01 23:10                 ` Andi Kleen
  -1 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 23:10 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux, linux-fsdevel

> For example, I am pretty sure that scale value in most places should
> be changed from literal value (inode scale = 14, dentry scale = 13,
> etc to: (PAGE_SHIFT + value): inode scale would become (PAGE_SHIFT +
> 2), dentry scale would become (PAGE_SHIFT + 1), etc. This is because
> we want 1/4 inodes and 1/2 dentries per every page in the system.

This is still far too much for a large system. The algorithm
simply was not designed for TB systems.

It's unlikely to have nowhere near that many small files active, as it's 
better to use the memory for something that is actually useful.

Also even a few hops in the open hash table are normally not a problems
dentry/inode; it is not that file lookups are that critical.

For networking the picture may be different, but I suspect GBs worth of
hash tables are still overkill there (Dave et.al. may have stronger opinions on this) 

I think a upper size (with user override which already exists) is fine,
but if you really don't want to do it then scale the factor down 
very aggressively for larger sizes, so that we don't end up with more
than a few tens of MB.

> This is basically a bug, and would not change the theory, but I am
> sure that changing scales without at least some theoretical backup

One dentry per page would only make sense if the files are zero sized.
If the file even has one byte then it already needs more than 1 page just to
cache the contents (even ignoring inodes and other caches)

With larger files that need multiple pages it makes even less sense.

So clearly one dentry per page theory is nonsense if the files are actually
used.

There is the "make find / + stat fast" case (where only the entries 
and inodes are cached). But even there it is unlikely that the TB system
has a much larger file system with more files than the 100GB system, so
I once a reasonable plateau is reached I don't see why you would want 
to exceed that.

Also the reason to make hash tables big is to minimize collisions,
but we have fairly good hash functions and a few hops worse case 
are likely not a problem for an already expensive file access
or open.

BTW the other option would be to switch all the large system hashes to a
rhashtable and do the resizing only when it is actually needed. 
But that would be more work than just adding a reasonable upper limit.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-01 23:10                 ` Andi Kleen
  0 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2017-03-01 23:10 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux, linux-fsdevel

> For example, I am pretty sure that scale value in most places should
> be changed from literal value (inode scale = 14, dentry scale = 13,
> etc to: (PAGE_SHIFT + value): inode scale would become (PAGE_SHIFT +
> 2), dentry scale would become (PAGE_SHIFT + 1), etc. This is because
> we want 1/4 inodes and 1/2 dentries per every page in the system.

This is still far too much for a large system. The algorithm
simply was not designed for TB systems.

It's unlikely to have nowhere near that many small files active, as it's 
better to use the memory for something that is actually useful.

Also even a few hops in the open hash table are normally not a problems
dentry/inode; it is not that file lookups are that critical.

For networking the picture may be different, but I suspect GBs worth of
hash tables are still overkill there (Dave et.al. may have stronger opinions on this) 

I think a upper size (with user override which already exists) is fine,
but if you really don't want to do it then scale the factor down 
very aggressively for larger sizes, so that we don't end up with more
than a few tens of MB.

> This is basically a bug, and would not change the theory, but I am
> sure that changing scales without at least some theoretical backup

One dentry per page would only make sense if the files are zero sized.
If the file even has one byte then it already needs more than 1 page just to
cache the contents (even ignoring inodes and other caches)

With larger files that need multiple pages it makes even less sense.

So clearly one dentry per page theory is nonsense if the files are actually
used.

There is the "make find / + stat fast" case (where only the entries 
and inodes are cached). But even there it is unlikely that the TB system
has a much larger file system with more files than the 100GB system, so
I once a reasonable plateau is reached I don't see why you would want 
to exceed that.

Also the reason to make hash tables big is to minimize collisions,
but we have fairly good hash functions and a few hops worse case 
are likely not a problem for an already expensive file access
or open.

BTW the other option would be to switch all the large system hashes to a
rhashtable and do the resizing only when it is actually needed. 
But that would be more work than just adding a reasonable upper limit.

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 21:20               ` Pasha Tatashin
@ 2017-03-02  0:12                 ` Matthew Wilcox
  -1 siblings, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2017-03-02  0:12 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

On Wed, Mar 01, 2017 at 04:20:28PM -0500, Pasha Tatashin wrote:
> Hi Andi,
> 
> After thinking some more about this issue, I figured that I would not want
> to set default maximums.
> 
> Currently, the defaults are scaled with system memory size, which seems like
> the right thing to do to me. They are set to size hash tables one entry per
> page and, if a scale argument is provided, scale them down to 1/2, 1/4, 1/8
> entry per page etc.

I disagree that it's the right thing to do.  You want your dentry cache
to scale with the number of dentries in use.  Scaling with memory size
is a reasonable approximation for smaller memory sizes, but allocating
8GB of *hash table entries* for dentries is plainly ridiculous, no matter
how much memory you have.  You won't have half a billion dentries active
in most uses of such a large machine.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-02  0:12                 ` Matthew Wilcox
  0 siblings, 0 replies; 26+ messages in thread
From: Matthew Wilcox @ 2017-03-02  0:12 UTC (permalink / raw)
  To: Pasha Tatashin; +Cc: Andi Kleen, linux-mm, sparclinux

On Wed, Mar 01, 2017 at 04:20:28PM -0500, Pasha Tatashin wrote:
> Hi Andi,
> 
> After thinking some more about this issue, I figured that I would not want
> to set default maximums.
> 
> Currently, the defaults are scaled with system memory size, which seems like
> the right thing to do to me. They are set to size hash tables one entry per
> page and, if a scale argument is provided, scale them down to 1/2, 1/4, 1/8
> entry per page etc.

I disagree that it's the right thing to do.  You want your dentry cache
to scale with the number of dentries in use.  Scaling with memory size
is a reasonable approximation for smaller memory sizes, but allocating
8GB of *hash table entries* for dentries is plainly ridiculous, no matter
how much memory you have.  You won't have half a billion dentries active
in most uses of such a large machine.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
  2017-03-01 23:10                 ` Andi Kleen
@ 2017-03-02 19:15                   ` Pasha Tatashin
  -1 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-02 19:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux, linux-fsdevel

Hi Andi,

>
> I think a upper size (with user override which already exists) is fine,
> but if you really don't want to do it then scale the factor down
> very aggressively for larger sizes, so that we don't end up with more
> than a few tens of MB.
>

I have scaled it, I do not think setting a default upper limit is a 
future proof strategy.

Thank you,
Pasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow
@ 2017-03-02 19:15                   ` Pasha Tatashin
  0 siblings, 0 replies; 26+ messages in thread
From: Pasha Tatashin @ 2017-03-02 19:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-mm, sparclinux, linux-fsdevel

Hi Andi,

>
> I think a upper size (with user override which already exists) is fine,
> but if you really don't want to do it then scale the factor down
> very aggressively for larger sizes, so that we don't end up with more
> than a few tens of MB.
>

I have scaled it, I do not think setting a default upper limit is a 
future proof strategy.

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2017-03-02 19:15 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-01  0:14 [PATCH v2 0/3] Zeroing hash tables in allocator Pavel Tatashin
2017-03-01  0:14 ` Pavel Tatashin
2017-03-01  0:14 ` [PATCH v2 1/3] sparc64: NG4 memset 32 bits overflow Pavel Tatashin
2017-03-01  0:14   ` Pavel Tatashin
2017-03-01  0:24   ` Andi Kleen
2017-03-01  0:24     ` Andi Kleen
2017-03-01 14:51     ` Pasha Tatashin
2017-03-01 14:51       ` Pasha Tatashin
2017-03-01 15:19       ` Andi Kleen
2017-03-01 15:19         ` Andi Kleen
2017-03-01 16:34         ` Pasha Tatashin
2017-03-01 16:34           ` Pasha Tatashin
2017-03-01 17:31           ` Andi Kleen
2017-03-01 17:31             ` Andi Kleen
2017-03-01 21:20             ` Pasha Tatashin
2017-03-01 21:20               ` Pasha Tatashin
2017-03-01 23:10               ` Andi Kleen
2017-03-01 23:10                 ` Andi Kleen
2017-03-02 19:15                 ` Pasha Tatashin
2017-03-02 19:15                   ` Pasha Tatashin
2017-03-02  0:12               ` Matthew Wilcox
2017-03-02  0:12                 ` Matthew Wilcox
2017-03-01  0:14 ` [PATCH v2 2/3] mm: Zeroing hash tables in allocator Pavel Tatashin
2017-03-01  0:14   ` Pavel Tatashin
2017-03-01  0:14 ` [PATCH v2 3/3] mm: Updated callers to use HASH_ZERO flag Pavel Tatashin
2017-03-01  0:14   ` Pavel Tatashin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.