linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
@ 2017-05-19 16:46 Guenter Roeck
  2017-05-19 23:24 ` Kevin Hilman
  2017-05-20  7:26 ` Michal Hocko
  0 siblings, 2 replies; 8+ messages in thread
From: Guenter Roeck @ 2017-05-19 16:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell

Hi,

my qemu tests of next-20170519 show the following results:
	total: 122 pass: 30 fail: 92

I won't bother listing all of the failures; they are available at
http://kerneltests.org/builders. I bisected one (openrisc, because
it gives me some console output before dying). It points to
'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.

A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
32-bit x86 images fail and should provide an easy test case.

Guenter

---
# bad: [5666af8ae4a18b5ea6758d0c7c42ea765de216d2] Add linux-next specific files for 20170519
# good: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
git bisect start 'HEAD' 'v4.12-rc1'
# good: [c7115270d8cc333801b11e541ddbc43e320a88ef] Merge remote-tracking branch 'drm/drm-next'
git bisect good c7115270d8cc333801b11e541ddbc43e320a88ef
# good: [6bf84ee44e057051577be7edf306dc595b8d3c0f] Merge remote-tracking branch 'tip/auto-latest'
git bisect good 6bf84ee44e057051577be7edf306dc595b8d3c0f
# good: [8def67a06d65a1b08c87a65a8ef4fd6e71b6745c] Merge remote-tracking branch 'staging/staging-next'
git bisect good 8def67a06d65a1b08c87a65a8ef4fd6e71b6745c
# good: [0d538a750eaab91fc3f6dffe4c0e7d98d6252b81] Merge remote-tracking branch 'userns/for-next'
git bisect good 0d538a750eaab91fc3f6dffe4c0e7d98d6252b81
# good: [eb64959cd8c405de533122dc72b64d6ca197cee1] powerpc/mm/hugetlb: remove follow_huge_addr for powerpc
git bisect good eb64959cd8c405de533122dc72b64d6ca197cee1
# bad: [eb520e759caf124ba1c64e277939ff379d0ca8bd] procfs: fdinfo: extend information about epoll target files
git bisect bad eb520e759caf124ba1c64e277939ff379d0ca8bd
# bad: [45f5e427d6326ca1c44cd6897b9939441063fb96] lib/kstrtox.c: use "unsigned int" more
git bisect bad 45f5e427d6326ca1c44cd6897b9939441063fb96
# bad: [d75db247b8f204bfa2e6d2b40afcae74f3b4c7fd] mm: drop NULL return check of pte_offset_map_lock()
git bisect bad d75db247b8f204bfa2e6d2b40afcae74f3b4c7fd
# good: [d4c9af9111d483efd5f302916639a0e9a626f90f] mm: adaptive hash table scaling
git bisect good d4c9af9111d483efd5f302916639a0e9a626f90f
# bad: [90d2d8d8960a1b2ed868ce3bfd91e2ac8d4ff9aa] mm/hugetlb: clean up ARCH_HAS_GIGANTIC_PAGE
git bisect bad 90d2d8d8960a1b2ed868ce3bfd91e2ac8d4ff9aa
# bad: [67d0687224a93ef2adae7a2ed10f25b275f2ee91] mm: drop HASH_ADAPT
git bisect bad 67d0687224a93ef2adae7a2ed10f25b275f2ee91
# first bad commit: [67d0687224a93ef2adae7a2ed10f25b275f2ee91] mm: drop HASH_ADAPT

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-19 16:46 Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT' Guenter Roeck
@ 2017-05-19 23:24 ` Kevin Hilman
  2017-05-20  7:26 ` Michal Hocko
  1 sibling, 0 replies; 8+ messages in thread
From: Kevin Hilman @ 2017-05-19 23:24 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Michal Hocko, lkml, Pavel Tatashin, Andrew Morton, Stephen Rothwell

On Fri, May 19, 2017 at 9:46 AM, Guenter Roeck <linux@roeck-us.net> wrote:
> Hi,
>
> my qemu tests of next-20170519 show the following results:
>         total: 122 pass: 30 fail: 92
>
> I won't bother listing all of the failures; they are available at
> http://kerneltests.org/builders. I bisected one (openrisc, because
> it gives me some console output before dying). It points to
> 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
>
> A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
> 32-bit x86 images fail and should provide an easy test case.

32-bit ARM platforms also failing.

I also noticed widespread boot failures in kernelci.org, and bisected
one of them (32-bit ARM, Beaglebone Black) and it pointed at the same
patch.

Kevin

> Guenter
>
> ---
> # bad: [5666af8ae4a18b5ea6758d0c7c42ea765de216d2] Add linux-next specific files for 20170519
> # good: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
> git bisect start 'HEAD' 'v4.12-rc1'
> # good: [c7115270d8cc333801b11e541ddbc43e320a88ef] Merge remote-tracking branch 'drm/drm-next'
> git bisect good c7115270d8cc333801b11e541ddbc43e320a88ef
> # good: [6bf84ee44e057051577be7edf306dc595b8d3c0f] Merge remote-tracking branch 'tip/auto-latest'
> git bisect good 6bf84ee44e057051577be7edf306dc595b8d3c0f
> # good: [8def67a06d65a1b08c87a65a8ef4fd6e71b6745c] Merge remote-tracking branch 'staging/staging-next'
> git bisect good 8def67a06d65a1b08c87a65a8ef4fd6e71b6745c
> # good: [0d538a750eaab91fc3f6dffe4c0e7d98d6252b81] Merge remote-tracking branch 'userns/for-next'
> git bisect good 0d538a750eaab91fc3f6dffe4c0e7d98d6252b81
> # good: [eb64959cd8c405de533122dc72b64d6ca197cee1] powerpc/mm/hugetlb: remove follow_huge_addr for powerpc
> git bisect good eb64959cd8c405de533122dc72b64d6ca197cee1
> # bad: [eb520e759caf124ba1c64e277939ff379d0ca8bd] procfs: fdinfo: extend information about epoll target files
> git bisect bad eb520e759caf124ba1c64e277939ff379d0ca8bd
> # bad: [45f5e427d6326ca1c44cd6897b9939441063fb96] lib/kstrtox.c: use "unsigned int" more
> git bisect bad 45f5e427d6326ca1c44cd6897b9939441063fb96
> # bad: [d75db247b8f204bfa2e6d2b40afcae74f3b4c7fd] mm: drop NULL return check of pte_offset_map_lock()
> git bisect bad d75db247b8f204bfa2e6d2b40afcae74f3b4c7fd
> # good: [d4c9af9111d483efd5f302916639a0e9a626f90f] mm: adaptive hash table scaling
> git bisect good d4c9af9111d483efd5f302916639a0e9a626f90f
> # bad: [90d2d8d8960a1b2ed868ce3bfd91e2ac8d4ff9aa] mm/hugetlb: clean up ARCH_HAS_GIGANTIC_PAGE
> git bisect bad 90d2d8d8960a1b2ed868ce3bfd91e2ac8d4ff9aa
> # bad: [67d0687224a93ef2adae7a2ed10f25b275f2ee91] mm: drop HASH_ADAPT
> git bisect bad 67d0687224a93ef2adae7a2ed10f25b275f2ee91
> # first bad commit: [67d0687224a93ef2adae7a2ed10f25b275f2ee91] mm: drop HASH_ADAPT

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-19 16:46 Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT' Guenter Roeck
  2017-05-19 23:24 ` Kevin Hilman
@ 2017-05-20  7:26 ` Michal Hocko
  2017-05-20 14:21   ` Guenter Roeck
  2017-05-22  8:45   ` Michal Hocko
  1 sibling, 2 replies; 8+ messages in thread
From: Michal Hocko @ 2017-05-20  7:26 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell,
	Kevin Hilman

On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
> Hi,
> 
> my qemu tests of next-20170519 show the following results:
> 	total: 122 pass: 30 fail: 92
> 
> I won't bother listing all of the failures; they are available at
> http://kerneltests.org/builders. I bisected one (openrisc, because
> it gives me some console output before dying). It points to
> 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
> 
> A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
> 32-bit x86 images fail and should provide an easy test case.

Hmm, this is quite unexpected as the patch is not supposed to change
things much. It just removes the flag and perform the new hash scaling
automatically for all requeusts which do not have any high limit.
Some of those didn't have HASH_ADAPT before but that shouldn't change
the picture much. The only thing that I can imagine is that what
formerly failed for early memblock allocations is now suceeding and that
depletes the early memory. Do you have any serial console from the boot?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-20  7:26 ` Michal Hocko
@ 2017-05-20 14:21   ` Guenter Roeck
  2017-05-20 16:38     ` Pasha Tatashin
  2017-05-22  8:45   ` Michal Hocko
  1 sibling, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2017-05-20 14:21 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell,
	Kevin Hilman

On 05/20/2017 12:26 AM, Michal Hocko wrote:
> On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
>> Hi,
>>
>> my qemu tests of next-20170519 show the following results:
>> 	total: 122 pass: 30 fail: 92
>>
>> I won't bother listing all of the failures; they are available at
>> http://kerneltests.org/builders. I bisected one (openrisc, because
>> it gives me some console output before dying). It points to
>> 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
>>
>> A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
>> 32-bit x86 images fail and should provide an easy test case.
> 
> Hmm, this is quite unexpected as the patch is not supposed to change
> things much. It just removes the flag and perform the new hash scaling

It may well be that the problem is introduced with an earlier patch and just
enabled by this one.

> automatically for all requeusts which do not have any high limit.
> Some of those didn't have HASH_ADAPT before but that shouldn't change
> the picture much. The only thing that I can imagine is that what
> formerly failed for early memblock allocations is now suceeding and that
> depletes the early memory. Do you have any serial console from the boot?
> 

They are all the same. Either nothing or the following. Picking a couple:

metag:

Linux version 4.12.0-rc1-next-20170519 (groeck@jupiter.roeck-us.net) (gcc version 4.2.4 (IMG-1.4.0.300)) #1 Fri May 19 00:50:50 PDT 2017
LNKGET/SET go through cache but CONFIG_METAG_LNKGET_AROUND_CACHE=y
DA present
console [ttyDA1] enabled
OF: fdt: Machine model: toumaz,tz1090
Machine name: Generic Meta
Node 0: start_pfn = 0xb0000, low = 0xbfff7
Zone ranges:
   Normal   [mem 0x00000000b0000000-0x00000000bfff6fff]
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x00000000b0000000-0x00000000bfff6fff]
Initmem setup node 0 [mem 0x00000000b0000000-0x00000000bfff6fff]
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65015
Kernel command line: rdinit=/sbin/init doreboot
PID hash table entries: 1024 (order: 0, 4096 bytes)

crisv32:

Linux version 4.12.0-rc1-next-20170519 (groeck@desktop.roeck-us.net) (gcc version 4.9.2 (Buildroot 2015.02-rc1-00005-gb13bd8e-dirty) ) #1 Fri May 19 00:52:55 PDT 2017
bootconsole [early0] enabled
Setting up paging and the MMU.
Linux/CRISv32 port on ETRAX FS (C) 2003, 2004 Axis Communications AB
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 4080
Kernel command line: console=ttyS0,115200,N,8 rdinit=/sbin/init
PID hash table entries: 128 (order: -4, 512 bytes)

powerpc:mpc8548cds:

Memory CAM mapping: 256 Mb, residual: 0Mb
Linux version 4.12.0-rc1-next-20170519 (groeck@jupiter.roeck-us.net) (gcc version 4.8.1 (GCC) ) #1 Fri May 19 01:17:29 PDT 2017
Found initrd at 0xc4000000:0xc4200c00
Using MPC85xx CDS machine description
bootconsole [udbg0] enabled
-----------------------------------------------------
phys_mem_size     = 0x10000000
dcache_bsize      = 0x20
icache_bsize      = 0x20
cpu_features      = 0x0000000012100460
   possible        = 0x0000000012100460
   always          = 0x0000000000100000
cpu_user_features = 0x84e08000 0x08000000
mmu_features      = 0x00020010
-----------------------------------------------------
mpc85xx_cds_setup_arch()
Could not find FPGA node.
Zone ranges:
   DMA      [mem 0x0000000000000000-0x000000000fffffff]
   Normal   empty
Movable zone start for each node
Early memory node ranges
   node   0: [mem 0x0000000000000000-0x000000000fffffff]
Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
MMU: Allocated 1088 bytes of context maps for 255 contexts
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
Kernel command line: rdinit=/sbin/init console=ttyS0 console=tty doreboot
PID hash table entries: 1024 (order: 0, 4096 bytes)

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-20 14:21   ` Guenter Roeck
@ 2017-05-20 16:38     ` Pasha Tatashin
  0 siblings, 0 replies; 8+ messages in thread
From: Pasha Tatashin @ 2017-05-20 16:38 UTC (permalink / raw)
  To: Guenter Roeck, Michal Hocko
  Cc: linux-kernel, Andrew Morton, Stephen Rothwell, Kevin Hilman

The problem is due to 32-bit integer overflow in:

ADAPT_SCALE_BASE and adapt

In dcache_init_early() that is causing the problem. It was not enabled 
before 'mm: drop HASH_ADAPT' but is enabled now, and it should follow 
right after: "PID hash table entries: 1024 (order: 0, 4096 bytes)"

main()
   pidhash_init();
   vfs_caches_init_early();
     dcache_init_early()
       alloc_large_system_hash("Dentry cache", ...)

for (adapt = ADAPT_SCALE_NPAGES; adapt < numentries;
			     adapt <<= ADAPT_SCALE_SHIFT)

numentries  is very small, so it should be always smaller than adapt, 
and algorithm should not kick in, but 32-bit causes adapt to be smaller 
than numentries.

I will send out an updated "mm: Adaptive hash table scaling", with "mm: 
drop HASH_ADAPT" integrated.

Pasha

On 05/20/2017 10:21 AM, Guenter Roeck wrote:
> On 05/20/2017 12:26 AM, Michal Hocko wrote:
>> On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
>>> Hi,
>>>
>>> my qemu tests of next-20170519 show the following results:
>>>     total: 122 pass: 30 fail: 92
>>>
>>> I won't bother listing all of the failures; they are available at
>>> http://kerneltests.org/builders. I bisected one (openrisc, because
>>> it gives me some console output before dying). It points to
>>> 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
>>>
>>> A quick glance suggests that 64 bit kernels pass and 32 bit kernels 
>>> fail.
>>> 32-bit x86 images fail and should provide an easy test case.
>>
>> Hmm, this is quite unexpected as the patch is not supposed to change
>> things much. It just removes the flag and perform the new hash scaling
> 
> It may well be that the problem is introduced with an earlier patch and 
> just
> enabled by this one.
> 
>> automatically for all requeusts which do not have any high limit.
>> Some of those didn't have HASH_ADAPT before but that shouldn't change
>> the picture much. The only thing that I can imagine is that what
>> formerly failed for early memblock allocations is now suceeding and that
>> depletes the early memory. Do you have any serial console from the boot?
>>
> 
> They are all the same. Either nothing or the following. Picking a couple:
> 
> metag:
> 
> Linux version 4.12.0-rc1-next-20170519 (groeck@jupiter.roeck-us.net) 
> (gcc version 4.2.4 (IMG-1.4.0.300)) #1 Fri May 19 00:50:50 PDT 2017
> LNKGET/SET go through cache but CONFIG_METAG_LNKGET_AROUND_CACHE=y
> DA present
> console [ttyDA1] enabled
> OF: fdt: Machine model: toumaz,tz1090
> Machine name: Generic Meta
> Node 0: start_pfn = 0xb0000, low = 0xbfff7
> Zone ranges:
>    Normal   [mem 0x00000000b0000000-0x00000000bfff6fff]
> Movable zone start for each node
> Early memory node ranges
>    node   0: [mem 0x00000000b0000000-0x00000000bfff6fff]
> Initmem setup node 0 [mem 0x00000000b0000000-0x00000000bfff6fff]
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65015
> Kernel command line: rdinit=/sbin/init doreboot
> PID hash table entries: 1024 (order: 0, 4096 bytes)
> 
> crisv32:
> 
> Linux version 4.12.0-rc1-next-20170519 (groeck@desktop.roeck-us.net) 
> (gcc version 4.9.2 (Buildroot 2015.02-rc1-00005-gb13bd8e-dirty) ) #1 Fri 
> May 19 00:52:55 PDT 2017
> bootconsole [early0] enabled
> Setting up paging and the MMU.
> Linux/CRISv32 port on ETRAX FS (C) 2003, 2004 Axis Communications AB
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 4080
> Kernel command line: console=ttyS0,115200,N,8 rdinit=/sbin/init
> PID hash table entries: 128 (order: -4, 512 bytes)
> 
> powerpc:mpc8548cds:
> 
> Memory CAM mapping: 256 Mb, residual: 0Mb
> Linux version 4.12.0-rc1-next-20170519 (groeck@jupiter.roeck-us.net) 
> (gcc version 4.8.1 (GCC) ) #1 Fri May 19 01:17:29 PDT 2017
> Found initrd at 0xc4000000:0xc4200c00
> Using MPC85xx CDS machine description
> bootconsole [udbg0] enabled
> -----------------------------------------------------
> phys_mem_size     = 0x10000000
> dcache_bsize      = 0x20
> icache_bsize      = 0x20
> cpu_features      = 0x0000000012100460
>    possible        = 0x0000000012100460
>    always          = 0x0000000000100000
> cpu_user_features = 0x84e08000 0x08000000
> mmu_features      = 0x00020010
> -----------------------------------------------------
> mpc85xx_cds_setup_arch()
> Could not find FPGA node.
> Zone ranges:
>    DMA      [mem 0x0000000000000000-0x000000000fffffff]
>    Normal   empty
> Movable zone start for each node
> Early memory node ranges
>    node   0: [mem 0x0000000000000000-0x000000000fffffff]
> Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
> MMU: Allocated 1088 bytes of context maps for 255 contexts
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
> Kernel command line: rdinit=/sbin/init console=ttyS0 console=tty doreboot
> PID hash table entries: 1024 (order: 0, 4096 bytes)
> 
> Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-20  7:26 ` Michal Hocko
  2017-05-20 14:21   ` Guenter Roeck
@ 2017-05-22  8:45   ` Michal Hocko
  2017-05-22  9:03     ` Guenter Roeck
  1 sibling, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2017-05-22  8:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell,
	Kevin Hilman, Michael Ellerman

On Sat 20-05-17 09:26:34, Michal Hocko wrote:
> On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
> > Hi,
> > 
> > my qemu tests of next-20170519 show the following results:
> > 	total: 122 pass: 30 fail: 92
> > 
> > I won't bother listing all of the failures; they are available at
> > http://kerneltests.org/builders. I bisected one (openrisc, because
> > it gives me some console output before dying). It points to
> > 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
> > 
> > A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
> > 32-bit x86 images fail and should provide an easy test case.
> 
> Hmm, this is quite unexpected as the patch is not supposed to change
> things much. It just removes the flag and perform the new hash scaling
> automatically for all requeusts which do not have any high limit.
> Some of those didn't have HASH_ADAPT before but that shouldn't change
> the picture much. The only thing that I can imagine is that what
> formerly failed for early memblock allocations is now suceeding and that
> depletes the early memory. Do you have any serial console from the boot?

OK, I guess I know what it going on here. Adaptive has scaling is not
really suited for 32b. ADAPT_SCALE_BASE is just too large for the word
size and so we end up in the endless loop. So the issue has been
introduced by the original "mm: adaptive hash table scaling" but my
patch made it more visible because [di]cache has tables most probably
suceeded in the early initialization which didn't have HASH_ADAPT.
The following should fix the hang. I am not yet sure about the maximum
size for the scaling and something even smaller would make sense to me
because kernel address space is just too small for such a large has
tables.
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a26e19c3e1ff..70c5fc1fb89a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7174,11 +7174,15 @@ static unsigned long __init arch_reserved_kernel_pages(void)
 /*
  * Adaptive scale is meant to reduce sizes of hash tables on large memory
  * machines. As memory size is increased the scale is also increased but at
- * slower pace.  Starting from ADAPT_SCALE_BASE (64G), every time memory
- * quadruples the scale is increased by one, which means the size of hash table
- * only doubles, instead of quadrupling as well.
+ * slower pace.  Starting from ADAPT_SCALE_BASE (64G on 64b systems and 32M
+ * on 32b), every time memory quadruples the scale is increased by one, which
+ * means the size of hash table only doubles, instead of quadrupling as well.
  */
+#if __BITS_PER_LONG == 64
 #define ADAPT_SCALE_BASE	(64ul << 30)
+#else
+#define ADAPT_SCALE_BASE	(32ul << 20)
+#endif
 #define ADAPT_SCALE_SHIFT	2
 #define ADAPT_SCALE_NPAGES	(ADAPT_SCALE_BASE >> PAGE_SHIFT)
 
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-22  8:45   ` Michal Hocko
@ 2017-05-22  9:03     ` Guenter Roeck
  2017-05-22  9:25       ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2017-05-22  9:03 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell,
	Kevin Hilman, Michael Ellerman

On 05/22/2017 01:45 AM, Michal Hocko wrote:
> On Sat 20-05-17 09:26:34, Michal Hocko wrote:
>> On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
>>> Hi,
>>>
>>> my qemu tests of next-20170519 show the following results:
>>> 	total: 122 pass: 30 fail: 92
>>>
>>> I won't bother listing all of the failures; they are available at
>>> http://kerneltests.org/builders. I bisected one (openrisc, because
>>> it gives me some console output before dying). It points to
>>> 'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
>>>
>>> A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
>>> 32-bit x86 images fail and should provide an easy test case.
>>
>> Hmm, this is quite unexpected as the patch is not supposed to change
>> things much. It just removes the flag and perform the new hash scaling
>> automatically for all requeusts which do not have any high limit.
>> Some of those didn't have HASH_ADAPT before but that shouldn't change
>> the picture much. The only thing that I can imagine is that what
>> formerly failed for early memblock allocations is now suceeding and that
>> depletes the early memory. Do you have any serial console from the boot?
> 
> OK, I guess I know what it going on here. Adaptive has scaling is not
> really suited for 32b. ADAPT_SCALE_BASE is just too large for the word
> size and so we end up in the endless loop. So the issue has been
> introduced by the original "mm: adaptive hash table scaling" but my
> patch made it more visible because [di]cache has tables most probably
> suceeded in the early initialization which didn't have HASH_ADAPT.
> The following should fix the hang. I am not yet sure about the maximum
> size for the scaling and something even smaller would make sense to me
> because kernel address space is just too small for such a large has
> tables.
> ---
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a26e19c3e1ff..70c5fc1fb89a 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7174,11 +7174,15 @@ static unsigned long __init arch_reserved_kernel_pages(void)
>   /*
>    * Adaptive scale is meant to reduce sizes of hash tables on large memory
>    * machines. As memory size is increased the scale is also increased but at
> - * slower pace.  Starting from ADAPT_SCALE_BASE (64G), every time memory
> - * quadruples the scale is increased by one, which means the size of hash table
> - * only doubles, instead of quadrupling as well.
> + * slower pace.  Starting from ADAPT_SCALE_BASE (64G on 64b systems and 32M
> + * on 32b), every time memory quadruples the scale is increased by one, which
> + * means the size of hash table only doubles, instead of quadrupling as well.
>    */
> +#if __BITS_PER_LONG == 64
>   #define ADAPT_SCALE_BASE	(64ul << 30)
> +#else
> +#define ADAPT_SCALE_BASE	(32ul << 20)
> +#endif
>   #define ADAPT_SCALE_SHIFT	2
>   #define ADAPT_SCALE_NPAGES	(ADAPT_SCALE_BASE >> PAGE_SHIFT)
>   
> 
I have seen another patch making it 64ull. Not sure if adaptive scaling
on 32 bit systems really makes sense; unless there is a clear need I'd rather
leave it alone.

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'
  2017-05-22  9:03     ` Guenter Roeck
@ 2017-05-22  9:25       ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2017-05-22  9:25 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, Pavel Tatashin, Andrew Morton, Stephen Rothwell,
	Kevin Hilman, Michael Ellerman

On Mon 22-05-17 02:03:21, Guenter Roeck wrote:
> On 05/22/2017 01:45 AM, Michal Hocko wrote:
> >On Sat 20-05-17 09:26:34, Michal Hocko wrote:
> >>On Fri 19-05-17 09:46:23, Guenter Roeck wrote:
> >>>Hi,
> >>>
> >>>my qemu tests of next-20170519 show the following results:
> >>>	total: 122 pass: 30 fail: 92
> >>>
> >>>I won't bother listing all of the failures; they are available at
> >>>http://kerneltests.org/builders. I bisected one (openrisc, because
> >>>it gives me some console output before dying). It points to
> >>>'mm: drop HASH_ADAPT' as the culprit. Bisect log is attached.
> >>>
> >>>A quick glance suggests that 64 bit kernels pass and 32 bit kernels fail.
> >>>32-bit x86 images fail and should provide an easy test case.
> >>
> >>Hmm, this is quite unexpected as the patch is not supposed to change
> >>things much. It just removes the flag and perform the new hash scaling
> >>automatically for all requeusts which do not have any high limit.
> >>Some of those didn't have HASH_ADAPT before but that shouldn't change
> >>the picture much. The only thing that I can imagine is that what
> >>formerly failed for early memblock allocations is now suceeding and that
> >>depletes the early memory. Do you have any serial console from the boot?
> >
> >OK, I guess I know what it going on here. Adaptive has scaling is not
> >really suited for 32b. ADAPT_SCALE_BASE is just too large for the word
> >size and so we end up in the endless loop. So the issue has been
> >introduced by the original "mm: adaptive hash table scaling" but my
> >patch made it more visible because [di]cache has tables most probably
> >suceeded in the early initialization which didn't have HASH_ADAPT.
> >The following should fix the hang. I am not yet sure about the maximum
> >size for the scaling and something even smaller would make sense to me
> >because kernel address space is just too small for such a large has
> >tables.
> >---
> >diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >index a26e19c3e1ff..70c5fc1fb89a 100644
> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -7174,11 +7174,15 @@ static unsigned long __init arch_reserved_kernel_pages(void)
> >  /*
> >   * Adaptive scale is meant to reduce sizes of hash tables on large memory
> >   * machines. As memory size is increased the scale is also increased but at
> >- * slower pace.  Starting from ADAPT_SCALE_BASE (64G), every time memory
> >- * quadruples the scale is increased by one, which means the size of hash table
> >- * only doubles, instead of quadrupling as well.
> >+ * slower pace.  Starting from ADAPT_SCALE_BASE (64G on 64b systems and 32M
> >+ * on 32b), every time memory quadruples the scale is increased by one, which
> >+ * means the size of hash table only doubles, instead of quadrupling as well.
> >   */
> >+#if __BITS_PER_LONG == 64
> >  #define ADAPT_SCALE_BASE	(64ul << 30)
> >+#else
> >+#define ADAPT_SCALE_BASE	(32ul << 20)
> >+#endif
> >  #define ADAPT_SCALE_SHIFT	2
> >  #define ADAPT_SCALE_NPAGES	(ADAPT_SCALE_BASE >> PAGE_SHIFT)
> >
> I have seen another patch making it 64ull. Not sure if adaptive scaling
> on 32 bit systems really makes sense; unless there is a clear need I'd rather
> leave it alone.

I've just found out that my incoming emails sync didn't work since
friday. So I've missed those follow up emails. I will double check.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-22  9:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-19 16:46 Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT' Guenter Roeck
2017-05-19 23:24 ` Kevin Hilman
2017-05-20  7:26 ` Michal Hocko
2017-05-20 14:21   ` Guenter Roeck
2017-05-20 16:38     ` Pasha Tatashin
2017-05-22  8:45   ` Michal Hocko
2017-05-22  9:03     ` Guenter Roeck
2017-05-22  9:25       ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).