linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression on ARMs in next-20170531
@ 2017-05-31 16:45 Tony Lindgren
  2017-05-31 17:43 ` Russell King - ARM Linux
  0 siblings, 1 reply; 9+ messages in thread
From: Tony Lindgren @ 2017-05-31 16:45 UTC (permalink / raw)
  To: Johannes Weiner, Andrew Morton
  Cc: Josef Bacik, Michal Hocko, Vladimir Davydov, Rik van Riel,
	Mark Brown, Russell King, linux-kernel

Hi,

Looks like current Linux next won't boot on most ARMs
and git bisect points to commit b6bc6724488a ("mm: vmstat:
move slab statistics from zone to node counters").

Mark Brown noticed that the so far the only booting
ARMs are all with CONFIG_SMP disabled and I just
confirmed that's the case.

Any ideas?

Regards,

Tony

8< --------------------
Unable to handle kernel paging request at virtual address 2e116007
pgd = c0004000
[2e116007] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc3-00153-gb6bc6724488a #200
Hardware name: Generic DRA74X (Flattened Device Tree)
task: c0d0adc0 task.stack: c0d00000
PC is at __mod_node_page_state+0x2c/0xc8
LR is at __per_cpu_offset+0x0/0x8
pc : [<c0271de8>]    lr : [<c0d07da4>]    psr: 600000d3
sp : c0d01eec  ip : 00000000  fp : c15782f4
r10: 00000000  r9 : c1591280  r8 : 00004000
r7 : 00000001  r6 : 00000006  r5 : 2e116000  r4 : 00000007
r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc27c0
Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 8000406a  DAC: 00000051
Process swapper (pid: 0, stack limit = 0xc0d00218)
Stack: (0xc0d01eec to 0xc0d02000)
1ee0:                            600000d3 c0dc27c0 c0271efc 00000001 c0d58864
1f00: ef470000 00008000 00004000 c029fbb0 01000000 c1572b5c 00002000 00000000
1f20: 00000001 00000001 00008000 c029f584 00000000 c0d58864 00008000 00008000
1f40: 01008000 c0c23790 c15782f4 a00000d3 c0d58864 c02a0364 00000000 c0819388
1f60: c0d58864 000000c0 01000000 c1572a58 c0aa57a4 00000080 00002000 c0dca000
1f80: efffe980 c0c53a48 00000000 c0c23790 c1572a58 c0c59e48 c0c59de8 c1572b5c
1fa0: c0dca000 c0c257a4 00000000 ffffffff c0dca000 c0d07940 c0dca000 c0c00a9c
1fc0: ffffffff ffffffff 00000000 c0c00680 00000000 c0c53a48 c0dca214 c0d07958
1fe0: c0c53a44 c0d0caa4 8000406a 412fc0f2 00000000 8000807c 00000000 00000000
[<c0271de8>] (__mod_node_page_state) from [<c0271efc>] (mod_node_page_state+0x2c/0x4c)
[<c0271efc>] (mod_node_page_state) from [<c029fbb0>] (cache_alloc_refill+0x5b8/0x828)
[<c029fbb0>] (cache_alloc_refill) from [<c02a0364>] (kmem_cache_alloc+0x24c/0x2d0)
[<c02a0364>] (kmem_cache_alloc) from [<c0c23790>] (create_kmalloc_cache+0x20/0x8c)
[<c0c23790>] (create_kmalloc_cache) from [<c0c257a4>] (kmem_cache_init+0xac/0x11c)
[<c0c257a4>] (kmem_cache_init) from [<c0c00a9c>] (start_kernel+0x1b8/0x3c0)
[<c0c00a9c>] (start_kernel) from [<8000807c>] (0x8000807c)
Code: e79e5103 e28c3001 e0833001 e1a04003 (e19440d5)
---[ end trace 0000000000000000 ]---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-05-31 16:45 Regression on ARMs in next-20170531 Tony Lindgren
@ 2017-05-31 17:43 ` Russell King - ARM Linux
  2017-05-31 18:00   ` Tony Lindgren
  2017-06-04 11:32   ` Johannes Weiner
  0 siblings, 2 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2017-05-31 17:43 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Johannes Weiner, Andrew Morton, Josef Bacik, Michal Hocko,
	Vladimir Davydov, Rik van Riel, Mark Brown, linux-kernel

On Wed, May 31, 2017 at 09:45:45AM -0700, Tony Lindgren wrote:
> Mark Brown noticed that the so far the only booting
> ARMs are all with CONFIG_SMP disabled and I just
> confirmed that's the case.

> 8< --------------------
> Unable to handle kernel paging request at virtual address 2e116007
> pgd = c0004000
> [2e116007] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc3-00153-gb6bc6724488a #200
> Hardware name: Generic DRA74X (Flattened Device Tree)
> task: c0d0adc0 task.stack: c0d00000
> PC is at __mod_node_page_state+0x2c/0xc8
> LR is at __per_cpu_offset+0x0/0x8
> pc : [<c0271de8>]    lr : [<c0d07da4>]    psr: 600000d3
> sp : c0d01eec  ip : 00000000  fp : c15782f4
> r10: 00000000  r9 : c1591280  r8 : 00004000
> r7 : 00000001  r6 : 00000006  r5 : 2e116000  r4 : 00000007
> r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc27c0
> Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
...
> Code: e79e5103 e28c3001 e0833001 e1a04003 (e19440d5)

This disassembles to:

   0:   e79e5103        ldr     r5, [lr, r3, lsl #2]
   4:   e28c3001        add     r3, ip, #1
   8:   e0833001        add     r3, r3, r1
   c:   e1a04003        mov     r4, r3
  10:   e19440d5        ldrsb   r4, [r4, r5]

I don't have a similarly configured kernel, but here I have for the
start of this function:

00000680 <__mod_node_page_state>:
     680:       e1a0c00d        mov     ip, sp
     684:       e92dd870        push    {r4, r5, r6, fp, ip, lr, pc}
     688:       e24cb004        sub     fp, ip, #4
     68c:       e590cc00        ldr     ip, [r0, #3072] ; 0xc00
     690:       e1a0400d        mov     r4, sp
     694:       ee1d6f90        mrc     15, 0, r6, cr13, cr0, {4}
     698:       e08c5001        add     r5, ip, r1
     69c:       e2855001        add     r5, r5, #1
     6a0:       e1a03005        mov     r3, r5
     6a4:       e196c0dc        ldrsb   ip, [r6, ip]
     6a8:       e19630d3        ldrsb   r3, [r6, r3]

r5 in your code is the equivalent of r6, r4 => r3, r3 -> r5.
lr is the __per_cpu_offset array, so the first instruction is
trying to load the percpu offset.

The faulting code is:

        x = delta + __this_cpu_read(*p);

specifically "__this_cpu_read(*p)".

"ip" holds "pcp" from:

        struct per_cpu_nodestat __percpu *pcp = pgdat->per_cpu_nodestats;

and you may notice that it's zero in the register dump.  So,
pgdat->per_cpu_nodestats is NULL here.

This seems to be setup in setup_per_cpu_pageset(), which in the init
order, happens way after mm_init() (which contains kmem_cache_init()).

So, looks to me like an init ordering bug.  I'm not sure why SMP
would be working - maybe its only working because it's managing to
scribble over some memory that isn't faulting?  I suspect a
WARN_ON(!pcp) here will report even on SMP.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-05-31 17:43 ` Russell King - ARM Linux
@ 2017-05-31 18:00   ` Tony Lindgren
  2017-06-04 11:32   ` Johannes Weiner
  1 sibling, 0 replies; 9+ messages in thread
From: Tony Lindgren @ 2017-05-31 18:00 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Johannes Weiner, Andrew Morton, Josef Bacik, Michal Hocko,
	Vladimir Davydov, Rik van Riel, Mark Brown, linux-kernel

* Russell King - ARM Linux <linux@armlinux.org.uk> [170531 10:47]:
> I don't have a similarly configured kernel, but here I have for the
> start of this function:
> 
> 00000680 <__mod_node_page_state>:
>      680:       e1a0c00d        mov     ip, sp
>      684:       e92dd870        push    {r4, r5, r6, fp, ip, lr, pc}
>      688:       e24cb004        sub     fp, ip, #4
>      68c:       e590cc00        ldr     ip, [r0, #3072] ; 0xc00
>      690:       e1a0400d        mov     r4, sp
>      694:       ee1d6f90        mrc     15, 0, r6, cr13, cr0, {4}
>      698:       e08c5001        add     r5, ip, r1
>      69c:       e2855001        add     r5, r5, #1
>      6a0:       e1a03005        mov     r3, r5
>      6a4:       e196c0dc        ldrsb   ip, [r6, ip]
>      6a8:       e19630d3        ldrsb   r3, [r6, r3]
> 
> r5 in your code is the equivalent of r6, r4 => r3, r3 -> r5.
> lr is the __per_cpu_offset array, so the first instruction is
> trying to load the percpu offset.
> 
> The faulting code is:
> 
>         x = delta + __this_cpu_read(*p);
> 
> specifically "__this_cpu_read(*p)".
> 
> "ip" holds "pcp" from:
> 
>         struct per_cpu_nodestat __percpu *pcp = pgdat->per_cpu_nodestats;
> 
> and you may notice that it's zero in the register dump.  So,
> pgdat->per_cpu_nodestats is NULL here.
> 
> This seems to be setup in setup_per_cpu_pageset(), which in the init
> order, happens way after mm_init() (which contains kmem_cache_init()).

OK thanks, so that should help :)

> So, looks to me like an init ordering bug.  I'm not sure why SMP
> would be working - maybe its only working because it's managing to
> scribble over some memory that isn't faulting?  I suspect a
> WARN_ON(!pcp) here will report even on SMP.

The other way around, CONFIG_SMP=y is not booting, disabling
it boots.

Regards,

Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-05-31 17:43 ` Russell King - ARM Linux
  2017-05-31 18:00   ` Tony Lindgren
@ 2017-06-04 11:32   ` Johannes Weiner
  2017-06-06  5:55     ` Tony Lindgren
  1 sibling, 1 reply; 9+ messages in thread
From: Johannes Weiner @ 2017-06-04 11:32 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Tony Lindgren, Andrew Morton, Josef Bacik, Michal Hocko,
	Vladimir Davydov, Rik van Riel, Mark Brown, linux-kernel

On Wed, May 31, 2017 at 06:43:33PM +0100, Russell King - ARM Linux wrote:
> On Wed, May 31, 2017 at 09:45:45AM -0700, Tony Lindgren wrote:
> > Mark Brown noticed that the so far the only booting
> > ARMs are all with CONFIG_SMP disabled and I just
> > confirmed that's the case.
> 
> > 8< --------------------
> > Unable to handle kernel paging request at virtual address 2e116007
> > pgd = c0004000
> > [2e116007] *pgd=00000000
> > Internal error: Oops: 5 [#1] SMP ARM
> > Modules linked in:
> > CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc3-00153-gb6bc6724488a #200
> > Hardware name: Generic DRA74X (Flattened Device Tree)
> > task: c0d0adc0 task.stack: c0d00000
> > PC is at __mod_node_page_state+0x2c/0xc8
> > LR is at __per_cpu_offset+0x0/0x8
> > pc : [<c0271de8>]    lr : [<c0d07da4>]    psr: 600000d3
> > sp : c0d01eec  ip : 00000000  fp : c15782f4
> > r10: 00000000  r9 : c1591280  r8 : 00004000
> > r7 : 00000001  r6 : 00000006  r5 : 2e116000  r4 : 00000007
> > r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc27c0
> > Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
> ...
> > Code: e79e5103 e28c3001 e0833001 e1a04003 (e19440d5)
> 
> This disassembles to:
> 
>    0:   e79e5103        ldr     r5, [lr, r3, lsl #2]
>    4:   e28c3001        add     r3, ip, #1
>    8:   e0833001        add     r3, r3, r1
>    c:   e1a04003        mov     r4, r3
>   10:   e19440d5        ldrsb   r4, [r4, r5]
> 
> I don't have a similarly configured kernel, but here I have for the
> start of this function:
> 
> 00000680 <__mod_node_page_state>:
>      680:       e1a0c00d        mov     ip, sp
>      684:       e92dd870        push    {r4, r5, r6, fp, ip, lr, pc}
>      688:       e24cb004        sub     fp, ip, #4
>      68c:       e590cc00        ldr     ip, [r0, #3072] ; 0xc00
>      690:       e1a0400d        mov     r4, sp
>      694:       ee1d6f90        mrc     15, 0, r6, cr13, cr0, {4}
>      698:       e08c5001        add     r5, ip, r1
>      69c:       e2855001        add     r5, r5, #1
>      6a0:       e1a03005        mov     r3, r5
>      6a4:       e196c0dc        ldrsb   ip, [r6, ip]
>      6a8:       e19630d3        ldrsb   r3, [r6, r3]
> 
> r5 in your code is the equivalent of r6, r4 => r3, r3 -> r5.
> lr is the __per_cpu_offset array, so the first instruction is
> trying to load the percpu offset.
> 
> The faulting code is:
> 
>         x = delta + __this_cpu_read(*p);
> 
> specifically "__this_cpu_read(*p)".
> 
> "ip" holds "pcp" from:
> 
>         struct per_cpu_nodestat __percpu *pcp = pgdat->per_cpu_nodestats;
> 
> and you may notice that it's zero in the register dump.  So,
> pgdat->per_cpu_nodestats is NULL here.
> 
> This seems to be setup in setup_per_cpu_pageset(), which in the init
> order, happens way after mm_init() (which contains kmem_cache_init()).

Thanks for the analysis, Russell.

I think it's NULL because the slab allocation happens before even the
root_mem_cgroup is set up, and so root_mem_cgroup -> lruvec -> pgdat
gives us garbage.

Tony, Josef, since the patches are dropped from -next, could you test
the -mm tree at git://git.cmpxchg.org/linux-mmots.git and verify that
this patch below fixes the issue?

---

>From 47007dfcd7873cb93d11466a93b1f41f6a7a434f Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Sun, 4 Jun 2017 07:02:44 -0400
Subject: [PATCH] mm: memcontrol: per-lruvec stats infrastructure fix 2

Even with the previous fix routing !page->mem_cgroup stats to the root
cgroup, we still see crashes in certain configurations as the root is
not initialized for the earliest possible accounting sites in certain
configurations.

Don't track uncharged pages at all, not even in the root. This takes
care of early accounting as well as special pages that aren't tracked.

Because we still need to account at the pgdat level, we can no longer
implement the lruvec_page_state functions on top of the lruvec_state
ones. But that's okay, it was a little silly to look up the nodeinfo
and descend to the lruvec, only to container_of() back to the nodeinfo
where the lruvec_stat structure is sitting.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index bea6f08e9e16..da9360885260 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -585,27 +585,27 @@ static inline void mod_lruvec_state(struct lruvec *lruvec,
 static inline void __mod_lruvec_page_state(struct page *page,
 					   enum node_stat_item idx, int val)
 {
-	struct mem_cgroup *memcg;
-	struct lruvec *lruvec;
-
-	/* Special pages in the VM aren't charged, use root */
-	memcg = page->mem_cgroup ? : root_mem_cgroup;
+	struct mem_cgroup_per_node *pn;
 
-	lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
-	__mod_lruvec_state(lruvec, idx, val);
+	__mod_node_page_state(page_pgdat(page), idx, val);
+	if (mem_cgroup_disabled() || !page->mem_cgroup)
+		return;
+	__mod_memcg_state(page->mem_cgroup, idx, val);
+	pn = page->mem_cgroup->nodeinfo[page_to_nid(page)];
+	__this_cpu_add(pn->lruvec_stat->count[idx], val);
 }
 
 static inline void mod_lruvec_page_state(struct page *page,
 					 enum node_stat_item idx, int val)
 {
-	struct mem_cgroup *memcg;
-	struct lruvec *lruvec;
-
-	/* Special pages in the VM aren't charged, use root */
-	memcg = page->mem_cgroup ? : root_mem_cgroup;
+	struct mem_cgroup_per_node *pn;
 
-	lruvec = mem_cgroup_lruvec(page_pgdat(page), memcg);
-	mod_lruvec_state(lruvec, idx, val);
+	mod_node_page_state(page_pgdat(page), idx, val);
+	if (mem_cgroup_disabled() || !page->mem_cgroup)
+		return;
+	mod_memcg_state(page->mem_cgroup, idx, val);
+	pn = page->mem_cgroup->nodeinfo[page_to_nid(page)];
+	this_cpu_add(pn->lruvec_stat->count[idx], val);
 }
 
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-06-04 11:32   ` Johannes Weiner
@ 2017-06-06  5:55     ` Tony Lindgren
  2017-06-06 12:30       ` Tony Lindgren
  0 siblings, 1 reply; 9+ messages in thread
From: Tony Lindgren @ 2017-06-06  5:55 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Russell King - ARM Linux, Andrew Morton, Josef Bacik,
	Michal Hocko, Vladimir Davydov, Rik van Riel, Mark Brown,
	linux-kernel

* Johannes Weiner <hannes@cmpxchg.org> [170604 04:36]:
> I think it's NULL because the slab allocation happens before even the
> root_mem_cgroup is set up, and so root_mem_cgroup -> lruvec -> pgdat
> gives us garbage.
> 
> Tony, Josef, since the patches are dropped from -next, could you test
> the -mm tree at git://git.cmpxchg.org/linux-mmots.git and verify that
> this patch below fixes the issue?

Looks like next-20170605 is broken for ARMs again.. And the patch
below does not apply for me against mmots/master or next.
Care to update?

Regards,

Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-06-06  5:55     ` Tony Lindgren
@ 2017-06-06 12:30       ` Tony Lindgren
  2017-06-06 14:36         ` Johannes Weiner
  0 siblings, 1 reply; 9+ messages in thread
From: Tony Lindgren @ 2017-06-06 12:30 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Russell King - ARM Linux, Andrew Morton, Josef Bacik,
	Michal Hocko, Vladimir Davydov, Rik van Riel, Mark Brown,
	linux-kernel

* Tony Lindgren <tony@atomide.com> [170605 22:55]:
> * Johannes Weiner <hannes@cmpxchg.org> [170604 04:36]:
> > I think it's NULL because the slab allocation happens before even the
> > root_mem_cgroup is set up, and so root_mem_cgroup -> lruvec -> pgdat
> > gives us garbage.
> > 
> > Tony, Josef, since the patches are dropped from -next, could you test
> > the -mm tree at git://git.cmpxchg.org/linux-mmots.git and verify that
> > this patch below fixes the issue?
> 
> Looks like next-20170605 is broken for ARMs again.. And the patch
> below does not apply for me against mmots/master or next.
> Care to update?

Oh I got it to apply on next-20170605, I must have had something
else applied causing issues on earlyer attempt. Now I'm getting the
following error.

Regards,

Tony

8< -----------------
Unable to handle kernel paging request at virtual address 2ea2d007
pgd = c0004000
[2ea2d007] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc3-next-20170605+ #1227
Hardware name: Generic OMAP4 (Flattened Device Tree)
task: c0d0ae00 task.stack: c0d00000
PC is at __mod_node_page_state+0x2c/0xc8
LR is at __per_cpu_offset+0x0/0x8
pc : [<c0280078>]    lr : [<c0d07d6c>]    psr: 200001d3
sp : c0d01eec  ip : 00000000  fp : 00000001
r10: c0c7cf68  r9 : 00008000  r8 : 00000000
r7 : 00000001  r6 : 00000006  r5 : 2ea2d000  r4 : 00000007
r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc1fc0
Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 8000404a  DAC: 00000051
Process swapper (pid: 0, stack limit = 0xc0d00218)
Stack: (0xc0d01eec to 0xc0d02000)
1ee0:                            400001d3 c0dc1fc0 c028018c 00000001 c1599440
1f00: c0d58834 efd83000 00000000 c02af214 01000000 c157a890 00002000 00008000
1f20: 00000001 00000001 00008000 c02aeb4c 00000000 00008000 c0d58834 00008000
1f40: 01008000 c0c23a88 c0d58834 c1580034 400001d3 c02afa9c 00000000 c086b230
1f60: c0d58834 000000c0 01000000 c157a78c c0abe0fc 00000080 00002000 c0dd4000
1f80: efffec40 c0c55a48 00000000 c0c23a88 c157a78c c0c5be48 c0c5bde8 c157a890
1fa0: c0dd4000 c0c25a9c 00000000 ffffffff c0dd4000 c0d07940 c0dd4000 c0c00abc
1fc0: ffffffff ffffffff 00000000 c0c006a0 00000000 c0c55a48 c0dd4214 c0d07958
1fe0: c0c55a44 c0d0cae4 8000406a 411fc093 00000000 8000807c 00000000 00000000
[<c0280078>] (__mod_node_page_state) from [<c028018c>] (mod_node_page_state+0x2c/0x4c)
[<c028018c>] (mod_node_page_state) from [<c02af214>] (cache_alloc_refill+0x654/0x898)
[<c02af214>] (cache_alloc_refill) from [<c02afa9c>] (kmem_cache_alloc+0x2d4/0x364)
[<c02afa9c>] (kmem_cache_alloc) from [<c0c23a88>] (create_kmalloc_cache+0x20/0x8c)
[<c0c23a88>] (create_kmalloc_cache) from [<c0c25a9c>] (kmem_cache_init+0xac/0x11c)
[<c0c25a9c>] (kmem_cache_init) from [<c0c00abc>] (start_kernel+0x1b8/0x3d8)
[<c0c00abc>] (start_kernel) from [<8000807c>] (0x8000807c)
Code: e79e5103 e28c3001 e0833001 e1a04003 (e19440d5)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-06-06 12:30       ` Tony Lindgren
@ 2017-06-06 14:36         ` Johannes Weiner
  2017-06-06 15:03           ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Weiner @ 2017-06-06 14:36 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Russell King - ARM Linux, Andrew Morton, Josef Bacik,
	Michal Hocko, Vladimir Davydov, Rik van Riel, Mark Brown,
	linux-kernel

On Tue, Jun 06, 2017 at 05:30:10AM -0700, Tony Lindgren wrote:
> PC is at __mod_node_page_state+0x2c/0xc8
> LR is at __per_cpu_offset+0x0/0x8
> pc : [<c0280078>]    lr : [<c0d07d6c>]    psr: 200001d3
> sp : c0d01eec  ip : 00000000  fp : 00000001
> r10: c0c7cf68  r9 : 00008000  r8 : 00000000
> r7 : 00000001  r6 : 00000006  r5 : 2ea2d000  r4 : 00000007
> r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc1fc0
> Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 8000404a  DAC: 00000051
> Process swapper (pid: 0, stack limit = 0xc0d00218)
> Stack: (0xc0d01eec to 0xc0d02000)
> 1ee0:                            400001d3 c0dc1fc0 c028018c 00000001 c1599440
> 1f00: c0d58834 efd83000 00000000 c02af214 01000000 c157a890 00002000 00008000
> 1f20: 00000001 00000001 00008000 c02aeb4c 00000000 00008000 c0d58834 00008000
> 1f40: 01008000 c0c23a88 c0d58834 c1580034 400001d3 c02afa9c 00000000 c086b230
> 1f60: c0d58834 000000c0 01000000 c157a78c c0abe0fc 00000080 00002000 c0dd4000
> 1f80: efffec40 c0c55a48 00000000 c0c23a88 c157a78c c0c5be48 c0c5bde8 c157a890
> 1fa0: c0dd4000 c0c25a9c 00000000 ffffffff c0dd4000 c0d07940 c0dd4000 c0c00abc
> 1fc0: ffffffff ffffffff 00000000 c0c006a0 00000000 c0c55a48 c0dd4214 c0d07958
> 1fe0: c0c55a44 c0d0cae4 8000406a 411fc093 00000000 8000807c 00000000 00000000
> [<c0280078>] (__mod_node_page_state) from [<c028018c>] (mod_node_page_state+0x2c/0x4c)
> [<c028018c>] (mod_node_page_state) from [<c02af214>] (cache_alloc_refill+0x654/0x898)
> [<c02af214>] (cache_alloc_refill) from [<c02afa9c>] (kmem_cache_alloc+0x2d4/0x364)
> [<c02afa9c>] (kmem_cache_alloc) from [<c0c23a88>] (create_kmalloc_cache+0x20/0x8c)
> [<c0c23a88>] (create_kmalloc_cache) from [<c0c25a9c>] (kmem_cache_init+0xac/0x11c)
> [<c0c25a9c>] (kmem_cache_init) from [<c0c00abc>] (start_kernel+0x1b8/0x3d8)

That's the one Russell analyzed and I misinterpreted. We put a fix
into -next to initialize pgdat->per_cpu_nodestats in time for slab
initialization during boot.

Is today's -next working again?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-06-06 14:36         ` Johannes Weiner
@ 2017-06-06 15:03           ` Andrew Morton
  2017-06-07  4:38             ` Tony Lindgren
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2017-06-06 15:03 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tony Lindgren, Russell King - ARM Linux, Josef Bacik,
	Michal Hocko, Vladimir Davydov, Rik van Riel, Mark Brown,
	linux-kernel, Stephen Rothwell

On Tue, 6 Jun 2017 10:36:48 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:

> On Tue, Jun 06, 2017 at 05:30:10AM -0700, Tony Lindgren wrote:
> > PC is at __mod_node_page_state+0x2c/0xc8
> > LR is at __per_cpu_offset+0x0/0x8
> > pc : [<c0280078>]    lr : [<c0d07d6c>]    psr: 200001d3
> > sp : c0d01eec  ip : 00000000  fp : 00000001
> > r10: c0c7cf68  r9 : 00008000  r8 : 00000000
> > r7 : 00000001  r6 : 00000006  r5 : 2ea2d000  r4 : 00000007
> > r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc1fc0
> > Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
> > Control: 10c5387d  Table: 8000404a  DAC: 00000051
> > Process swapper (pid: 0, stack limit = 0xc0d00218)
> > Stack: (0xc0d01eec to 0xc0d02000)
> > 1ee0:                            400001d3 c0dc1fc0 c028018c 00000001 c1599440
> > 1f00: c0d58834 efd83000 00000000 c02af214 01000000 c157a890 00002000 00008000
> > 1f20: 00000001 00000001 00008000 c02aeb4c 00000000 00008000 c0d58834 00008000
> > 1f40: 01008000 c0c23a88 c0d58834 c1580034 400001d3 c02afa9c 00000000 c086b230
> > 1f60: c0d58834 000000c0 01000000 c157a78c c0abe0fc 00000080 00002000 c0dd4000
> > 1f80: efffec40 c0c55a48 00000000 c0c23a88 c157a78c c0c5be48 c0c5bde8 c157a890
> > 1fa0: c0dd4000 c0c25a9c 00000000 ffffffff c0dd4000 c0d07940 c0dd4000 c0c00abc
> > 1fc0: ffffffff ffffffff 00000000 c0c006a0 00000000 c0c55a48 c0dd4214 c0d07958
> > 1fe0: c0c55a44 c0d0cae4 8000406a 411fc093 00000000 8000807c 00000000 00000000
> > [<c0280078>] (__mod_node_page_state) from [<c028018c>] (mod_node_page_state+0x2c/0x4c)
> > [<c028018c>] (mod_node_page_state) from [<c02af214>] (cache_alloc_refill+0x654/0x898)
> > [<c02af214>] (cache_alloc_refill) from [<c02afa9c>] (kmem_cache_alloc+0x2d4/0x364)
> > [<c02afa9c>] (kmem_cache_alloc) from [<c0c23a88>] (create_kmalloc_cache+0x20/0x8c)
> > [<c0c23a88>] (create_kmalloc_cache) from [<c0c25a9c>] (kmem_cache_init+0xac/0x11c)
> > [<c0c25a9c>] (kmem_cache_init) from [<c0c00abc>] (start_kernel+0x1b8/0x3d8)
> 
> That's the one Russell analyzed and I misinterpreted. We put a fix
> into -next to initialize pgdat->per_cpu_nodestats in time for slab
> initialization during boot.
> 
> Is today's -next working again?

I'll be even less than usually functional for the next week, so please
cc Stephen on any -next hotfixes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Regression on ARMs in next-20170531
  2017-06-06 15:03           ` Andrew Morton
@ 2017-06-07  4:38             ` Tony Lindgren
  0 siblings, 0 replies; 9+ messages in thread
From: Tony Lindgren @ 2017-06-07  4:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Russell King - ARM Linux, Josef Bacik,
	Michal Hocko, Vladimir Davydov, Rik van Riel, Mark Brown,
	linux-kernel, Stephen Rothwell

* Andrew Morton <akpm@linux-foundation.org> [170606 08:07]:
> On Tue, 6 Jun 2017 10:36:48 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > On Tue, Jun 06, 2017 at 05:30:10AM -0700, Tony Lindgren wrote:
> > > PC is at __mod_node_page_state+0x2c/0xc8
> > > LR is at __per_cpu_offset+0x0/0x8
> > > pc : [<c0280078>]    lr : [<c0d07d6c>]    psr: 200001d3
> > > sp : c0d01eec  ip : 00000000  fp : 00000001
> > > r10: c0c7cf68  r9 : 00008000  r8 : 00000000
> > > r7 : 00000001  r6 : 00000006  r5 : 2ea2d000  r4 : 00000007
> > > r3 : 00000007  r2 : 00000001  r1 : 00000006  r0 : c0dc1fc0
> > > Flags: nzCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment none
> > > Control: 10c5387d  Table: 8000404a  DAC: 00000051
> > > Process swapper (pid: 0, stack limit = 0xc0d00218)
> > > Stack: (0xc0d01eec to 0xc0d02000)
> > > 1ee0:                            400001d3 c0dc1fc0 c028018c 00000001 c1599440
> > > 1f00: c0d58834 efd83000 00000000 c02af214 01000000 c157a890 00002000 00008000
> > > 1f20: 00000001 00000001 00008000 c02aeb4c 00000000 00008000 c0d58834 00008000
> > > 1f40: 01008000 c0c23a88 c0d58834 c1580034 400001d3 c02afa9c 00000000 c086b230
> > > 1f60: c0d58834 000000c0 01000000 c157a78c c0abe0fc 00000080 00002000 c0dd4000
> > > 1f80: efffec40 c0c55a48 00000000 c0c23a88 c157a78c c0c5be48 c0c5bde8 c157a890
> > > 1fa0: c0dd4000 c0c25a9c 00000000 ffffffff c0dd4000 c0d07940 c0dd4000 c0c00abc
> > > 1fc0: ffffffff ffffffff 00000000 c0c006a0 00000000 c0c55a48 c0dd4214 c0d07958
> > > 1fe0: c0c55a44 c0d0cae4 8000406a 411fc093 00000000 8000807c 00000000 00000000
> > > [<c0280078>] (__mod_node_page_state) from [<c028018c>] (mod_node_page_state+0x2c/0x4c)
> > > [<c028018c>] (mod_node_page_state) from [<c02af214>] (cache_alloc_refill+0x654/0x898)
> > > [<c02af214>] (cache_alloc_refill) from [<c02afa9c>] (kmem_cache_alloc+0x2d4/0x364)
> > > [<c02afa9c>] (kmem_cache_alloc) from [<c0c23a88>] (create_kmalloc_cache+0x20/0x8c)
> > > [<c0c23a88>] (create_kmalloc_cache) from [<c0c25a9c>] (kmem_cache_init+0xac/0x11c)
> > > [<c0c25a9c>] (kmem_cache_init) from [<c0c00abc>] (start_kernel+0x1b8/0x3d8)
> > 
> > That's the one Russell analyzed and I misinterpreted. We put a fix
> > into -next to initialize pgdat->per_cpu_nodestats in time for slab
> > initialization during boot.

OK

> > Is today's -next working again?

Yes just tested that next-20170606 is working again thanks!

> I'll be even less than usually functional for the next week, so please
> cc Stephen on any -next hotfixes.

OK good to know if new issues show up.

Regards,

Tony

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-07  4:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-31 16:45 Regression on ARMs in next-20170531 Tony Lindgren
2017-05-31 17:43 ` Russell King - ARM Linux
2017-05-31 18:00   ` Tony Lindgren
2017-06-04 11:32   ` Johannes Weiner
2017-06-06  5:55     ` Tony Lindgren
2017-06-06 12:30       ` Tony Lindgren
2017-06-06 14:36         ` Johannes Weiner
2017-06-06 15:03           ` Andrew Morton
2017-06-07  4:38             ` Tony Lindgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).