linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-next crash during very early boot
@ 2016-04-14  0:29 Valdis Kletnieks
  2016-04-14  1:35 ` Joonsoo Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Valdis Kletnieks @ 2016-04-14  0:29 UTC (permalink / raw)
  To: Joonsoo Kim, Andrew Morton; +Cc: linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

I'm seeing my laptop crash/wedge up/something during very early
boot - before it can write anything to the console.  Nothing in pstore,
need to hold down the power button for 6 seconds and reboot.

git bisect points at:

commit 7a6bacb133752beacb76775797fd550417e9d3a2
Author: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Date:   Thu Apr 7 13:59:39 2016 +1000

    mm/slab: factor out kmem_cache_node initialization code

    It can be reused on other place, so factor out it.  Following patch will
    use it.


Not sure what the problem is - the logic *looks* ok at first read.  The
patch *does* remove a spin_lock_irq() - but I find it difficult to
believe that with it gone, my laptop is able to hit the race condition
the spinlock protects against *every single boot*.

The only other thing I see is that n->free_limit used to be assigned
every time, and now it's only assigned at initial creation.


[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-14  0:29 linux-next crash during very early boot Valdis Kletnieks
@ 2016-04-14  1:35 ` Joonsoo Kim
  2016-04-14 19:22   ` Valdis.Kletnieks
  2016-04-15 14:10   ` Valdis.Kletnieks
  0 siblings, 2 replies; 7+ messages in thread
From: Joonsoo Kim @ 2016-04-14  1:35 UTC (permalink / raw)
  To: Valdis Kletnieks; +Cc: Andrew Morton, linux-kernel, linux-mm

On Wed, Apr 13, 2016 at 08:29:46PM -0400, Valdis Kletnieks wrote:
> I'm seeing my laptop crash/wedge up/something during very early
> boot - before it can write anything to the console.  Nothing in pstore,
> need to hold down the power button for 6 seconds and reboot.
> 
> git bisect points at:
> 
> commit 7a6bacb133752beacb76775797fd550417e9d3a2
> Author: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> Date:   Thu Apr 7 13:59:39 2016 +1000
> 
>     mm/slab: factor out kmem_cache_node initialization code
> 
>     It can be reused on other place, so factor out it.  Following patch will
>     use it.
> 
> 
> Not sure what the problem is - the logic *looks* ok at first read.  The
> patch *does* remove a spin_lock_irq() - but I find it difficult to
> believe that with it gone, my laptop is able to hit the race condition
> the spinlock protects against *every single boot*.
> 
> The only other thing I see is that n->free_limit used to be assigned
> every time, and now it's only assigned at initial creation.

Hello,

My fault. It should be assgined every time. Please test below patch.
I will send it with proper SOB after you confirm the problem disappear.
Thanks for report and analysis!

Thanks.

---------------->8-----------------
diff --git a/mm/slab.c b/mm/slab.c
index 13e74aa..59dd94a 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -856,8 +856,14 @@ static int init_cache_node(struct kmem_cache *cachep, int node, gfp_t gfp)
 	 * node has not already allocated this
 	 */
 	n = get_node(cachep, node);
-	if (n)
+	if (n) {
+		spin_lock_irq(&n->list_lock);
+		n->free_limit = (1 + nr_cpus_node(node)) * cachep->batchcount +
+				cachep->num;
+		spin_unlock_irq(&n->list_lock);
+
 		return 0;
+	}
 
 	n = kmalloc_node(sizeof(struct kmem_cache_node), gfp, node);
 	if (!n)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-14  1:35 ` Joonsoo Kim
@ 2016-04-14 19:22   ` Valdis.Kletnieks
  2016-04-15  1:25     ` Joonsoo Kim
  2016-04-15 14:10   ` Valdis.Kletnieks
  1 sibling, 1 reply; 7+ messages in thread
From: Valdis.Kletnieks @ 2016-04-14 19:22 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 361 bytes --]

On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:

> My fault. It should be assgined every time. Please test below patch.
> I will send it with proper SOB after you confirm the problem disappear.
> Thanks for report and analysis!

Still bombs out, sorry.  Will do more debugging this evening if I have
a chance - will follow up tomorrow morning US time....

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-14 19:22   ` Valdis.Kletnieks
@ 2016-04-15  1:25     ` Joonsoo Kim
  0 siblings, 0 replies; 7+ messages in thread
From: Joonsoo Kim @ 2016-04-15  1:25 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Joonsoo Kim, Andrew Morton, LKML, Linux Memory Management List

2016-04-15 4:22 GMT+09:00  <Valdis.Kletnieks@vt.edu>:
> On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:
>
>> My fault. It should be assgined every time. Please test below patch.
>> I will send it with proper SOB after you confirm the problem disappear.
>> Thanks for report and analysis!
>
> Still bombs out, sorry.  Will do more debugging this evening if I have
> a chance - will follow up tomorrow morning US time....

Hmm... could you also apply the patch on below link?
There is another issue from me and fix is there.

https://lkml.org/lkml/2016/4/10/703

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-14  1:35 ` Joonsoo Kim
  2016-04-14 19:22   ` Valdis.Kletnieks
@ 2016-04-15 14:10   ` Valdis.Kletnieks
  2016-04-20  8:13     ` Joonsoo Kim
  2016-04-21  3:14     ` Valdis.Kletnieks
  1 sibling, 2 replies; 7+ messages in thread
From: Valdis.Kletnieks @ 2016-04-15 14:10 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Andrew Morton, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1671 bytes --]

On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:
> On Wed, Apr 13, 2016 at 08:29:46PM -0400, Valdis Kletnieks wrote:
> > I'm seeing my laptop crash/wedge up/something during very early
> > boot - before it can write anything to the console.  Nothing in pstore,
> > need to hold down the power button for 6 seconds and reboot.
> >
> > git bisect points at:
> >
> > commit 7a6bacb133752beacb76775797fd550417e9d3a2
> > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > Date:   Thu Apr 7 13:59:39 2016 +1000
> >
> >     mm/slab: factor out kmem_cache_node initialization code
> >
> >     It can be reused on other place, so factor out it.  Following patch wil
l
> >     use it.
> >
> >
> > Not sure what the problem is - the logic *looks* ok at first read.  The
> > patch *does* remove a spin_lock_irq() - but I find it difficult to
> > believe that with it gone, my laptop is able to hit the race condition
> > the spinlock protects against *every single boot*.
> >
> > The only other thing I see is that n->free_limit used to be assigned
> > every time, and now it's only assigned at initial creation.
>
> Hello,
>
> My fault. It should be assgined every time. Please test below patch.
> I will send it with proper SOB after you confirm the problem disappear.
> Thanks for report and analysis!

Following up - I verified that it was your patch series and not a bad bisect
by starting with a clean next-20160413 and reverting that series - and the
resulting kernel boots fine.

Will take a closer look at your fix patch and figure out what's still changed
afterwards - there's obviously some small semantic change that actually
matters, but we're not spotting it yet...

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-15 14:10   ` Valdis.Kletnieks
@ 2016-04-20  8:13     ` Joonsoo Kim
  2016-04-21  3:14     ` Valdis.Kletnieks
  1 sibling, 0 replies; 7+ messages in thread
From: Joonsoo Kim @ 2016-04-20  8:13 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Andrew Morton, linux-kernel, linux-mm

On Fri, Apr 15, 2016 at 10:10:33AM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:
> > On Wed, Apr 13, 2016 at 08:29:46PM -0400, Valdis Kletnieks wrote:
> > > I'm seeing my laptop crash/wedge up/something during very early
> > > boot - before it can write anything to the console.  Nothing in pstore,
> > > need to hold down the power button for 6 seconds and reboot.
> > >
> > > git bisect points at:
> > >
> > > commit 7a6bacb133752beacb76775797fd550417e9d3a2
> > > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > Date:   Thu Apr 7 13:59:39 2016 +1000
> > >
> > >     mm/slab: factor out kmem_cache_node initialization code
> > >
> > >     It can be reused on other place, so factor out it.  Following patch wil
> l
> > >     use it.
> > >
> > >
> > > Not sure what the problem is - the logic *looks* ok at first read.  The
> > > patch *does* remove a spin_lock_irq() - but I find it difficult to
> > > believe that with it gone, my laptop is able to hit the race condition
> > > the spinlock protects against *every single boot*.
> > >
> > > The only other thing I see is that n->free_limit used to be assigned
> > > every time, and now it's only assigned at initial creation.
> >
> > Hello,
> >
> > My fault. It should be assgined every time. Please test below patch.
> > I will send it with proper SOB after you confirm the problem disappear.
> > Thanks for report and analysis!
> 
> Following up - I verified that it was your patch series and not a bad bisect
> by starting with a clean next-20160413 and reverting that series - and the
> resulting kernel boots fine.
> 
> Will take a closer look at your fix patch and figure out what's still changed
> afterwards - there's obviously some small semantic change that actually
> matters, but we're not spotting it yet...

Hello,

Do you try to test the patch in following link on top of my fix for "mm/slab:
factor out kmem_cache_node initialization code"?

https://lkml.org/lkml/2016/4/10/703

I mentioned it in another thread but you didn't reply it so I'm
curious.

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: linux-next crash during very early boot
  2016-04-15 14:10   ` Valdis.Kletnieks
  2016-04-20  8:13     ` Joonsoo Kim
@ 2016-04-21  3:14     ` Valdis.Kletnieks
  1 sibling, 0 replies; 7+ messages in thread
From: Valdis.Kletnieks @ 2016-04-21  3:14 UTC (permalink / raw)
  Cc: Joonsoo Kim, Andrew Morton, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1917 bytes --]

On Fri, 15 Apr 2016 10:10:33 -0400, Valdis.Kletnieks@vt.edu said:
> On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:
> > On Wed, Apr 13, 2016 at 08:29:46PM -0400, Valdis Kletnieks wrote:
> > > I'm seeing my laptop crash/wedge up/something during very early
> > > boot - before it can write anything to the console.  Nothing in pstore,
> > > need to hold down the power button for 6 seconds and reboot.
> > >
> > > git bisect points at:
> > >
> > > commit 7a6bacb133752beacb76775797fd550417e9d3a2
> > > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > > Date:   Thu Apr 7 13:59:39 2016 +1000
> > >
> > >     mm/slab: factor out kmem_cache_node initialization code
> > >
> > >     It can be reused on other place, so factor out it.  Following patch will
> > >     use it.
> > >
> > >
> > > Not sure what the problem is - the logic *looks* ok at first read.  The
> > > patch *does* remove a spin_lock_irq() - but I find it difficult to
> > > believe that with it gone, my laptop is able to hit the race condition
> > > the spinlock protects against *every single boot*.
> > >
> > > The only other thing I see is that n->free_limit used to be assigned
> > > every time, and now it's only assigned at initial creation.
> >
> > Hello,
> >
> > My fault. It should be assgined every time. Please test below patch.
> > I will send it with proper SOB after you confirm the problem disappear.
> > Thanks for report and analysis!
>
> Following up - I verified that it was your patch series and not a bad bisect
> by starting with a clean next-20160413 and reverting that series - and the
> resulting kernel boots fine.

Following up some more - next-20160420 seems to work just fine, even with
no sign in 'git log -- mm/slab.c' of the fix-patch....

I'm obviously having a very bad "things that go bump in the night" with
kernels lately - this makes 3 different "makes no sense" things I've posted
in the last 6 hours... :)

[-- Attachment #2: Type: application/pgp-signature, Size: 848 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-04-21  3:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-14  0:29 linux-next crash during very early boot Valdis Kletnieks
2016-04-14  1:35 ` Joonsoo Kim
2016-04-14 19:22   ` Valdis.Kletnieks
2016-04-15  1:25     ` Joonsoo Kim
2016-04-15 14:10   ` Valdis.Kletnieks
2016-04-20  8:13     ` Joonsoo Kim
2016-04-21  3:14     ` Valdis.Kletnieks

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).