linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
@ 2014-05-22 13:58 Dave Jones
  2014-05-22 15:08 ` Vlastimil Babka
  2014-05-23  4:21 ` Sasha Levin
  0 siblings, 2 replies; 8+ messages in thread
From: Dave Jones @ 2014-05-22 13:58 UTC (permalink / raw)
  To: Linux Kernel; +Cc: linux-mm, Linus Torvalds

Not sure if Sasha has already reported this on -next (It's getting hard
to keep track of all the VM bugs he's been finding), but I hit this overnight
on .15-rc6.  First time I've seen this one.


page:ffffea0004599800 count:0 mapcount:0 mapping:          (null) index:0x2
page flags: 0x20000000008000(tail)
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:415!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 1 PID: 6858 Comm: trinity-c42 Not tainted 3.15.0-rc6+ #216
task: ffff88012d18e900 ti: ffff88009e87a000 task.ti: ffff88009e87a000
RIP: 0010:[<ffffffffbb718d98>]  [<ffffffffbb718d98>] PageTransHuge.part.23+0xb/0xd
RSP: 0000:ffff88009e87b940  EFLAGS: 00010246
RAX: 0000000000000001 RBX: 0000000000116660 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffffffffbb0c00f8 RDI: ffffffffbb0bfed2
RBP: ffff88009e87b940 R08: ffffffffbc01203c R09: 00000000000003da
R10: 00000000000003d9 R11: 0000000000000003 R12: 0000000000000001
R13: 0000000000116800 R14: ffff88024d64ce00 R15: ffffea0004599800
FS:  00007f4fd192e740(0000) GS:ffff88024d040000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000004c00000 CR3: 00000000a19ce000 CR4: 00000000001407e0
DR0: 00000000024f4000 DR1: 0000000001d43000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88009e87b9e8 ffffffffbb1728a3 ffff88009e87b9e8 ffff88009e87baa8
 ffff88012d18e900 ffff88009e87ba60 0000000000000000 0000000400000016
 0000000000000000 ffff88009e87bfd8 00000000000008b3 ffff88009e87ba50
Call Trace:
 [<ffffffffbb1728a3>] isolate_migratepages_range+0x7a3/0x870
 [<ffffffffbb172d90>] compact_zone+0x370/0x560
 [<ffffffffbb173022>] compact_zone_order+0xa2/0x110
 [<ffffffffbb1733f1>] try_to_compact_pages+0x101/0x130
 [<ffffffffbb71861b>] __alloc_pages_direct_compact+0xac/0x1d0
 [<ffffffffbb15760b>] __alloc_pages_nodemask+0x6ab/0xaf0
 [<ffffffffbb19c9ea>] alloc_pages_vma+0x9a/0x160
 [<ffffffffbb1aef0d>] do_huge_pmd_anonymous_page+0xfd/0x3c0
 [<ffffffffbb0a19cd>] ? get_parent_ip+0xd/0x50
 [<ffffffffbb17ac18>] handle_mm_fault+0x158/0xcb0
 [<ffffffffbb72594d>] ? retint_restore_args+0xe/0xe
 [<ffffffffbb728bb6>] __do_page_fault+0x1a6/0x620
 [<ffffffffbb11011e>] ? __acct_update_integrals+0x8e/0x120
 [<ffffffffbb0a19cd>] ? get_parent_ip+0xd/0x50
 [<ffffffffbb72949b>] ? preempt_count_sub+0x6b/0xf0
 [<ffffffffbb72904e>] do_page_fault+0x1e/0x70
Code: 75 1d 55 be 6c 00 00 00 48 c7 c7 8a 2f a2 bb 48 89 e5 e8 6c 49 95 ff 5d c6 05 74 16 65 00 01 c3 55 31 f6 48 89 e5 e8 28 bd a3 ff <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 45 31 ff 41 56 49 89 fe 
RIP  [<ffffffffbb718d98>]

That BUG is..

413 static inline int PageTransHuge(struct page *page)
414 {
415         VM_BUG_ON_PAGE(PageTail(page), page);
416         return PageHead(page);
417 }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-22 13:58 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page) Dave Jones
@ 2014-05-22 15:08 ` Vlastimil Babka
  2014-05-22 15:41   ` Dave Jones
  2014-05-23  4:21 ` Sasha Levin
  1 sibling, 1 reply; 8+ messages in thread
From: Vlastimil Babka @ 2014-05-22 15:08 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On 05/22/2014 03:58 PM, Dave Jones wrote:
> Not sure if Sasha has already reported this on -next (It's getting hard
> to keep track of all the VM bugs he's been finding), but I hit this overnight
> on .15-rc6.  First time I've seen this one.
>
>
> page:ffffea0004599800 count:0 mapcount:0 mapping:          (null) index:0x2
> page flags: 0x20000000008000(tail)
> ------------[ cut here ]------------
> kernel BUG at include/linux/page-flags.h:415!
> invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU: 1 PID: 6858 Comm: trinity-c42 Not tainted 3.15.0-rc6+ #216
> task: ffff88012d18e900 ti: ffff88009e87a000 task.ti: ffff88009e87a000
> RIP: 0010:[<ffffffffbb718d98>]  [<ffffffffbb718d98>] PageTransHuge.part.23+0xb/0xd
> RSP: 0000:ffff88009e87b940  EFLAGS: 00010246
> RAX: 0000000000000001 RBX: 0000000000116660 RCX: 0000000000000006
> RDX: 0000000000000000 RSI: ffffffffbb0c00f8 RDI: ffffffffbb0bfed2
> RBP: ffff88009e87b940 R08: ffffffffbc01203c R09: 00000000000003da
> R10: 00000000000003d9 R11: 0000000000000003 R12: 0000000000000001
> R13: 0000000000116800 R14: ffff88024d64ce00 R15: ffffea0004599800
> FS:  00007f4fd192e740(0000) GS:ffff88024d040000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000004c00000 CR3: 00000000a19ce000 CR4: 00000000001407e0
> DR0: 00000000024f4000 DR1: 0000000001d43000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>   ffff88009e87b9e8 ffffffffbb1728a3 ffff88009e87b9e8 ffff88009e87baa8
>   ffff88012d18e900 ffff88009e87ba60 0000000000000000 0000000400000016
>   0000000000000000 ffff88009e87bfd8 00000000000008b3 ffff88009e87ba50
> Call Trace:
>   [<ffffffffbb1728a3>] isolate_migratepages_range+0x7a3/0x870
>   [<ffffffffbb172d90>] compact_zone+0x370/0x560
>   [<ffffffffbb173022>] compact_zone_order+0xa2/0x110
>   [<ffffffffbb1733f1>] try_to_compact_pages+0x101/0x130
>   [<ffffffffbb71861b>] __alloc_pages_direct_compact+0xac/0x1d0
>   [<ffffffffbb15760b>] __alloc_pages_nodemask+0x6ab/0xaf0
>   [<ffffffffbb19c9ea>] alloc_pages_vma+0x9a/0x160
>   [<ffffffffbb1aef0d>] do_huge_pmd_anonymous_page+0xfd/0x3c0
>   [<ffffffffbb0a19cd>] ? get_parent_ip+0xd/0x50
>   [<ffffffffbb17ac18>] handle_mm_fault+0x158/0xcb0
>   [<ffffffffbb72594d>] ? retint_restore_args+0xe/0xe
>   [<ffffffffbb728bb6>] __do_page_fault+0x1a6/0x620
>   [<ffffffffbb11011e>] ? __acct_update_integrals+0x8e/0x120
>   [<ffffffffbb0a19cd>] ? get_parent_ip+0xd/0x50
>   [<ffffffffbb72949b>] ? preempt_count_sub+0x6b/0xf0
>   [<ffffffffbb72904e>] do_page_fault+0x1e/0x70
> Code: 75 1d 55 be 6c 00 00 00 48 c7 c7 8a 2f a2 bb 48 89 e5 e8 6c 49 95 ff 5d c6 05 74 16 65 00 01 c3 55 31 f6 48 89 e5 e8 28 bd a3 ff <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 45 31 ff 41 56 49 89 fe
> RIP  [<ffffffffbb718d98>]
>
> That BUG is..
>
> 413 static inline int PageTransHuge(struct page *page)
> 414 {
> 415         VM_BUG_ON_PAGE(PageTail(page), page);
> 416         return PageHead(page);
> 417 }

Any idea which of the two PageTransHuge() calls in 
isolate_migratepages_range() that is? Offset far in the function suggest 
it's where the lru lock is already held, but I'm not sure as decodecode 
of your dump and objdump of my own compile look widely different.

If it's indeed the later PageTransHuge() call, it means that somebody 
else has cleared PageLRU and set PageTail (I don't think a page could 
have both at once) between the checks for PageLRU() and PageTransHuge() 
in isolate_migratepages_range(), while the latter was holding lru_lock. 
That's quite weird...

Vlastimil

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-22 15:08 ` Vlastimil Babka
@ 2014-05-22 15:41   ` Dave Jones
  2014-05-23  9:45     ` Vlastimil Babka
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Jones @ 2014-05-22 15:41 UTC (permalink / raw)
  To: Vlastimil Babka; +Cc: Linux Kernel, linux-mm, Linus Torvalds

On Thu, May 22, 2014 at 05:08:09PM +0200, Vlastimil Babka wrote:
 
 > > RIP: 0010:[<ffffffffbb718d98>]  [<ffffffffbb718d98>] PageTransHuge.part.23+0xb/0xd
 > > Call Trace:
 > >   [<ffffffffbb1728a3>] isolate_migratepages_range+0x7a3/0x870
 > >   [<ffffffffbb172d90>] compact_zone+0x370/0x560
 > >   [<ffffffffbb173022>] compact_zone_order+0xa2/0x110
 > >   [<ffffffffbb1733f1>] try_to_compact_pages+0x101/0x130
 > > ...
 > > Code: 75 1d 55 be 6c 00 00 00 48 c7 c7 8a 2f a2 bb 48 89 e5 e8 6c 49 95 ff 5d c6 05 74 16 65 00 01 c3 55 31 f6 48 89 e5 e8 28 bd a3 ff <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 45 31 ff 41 56 49 89 fe
 > > RIP  [<ffffffffbb718d98>]
 > >
 > > That BUG is..
 > >
 > > 413 static inline int PageTransHuge(struct page *page)
 > > 414 {
 > > 415         VM_BUG_ON_PAGE(PageTail(page), page);
 > > 416         return PageHead(page);
 > > 417 }
 > 
 > Any idea which of the two PageTransHuge() calls in 
 > isolate_migratepages_range() that is? Offset far in the function suggest 
 > it's where the lru lock is already held, but I'm not sure as decodecode 
 > of your dump and objdump of my own compile look widely different.

Yeah, the only thing the code: matches is the BUG() which is in another section.
(see end of file at http://paste.fedoraproject.org/104155/40077293/raw/)

Maybe you can make more sense of that disassembly than I can..

	Dave



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-22 13:58 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page) Dave Jones
  2014-05-22 15:08 ` Vlastimil Babka
@ 2014-05-23  4:21 ` Sasha Levin
  2014-05-23  5:54   ` Vlastimil Babka
  2014-05-23  9:16   ` Kirill A. Shutemov
  1 sibling, 2 replies; 8+ messages in thread
From: Sasha Levin @ 2014-05-23  4:21 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On 05/22/2014 09:58 AM, Dave Jones wrote:
> Not sure if Sasha has already reported this on -next (It's getting hard
> to keep track of all the VM bugs he's been finding), but I hit this overnight
> on .15-rc6.  First time I've seen this one.

Unfortunately I had to disable transhuge/hugetlb in my testing .config since
the open issues in -next get hit pretty often, and were unfixed for a while
now.

Keeping them enabled just made it impossible to test anything else.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-23  4:21 ` Sasha Levin
@ 2014-05-23  5:54   ` Vlastimil Babka
  2014-05-23  9:16   ` Kirill A. Shutemov
  1 sibling, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2014-05-23  5:54 UTC (permalink / raw)
  To: Sasha Levin, Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On 23.5.2014 6:21, Sasha Levin wrote:
> On 05/22/2014 09:58 AM, Dave Jones wrote:
>> Not sure if Sasha has already reported this on -next (It's getting hard
>> to keep track of all the VM bugs he's been finding), but I hit this overnight
>> on .15-rc6.  First time I've seen this one.
> Unfortunately I had to disable transhuge/hugetlb in my testing .config since
> the open issues in -next get hit pretty often, and were unfixed for a while
> now.

Meh, I lost track. We should really consider something like bugzilla for 
this, IMHO.

> Keeping them enabled just made it impossible to test anything else.
>
>
> Thanks,
> Sasha
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-23  4:21 ` Sasha Levin
  2014-05-23  5:54   ` Vlastimil Babka
@ 2014-05-23  9:16   ` Kirill A. Shutemov
  2014-05-23 13:54     ` Sasha Levin
  1 sibling, 1 reply; 8+ messages in thread
From: Kirill A. Shutemov @ 2014-05-23  9:16 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On Fri, May 23, 2014 at 12:21:47AM -0400, Sasha Levin wrote:
> On 05/22/2014 09:58 AM, Dave Jones wrote:
> > Not sure if Sasha has already reported this on -next (It's getting hard
> > to keep track of all the VM bugs he's been finding), but I hit this overnight
> > on .15-rc6.  First time I've seen this one.
> 
> Unfortunately I had to disable transhuge/hugetlb in my testing .config since
> the open issues in -next get hit pretty often, and were unfixed for a while
> now.

What THP-related is not fixed by now? collapse hung? what else?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-22 15:41   ` Dave Jones
@ 2014-05-23  9:45     ` Vlastimil Babka
  0 siblings, 0 replies; 8+ messages in thread
From: Vlastimil Babka @ 2014-05-23  9:45 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On 05/22/2014 05:41 PM, Dave Jones wrote:
> On Thu, May 22, 2014 at 05:08:09PM +0200, Vlastimil Babka wrote:
>
>   > > RIP: 0010:[<ffffffffbb718d98>]  [<ffffffffbb718d98>] PageTransHuge.part.23+0xb/0xd
>   > > Call Trace:
>   > >   [<ffffffffbb1728a3>] isolate_migratepages_range+0x7a3/0x870
>   > >   [<ffffffffbb172d90>] compact_zone+0x370/0x560
>   > >   [<ffffffffbb173022>] compact_zone_order+0xa2/0x110
>   > >   [<ffffffffbb1733f1>] try_to_compact_pages+0x101/0x130
>   > > ...
>   > > Code: 75 1d 55 be 6c 00 00 00 48 c7 c7 8a 2f a2 bb 48 89 e5 e8 6c 49 95 ff 5d c6 05 74 16 65 00 01 c3 55 31 f6 48 89 e5 e8 28 bd a3 ff <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 45 31 ff 41 56 49 89 fe
>   > > RIP  [<ffffffffbb718d98>]
>   > >
>   > > That BUG is..
>   > >
>   > > 413 static inline int PageTransHuge(struct page *page)
>   > > 414 {
>   > > 415         VM_BUG_ON_PAGE(PageTail(page), page);
>   > > 416         return PageHead(page);
>   > > 417 }
>   >
>   > Any idea which of the two PageTransHuge() calls in
>   > isolate_migratepages_range() that is? Offset far in the function suggest
>   > it's where the lru lock is already held, but I'm not sure as decodecode
>   > of your dump and objdump of my own compile look widely different.
>
> Yeah, the only thing the code: matches is the BUG() which is in another section.

Oh right, it's not a simple BUG_ON that would be inlined.

> (see end of file at http://paste.fedoraproject.org/104155/40077293/raw/)
>
> Maybe you can make more sense of that disassembly than I can..

Could you try adding -r to objdump, as now I have no idea where all 
those calls go :/

> 	Dave
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page)
  2014-05-23  9:16   ` Kirill A. Shutemov
@ 2014-05-23 13:54     ` Sasha Levin
  0 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2014-05-23 13:54 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Dave Jones, Linux Kernel, linux-mm, Linus Torvalds

On 05/23/2014 05:16 AM, Kirill A. Shutemov wrote:
> On Fri, May 23, 2014 at 12:21:47AM -0400, Sasha Levin wrote:
>> On 05/22/2014 09:58 AM, Dave Jones wrote:
>>> Not sure if Sasha has already reported this on -next (It's getting hard
>>> to keep track of all the VM bugs he's been finding), but I hit this overnight
>>> on .15-rc6.  First time I've seen this one.
>>
>> Unfortunately I had to disable transhuge/hugetlb in my testing .config since
>> the open issues in -next get hit pretty often, and were unfixed for a while
>> now.
> 
> What THP-related is not fixed by now? collapse hung? what else?

Besides the collapse hang, we have this: https://lkml.org/lkml/2013/3/29/103 .

I know it's not a "real" bug, but DEBUG_PAGEALLOC misbehaving, but it's
still something that makes testing difficult.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-23 13:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-22 13:58 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page) Dave Jones
2014-05-22 15:08 ` Vlastimil Babka
2014-05-22 15:41   ` Dave Jones
2014-05-23  9:45     ` Vlastimil Babka
2014-05-23  4:21 ` Sasha Levin
2014-05-23  5:54   ` Vlastimil Babka
2014-05-23  9:16   ` Kirill A. Shutemov
2014-05-23 13:54     ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).