From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751558AbaEVPIM (ORCPT ); Thu, 22 May 2014 11:08:12 -0400 Received: from cantor2.suse.de ([195.135.220.15]:43196 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750729AbaEVPIL (ORCPT ); Thu, 22 May 2014 11:08:11 -0400 Message-ID: <537E12D9.6090709@suse.cz> Date: Thu, 22 May 2014 17:08:09 +0200 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Dave Jones , Linux Kernel , linux-mm@kvack.org, Linus Torvalds Subject: Re: 3.15.0-rc6: VM_BUG_ON_PAGE(PageTail(page), page) References: <20140522135828.GA24879@redhat.com> In-Reply-To: <20140522135828.GA24879@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/22/2014 03:58 PM, Dave Jones wrote: > Not sure if Sasha has already reported this on -next (It's getting hard > to keep track of all the VM bugs he's been finding), but I hit this overnight > on .15-rc6. First time I've seen this one. > > > page:ffffea0004599800 count:0 mapcount:0 mapping: (null) index:0x2 > page flags: 0x20000000008000(tail) > ------------[ cut here ]------------ > kernel BUG at include/linux/page-flags.h:415! > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > CPU: 1 PID: 6858 Comm: trinity-c42 Not tainted 3.15.0-rc6+ #216 > task: ffff88012d18e900 ti: ffff88009e87a000 task.ti: ffff88009e87a000 > RIP: 0010:[] [] PageTransHuge.part.23+0xb/0xd > RSP: 0000:ffff88009e87b940 EFLAGS: 00010246 > RAX: 0000000000000001 RBX: 0000000000116660 RCX: 0000000000000006 > RDX: 0000000000000000 RSI: ffffffffbb0c00f8 RDI: ffffffffbb0bfed2 > RBP: ffff88009e87b940 R08: ffffffffbc01203c R09: 00000000000003da > R10: 00000000000003d9 R11: 0000000000000003 R12: 0000000000000001 > R13: 0000000000116800 R14: ffff88024d64ce00 R15: ffffea0004599800 > FS: 00007f4fd192e740(0000) GS:ffff88024d040000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000004c00000 CR3: 00000000a19ce000 CR4: 00000000001407e0 > DR0: 00000000024f4000 DR1: 0000000001d43000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 > Stack: > ffff88009e87b9e8 ffffffffbb1728a3 ffff88009e87b9e8 ffff88009e87baa8 > ffff88012d18e900 ffff88009e87ba60 0000000000000000 0000000400000016 > 0000000000000000 ffff88009e87bfd8 00000000000008b3 ffff88009e87ba50 > Call Trace: > [] isolate_migratepages_range+0x7a3/0x870 > [] compact_zone+0x370/0x560 > [] compact_zone_order+0xa2/0x110 > [] try_to_compact_pages+0x101/0x130 > [] __alloc_pages_direct_compact+0xac/0x1d0 > [] __alloc_pages_nodemask+0x6ab/0xaf0 > [] alloc_pages_vma+0x9a/0x160 > [] do_huge_pmd_anonymous_page+0xfd/0x3c0 > [] ? get_parent_ip+0xd/0x50 > [] handle_mm_fault+0x158/0xcb0 > [] ? retint_restore_args+0xe/0xe > [] __do_page_fault+0x1a6/0x620 > [] ? __acct_update_integrals+0x8e/0x120 > [] ? get_parent_ip+0xd/0x50 > [] ? preempt_count_sub+0x6b/0xf0 > [] do_page_fault+0x1e/0x70 > Code: 75 1d 55 be 6c 00 00 00 48 c7 c7 8a 2f a2 bb 48 89 e5 e8 6c 49 95 ff 5d c6 05 74 16 65 00 01 c3 55 31 f6 48 89 e5 e8 28 bd a3 ff <0f> 0b 0f 1f 44 00 00 55 48 89 e5 41 57 45 31 ff 41 56 49 89 fe > RIP [] > > That BUG is.. > > 413 static inline int PageTransHuge(struct page *page) > 414 { > 415 VM_BUG_ON_PAGE(PageTail(page), page); > 416 return PageHead(page); > 417 } Any idea which of the two PageTransHuge() calls in isolate_migratepages_range() that is? Offset far in the function suggest it's where the lru lock is already held, but I'm not sure as decodecode of your dump and objdump of my own compile look widely different. If it's indeed the later PageTransHuge() call, it means that somebody else has cleared PageLRU and set PageTail (I don't think a page could have both at once) between the checks for PageLRU() and PageTransHuge() in isolate_migratepages_range(), while the latter was holding lru_lock. That's quite weird... Vlastimil > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >