linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mm: kernel BUG at mm/huge_memory.c:3272!
@ 2015-11-30 14:37 Sasha Levin
  2015-12-01 21:26 ` Kirill A. Shutemov
  0 siblings, 1 reply; 3+ messages in thread
From: Sasha Levin @ 2015-11-30 14:37 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: LKML, linux-mm

Hi Kirill,

I've hit the following while fuzzing with trinity on the latest -next kernel:

[  321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
[  321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
[  321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
[  321.455353] page->mem_cgroup:ffff880286620000
[  321.456482] ------------[ cut here ]------------
[  321.457158] kernel BUG at mm/huge_memory.c:3272!
[  321.457811] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[  321.458598] Modules linked in:
[  321.459057] CPU: 18 PID: 24106 Comm: trinity-c129 Not tainted 4.4.0-rc2-next-20151127-sasha-00012-gf0498ca-dirty #2661
[  321.460516] task: ffff880042fd2000 ti: ffff8800428c0000 task.ti: ffff8800428c0000
[  321.461732] RIP: split_huge_page_to_list (mm/huge_memory.c:3272 (discriminator 1))
[  321.464004] RSP: 0000:ffff8800428c71d0  EFLAGS: 00010246
[  321.464733] RAX: ffff880042fd2000 RBX: ffffea0011a20080 RCX: 0000000000000000
[  321.465735] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffed0008518e1f
[  321.466719] RBP: ffff8800428c72b0 R08: fffffbfff4f9eaf1 R09: ffffffffa7cf578f
[  321.467704] R10: ffffed0105fe6293 R11: 1ffffffff4f9eaed R12: ffffea0011a20060
[  321.468702] R13: ffffea0011a200a0 R14: ffffea0011a20080 R15: ffff8800428c7300
[  321.469718] FS:  00007f9d611bb700(0000) GS:ffff880686800000(0000) knlGS:0000000000000000
[  321.470807] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  321.471608] CR2: 0000000001b54fe8 CR3: 0000000042869000 CR4: 00000000000006a0
[  321.472633] DR0: 00007f9d5cb76000 DR1: 0000000000000000 DR2: 0000000000000000
[  321.473612] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[  321.474619] Stack:
[  321.474935]  ffff8800428c7300 ffff8800428c72b0 ffffffff9950f2c8 dffffc0000000000
[  321.476071]  0000000041b58ab3 ffffffffa4871335 ffffffff9950f0c0 dffffc0000000000
[  321.477184]  ffffea0011a20000 0000000000000000 0000000000000000 0000000000000001
[  321.478297] Call Trace:
[  321.481234] deferred_split_scan (mm/huge_memory.c:3392)
[  321.484688] shrink_slab (mm/vmscan.c:354 mm/vmscan.c:446)
[  321.488008] shrink_zone (mm/vmscan.c:2449)
[  321.493105] do_try_to_free_pages (mm/vmscan.c:2600 mm/vmscan.c:2650)
[  321.496657] try_to_free_pages (mm/vmscan.c:2858)
[  321.498346] __alloc_pages_nodemask (mm/page_alloc.c:2878 mm/page_alloc.c:2896 mm/page_alloc.c:3149 mm/page_alloc.c:3260)
[  321.508819] alloc_pages_vma (mm/mempolicy.c:2042)
[  321.509629] wp_page_copy.isra.41 (mm/memory.c:2064)
[  321.512347] do_wp_page (mm/memory.c:2339)
[  321.518569] handle_mm_fault (mm/memory.c:3302 mm/memory.c:3396 mm/memory.c:3425)
[  321.527500] __do_page_fault (arch/x86/mm/fault.c:1239)
[  321.528411] do_page_fault (arch/x86/mm/fault.c:1301 include/linux/context_tracking_state.h:30 include/linux/context_tracking.h:50 arch/x86/mm/fault.c:1302)
[  321.530053] do_async_page_fault (./arch/x86/include/asm/traps.h:82 arch/x86/kernel/kvm.c:264)
[  321.532125] async_page_fault (arch/x86/entry/entry_64.S:989)
[ 321.533057] Code: ea 03 80 3c 02 00 74 08 48 89 df e8 58 4d fe ff 48 8b 03 a8 01 75 16 e8 7c 51 fe ff 48 c7 c6 80 3c 4f a2 4c 89 f7 e8 2d 84 f5 ff <0f> 0b e8 66 51 fe ff 48 8b 55 c8 48 b8 00 00 00 00 00 fc ff df
All code
========
   0:   ea                      (bad)
   1:   03 80 3c 02 00 74       add    0x7400023c(%rax),%eax
   7:   08 48 89                or     %cl,-0x77(%rax)
   a:   df e8                   fucomip %st(0),%st
   c:   58                      pop    %rax
   d:   4d fe                   rex.WRB (bad)
   f:   ff 48 8b                decl   -0x75(%rax)
  12:   03 a8 01 75 16 e8       add    -0x17e98aff(%rax),%ebp
  18:   7c 51                   jl     0x6b
  1a:   fe                      (bad)
  1b:   ff 48 c7                decl   -0x39(%rax)
  1e:   c6 80 3c 4f a2 4c 89    movb   $0x89,0x4ca24f3c(%rax)
  25:   f7 e8                   imul   %eax
  27:   2d 84 f5 ff 0f          sub    $0xffff584,%eax
  2c:   0b e8                   or     %eax,%ebp
  2e:   66 51                   push   %cx
  30:   fe                      (bad)
  31:   ff 48 8b                decl   -0x75(%rax)
  34:   55                      push   %rbp
  35:*  c8 48 b8 00             enterq $0xb848,$0x0             <-- trapping instruction
  39:   00 00                   add    %al,(%rax)
  3b:   00 00                   add    %al,(%rax)
  3d:   fc                      cld
  3e:   ff df                   lcallq *<internal disassembler error>
        ...

Code starting with the faulting instruction
===========================================
   0:   0f 0b                   ud2
   2:   e8 66 51 fe ff          callq  0xfffffffffffe516d
   7:   48 8b 55 c8             mov    -0x38(%rbp),%rdx
   b:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
  12:   fc ff df
        ...
[  321.537072] RIP split_huge_page_to_list (mm/huge_memory.c:3272 (discriminator 1))
[  321.537942]  RSP <ffff8800428c71d0>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: mm: kernel BUG at mm/huge_memory.c:3272!
  2015-11-30 14:37 mm: kernel BUG at mm/huge_memory.c:3272! Sasha Levin
@ 2015-12-01 21:26 ` Kirill A. Shutemov
  2015-12-01 23:41   ` Minchan Kim
  0 siblings, 1 reply; 3+ messages in thread
From: Kirill A. Shutemov @ 2015-12-01 21:26 UTC (permalink / raw)
  To: Sasha Levin, Minchan Kim, Andrew Morton; +Cc: LKML, linux-mm

On Mon, Nov 30, 2015 at 09:37:33AM -0500, Sasha Levin wrote:
> Hi Kirill,
> 
> I've hit the following while fuzzing with trinity on the latest -next kernel:
> 
> [  321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
> [  321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
> [  321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> [  321.455353] page->mem_cgroup:ffff880286620000

I think this should help:

>From aadc911f047b094c68b350550556dafabf05af13 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Fri, 20 Nov 2015 12:20:00 +0200
Subject: [PATCH] thp: fix split_huge_page vs. deferred_split_scan race

Minchan[1] and Sasha[2] had reported crash in split_huge_page_to_list()
called from deferred_split_scan() due VM_BUG_ON_PAGE(!PageLocked(page)).

This can happen because race between deferred_split_scan() and
split_huge_page(). The result of the race is that the page can be split
under deferred_split_scan().

The patch prevents this by taking split_queue_lock in
split_huge_page_to_list() when we check if the page can be split.
If the page is suitable for splitting, we remove page from splitting
queue under the same lock, before splitting starts.

[1] http://lkml.kernel.org/g/20151117073539.GB32578@bbox
[2] http://lkml.kernel.org/g/565C5F2D.5060003@oracle.com

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Minchan Kim <minchan@kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
---
 mm/huge_memory.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index dc2b947d4f85..7c0ad4d9110b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3186,13 +3186,6 @@ static void __split_huge_page(struct page *page, struct list_head *list)
 	spin_lock_irq(&zone->lru_lock);
 	lruvec = mem_cgroup_page_lruvec(head, zone);
 
-	spin_lock(&split_queue_lock);
-	if (!list_empty(page_deferred_list(head))) {
-		split_queue_len--;
-		list_del(page_deferred_list(head));
-	}
-	spin_unlock(&split_queue_lock);
-
 	/* complete memcg works before add pages to LRU */
 	mem_cgroup_split_huge_fixup(head);
 
@@ -3299,12 +3292,20 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	freeze_page(anon_vma, head);
 	VM_BUG_ON_PAGE(compound_mapcount(head), head);
 
+	/* Prevent deferred_split_scan() touching ->_count */
+	spin_lock(&split_queue_lock);
 	count = page_count(head);
 	mapcount = total_mapcount(head);
 	if (mapcount == count - 1) {
+		if (!list_empty(page_deferred_list(head))) {
+			split_queue_len--;
+			list_del(page_deferred_list(head));
+		}
+		spin_unlock(&split_queue_lock);
 		__split_huge_page(page, list);
 		ret = 0;
 	} else if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount > count - 1) {
+		spin_unlock(&split_queue_lock);
 		pr_alert("total_mapcount: %u, page_count(): %u\n",
 				mapcount, count);
 		if (PageTail(page))
@@ -3312,6 +3313,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		dump_page(page, "total_mapcount(head) > page_count(head) - 1");
 		BUG();
 	} else {
+		spin_unlock(&split_queue_lock);
 		unfreeze_page(anon_vma, head);
 		ret = -EBUSY;
 	}
-- 
2.6.2

-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: mm: kernel BUG at mm/huge_memory.c:3272!
  2015-12-01 21:26 ` Kirill A. Shutemov
@ 2015-12-01 23:41   ` Minchan Kim
  0 siblings, 0 replies; 3+ messages in thread
From: Minchan Kim @ 2015-12-01 23:41 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Sasha Levin, Andrew Morton, LKML, linux-mm

On Tue, Dec 01, 2015 at 11:26:36PM +0200, Kirill A. Shutemov wrote:
> On Mon, Nov 30, 2015 at 09:37:33AM -0500, Sasha Levin wrote:
> > Hi Kirill,
> > 
> > I've hit the following while fuzzing with trinity on the latest -next kernel:
> > 
> > [  321.348184] page:ffffea0011a20080 count:1 mapcount:1 mapping:ffff8802d745f601 index:0x1802
> > [  321.350607] flags: 0x320035c00040078(uptodate|dirty|lru|active|swapbacked)
> > [  321.453706] page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
> > [  321.455353] page->mem_cgroup:ffff880286620000
> 
> I think this should help:
> 
> From aadc911f047b094c68b350550556dafabf05af13 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Fri, 20 Nov 2015 12:20:00 +0200
> Subject: [PATCH] thp: fix split_huge_page vs. deferred_split_scan race
> 
> Minchan[1] and Sasha[2] had reported crash in split_huge_page_to_list()
> called from deferred_split_scan() due VM_BUG_ON_PAGE(!PageLocked(page)).
> 
> This can happen because race between deferred_split_scan() and
> split_huge_page(). The result of the race is that the page can be split
> under deferred_split_scan().
> 
> The patch prevents this by taking split_queue_lock in
> split_huge_page_to_list() when we check if the page can be split.
> If the page is suitable for splitting, we remove page from splitting
> queue under the same lock, before splitting starts.
> 
> [1] http://lkml.kernel.org/g/20151117073539.GB32578@bbox
> [2] http://lkml.kernel.org/g/565C5F2D.5060003@oracle.com
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Reported-by: Minchan Kim <minchan@kernel.org>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>

With this, I cannot reprocude the error.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-12-01 23:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-30 14:37 mm: kernel BUG at mm/huge_memory.c:3272! Sasha Levin
2015-12-01 21:26 ` Kirill A. Shutemov
2015-12-01 23:41   ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).