linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG at mm/gup.c:LINE!
@ 2018-07-04  4:19 syzbot
  2018-07-04 10:01 ` Tetsuo Handa
  0 siblings, 1 reply; 19+ messages in thread
From: syzbot @ 2018-07-04  4:19 UTC (permalink / raw)
  To: akpm, aneesh.kumar, dan.j.williams, kirill.shutemov,
	linux-kernel, linux-mm, mst, syzkaller-bugs, viro, ying.huang,
	zi.yan

Hello,

syzbot found the following crash on:

HEAD commit:    d3bc0e67f852 Merge tag 'for-4.18-rc2-tag' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1000077c400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a63be0c83e84d370
dashboard link: https://syzkaller.appspot.com/bug?extid=5dcb560fe12aa5091c06
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
userspace arch: i386
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=158577a2400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com

IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
8021q: adding VLAN 0 to HW filter on device team0
------------[ cut here ]------------
kernel BUG at mm/gup.c:1242!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 1 PID: 4837 Comm: syz-executor0 Not tainted 4.18.0-rc2+ #29
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:__mm_populate+0x472/0x520 mm/gup.c:1242
Code: ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e aa 00 00 00 44 8b 75 98 45  
31 e4 e9 58 ff ff ff e8 b5 9e d1 ff 0f 0b e8 ae 9e d1 ff <0f> 0b 48 8b bd  
60 ff ff ff e8 d0 72 0f 00 e9 52 fc ff ff 48 8b bd
RSP: 0018:ffff8801aae77ae0 EFLAGS: 00010293
RAX: ffff8801cfb48280 RBX: 0000000000008000 RCX: ffffffff81aa6a68
RDX: 0000000000000000 RSI: ffffffff81aa6dc2 RDI: 0000000000000006
RBP: ffff8801aae77ba0 R08: ffff8801cfb48280 R09: fffffbfff133d66a
R10: 0000000000000003 R11: 0000000000000000 R12: 000000007bf81000
R13: 0000000000007676 R14: dffffc0000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8801daf00000(0063) knlGS:000000000865b900
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000080e3a94 CR3: 00000001cb021000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  mm_populate include/linux/mm.h:2296 [inline]
  vm_brk_flags+0x1fe/0x240 mm/mmap.c:3038
  vm_brk+0x1f/0x30 mm/mmap.c:3045
  load_elf_library+0x711/0x8e0 fs/binfmt_elf.c:1266
  __do_sys_uselib fs/exec.c:161 [inline]
  __se_sys_uselib fs/exec.c:120 [inline]
  __ia32_sys_uselib+0x37e/0x4c0 fs/exec.c:120
  do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline]
  do_fast_syscall_32+0x34d/0xfb2 arch/x86/entry/common.c:397
  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7fcbcb9
Code: 55 08 8b 88 64 cd ff ff 8b 98 68 cd ff ff 89 c8 85 d2 74 02 89 0a 5b  
5d c3 8b 04 24 c3 8b 1c 24 c3 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90  
90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:00000000ff8df4ac EFLAGS: 00000282 ORIG_RAX: 0000000000000056
RAX: ffffffffffffffda RBX: 0000000020000040 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
Dumping ftrace buffer:
    (ftrace buffer empty)
---[ end trace f964ea7008b66351 ]---
RIP: 0010:__mm_populate+0x472/0x520 mm/gup.c:1242
Code: ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e aa 00 00 00 44 8b 75 98 45  
31 e4 e9 58 ff ff ff e8 b5 9e d1 ff 0f 0b e8 ae 9e d1 ff <0f> 0b 48 8b bd  
60 ff ff ff e8 d0 72 0f 00 e9 52 fc ff ff 48 8b bd
RSP: 0018:ffff8801aae77ae0 EFLAGS: 00010293
RAX: ffff8801cfb48280 RBX: 0000000000008000 RCX: ffffffff81aa6a68
RDX: 0000000000000000 RSI: ffffffff81aa6dc2 RDI: 0000000000000006
RBP: ffff8801aae77ba0 R08: ffff8801cfb48280 R09: fffffbfff133d66a
R10: 0000000000000003 R11: 0000000000000000 R12: 000000007bf81000
R13: 0000000000007676 R14: dffffc0000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8801daf00000(0063) knlGS:000000000865b900
CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000080e3a94 CR3: 00000001cb021000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04  4:19 kernel BUG at mm/gup.c:LINE! syzbot
@ 2018-07-04 10:01 ` Tetsuo Handa
  2018-07-04 11:17   ` Michal Hocko
  0 siblings, 1 reply; 19+ messages in thread
From: Tetsuo Handa @ 2018-07-04 10:01 UTC (permalink / raw)
  To: Michal Hocko
  Cc: syzbot, akpm, aneesh.kumar, dan.j.williams, kirill.shutemov,
	linux-mm, mst, syzkaller-bugs, viro, ying.huang, zi.yan

+Michal Hocko

On 2018/07/04 13:19, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    d3bc0e67f852 Merge tag 'for-4.18-rc2-tag' of git://git.ker..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1000077c400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a63be0c83e84d370
> dashboard link: https://syzkaller.appspot.com/bug?extid=5dcb560fe12aa5091c06
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> userspace arch: i386
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=158577a2400000

Here is C reproducer made from syz reproducer. mlockall(MCL_FUTURE) is involved.

This problem is triggerable by an unprivileged user.
Shows different result on x86_64 (crash) and x86_32 (stall).

------------------------------------------------------------
/* Need to compile using "-m32" option if host is 64bit. */
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
int uselib(const char *library);

int main(int argc, char *argv[])
{
	int fd = open("file", O_WRONLY | O_CREAT, 0644);
	write(fd, "\x7f\x45\x4c\x46\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02"
	      "\x00\x06\x00\xca\x3f\x8b\xca\x00\x00\x00\x00\x38\x00\x00\x00\x00\x00"
	      "\x00\xf7\xff\xff\xff\xff\xff\xff\x1f\x00\x02\x00\x00\x00\x00\x00\x00"
	      "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x7b"
	      "\x66\xff\x00\x00\x05\x00\x00\x00\x76\x86\x00\x00\x00\x00\x00\x00\x00"
	      "\x00\x00\x00\x31\x0f\xf3\xee\xc1\xb0\x00\x0c\x08\x53\x55\xbe\x88\x47"
	      "\xc2\x2e\x30\xf5\x62\x82\xc6\x2c\x95\x72\x3f\x06\x8f\xe4\x2d\x27\x96"
	      "\xcc", 120);
	fchmod(fd, 0755);
	close(fd);
	mlockall(MCL_FUTURE); /* Removing this line avoids the bug. */
	uselib("file");
	return 0;
}
------------------------------------------------------------

------------------------------------------------------------
CentOS Linux 7 (Core)
Kernel 4.18.0-rc3 on an x86_64

localhost login: [   81.210241] emacs (9634) used greatest stack depth: 10416 bytes left
[  140.099935] ------------[ cut here ]------------
[  140.101904] kernel BUG at mm/gup.c:1242!
[  140.103572] invalid opcode: 0000 [#1] SMP
[  140.105220] CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
[  140.107762] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  140.112000] RIP: 0010:__mm_populate+0x1e2/0x1f0
[  140.113875] Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb 
[  140.121403] RSP: 0018:ffffc90000dffd78 EFLAGS: 00010293
[  140.123516] RAX: ffff8801366c63c0 RBX: 000000007bf81000 RCX: ffffffff813e4ee2
[  140.126352] RDX: 0000000000000000 RSI: 0000000000007676 RDI: 000000007bf81000
[  140.129236] RBP: ffffc90000dffdc0 R08: 0000000000000000 R09: 0000000000000000
[  140.132110] R10: ffff880135895c80 R11: 0000000000000000 R12: 0000000000007676
[  140.134955] R13: 0000000000008000 R14: 0000000000000000 R15: 0000000000007676
[  140.137785] FS:  0000000000000000(0000) GS:ffff88013a680000(0063) knlGS:00000000f7db9700
[  140.140998] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  140.143303] CR2: 00000000f7ea56e0 CR3: 0000000134674004 CR4: 00000000000606e0
[  140.145906] Call Trace:
[  140.146728]  vm_brk_flags+0xc3/0x100
[  140.147830]  vm_brk+0x1f/0x30
[  140.148714]  load_elf_library+0x281/0x2e0
[  140.149875]  __ia32_sys_uselib+0x170/0x1e0
[  140.151028]  ? copy_overflow+0x30/0x30
[  140.152105]  ? __ia32_sys_uselib+0x170/0x1e0
[  140.153301]  do_fast_syscall_32+0xca/0x420
[  140.154455]  entry_SYSENTER_compat+0x70/0x7f
[  140.155651] RIP: 0023:0xf7f9fc99
[  140.156568] Code: 89 c8 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 0c 24 c3 8b 1c 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 
[  140.161951] RSP: 002b:00000000ffcca47c EFLAGS: 00000246 ORIG_RAX: 0000000000000056
[  140.164292] RAX: ffffffffffffffda RBX: 0000000008048614 RCX: 00000000000001ed
[  140.166390] RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000000
[  140.168400] RBP: 00000000ffcca4a8 R08: 0000000000000000 R09: 0000000000000000
[  140.170352] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  140.172302] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  140.174255] Modules linked in:
[  140.175255] ---[ end trace d38f4666ebf4809c ]---
[  140.176838] RIP: 0010:__mm_populate+0x1e2/0x1f0
[  140.178239] Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb 
[  140.183795] RSP: 0018:ffffc90000dffd78 EFLAGS: 00010293
[  140.185293] RAX: ffff8801366c63c0 RBX: 000000007bf81000 RCX: ffffffff813e4ee2
[  140.187285] RDX: 0000000000000000 RSI: 0000000000007676 RDI: 000000007bf81000
[  140.189282] RBP: ffffc90000dffdc0 R08: 0000000000000000 R09: 0000000000000000
[  140.191298] R10: ffff880135895c80 R11: 0000000000000000 R12: 0000000000007676
[  140.193478] R13: 0000000000008000 R14: 0000000000000000 R15: 0000000000007676
[  140.195740] FS:  0000000000000000(0000) GS:ffff88013a680000(0063) knlGS:00000000f7db9700
[  140.198178] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[  140.199864] CR2: 00000000f7ea56e0 CR3: 0000000134674004 CR4: 00000000000606e0
[  140.201998] Kernel panic - not syncing: Fatal exception
------------------------------------------------------------

------------------------------------------------------------
CentOS Linux 7 (AltArch)
Kernel 4.18.0-rc3-00113-gfc36def on an i686

localhost login: [  231.139466] INFO: rcu_sched self-detected stall on CPU
[  231.140169] INFO: rcu_sched detected stalls on CPUs/tasks:
[  231.141010] 	5-....: (20761 ticks this GP) idle=0b6/1/1073741826 softirq=1654/1654 fqs=5193 
[  231.145209] 	
[  231.145213] 	5-....: (20761 ticks this GP) idle=0b6/1/1073741826 softirq=1654/1654 fqs=5194 
[  231.145216]  (t=21003 jiffies g=884 c=883 q=12)
[  231.145777] 	
[  231.148182] NMI backtrace for cpu 5
[  231.149527] (detected by 4, t=21011 jiffies, g=884, c=883, q=12)
[  231.150049] CPU: 5 PID: 956 Comm: a.out Not tainted 4.18.0-rc3-00113-gfc36def #365
[  231.155315] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  231.158549] Call Trace:
[  231.159341]  dump_stack+0x57/0x7b
[  231.160422]  nmi_cpu_backtrace+0xc4/0xd0
[  231.161641]  nmi_trigger_cpumask_backtrace+0x9a/0xe0
[  231.163174]  ? vprintk_default+0x32/0x40
[  231.164408]  ? lapic_can_unplug_cpu+0xa0/0xa0
[  231.165760]  arch_trigger_cpumask_backtrace+0x10/0x20
[  231.167321]  rcu_dump_cpu_stacks+0x6f/0x96
[  231.168596]  rcu_check_callbacks+0x532/0x680
[  231.169994]  ? account_process_tick+0x55/0x120
[  231.171371]  ? tick_sched_do_timer+0x50/0x50
[  231.172700]  update_process_times+0x23/0x50
[  231.174016]  tick_sched_handle+0x3a/0x50
[  231.175277]  tick_sched_timer+0x34/0x80
[  231.176492]  __hrtimer_run_queues+0xe4/0x170
[  231.177822]  hrtimer_interrupt+0x10d/0x2b0
[  231.179101]  smp_apic_timer_interrupt+0x4f/0x90
[  231.180511]  ? smp_apic_timer_interrupt+0x54/0x90
[  231.181968]  apic_timer_interrupt+0x3c/0x44
[  231.183262] EIP: __get_user_pages+0x3/0x3e0
[  231.184559] Code: e4 89 f0 89 1c 24 e8 fc 1b 03 00 8b 55 e4 c6 02 00 85 c0 0f 85 b0 fb ff ff e9 21 fd ff ff 89 f6 8d bc 27 00 00 00 00 55 89 e5 <57> 56 53 83 ec 44 8b 7d 08 89 45 dc 8b 45 10 89 55 d8 89 4d e8 89 
[  231.190324] EAX: f2301300 EBX: 00001053 ECX: 7bf88000 EDX: f235c240
[  231.192259] ESI: 7bf88000 EDI: f6ebbea4 EBP: f6ebbe5c ESP: f6ebbe5c
[  231.194170] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000206
[  231.196252]  populate_vma_page_range+0x77/0x80
[  231.197631]  __mm_populate+0x8c/0x110
[  231.198780]  vm_brk_flags+0xab/0xc0
[  231.199867]  vm_brk+0xa/0x10
[  231.200803]  load_elf_library+0x1c0/0x1e0
[  231.202073]  sys_uselib+0x11a/0x160
[  231.203266]  do_fast_syscall_32+0x95/0x188
[  231.204562]  entry_SYSENTER_32+0x4e/0x7c
[  231.205787] EIP: 0xb7f98fd1
[  231.206676] Code: c1 9e f3 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  231.212378] EAX: ffffffda EBX: 08048614 ECX: 000001ed EDX: 00000003
[  231.214302] ESI: 00000000 EDI: 00000000 EBP: bfbfaa28 ESP: bfbfa9fc
[  231.216223] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[  231.218302] Sending NMI from CPU 4 to CPUs 5:
[  231.219719] NMI backtrace for cpu 5
[  231.219722] CPU: 5 PID: 956 Comm: a.out Not tainted 4.18.0-rc3-00113-gfc36def #365
[  231.219722] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  231.219726] EIP: queued_spin_lock_slowpath+0x32/0x200
[  231.219727] Code: 66 66 66 66 90 ba 01 00 00 00 8d b6 00 00 00 00 8b 01 85 c0 75 12 f0 0f b1 11 85 c0 75 f2 5b 5e 5f 5d c3 90 8d 74 26 00 f3 90 <eb> e4 8d 74 26 00 81 fa 00 01 00 00 66 90 0f 84 3f 01 00 00 81 e2 
[  231.219745] EAX: 00000001 EBX: 00000001 ECX: d66ce500 EDX: 00000001
[  231.219746] ESI: 00000046 EDI: d66ce500 EBP: f6ebbcf0 ESP: f6ebbce4
[  231.219747] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
[  231.219748] CR0: 80050033 CR2: b7ebb3f0 CR3: 32646760 CR4: 000406f0
[  231.219788] Call Trace:
[  231.219792]  _raw_spin_lock_irqsave+0x33/0x40
[  231.219795]  rcu_check_callbacks+0x539/0x680
[  231.219798]  ? account_process_tick+0x55/0x120
[  231.219801]  ? tick_sched_do_timer+0x50/0x50
[  231.219803]  update_process_times+0x23/0x50
[  231.219804]  tick_sched_handle+0x3a/0x50
[  231.219806]  tick_sched_timer+0x34/0x80
[  231.219807]  __hrtimer_run_queues+0xe4/0x170
[  231.219809]  hrtimer_interrupt+0x10d/0x2b0
[  231.219811]  smp_apic_timer_interrupt+0x4f/0x90
[  231.219812]  ? smp_apic_timer_interrupt+0x54/0x90
[  231.219814]  apic_timer_interrupt+0x3c/0x44
[  231.219816] EIP: __get_user_pages+0x3/0x3e0
[  231.219816] Code: e4 89 f0 89 1c 24 e8 fc 1b 03 00 8b 55 e4 c6 02 00 85 c0 0f 85 b0 fb ff ff e9 21 fd ff ff 89 f6 8d bc 27 00 00 00 00 55 89 e5 <57> 56 53 83 ec 44 8b 7d 08 89 45 dc 8b 45 10 89 55 d8 89 4d e8 89 
[  231.219834] EAX: f2301300 EBX: 00001053 ECX: 7bf88000 EDX: f235c240
[  231.219835] ESI: 7bf88000 EDI: f6ebbea4 EBP: f6ebbe5c ESP: f6ebbe5c
[  231.219836] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000206
[  231.219838]  populate_vma_page_range+0x77/0x80
[  231.219840]  __mm_populate+0x8c/0x110
[  231.219842]  vm_brk_flags+0xab/0xc0
[  231.219844]  vm_brk+0xa/0x10
[  231.219846]  load_elf_library+0x1c0/0x1e0
[  231.219849]  sys_uselib+0x11a/0x160
[  231.219850]  do_fast_syscall_32+0x95/0x188
[  231.219852]  entry_SYSENTER_32+0x4e/0x7c
[  231.219853] EIP: 0xb7f98fd1
[  231.219854] Code: c1 9e f3 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  231.219872] EAX: ffffffda EBX: 08048614 ECX: 000001ed EDX: 00000003
[  231.219873] ESI: 00000000 EDI: 00000000 EBP: bfbfaa28 ESP: bfbfa9fc
[  231.219874] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[  294.144215] INFO: rcu_sched self-detected stall on CPU
[  294.145578] INFO: rcu_sched detected stalls on CPUs/tasks:
[  294.145926] 	5-....: (83606 ticks this GP) idle=0b6/1/1073741826 softirq=1654/1654 fqs=20101 
[  294.145927] 	
[  294.147855] 	5-....: (83606 ticks this GP) idle=0b6/1/1073741826 softirq=1654/1654 fqs=20101 
[  294.150966]  (t=84007 jiffies g=884 c=883 q=411)
[  294.151577] 	
[  294.154593] NMI backtrace for cpu 5
[  294.156334] (detected by 4, t=84007 jiffies, g=884, c=883, q=411)
[  294.156958] CPU: 5 PID: 956 Comm: a.out Not tainted 4.18.0-rc3-00113-gfc36def #365
[  294.163053] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  294.166957] Call Trace:
[  294.167772]  dump_stack+0x57/0x7b
[  294.168860]  nmi_cpu_backtrace+0xc4/0xd0
[  294.170289]  nmi_trigger_cpumask_backtrace+0x9a/0xe0
[  294.171852]  ? vprintk_default+0x32/0x40
[  294.173228]  ? lapic_can_unplug_cpu+0xa0/0xa0
[  294.174801]  arch_trigger_cpumask_backtrace+0x10/0x20
[  294.176446]  rcu_dump_cpu_stacks+0x6f/0x96
[  294.177803]  rcu_check_callbacks+0x532/0x680
[  294.179231]  ? account_process_tick+0x55/0x120
[  294.180643]  ? tick_sched_do_timer+0x50/0x50
[  294.182056]  update_process_times+0x23/0x50
[  294.183408]  tick_sched_handle+0x3a/0x50
[  294.184664]  tick_sched_timer+0x34/0x80
[  294.185874]  __hrtimer_run_queues+0xe4/0x170
[  294.187221]  hrtimer_interrupt+0x10d/0x2b0
[  294.188657]  ? apic_timer_interrupt+0x3c/0x44
[  294.190079]  smp_apic_timer_interrupt+0x4f/0x90
[  294.191720]  apic_timer_interrupt+0x3c/0x44
[  294.193175] EIP: populate_vma_page_range+0x19/0x80
[  294.194722] Code: 2d 04 f3 ff 0f 0b 0f 0b 89 f6 8d bc 27 00 00 00 00 55 89 e5 57 56 89 d6 53 29 f1 83 ec 18 8b 50 20 8b 40 2c c1 e9 0c 89 0c 24 <89> f1 c7 44 24 0c 00 00 00 00 89 55 f0 89 c3 89 c7 81 e3 00 00 08 
[  294.200571] EAX: 00102073 EBX: f600e888 ECX: 00000000 EDX: f235c240
[  294.202604] ESI: 7bf88000 EDI: 7bf88000 EBP: f6ebbe88 ESP: f6ebbe64
[  294.204756] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000246
[  294.207295]  __mm_populate+0x8c/0x110
[  294.208653]  vm_brk_flags+0xab/0xc0
[  294.209949]  vm_brk+0xa/0x10
[  294.211018]  load_elf_library+0x1c0/0x1e0
[  294.212580]  sys_uselib+0x11a/0x160
[  294.213901]  do_fast_syscall_32+0x95/0x188
[  294.215436]  entry_SYSENTER_32+0x4e/0x7c
[  294.216955] EIP: 0xb7f98fd1
[  294.217995] Code: c1 9e f3 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  294.224703] EAX: ffffffda EBX: 08048614 ECX: 000001ed EDX: 00000003
[  294.226681] ESI: 00000000 EDI: 00000000 EBP: bfbfaa28 ESP: bfbfa9fc
[  294.228654] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
[  294.230799] Sending NMI from CPU 4 to CPUs 5:
[  294.253453] NMI backtrace for cpu 5
[  294.253458] CPU: 5 PID: 956 Comm: a.out Not tainted 4.18.0-rc3-00113-gfc36def #365
[  294.253459] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  294.253465] EIP: __mm_populate+0x7a/0x110
[  294.253466] Code: 39 7b 04 77 03 8b 5b 08 85 db 74 6c 8b 03 8b 4d e8 39 c1 76 63 8b 73 04 39 f1 0f 46 f1 f7 43 2c 00 44 00 00 75 21 39 c7 89 f1 <0f> 42 f8 8d 45 ec 89 fa 89 04 24 89 d8 e8 f4 fe ff ff 85 c0 78 40 
[  294.253495] EAX: 7bf81000 EBX: f600e888 ECX: 7bf88676 EDX: f235c240
[  294.253496] ESI: 7bf88676 EDI: 7bf88000 EBP: f6ebbeb8 ESP: f6ebbe90
[  294.253498] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000206
[  294.253499] CR0: 80050033 CR2: b7ebb3f0 CR3: 32646760 CR4: 000406f0
[  294.253608] Call Trace:
[  294.253613]  vm_brk_flags+0xab/0xc0
[  294.253616]  vm_brk+0xa/0x10
[  294.253619]  load_elf_library+0x1c0/0x1e0
[  294.253622]  sys_uselib+0x11a/0x160
[  294.253625]  do_fast_syscall_32+0x95/0x188
[  294.253630]  entry_SYSENTER_32+0x4e/0x7c
[  294.253632] EIP: 0xb7f98fd1
[  294.253632] Code: c1 9e f3 ff ff 89 e5 8b 55 08 85 d2 8b 81 64 cd ff ff 74 02 89 02 5d c3 8b 0c 24 c3 8b 1c 24 c3 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[  294.253660] EAX: ffffffda EBX: 08048614 ECX: 000001ed EDX: 00000003
[  294.253662] ESI: 00000000 EDI: 00000000 EBP: bfbfaa28 ESP: bfbfa9fc
[  294.253663] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000246
------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 10:01 ` Tetsuo Handa
@ 2018-07-04 11:17   ` Michal Hocko
  2018-07-04 11:48     ` Zi Yan
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2018-07-04 11:17 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzbot, akpm, aneesh.kumar, dan.j.williams, kirill.shutemov,
	linux-mm, mst, syzkaller-bugs, viro, ying.huang, zi.yan

On Wed 04-07-18 19:01:51, Tetsuo Handa wrote:
> +Michal Hocko
> 
> On 2018/07/04 13:19, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    d3bc0e67f852 Merge tag 'for-4.18-rc2-tag' of git://git.ker..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1000077c400000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a63be0c83e84d370
> > dashboard link: https://syzkaller.appspot.com/bug?extid=5dcb560fe12aa5091c06
> > compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> > userspace arch: i386
> > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=158577a2400000
> 
> Here is C reproducer made from syz reproducer. mlockall(MCL_FUTURE) is involved.
> 
> This problem is triggerable by an unprivileged user.
> Shows different result on x86_64 (crash) and x86_32 (stall).
> 
> ------------------------------------------------------------
> /* Need to compile using "-m32" option if host is 64bit. */
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sys/mman.h>
> int uselib(const char *library);
> 
> int main(int argc, char *argv[])
> {
> 	int fd = open("file", O_WRONLY | O_CREAT, 0644);
> 	write(fd, "\x7f\x45\x4c\x46\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02"
> 	      "\x00\x06\x00\xca\x3f\x8b\xca\x00\x00\x00\x00\x38\x00\x00\x00\x00\x00"
> 	      "\x00\xf7\xff\xff\xff\xff\xff\xff\x1f\x00\x02\x00\x00\x00\x00\x00\x00"
> 	      "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x7b"
> 	      "\x66\xff\x00\x00\x05\x00\x00\x00\x76\x86\x00\x00\x00\x00\x00\x00\x00"
> 	      "\x00\x00\x00\x31\x0f\xf3\xee\xc1\xb0\x00\x0c\x08\x53\x55\xbe\x88\x47"
> 	      "\xc2\x2e\x30\xf5\x62\x82\xc6\x2c\x95\x72\x3f\x06\x8f\xe4\x2d\x27\x96"
> 	      "\xcc", 120);
> 	fchmod(fd, 0755);
> 	close(fd);
> 	mlockall(MCL_FUTURE); /* Removing this line avoids the bug. */
> 	uselib("file");
> 	return 0;
> }
> ------------------------------------------------------------
> 
> ------------------------------------------------------------
> CentOS Linux 7 (Core)
> Kernel 4.18.0-rc3 on an x86_64
> 
> localhost login: [   81.210241] emacs (9634) used greatest stack depth: 10416 bytes left
> [  140.099935] ------------[ cut here ]------------
> [  140.101904] kernel BUG at mm/gup.c:1242!

Is this 
	VM_BUG_ON(len != PAGE_ALIGN(len));
in __mm_populate? I do not really get why we should VM_BUG_ON when the
len is not page aligned to be honest. The library is probably containing
some funky setup but if we simply cannot round up to the next PAGE_SIZE
boundary then we should probably just error out and fail. This is an
area I am really familiar with so I cannot really judge.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 11:17   ` Michal Hocko
@ 2018-07-04 11:48     ` Zi Yan
  2018-07-04 12:11       ` Michal Hocko
  2018-07-04 12:12       ` kernel BUG at mm/gup.c:LINE! Oscar Salvador
  0 siblings, 2 replies; 19+ messages in thread
From: Zi Yan @ 2018-07-04 11:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

[-- Attachment #1: Type: text/plain, Size: 3106 bytes --]

On 4 Jul 2018, at 7:17, Michal Hocko wrote:

> On Wed 04-07-18 19:01:51, Tetsuo Handa wrote:
>> +Michal Hocko
>>
>> On 2018/07/04 13:19, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit:    d3bc0e67f852 Merge tag 'for-4.18-rc2-tag' of git://git.ker..
>>> git tree:       upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1000077c400000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a63be0c83e84d370
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=5dcb560fe12aa5091c06
>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>> userspace arch: i386
>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=158577a2400000
>>
>> Here is C reproducer made from syz reproducer. mlockall(MCL_FUTURE) is involved.
>>
>> This problem is triggerable by an unprivileged user.
>> Shows different result on x86_64 (crash) and x86_32 (stall).
>>
>> ------------------------------------------------------------
>> /* Need to compile using "-m32" option if host is 64bit. */
>> #include <sys/types.h>
>> #include <sys/stat.h>
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <sys/mman.h>
>> int uselib(const char *library);
>>
>> int main(int argc, char *argv[])
>> {
>> 	int fd = open("file", O_WRONLY | O_CREAT, 0644);
>> 	write(fd, "\x7f\x45\x4c\x46\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02"
>> 	      "\x00\x06\x00\xca\x3f\x8b\xca\x00\x00\x00\x00\x38\x00\x00\x00\x00\x00"
>> 	      "\x00\xf7\xff\xff\xff\xff\xff\xff\x1f\x00\x02\x00\x00\x00\x00\x00\x00"
>> 	      "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x7b"
>> 	      "\x66\xff\x00\x00\x05\x00\x00\x00\x76\x86\x00\x00\x00\x00\x00\x00\x00"
>> 	      "\x00\x00\x00\x31\x0f\xf3\xee\xc1\xb0\x00\x0c\x08\x53\x55\xbe\x88\x47"
>> 	      "\xc2\x2e\x30\xf5\x62\x82\xc6\x2c\x95\x72\x3f\x06\x8f\xe4\x2d\x27\x96"
>> 	      "\xcc", 120);
>> 	fchmod(fd, 0755);
>> 	close(fd);
>> 	mlockall(MCL_FUTURE); /* Removing this line avoids the bug. */
>> 	uselib("file");
>> 	return 0;
>> }
>> ------------------------------------------------------------
>>
>> ------------------------------------------------------------
>> CentOS Linux 7 (Core)
>> Kernel 4.18.0-rc3 on an x86_64
>>
>> localhost login: [   81.210241] emacs (9634) used greatest stack depth: 10416 bytes left
>> [  140.099935] ------------[ cut here ]------------
>> [  140.101904] kernel BUG at mm/gup.c:1242!
>
> Is this
> 	VM_BUG_ON(len != PAGE_ALIGN(len));
> in __mm_populate? I do not really get why we should VM_BUG_ON when the
> len is not page aligned to be honest. The library is probably containing
> some funky setup but if we simply cannot round up to the next PAGE_SIZE
> boundary then we should probably just error out and fail. This is an
> area I am really familiar with so I cannot really judge.

A strange thing is that __mm_populate() is only called by do_mlock() from mm/mlock.c,
which makes len PAGE_ALIGN already. That VM_BUG_ON should not be triggered.

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 11:48     ` Zi Yan
@ 2018-07-04 12:11       ` Michal Hocko
  2018-07-04 15:15         ` Oscar Salvador
  2018-07-04 12:12       ` kernel BUG at mm/gup.c:LINE! Oscar Salvador
  1 sibling, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2018-07-04 12:11 UTC (permalink / raw)
  To: Zi Yan
  Cc: Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

On Wed 04-07-18 07:48:27, Zi Yan wrote:
> On 4 Jul 2018, at 7:17, Michal Hocko wrote:
> 
> > On Wed 04-07-18 19:01:51, Tetsuo Handa wrote:
> >> +Michal Hocko
> >>
> >> On 2018/07/04 13:19, syzbot wrote:
> >>> Hello,
> >>>
> >>> syzbot found the following crash on:
> >>>
> >>> HEAD commit:    d3bc0e67f852 Merge tag 'for-4.18-rc2-tag' of git://git.ker..
> >>> git tree:       upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=1000077c400000
> >>> kernel config:  https://syzkaller.appspot.com/x/.config?x=a63be0c83e84d370
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=5dcb560fe12aa5091c06
> >>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> >>> userspace arch: i386
> >>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=158577a2400000
> >>
> >> Here is C reproducer made from syz reproducer. mlockall(MCL_FUTURE) is involved.
> >>
> >> This problem is triggerable by an unprivileged user.
> >> Shows different result on x86_64 (crash) and x86_32 (stall).
> >>
> >> ------------------------------------------------------------
> >> /* Need to compile using "-m32" option if host is 64bit. */
> >> #include <sys/types.h>
> >> #include <sys/stat.h>
> >> #include <fcntl.h>
> >> #include <unistd.h>
> >> #include <sys/mman.h>
> >> int uselib(const char *library);
> >>
> >> int main(int argc, char *argv[])
> >> {
> >> 	int fd = open("file", O_WRONLY | O_CREAT, 0644);
> >> 	write(fd, "\x7f\x45\x4c\x46\x00\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02"
> >> 	      "\x00\x06\x00\xca\x3f\x8b\xca\x00\x00\x00\x00\x38\x00\x00\x00\x00\x00"
> >> 	      "\x00\xf7\xff\xff\xff\xff\xff\xff\x1f\x00\x02\x00\x00\x00\x00\x00\x00"
> >> 	      "\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\xf8\x7b"
> >> 	      "\x66\xff\x00\x00\x05\x00\x00\x00\x76\x86\x00\x00\x00\x00\x00\x00\x00"
> >> 	      "\x00\x00\x00\x31\x0f\xf3\xee\xc1\xb0\x00\x0c\x08\x53\x55\xbe\x88\x47"
> >> 	      "\xc2\x2e\x30\xf5\x62\x82\xc6\x2c\x95\x72\x3f\x06\x8f\xe4\x2d\x27\x96"
> >> 	      "\xcc", 120);
> >> 	fchmod(fd, 0755);
> >> 	close(fd);
> >> 	mlockall(MCL_FUTURE); /* Removing this line avoids the bug. */
> >> 	uselib("file");
> >> 	return 0;
> >> }
> >> ------------------------------------------------------------
> >>
> >> ------------------------------------------------------------
> >> CentOS Linux 7 (Core)
> >> Kernel 4.18.0-rc3 on an x86_64
> >>
> >> localhost login: [   81.210241] emacs (9634) used greatest stack depth: 10416 bytes left
> >> [  140.099935] ------------[ cut here ]------------
> >> [  140.101904] kernel BUG at mm/gup.c:1242!
> >
> > Is this
> > 	VM_BUG_ON(len != PAGE_ALIGN(len));
> > in __mm_populate? I do not really get why we should VM_BUG_ON when the
> > len is not page aligned to be honest. The library is probably containing
> > some funky setup but if we simply cannot round up to the next PAGE_SIZE
> > boundary then we should probably just error out and fail. This is an
> > area I am really familiar with so I cannot really judge.
> 
> A strange thing is that __mm_populate() is only called by do_mlock() from mm/mlock.c,
> which makes len PAGE_ALIGN already. That VM_BUG_ON should not be triggered.

Not really. vm_brk_flags does call mm_populate for mlocked brk which is
the case for mlockall. I do not see any len sanitization in that path.
Well do_brk_flags does the roundup. I think we should simply remove the
bug on and round up there. mm_populate is an internal API and we should
trust our callers.

Anyway, the minimum fix seems to be the following (untested):

diff --git a/mm/mmap.c b/mm/mmap.c
index 9859cd4e19b9..56ad19cf2aea 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 	return next;
 }
 
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf);
-
+static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
+		struct list_head *uf);
 SYSCALL_DEFINE1(brk, unsigned long, brk)
 {
 	unsigned long retval;
@@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
 		goto out;
 
 	/* Ok, looks good - let it rip. */
-	if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0)
+	if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
 		goto out;
 
 set_brk:
@@ -2939,12 +2939,6 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
 	pgoff_t pgoff = addr >> PAGE_SHIFT;
 	int error;
 
-	len = PAGE_ALIGN(request);
-	if (len < request)
-		return -ENOMEM;
-	if (!len)
-		return 0;
-
 	/* Until we need other flags, refuse anything except VM_EXEC. */
 	if ((flags & (~VM_EXEC)) != 0)
 		return -EINVAL;
@@ -3016,18 +3010,20 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
 	return 0;
 }
 
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
-{
-	return do_brk_flags(addr, len, 0, uf);
-}
-
-int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
+int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
+	unsigned long len;
 	int ret;
 	bool populate;
 	LIST_HEAD(uf);
 
+	len = PAGE_ALIGN(request);
+	if (len < request)
+		return -ENOMEM;
+	if (!len)
+		return 0;
+
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;
 
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 11:48     ` Zi Yan
  2018-07-04 12:11       ` Michal Hocko
@ 2018-07-04 12:12       ` Oscar Salvador
  1 sibling, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-04 12:12 UTC (permalink / raw)
  To: Zi Yan
  Cc: Michal Hocko, Tetsuo Handa, syzbot, akpm, aneesh.kumar,
	dan.j.williams, kirill.shutemov, linux-mm, mst, syzkaller-bugs,
	viro, ying.huang

> A strange thing is that __mm_populate() is only called by do_mlock() from mm/mlock.c,
> which makes len PAGE_ALIGN already. That VM_BUG_ON should not be triggered.

Unless I overlooked something, __mm_populate() gets called from:

load_elf_library() -> vm_brk() -> vm_brk_flags():

vm_brk_flags() {
	...
	populate = ((mm->def_flags & VM_LOCKED) != 0);
	...
	if (populate && !ret)
		mm_populate(addr, len);
}

mm_populate() -> __mm_populate():

__mm_populate() {
	...
	VM_BUG_ON(len != PAGE_ALIGN(len));
	...
}


In load_elf_library(), we have:

len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
			    ELF_MIN_ALIGN - 1);
bss = eppnt->p_memsz + eppnt->p_vaddr;
if (bss > len) {
	error = vm_brk(len, bss - len);
	if (error)
		goto out_free_ph;
}

So len gets page aligned, but not bss (eppnt->p_memsz + eppnt->p_vaddr), maybe that's the problem?


-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 12:11       ` Michal Hocko
@ 2018-07-04 15:15         ` Oscar Salvador
  2018-07-05  0:35           ` Tetsuo Handa
  2018-07-05  6:44           ` Michal Hocko
  0 siblings, 2 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-04 15:15 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

> 
> Not really. vm_brk_flags does call mm_populate for mlocked brk which is
> the case for mlockall. I do not see any len sanitization in that path.
> Well do_brk_flags does the roundup. I think we should simply remove the
> bug on and round up there. mm_populate is an internal API and we should
> trust our callers.
> 
> Anyway, the minimum fix seems to be the following (untested):
> 
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 9859cd4e19b9..56ad19cf2aea 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
>  	return next;
>  }
>  
> -static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf);
> -
> +static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
> +		struct list_head *uf);
>  SYSCALL_DEFINE1(brk, unsigned long, brk)
>  {
>  	unsigned long retval;
> @@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
>  		goto out;
>  
>  	/* Ok, looks good - let it rip. */
> -	if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0)
> +	if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
>  		goto out;
>  
>  set_brk:
> @@ -2939,12 +2939,6 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
>  	pgoff_t pgoff = addr >> PAGE_SHIFT;
>  	int error;
>  
> -	len = PAGE_ALIGN(request);
> -	if (len < request)
> -		return -ENOMEM;
> -	if (!len)
> -		return 0;
> -
>  	/* Until we need other flags, refuse anything except VM_EXEC. */
>  	if ((flags & (~VM_EXEC)) != 0)
>  		return -EINVAL;
> @@ -3016,18 +3010,20 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
>  	return 0;
>  }
>  
> -static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
> -{
> -	return do_brk_flags(addr, len, 0, uf);
> -}
> -
> -int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
> +int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
>  {
>  	struct mm_struct *mm = current->mm;
> +	unsigned long len;
>  	int ret;
>  	bool populate;
>  	LIST_HEAD(uf);
>  
> +	len = PAGE_ALIGN(request);
> +	if (len < request)
> +		return -ENOMEM;
> +	if (!len)
> +		return 0;
> +
>  	if (down_write_killable(&mm->mmap_sem))
>  		return -EINTR;

I gave this patch a try but the system doesn't boot.
Unfortunately, I don't have the stacktrace on hand, but I'll get back to it tomorrow.

Anyway, I just gave it a try, and making sure that bss gets page aligned seems to
"fix" the issue (at the process doesn't hang anymore):

-       bss = eppnt->p_memsz + eppnt->p_vaddr;
+       bss = ELF_PAGESTART(eppnt->p_memsz + eppnt->p_vaddr);
	if (bss > len) {
                error = vm_brk(len, bss - len);

Although I'm not sure about the correctness of this.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 15:15         ` Oscar Salvador
@ 2018-07-05  0:35           ` Tetsuo Handa
  2018-07-05  7:18             ` Oscar Salvador
  2018-07-05  6:44           ` Michal Hocko
  1 sibling, 1 reply; 19+ messages in thread
From: Tetsuo Handa @ 2018-07-05  0:35 UTC (permalink / raw)
  To: Oscar Salvador, viro
  Cc: Michal Hocko, Zi Yan, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, ying.huang

Oscar Salvador wrote:
> Anyway, I just gave it a try, and making sure that bss gets page aligned seems to
> "fix" the issue (at the process doesn't hang anymore):
> 
> -       bss = eppnt->p_memsz + eppnt->p_vaddr;
> +       bss = ELF_PAGESTART(eppnt->p_memsz + eppnt->p_vaddr);
> 	if (bss > len) {
>                 error = vm_brk(len, bss - len);
> 
> Although I'm not sure about the correctness of this.

static int set_brk(unsigned long start, unsigned long end, int prot)
{
        start = ELF_PAGEALIGN(start);
        end = ELF_PAGEALIGN(end);
        if (end > start) {
                /*
                 * Map the last of the bss segment.
                 * If the header is requesting these pages to be
                 * executable, honour that (ppc32 needs this).
                 */
                int error = vm_brk_flags(start, end - start,
                                prot & PROT_EXEC ? VM_EXEC : 0);
                if (error)
                        return error;
        }
        current->mm->start_brk = current->mm->brk = end;
        return 0;
}

static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
                struct file *interpreter, unsigned long *interp_map_addr,
                unsigned long no_base, struct elf_phdr *interp_elf_phdata)
{
(...snipped...)
        /*
         * Next, align both the file and mem bss up to the page size,
         * since this is where elf_bss was just zeroed up to, and where
         * last_bss will end after the vm_brk_flags() below.
         */
        elf_bss = ELF_PAGEALIGN(elf_bss);
        last_bss = ELF_PAGEALIGN(last_bss);
        /* Finally, if there is still more bss to allocate, do it. */
        if (last_bss > elf_bss) {
                error = vm_brk_flags(elf_bss, last_bss - elf_bss,
                                bss_prot & PROT_EXEC ? VM_EXEC : 0);
                if (error)
                        goto out;
        }
(...snipped...)
}

static int load_elf_library(struct file *file)
{
(...snipped...)
        len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
                            ELF_MIN_ALIGN - 1);
        bss = eppnt->p_memsz + eppnt->p_vaddr;
        if (bss > len) {
                error = vm_brk(len, bss - len);
                if (error)
                        goto out_free_ph;
        }
(...snipped...)
}

So, indeed "bss" needs to be aligned.
But ELF_PAGESTART() or ELF_PAGEALIGN(), which one to use?

#define ELF_PAGESTART(_v) ((_v) & ~(unsigned long)(ELF_MIN_ALIGN-1))
#define ELF_PAGEALIGN(_v) (((_v) + ELF_MIN_ALIGN - 1) & ~(ELF_MIN_ALIGN - 1))

Is

-	len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
-			    ELF_MIN_ALIGN - 1);
+	len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);

suggesting that

-	bss = eppnt->p_memsz + eppnt->p_vaddr;
+	bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);

is the right choice? I don't know...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-04 15:15         ` Oscar Salvador
  2018-07-05  0:35           ` Tetsuo Handa
@ 2018-07-05  6:44           ` Michal Hocko
  2018-07-05  7:18             ` Oscar Salvador
  1 sibling, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2018-07-05  6:44 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

On Wed 04-07-18 17:15:29, Oscar Salvador wrote:
> > 
> > Not really. vm_brk_flags does call mm_populate for mlocked brk which is
> > the case for mlockall. I do not see any len sanitization in that path.
> > Well do_brk_flags does the roundup. I think we should simply remove the
> > bug on and round up there. mm_populate is an internal API and we should
> > trust our callers.
> > 
> > Anyway, the minimum fix seems to be the following (untested):
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 9859cd4e19b9..56ad19cf2aea 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
> >  	return next;
> >  }
> >  
> > -static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf);
> > -
> > +static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
> > +		struct list_head *uf);
> >  SYSCALL_DEFINE1(brk, unsigned long, brk)
> >  {
> >  	unsigned long retval;
> > @@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
> >  		goto out;
> >  
> >  	/* Ok, looks good - let it rip. */
> > -	if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0)
> > +	if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
> >  		goto out;
> >  
> >  set_brk:
> > @@ -2939,12 +2939,6 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
> >  	pgoff_t pgoff = addr >> PAGE_SHIFT;
> >  	int error;
> >  
> > -	len = PAGE_ALIGN(request);
> > -	if (len < request)
> > -		return -ENOMEM;
> > -	if (!len)
> > -		return 0;
> > -
> >  	/* Until we need other flags, refuse anything except VM_EXEC. */
> >  	if ((flags & (~VM_EXEC)) != 0)
> >  		return -EINVAL;
> > @@ -3016,18 +3010,20 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
> >  	return 0;
> >  }
> >  
> > -static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
> > -{
> > -	return do_brk_flags(addr, len, 0, uf);
> > -}
> > -
> > -int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
> > +int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
> >  {
> >  	struct mm_struct *mm = current->mm;
> > +	unsigned long len;
> >  	int ret;
> >  	bool populate;
> >  	LIST_HEAD(uf);
> >  
> > +	len = PAGE_ALIGN(request);
> > +	if (len < request)
> > +		return -ENOMEM;
> > +	if (!len)
> > +		return 0;
> > +
> >  	if (down_write_killable(&mm->mmap_sem))
> >  		return -EINTR;
> 
> I gave this patch a try but the system doesn't boot.
> Unfortunately, I don't have the stacktrace on hand, but I'll get back to it tomorrow.

This is more than unexpected. The patch merely move the alignment check
up. I will try to investigate some more but I am off for next four days
and won't be online most of the time.

Btw. does the same happen if you keep do_brk helper and add the length
sanitization there as well?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05  0:35           ` Tetsuo Handa
@ 2018-07-05  7:18             ` Oscar Salvador
  2018-07-05 11:40               ` Oscar Salvador
  0 siblings, 1 reply; 19+ messages in thread
From: Oscar Salvador @ 2018-07-05  7:18 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: viro, Michal Hocko, Zi Yan, syzbot, akpm, aneesh.kumar,
	dan.j.williams, kirill.shutemov, linux-mm, mst, syzkaller-bugs,
	ying.huang

> So, indeed "bss" needs to be aligned.
> But ELF_PAGESTART() or ELF_PAGEALIGN(), which one to use?
> 
> #define ELF_PAGESTART(_v) ((_v) & ~(unsigned long)(ELF_MIN_ALIGN-1))
> #define ELF_PAGEALIGN(_v) (((_v) + ELF_MIN_ALIGN - 1) & ~(ELF_MIN_ALIGN - 1))
> 
> Is
> 
> -	len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
> -			    ELF_MIN_ALIGN - 1);
> +	len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
> 
> suggesting that
> 
> -	bss = eppnt->p_memsz + eppnt->p_vaddr;
> +	bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
> 
> is the right choice? I don't know...

Yes, I think that ELF_PAGEALIGN is the right choice here.
Given that bss is 0x7bf88676, using ELF_PAGESTART aligns it but backwards, while ELF_PAGEALIGN does
the right thing:

bss = 0x7bf88676
ELF_PAGESTART (bss) = 0x7bf88000
ELF_PAGEALIGN (bss) = 0x7bf89000

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05  6:44           ` Michal Hocko
@ 2018-07-05  7:18             ` Oscar Salvador
  2018-07-05 12:30               ` Oscar Salvador
  0 siblings, 1 reply; 19+ messages in thread
From: Oscar Salvador @ 2018-07-05  7:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

 
> This is more than unexpected. The patch merely move the alignment check
> up. I will try to investigate some more but I am off for next four days
> and won't be online most of the time.
> 
> Btw. does the same happen if you keep do_brk helper and add the length
> sanitization there as well?

I will give it a try and I will let you know.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05  7:18             ` Oscar Salvador
@ 2018-07-05 11:40               ` Oscar Salvador
  0 siblings, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-05 11:40 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: viro, Michal Hocko, Zi Yan, syzbot, akpm, aneesh.kumar,
	dan.j.williams, kirill.shutemov, linux-mm, mst, syzkaller-bugs,
	ying.huang

On Thu, Jul 05, 2018 at 09:18:08AM +0200, Oscar Salvador wrote:
> > So, indeed "bss" needs to be aligned.
> > But ELF_PAGESTART() or ELF_PAGEALIGN(), which one to use?
> > 
> > #define ELF_PAGESTART(_v) ((_v) & ~(unsigned long)(ELF_MIN_ALIGN-1))
> > #define ELF_PAGEALIGN(_v) (((_v) + ELF_MIN_ALIGN - 1) & ~(ELF_MIN_ALIGN - 1))
> > 
> > Is
> > 
> > -	len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
> > -			    ELF_MIN_ALIGN - 1);
> > +	len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
> > 
> > suggesting that
> > 
> > -	bss = eppnt->p_memsz + eppnt->p_vaddr;
> > +	bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
> > 
> > is the right choice? I don't know...
> 
> Yes, I think that ELF_PAGEALIGN is the right choice here.
> Given that bss is 0x7bf88676, using ELF_PAGESTART aligns it but backwards, while ELF_PAGEALIGN does
> the right thing:
> 
> bss = 0x7bf88676
> ELF_PAGESTART (bss) = 0x7bf88000
> ELF_PAGEALIGN (bss) = 0x7bf89000

I think this should do the trick:

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 0ac456b52bdd..6c7e005ae12d 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1259,9 +1259,9 @@ static int load_elf_library(struct file *file)
                goto out_free_ph;
        }
 
-       len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
-                           ELF_MIN_ALIGN - 1);
-       bss = eppnt->p_memsz + eppnt->p_vaddr;
+
+       len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
+       bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
        if (bss > len) {
                error = vm_brk(len, bss - len);
                if (error)

I could only test it in x86_64 (with -m32).
Could you test it on x86_32? 

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05  7:18             ` Oscar Salvador
@ 2018-07-05 12:30               ` Oscar Salvador
  2018-07-05 13:40                 ` Tetsuo Handa
  2018-07-06  5:35                 ` Michal Hocko
  0 siblings, 2 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-05 12:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

On Thu, Jul 05, 2018 at 09:18:39AM +0200, Oscar Salvador wrote:
>  
> > This is more than unexpected. The patch merely move the alignment check
> > up. I will try to investigate some more but I am off for next four days
> > and won't be online most of the time.
> > 
> > Btw. does the same happen if you keep do_brk helper and add the length
> > sanitization there as well?

I took another look.
The problem was that while deleting the check in do_brk_flags(), this left then "len"
local variable with an unset value, but we need it to contain the request value
because we do use it in further calls in do_brk_flags(), like:

while (find_vma_links(mm, addr, addr + len, &prev, &rb_link,
                              &rb_parent)) {
	if (do_munmap(mm, addr, len, uf))
		return -ENOMEM;
}

or

if (!may_expand_vm(mm, flags, len >> PAGE_SHIFT))

and so on.

This boots and works with the reproducer:

diff --git a/mm/mmap.c b/mm/mmap.c
index 9859cd4e19b9..e4c9e995870f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 	return next;
 }
 
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf);
-
+static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags,
+								struct list_head *uf);
 SYSCALL_DEFINE1(brk, unsigned long, brk)
 {
 	unsigned long retval;
@@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
 		goto out;
 
 	/* Ok, looks good - let it rip. */
-	if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0)
+	if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
 		goto out;
 
 set_brk:
@@ -2934,17 +2934,11 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma, *prev;
-	unsigned long len;
+	unsigned long len = request;
 	struct rb_node **rb_link, *rb_parent;
 	pgoff_t pgoff = addr >> PAGE_SHIFT;
 	int error;
 
-	len = PAGE_ALIGN(request);
-	if (len < request)
-		return -ENOMEM;
-	if (!len)
-		return 0;
-
 	/* Until we need other flags, refuse anything except VM_EXEC. */
 	if ((flags & (~VM_EXEC)) != 0)
 		return -EINVAL;
@@ -3016,18 +3010,20 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
 	return 0;
 }
 
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf)
-{
-	return do_brk_flags(addr, len, 0, uf);
-}
-
-int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
+int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
 	int ret;
+	unsigned long len;
 	bool populate;
 	LIST_HEAD(uf);
 
+	len = PAGE_ALIGN(request);
+	if (len < request)
+		return -ENOMEM;
+	if (!len)
+		return 0;
+
 	if (down_write_killable(&mm->mmap_sem))
 		return -EINTR;


But I think that we should also add:

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 0ac456b52bdd..6c7e005ae12d 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1259,9 +1259,9 @@ static int load_elf_library(struct file *file)
 		goto out_free_ph;
 	}
 
-	len = ELF_PAGESTART(eppnt->p_filesz + eppnt->p_vaddr +
-			    ELF_MIN_ALIGN - 1);
-	bss = eppnt->p_memsz + eppnt->p_vaddr;
+
+	len = ELF_PAGEALIGN(eppnt->p_filesz + eppnt->p_vaddr);
+	bss = ELF_PAGEALIGN(eppnt->p_memsz + eppnt->p_vaddr);
 	if (bss > len) {
 		error = vm_brk(len, bss - len);
 		if (error)

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05 12:30               ` Oscar Salvador
@ 2018-07-05 13:40                 ` Tetsuo Handa
  2018-07-06  5:35                 ` Michal Hocko
  1 sibling, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2018-07-05 13:40 UTC (permalink / raw)
  To: Oscar Salvador, Michal Hocko
  Cc: Zi Yan, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

On 2018/07/05 21:30, Oscar Salvador wrote:
> This boots and works with the reproducer:

Yes, this patch fixes the problem on x86_32.

> But I think that we should also add:

Yes, this patch also fixes the problem on x86_32.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-05 12:30               ` Oscar Salvador
  2018-07-05 13:40                 ` Tetsuo Handa
@ 2018-07-06  5:35                 ` Michal Hocko
  2018-07-06  7:40                   ` Oscar Salvador
  2018-07-06  7:50                   ` [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate kbuild test robot
  1 sibling, 2 replies; 19+ messages in thread
From: Michal Hocko @ 2018-07-06  5:35 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

On Thu 05-07-18 14:30:17, Oscar Salvador wrote:
> On Thu, Jul 05, 2018 at 09:18:39AM +0200, Oscar Salvador wrote:
> >  
> > > This is more than unexpected. The patch merely move the alignment check
> > > up. I will try to investigate some more but I am off for next four days
> > > and won't be online most of the time.
> > > 
> > > Btw. does the same happen if you keep do_brk helper and add the length
> > > sanitization there as well?
> 
> I took another look.
> The problem was that while deleting the check in do_brk_flags(), this left then "len"
> local variable with an unset value, but we need it to contain the request value
> because we do use it in further calls in do_brk_flags(), like:

Very well spotted. Thanks for noticing! I am really half online so I
cannot give it much testing right now. But here is the updated patch
with the changelog. I cannot really tell whether the other change to
align up in load_elf_library is correct as well. It seems OK but
everything around elf loading is tricky from my past experience.

My patch simply makes vm_brk_flags behavior consistent so I believe we
should do that regardless (unless I've screwed something else here of
course).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel BUG at mm/gup.c:LINE!
  2018-07-06  5:35                 ` Michal Hocko
@ 2018-07-06  7:40                   ` Oscar Salvador
  2018-07-06  7:50                   ` [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate kbuild test robot
  1 sibling, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-06  7:40 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Zi Yan, Tetsuo Handa, syzbot, akpm, aneesh.kumar, dan.j.williams,
	kirill.shutemov, linux-mm, mst, syzkaller-bugs, viro, ying.huang

> Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
> [osalvador: fix up vm_brk_flags s@request@len@]
> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable
> Signed-off-by: Michal Hocko <mhocko@suse.com>

hi Michal,

I gave it another spin and it works for me.

FWIW:
Reviewed-by: Oscar Salvador <osalvador@suse.de>
-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate
  2018-07-06  5:35                 ` Michal Hocko
  2018-07-06  7:40                   ` Oscar Salvador
@ 2018-07-06  7:50                   ` kbuild test robot
  2018-07-06  8:23                     ` Oscar Salvador
  1 sibling, 1 reply; 19+ messages in thread
From: kbuild test robot @ 2018-07-06  7:50 UTC (permalink / raw)
  To: Michal Hocko
  Cc: kbuild-all, Oscar Salvador, Zi Yan, Tetsuo Handa, syzbot, akpm,
	aneesh.kumar, dan.j.williams, kirill.shutemov, linux-mm, mst,
	syzkaller-bugs, viro, ying.huang

[-- Attachment #1: Type: text/plain, Size: 8053 bytes --]

Hi Michal,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.18-rc3 next-20180705]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate/20180706-134850
config: x86_64-randconfig-x015-201826 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   mm/mmap.c: In function 'do_brk_flags':
>> mm/mmap.c:2936:16: error: 'len' redeclared as different kind of symbol
     unsigned long len;
                   ^~~
   mm/mmap.c:2932:59: note: previous definition of 'len' was here
    static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf)
                                                              ^~~

vim +/len +2936 mm/mmap.c

^1da177e4 Linus Torvalds        2005-04-16  2926  
^1da177e4 Linus Torvalds        2005-04-16  2927  /*
^1da177e4 Linus Torvalds        2005-04-16  2928   *  this is really a simplified "do_mmap".  it only handles
^1da177e4 Linus Torvalds        2005-04-16  2929   *  anonymous maps.  eventually we may be able to do some
^1da177e4 Linus Torvalds        2005-04-16  2930   *  brk-specific accounting here.
^1da177e4 Linus Torvalds        2005-04-16  2931   */
e3049e198 Michal Hocko          2018-07-06  2932  static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf)
^1da177e4 Linus Torvalds        2005-04-16  2933  {
^1da177e4 Linus Torvalds        2005-04-16  2934  	struct mm_struct *mm = current->mm;
^1da177e4 Linus Torvalds        2005-04-16  2935  	struct vm_area_struct *vma, *prev;
16e72e9b3 Denys Vlasenko        2017-02-22 @2936  	unsigned long len;
^1da177e4 Linus Torvalds        2005-04-16  2937  	struct rb_node **rb_link, *rb_parent;
^1da177e4 Linus Torvalds        2005-04-16  2938  	pgoff_t pgoff = addr >> PAGE_SHIFT;
3a4597568 Kirill Korotaev       2006-09-07  2939  	int error;
^1da177e4 Linus Torvalds        2005-04-16  2940  
16e72e9b3 Denys Vlasenko        2017-02-22  2941  	/* Until we need other flags, refuse anything except VM_EXEC. */
16e72e9b3 Denys Vlasenko        2017-02-22  2942  	if ((flags & (~VM_EXEC)) != 0)
16e72e9b3 Denys Vlasenko        2017-02-22  2943  		return -EINVAL;
16e72e9b3 Denys Vlasenko        2017-02-22  2944  	flags |= VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
3a4597568 Kirill Korotaev       2006-09-07  2945  
2c6a10161 Al Viro               2009-12-03  2946  	error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
de1741a13 Alexander Kuleshov    2015-11-05  2947  	if (offset_in_page(error))
3a4597568 Kirill Korotaev       2006-09-07  2948  		return error;
3a4597568 Kirill Korotaev       2006-09-07  2949  
363ee17f0 Davidlohr Bueso       2014-01-21  2950  	error = mlock_future_check(mm, mm->def_flags, len);
363ee17f0 Davidlohr Bueso       2014-01-21  2951  	if (error)
363ee17f0 Davidlohr Bueso       2014-01-21  2952  		return error;
^1da177e4 Linus Torvalds        2005-04-16  2953  
^1da177e4 Linus Torvalds        2005-04-16  2954  	/*
^1da177e4 Linus Torvalds        2005-04-16  2955  	 * mm->mmap_sem is required to protect against another thread
^1da177e4 Linus Torvalds        2005-04-16  2956  	 * changing the mappings in case we sleep.
^1da177e4 Linus Torvalds        2005-04-16  2957  	 */
^1da177e4 Linus Torvalds        2005-04-16  2958  	verify_mm_writelocked(mm);
^1da177e4 Linus Torvalds        2005-04-16  2959  
^1da177e4 Linus Torvalds        2005-04-16  2960  	/*
^1da177e4 Linus Torvalds        2005-04-16  2961  	 * Clear old maps.  this also does some error checking for us
^1da177e4 Linus Torvalds        2005-04-16  2962  	 */
9fcd14571 Rasmus Villemoes      2015-04-15  2963  	while (find_vma_links(mm, addr, addr + len, &prev, &rb_link,
9fcd14571 Rasmus Villemoes      2015-04-15  2964  			      &rb_parent)) {
897ab3e0c Mike Rapoport         2017-02-24  2965  		if (do_munmap(mm, addr, len, uf))
^1da177e4 Linus Torvalds        2005-04-16  2966  			return -ENOMEM;
^1da177e4 Linus Torvalds        2005-04-16  2967  	}
^1da177e4 Linus Torvalds        2005-04-16  2968  
^1da177e4 Linus Torvalds        2005-04-16  2969  	/* Check against address space limits *after* clearing old maps... */
846383359 Konstantin Khlebnikov 2016-01-14  2970  	if (!may_expand_vm(mm, flags, len >> PAGE_SHIFT))
^1da177e4 Linus Torvalds        2005-04-16  2971  		return -ENOMEM;
^1da177e4 Linus Torvalds        2005-04-16  2972  
^1da177e4 Linus Torvalds        2005-04-16  2973  	if (mm->map_count > sysctl_max_map_count)
^1da177e4 Linus Torvalds        2005-04-16  2974  		return -ENOMEM;
^1da177e4 Linus Torvalds        2005-04-16  2975  
191c54244 Al Viro               2012-02-13  2976  	if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT))
^1da177e4 Linus Torvalds        2005-04-16  2977  		return -ENOMEM;
^1da177e4 Linus Torvalds        2005-04-16  2978  
^1da177e4 Linus Torvalds        2005-04-16  2979  	/* Can we just expand an old private anonymous mapping? */
ba470de43 Rik van Riel          2008-10-18  2980  	vma = vma_merge(mm, prev, addr, addr + len, flags,
19a809afe Andrea Arcangeli      2015-09-04  2981  			NULL, NULL, pgoff, NULL, NULL_VM_UFFD_CTX);
ba470de43 Rik van Riel          2008-10-18  2982  	if (vma)
^1da177e4 Linus Torvalds        2005-04-16  2983  		goto out;
^1da177e4 Linus Torvalds        2005-04-16  2984  
^1da177e4 Linus Torvalds        2005-04-16  2985  	/*
^1da177e4 Linus Torvalds        2005-04-16  2986  	 * create a vma struct for an anonymous mapping
^1da177e4 Linus Torvalds        2005-04-16  2987  	 */
c5e3b83e9 Pekka Enberg          2006-03-25  2988  	vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
^1da177e4 Linus Torvalds        2005-04-16  2989  	if (!vma) {
^1da177e4 Linus Torvalds        2005-04-16  2990  		vm_unacct_memory(len >> PAGE_SHIFT);
^1da177e4 Linus Torvalds        2005-04-16  2991  		return -ENOMEM;
^1da177e4 Linus Torvalds        2005-04-16  2992  	}
^1da177e4 Linus Torvalds        2005-04-16  2993  
5beb49305 Rik van Riel          2010-03-05  2994  	INIT_LIST_HEAD(&vma->anon_vma_chain);
^1da177e4 Linus Torvalds        2005-04-16  2995  	vma->vm_mm = mm;
^1da177e4 Linus Torvalds        2005-04-16  2996  	vma->vm_start = addr;
^1da177e4 Linus Torvalds        2005-04-16  2997  	vma->vm_end = addr + len;
^1da177e4 Linus Torvalds        2005-04-16  2998  	vma->vm_pgoff = pgoff;
^1da177e4 Linus Torvalds        2005-04-16  2999  	vma->vm_flags = flags;
3ed75eb8f Coly Li               2007-10-18  3000  	vma->vm_page_prot = vm_get_page_prot(flags);
^1da177e4 Linus Torvalds        2005-04-16  3001  	vma_link(mm, vma, prev, rb_link, rb_parent);
^1da177e4 Linus Torvalds        2005-04-16  3002  out:
3af9e8592 Eric B Munson         2010-05-18  3003  	perf_event_mmap(vma);
^1da177e4 Linus Torvalds        2005-04-16  3004  	mm->total_vm += len >> PAGE_SHIFT;
846383359 Konstantin Khlebnikov 2016-01-14  3005  	mm->data_vm += len >> PAGE_SHIFT;
128557ffe Michel Lespinasse     2013-02-22  3006  	if (flags & VM_LOCKED)
ba470de43 Rik van Riel          2008-10-18  3007  		mm->locked_vm += (len >> PAGE_SHIFT);
d9104d1ca Cyrill Gorcunov       2013-09-11  3008  	vma->vm_flags |= VM_SOFTDIRTY;
5d22fc25d Linus Torvalds        2016-05-27  3009  	return 0;
^1da177e4 Linus Torvalds        2005-04-16  3010  }
^1da177e4 Linus Torvalds        2005-04-16  3011  

:::::: The code at line 2936 was first introduced by commit
:::::: 16e72e9b30986ee15f17fbb68189ca842c32af58 powerpc: do not make the entire heap executable

:::::: TO: Denys Vlasenko <dvlasenk@redhat.com>
:::::: CC: Linus Torvalds <torvalds@linux-foundation.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29269 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate
  2018-07-06  7:50                   ` [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate kbuild test robot
@ 2018-07-06  8:23                     ` Oscar Salvador
  2018-07-06  9:02                       ` Michal Hocko
  0 siblings, 1 reply; 19+ messages in thread
From: Oscar Salvador @ 2018-07-06  8:23 UTC (permalink / raw)
  To: kbuild test robot
  Cc: Michal Hocko, kbuild-all, Zi Yan, Tetsuo Handa, syzbot, akpm,
	aneesh.kumar, dan.j.williams, kirill.shutemov, linux-mm, mst,
	syzkaller-bugs, viro, ying.huang

On Fri, Jul 06, 2018 at 03:50:53PM +0800, kbuild test robot wrote:
> Hi Michal,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.18-rc3 next-20180705]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate/20180706-134850
> config: x86_64-randconfig-x015-201826 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> All errors (new ones prefixed by >>):
> 
>    mm/mmap.c: In function 'do_brk_flags':
> >> mm/mmap.c:2936:16: error: 'len' redeclared as different kind of symbol
>      unsigned long len;
>                    ^~~
>    mm/mmap.c:2932:59: note: previous definition of 'len' was here
>     static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf)

Somehow I missed that.
Maybe some remains from yesterday.

The local variable "len" must be dropped.
-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate
  2018-07-06  8:23                     ` Oscar Salvador
@ 2018-07-06  9:02                       ` Michal Hocko
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-07-06  9:02 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: kbuild test robot, kbuild-all, Zi Yan, Tetsuo Handa, syzbot,
	akpm, aneesh.kumar, dan.j.williams, kirill.shutemov, linux-mm,
	mst, syzkaller-bugs, viro, ying.huang

On Fri 06-07-18 10:23:48, Oscar Salvador wrote:
> On Fri, Jul 06, 2018 at 03:50:53PM +0800, kbuild test robot wrote:
> > Hi Michal,
> > 
> > I love your patch! Yet something to improve:
> > 
> > [auto build test ERROR on linus/master]
> > [also build test ERROR on v4.18-rc3 next-20180705]
> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> > 
> > url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate/20180706-134850
> > config: x86_64-randconfig-x015-201826 (attached as .config)
> > compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> > reproduce:
> >         # save the attached .config to linux build tree
> >         make ARCH=x86_64 
> > 
> > All errors (new ones prefixed by >>):
> > 
> >    mm/mmap.c: In function 'do_brk_flags':
> > >> mm/mmap.c:2936:16: error: 'len' redeclared as different kind of symbol
> >      unsigned long len;
> >                    ^~~
> >    mm/mmap.c:2932:59: note: previous definition of 'len' was here
> >     static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf)
> 
> Somehow I missed that.
> Maybe some remains from yesterday.
> 
> The local variable "len" must be dropped.

Of course. This is what it looks like when you post patches in hurry
before leaving. Mea culpa. Sorry about that. Refreshed

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-07-06  9:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-04  4:19 kernel BUG at mm/gup.c:LINE! syzbot
2018-07-04 10:01 ` Tetsuo Handa
2018-07-04 11:17   ` Michal Hocko
2018-07-04 11:48     ` Zi Yan
2018-07-04 12:11       ` Michal Hocko
2018-07-04 15:15         ` Oscar Salvador
2018-07-05  0:35           ` Tetsuo Handa
2018-07-05  7:18             ` Oscar Salvador
2018-07-05 11:40               ` Oscar Salvador
2018-07-05  6:44           ` Michal Hocko
2018-07-05  7:18             ` Oscar Salvador
2018-07-05 12:30               ` Oscar Salvador
2018-07-05 13:40                 ` Tetsuo Handa
2018-07-06  5:35                 ` Michal Hocko
2018-07-06  7:40                   ` Oscar Salvador
2018-07-06  7:50                   ` [PATCH] mm: do not bug_on on incorrect lenght in __mm_populate kbuild test robot
2018-07-06  8:23                     ` Oscar Salvador
2018-07-06  9:02                       ` Michal Hocko
2018-07-04 12:12       ` kernel BUG at mm/gup.c:LINE! Oscar Salvador

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).