Greetings, 0day kernel testing robot got the below dmesg and the first bad commit is https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master commit efad4e475c312456edb3c789d0996d12ed744c13 Author: Michal Hocko AuthorDate: Fri Feb 1 14:20:34 2019 -0800 Commit: Linus Torvalds CommitDate: Fri Feb 1 15:46:23 2019 -0800 mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone Patch series "mm, memory_hotplug: fix uninitialized pages fallouts", v2. Mikhail Zaslonko has posted fixes for the two bugs quite some time ago [1]. I have pushed back on those fixes because I believed that it is much better to plug the problem at the initialization time rather than play whack-a-mole all over the hotplug code and find all the places which expect the full memory section to be initialized. We have ended up with commit 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section") merged and cause a regression [2][3]. The reason is that there might be memory layouts when two NUMA nodes share the same memory section so the merged fix is simply incorrect. In order to plug this hole we really have to be zone range aware in those handlers. I have split up the original patch into two. One is unchanged (patch 2) and I took a different approach for `removable' crash. [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948 [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz This patch (of 2): Mikhail has reported the following VM_BUG_ON triggered when reading sysfs removable state of a memory block: page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: is_mem_section_removable+0xb4/0x190 show_mem_removable+0x9a/0xd8 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops The reason is that the memory block spans the zone boundary and we are stumbling over an unitialized struct page. Fix this by enforcing zone range in is_mem_section_removable so that we never run away from a zone. Link: http://lkml.kernel.org/r/20190128144506.15603-2-mhocko@kernel.org Signed-off-by: Michal Hocko Reported-by: Mikhail Zaslonko Debugged-by: Mikhail Zaslonko Tested-by: Gerald Schaefer Tested-by: Mikhail Gavrilov Reviewed-by: Oscar Salvador Cc: Pavel Tatashin Cc: Heiko Carstens Cc: Martin Schwidefsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds 9bcdeb51bd oom, oom_reaper: do not enqueue same task twice efad4e475c mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone f17b5f06cb Linux 5.0-rc4 7a92eb7cc1 Add linux-next specific files for 20190215 +-----------------------------------------------------+------------+------------+----------+---------------+ | | 9bcdeb51bd | efad4e475c | v5.0-rc4 | next-20190215 | +-----------------------------------------------------+------------+------------+----------+---------------+ | boot_successes | 31 | 2 | 21 | 0 | | boot_failures | 0 | 11 | 6 | 10 | | Oops:#[##] | 0 | 11 | | | | RIP:page_mapping | 0 | 11 | | | | WARNING:at_kernel/locking/lockdep.c:#lock_downgrade | 0 | 3 | | | | RIP:lock_downgrade | 0 | 3 | | | | Kernel_panic-not_syncing:Fatal_exception | 0 | 11 | 0 | 10 | | BUG:unable_to_handle_kernel | 0 | 6 | | | | BUG:kernel_in_stage | 0 | 0 | 6 | | | kernel_BUG_at_include/linux/mm.h | 0 | 0 | 0 | 10 | | invalid_opcode:#[##] | 0 | 0 | 0 | 10 | | RIP:is_mem_section_removable | 0 | 0 | 0 | 10 | +-----------------------------------------------------+------------+------------+----------+---------------+ udevd[311]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv pci:v00001234d00001111sv00001AF4sd00001100bc03sc00i00': No such file or directory udevd[312]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv acpi:QEMU0002:': No such file or directory udevd[314]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv platform:Fixed MDIO bus': No such file or directory udevd[315]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv acpi:PNP0103:': No such file or directory [ 40.305212] PGD 0 P4D 0 [ 40.308255] Oops: 0000 [#1] PREEMPT SMP PTI [ 40.313055] CPU: 1 PID: 239 Comm: udevd Not tainted 5.0.0-rc4-00149-gefad4e4 #1 [ 40.321348] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 40.330813] RIP: 0010:page_mapping+0x12/0x80 [ 40.335709] Code: 5d c3 48 89 df e8 0e ad 02 00 85 c0 75 da 89 e8 5b 5d c3 0f 1f 44 00 00 53 48 89 fb 48 8b 43 08 48 8d 50 ff a8 01 48 0f 45 da <48> 8b 53 08 48 8d 42 ff 83 e2 01 48 0f 44 c3 48 83 38 ff 74 2f 48 [ 40.356704] RSP: 0018:ffff88801fa87cd8 EFLAGS: 00010202 [ 40.362714] RAX: ffffffffffffffff RBX: fffffffffffffffe RCX: 000000000000000a [ 40.370798] RDX: fffffffffffffffe RSI: ffffffff820b9a20 RDI: ffff88801e5c0000 [ 40.378830] RBP: 6db6db6db6db6db7 R08: ffff88801e8bb000 R09: 0000000001b64d13 [ 40.386902] R10: ffff88801fa87cf8 R11: 0000000000000001 R12: ffff88801e640000 [ 40.395033] R13: ffffffff820b9a20 R14: ffff88801f145258 R15: 0000000000000001 [ 40.403138] FS: 00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000 [ 40.412243] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 40.418846] CR2: 0000000000000006 CR3: 000000001fa82000 CR4: 00000000000006a0 [ 40.426951] Call Trace: [ 40.429843] __dump_page+0x14/0x2c0 [ 40.433947] is_mem_section_removable+0x24c/0x2c0 [ 40.439327] removable_show+0x87/0xa0 [ 40.443613] dev_attr_show+0x25/0x60 [ 40.447763] sysfs_kf_seq_show+0xba/0x110 [ 40.452363] seq_read+0x196/0x3f0 [ 40.456282] __vfs_read+0x34/0x180 [ 40.460233] ? lock_acquire+0xb6/0x1e0 [ 40.464610] vfs_read+0xa0/0x150 [ 40.468372] ksys_read+0x44/0xb0 [ 40.472129] ? do_syscall_64+0x1f/0x4a0 [ 40.476593] do_syscall_64+0x5e/0x4a0 [ 40.480809] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 40.486195] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 40.491961] RIP: 0033:0x7fb2070680a0 [ 40.496078] Code: 73 01 c3 48 8b 0d a0 0d 2d 00 31 d2 48 29 c2 64 89 11 48 83 c8 ff eb ea 90 90 83 3d 3d 71 2d 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 3e b1 01 00 48 89 04 24 [ 40.517047] RSP: 002b:00007ffeee09f0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 40.525660] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb2070680a0 [ 40.533780] RDX: 0000000000001000 RSI: 00007ffeee09f158 RDI: 0000000000000005 [ 40.541853] RBP: 000056092c0f0ac3 R08: 7379732f73656369 R09: 6f6d656d2f6d6574 [ 40.549930] R10: 726f6d656d2f7972 R11: 0000000000000246 R12: 0000000000000000 [ 40.557982] R13: 000056092c0ef7a0 R14: 0000000000000000 R15: 00007ffeee0a4f08 [ 40.566089] Modules linked in: [ 40.569651] CR2: 0000000000000006 udevd[316]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv platform:i5k_amb': No such file or directory [ 40.609875] WARNING: CPU: 1 PID: 235 at kernel/locking/lockdep.c:3553 lock_downgrade+0x167/0x1b0 [ 40.626045] Modules linked in: [ 40.629632] CPU: 1 PID: 235 Comm: udevd Tainted: G D 5.0.0-rc4-00149-gefad4e4 #1 [ 40.639486] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 40.648956] RIP: 0010:lock_downgrade+0x167/0x1b0 [ 40.654231] Code: c9 75 a9 48 c7 c6 c7 08 0c 82 48 c7 c7 58 f9 0a 82 e8 dd e6 fa ff 0f 0b eb 92 48 c7 c7 eb 08 0c 82 48 89 04 24 e8 c9 e6 fa ff <0f> 0b 8b 54 24 0c 48 8b 04 24 e9 2e ff ff ff e8 e5 fb 1e 00 85 c0 [ 40.675231] RSP: 0018:ffff88801fa13de8 EFLAGS: 00010096 [ 40.681229] RAX: 0000000000000017 RBX: ffff88801fa0c000 RCX: 0000000000000000 [ 40.689326] RDX: ffffffff811285f4 RSI: 0000000000000001 RDI: ffffffff81128610 [ 40.697401] RBP: ffff88801f93e0f8 R08: 0000000000000000 R09: 6572206120676e69 [ 40.705498] R10: ffff88801fa13e08 R11: 6b636f6c20646165 R12: 0000000000000246 [ 40.713630] R13: ffffffff812145c1 R14: 0000000000000001 R15: ffff88801f16a1d0 [ 40.721734] FS: 00007fb2079817c0(0000) GS:ffff88801dd00000(0000) knlGS:0000000000000000 [ 40.730878] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 40.737418] CR2: 0000000000fa8000 CR3: 000000001fa0e000 CR4: 00000000000006a0 [ 40.745516] Call Trace: [ 40.748404] downgrade_write+0x12/0x80 [ 40.752748] __do_munmap+0x3f1/0x430 [ 40.756926] __vm_munmap+0x5d/0x90 [ 40.760854] __x64_sys_munmap+0x25/0x30 [ 40.765257] do_syscall_64+0x5e/0x4a0 [ 40.769566] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 40.774950] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 40.780753] RIP: 0033:0x7fb207071897 [ 40.784895] Code: f0 ff ff 73 01 c3 48 8b 0d a6 75 2c 00 31 d2 48 29 c2 64 89 11 48 83 c8 ff eb ea 90 90 90 90 90 90 90 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 79 75 2c 00 31 d2 48 29 c2 64 [ 40.806706] RSP: 002b:00007ffeee09c9e8 EFLAGS: 00000206 ORIG_RAX: 000000000000000b [ 40.816041] RAX: ffffffffffffffda RBX: 000056092c0e9720 RCX: 00007fb207071897 [ 40.824406] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007fb207986000 [ 40.832697] RBP: 0000000000000000 R08: 00007fb2079817c0 R09: 00000000ffffffff [ 40.840871] R10: 0000000000000022 R11: 0000000000000206 R12: 0000000000000000 [ 40.848911] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffeee09ca6e [ 40.857009] irq event stamp: 8258 [ 40.860875] hardirqs last enabled at (8257): [] preempt_schedule_irq+0x3b/0x90 [ 40.870941] hardirqs last disabled at (8258): [] __schedule+0x99/0x9e0 [ 40.880106] softirqs last enabled at (8256): [] __do_softirq+0x3f4/0x4c1 [ 40.889506] softirqs last disabled at (8249): [] irq_exit+0xdd/0xf0 [ 40.898329] ---[ end trace 0f9a24fdf9c73c71 ]--- # HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD git bisect start 5bb0643c4108bb06d8766b4bd48d20215deef4af f17b5f06cb92ef2250513a1e154c47b78df07d40 -- git bisect bad 8e26062e1c829f1656e91461f95a7b83bda16ffd # 02:34 B 0 10 25 0 Merge 'tip/ras/core' into devel-hourly-2019021719 git bisect bad 39b94eff9f252bd7b6f2dfe716f6b5dd894ada6f # 02:49 B 0 4 19 0 Merge 'sunxi/sunxi/h3-h5-for-5.1' into devel-hourly-2019021719 git bisect bad cce96fc008ac0e3a5f96280557b02dcb83e70eee # 03:02 B 0 10 25 0 Merge 'linux-review/Gustavo-A-R-Silva/igc-Use-struct_size-helper/20190208-163630' into devel-hourly-2019021719 git bisect bad 544d67be09fcf4054db60b0b2b6fcb7386c095fe # 03:13 B 0 7 22 0 Merge 'linux-review/Noralf-Tr-nnes/drm-drv-Rework-drm_dev_unplug-was-Remove-drm_dev_unplug/20190208-223952' into devel-hourly-2019021719 git bisect good 6dfcfd278beadb8857b94c0382348625943044be # 03:25 G 11 0 0 0 Merge 'linux-review/Qing-Xia/staging-android-ion-fix-sys-heap-pool-s-gfp_flags/20190204-124705' into devel-hourly-2019021719 git bisect bad 238358184e8bfb7c34701fc858f93400ffd8207d # 03:35 B 0 10 25 0 Merge 'linux-review/Colin-King-via-dri-devel/video-fbdev-savage-fix-indentation-issue/20190212-234031' into devel-hourly-2019021719 git bisect good 8833753cc966fbe02ec9dadcd73601f23da7dc2d # 03:44 G 10 0 0 0 Merge 'linux-review/Kamalesh-Babulal/static_keys-txt-Fix-trivial-spelling-mistake/20190204-230620' into devel-hourly-2019021719 git bisect bad efcb5c0b0e4e5bd29320ef5d7ef3e0654c182abf # 03:52 B 0 8 23 0 Merge 'net/master' into devel-hourly-2019021719 git bisect good 9312d5340da6a6018c851d03107ae24ef1a7ccb5 # 04:08 G 11 0 0 0 Merge 'linux-review/Yuri-Benditovich/virtio_net-Introduce-extended-RSC-feature/20190204-114604' into devel-hourly-2019021719 git bisect bad 680905431b9de8c7224b15b76b1826a1481cfeaf # 04:18 B 0 9 24 0 Merge tag 'char-misc-5.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc git bisect bad b9de6efed25cb713c1648e71302f4af83bd14ee6 # 04:31 B 0 11 26 0 Merge branch 'akpm' (patches from Andrew) git bisect good 44e56f325b7d63e8a53008956ce7b28e4272a599 # 04:39 G 11 0 0 0 Merge tag 'pci-v5.0-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci git bisect good a8e911d13540487942d53137c156bd7707f66e5d # 04:50 G 10 0 0 0 x86_64: increase stack size for KASAN_EXTRA git bisect good cd984a5be21549273a3f13b52a8b7b84097b32a7 # 05:01 G 11 0 0 0 Merge tag 'xtensa-20190201' of git://github.com/jcmvbkbc/linux-xtensa git bisect bad db7ddeab3ce5d64c9696e70d61f45ea9909cd196 # 05:10 B 0 7 22 0 lib/test_kmod.c: potential double free in error handling git bisect bad 24feb47c5fa5b825efb0151f28906dfdad027e61 # 05:20 B 0 4 19 0 mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone git bisect good 80409c65e2c6cd1540045ee01fc55e50d95e0983 # 05:50 G 11 0 1 1 mm: migrate: make buffer_migrate_page_norefs() actually succeed git bisect bad efad4e475c312456edb3c789d0996d12ed744c13 # 06:03 B 0 3 18 0 mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone git bisect good 9bcdeb51bd7d2ae9fe65ea4d60643d2aeef5bfe3 # 06:25 G 11 0 0 0 oom, oom_reaper: do not enqueue same task twice # first bad commit: [efad4e475c312456edb3c789d0996d12ed744c13] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone git bisect good 9bcdeb51bd7d2ae9fe65ea4d60643d2aeef5bfe3 # 06:29 G 31 0 0 0 oom, oom_reaper: do not enqueue same task twice # extra tests with debug options git bisect bad efad4e475c312456edb3c789d0996d12ed744c13 # 06:50 B 0 2 17 0 mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone # extra tests on HEAD of linux-devel/devel-hourly-2019021719 git bisect bad 5bb0643c4108bb06d8766b4bd48d20215deef4af # 06:55 B 0 12 31 1 0day head guard for 'devel-hourly-2019021719' # extra tests on tree/branch linus/master git bisect good f17b5f06cb92ef2250513a1e154c47b78df07d40 # 06:56 G 10 0 0 6 Linux 5.0-rc4 # extra tests with first bad commit reverted git bisect good cc8685c9af14503b93c6aca3330789384fcb62ac # 07:25 G 10 0 0 0 Revert "mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone" # extra tests on tree/branch linux-next/master git bisect bad 7a92eb7cc1dc4c63e3a2fa9ab8e3c1049f199249 # 07:50 B 0 10 25 0 Add linux-next specific files for 20190215 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/lkp Intel Corporation