* [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) @ 2016-07-20 11:44 Kirill A. Shutemov 2016-07-20 11:53 ` Michal Hocko 0 siblings, 1 reply; 8+ messages in thread From: Kirill A. Shutemov @ 2016-07-20 11:44 UTC (permalink / raw) To: linux-mm, akpm; +Cc: mhocko, riel, rientjes, vbabka, mgorman Hello, Looks like current mmotm is broken. See trace below. It's easy to reproduce in my setup: virtual machine with some amount of swap space and try allocate about the size of RAM in userspace (I used usemem[1] for that). Any clues? [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 [ 39.416018] Call Trace: [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 11:44 [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) Kirill A. Shutemov @ 2016-07-20 11:53 ` Michal Hocko 2016-07-20 13:39 ` Vlastimil Babka 0 siblings, 1 reply; 8+ messages in thread From: Michal Hocko @ 2016-07-20 11:53 UTC (permalink / raw) To: Kirill A. Shutemov; +Cc: linux-mm, akpm, riel, rientjes, vbabka, mgorman On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: > Hello, > > Looks like current mmotm is broken. See trace below. Why do you think it is broken? This is order-2 NOWAIT allocation. So we are relying on atomic highorder reserve and kcompactd to make sufficient progress. It is hard to find out more without the full log including the meminfo. > It's easy to reproduce in my setup: virtual machine with some amount of > swap space and try allocate about the size of RAM in userspace (I used > usemem[1] for that). Have you tried to bisect it? Some of the recent compaction changes might have made a difference. > Any clues? > > [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar > > [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) > [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 > [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 > [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 > [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 > [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 > [ 39.416018] Call Trace: > [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 > [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 > [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 > [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 > [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 > [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 > [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 > [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 > [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 > [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 > [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 > [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 > [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 > [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 > [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 > [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 > [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 > [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 > [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 > [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 > [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 > [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 > [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 > [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 > [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 > [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 > [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 > [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 > [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 > [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 > [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 > [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 > [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 > [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 > [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 > [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 > [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 > [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 > [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 > [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 > [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 > [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 > [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 > [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 > [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 > [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 > [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 > [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 > [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 > [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 > [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 > [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 > [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 > [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 > [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 > [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 > [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 > [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 > [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 > [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 > [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 > [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 > [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 > [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 > [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 > [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 > [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 > [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 > [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 > [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 > [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 > [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 > [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 > [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 > [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 > [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 > [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 > [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 > [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 > [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 > [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 > [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 > [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 > [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 > [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 > [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 > [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 > [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 > [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 > [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 > [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 > [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 > [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 > [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 > [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 > [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 > [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 > [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 > [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 > [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 > [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 > [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 > [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 > [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 > [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 > [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 > [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 > [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 > [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 > [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 > [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 > [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 > [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 > [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 > [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 > [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 > [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 > [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 > [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 > [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 > [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 > [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 > [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 > [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 > [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 > [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 > [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 > [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 > [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 > [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 > [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 > > -- > Kirill A. Shutemov -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 11:53 ` Michal Hocko @ 2016-07-20 13:39 ` Vlastimil Babka 2016-07-20 15:19 ` Kirill A. Shutemov 0 siblings, 1 reply; 8+ messages in thread From: Vlastimil Babka @ 2016-07-20 13:39 UTC (permalink / raw) To: Michal Hocko, Kirill A. Shutemov; +Cc: linux-mm, akpm, riel, rientjes, mgorman On 07/20/2016 01:53 PM, Michal Hocko wrote: > On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: >> Hello, >> >> Looks like current mmotm is broken. See trace below. > > Why do you think it is broken? This is order-2 NOWAIT allocation. So we > are relying on atomic highorder reserve and kcompactd to make sufficient > progress. It is hard to find out more without the full log including the > meminfo. Also it seems to come from kasan allocating stackdepot space to record who freed a slab object, or something. >> It's easy to reproduce in my setup: virtual machine with some amount of >> swap space and try allocate about the size of RAM in userspace (I used >> usemem[1] for that). > > Have you tried to bisect it? Some of the recent compaction changes might > have made a difference. AFAIK recent compaction changes are not in mmotm yet. The node-based lru reclaim might have shifted some balances perhaps. >> Any clues? >> >> [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar >> >> [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) >> [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 >> [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 >> [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 >> [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 >> [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 >> [ 39.416018] Call Trace: >> [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 >> [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 >> [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 >> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >> [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 >> [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 >> [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >> [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 >> [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 >> [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >> [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 >> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >> [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 >> [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 >> [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >> [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 >> [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 >> [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 >> [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >> [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 >> [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 >> [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 >> [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 >> [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 >> [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 >> [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 >> [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 >> [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 >> [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 >> [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 >> [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 >> [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 >> [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 >> [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 >> [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 >> [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 >> [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 >> [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 >> [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 >> [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 >> [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 >> [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 >> [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 >> [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 >> [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >> [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 >> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >> [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >> [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 >> [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 >> [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 >> [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 >> [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >> [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 >> [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >> [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 >> [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 >> [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 >> [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 >> [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 >> [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 >> [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 >> [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 >> [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 >> [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 >> [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 >> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >> [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 >> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >> [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 >> [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 >> [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 >> [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 >> [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 >> [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 >> [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 >> [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >> [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 >> [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 >> [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 >> [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 >> [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 >> [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >> [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 >> [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 >> [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 >> [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 >> [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 >> [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 >> [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 >> [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 >> [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 >> [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >> [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >> [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 >> [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 >> [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 >> [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 >> [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 >> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >> [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 >> [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >> [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 >> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >> [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 >> [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 >> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >> [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 >> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >> [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 >> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >> [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 >> [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 >> [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 >> [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 >> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >> [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 >> [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 >> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >> >> -- >> Kirill A. Shutemov > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 13:39 ` Vlastimil Babka @ 2016-07-20 15:19 ` Kirill A. Shutemov 2016-07-20 15:33 ` Vlastimil Babka 0 siblings, 1 reply; 8+ messages in thread From: Kirill A. Shutemov @ 2016-07-20 15:19 UTC (permalink / raw) To: Vlastimil Babka, Alexander Potapenko, Michal Hocko Cc: linux-mm, akpm, riel, rientjes, mgorman On Wed, Jul 20, 2016 at 03:39:24PM +0200, Vlastimil Babka wrote: > On 07/20/2016 01:53 PM, Michal Hocko wrote: > > On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: > >> Hello, > >> > >> Looks like current mmotm is broken. See trace below. > > > > Why do you think it is broken? This is order-2 NOWAIT allocation. So we > > are relying on atomic highorder reserve and kcompactd to make sufficient > > progress. It is hard to find out more without the full log including the > > meminfo. > > Also it seems to come from kasan allocating stackdepot space to record > who freed a slab object, or something. > > >> It's easy to reproduce in my setup: virtual machine with some amount of > >> swap space and try allocate about the size of RAM in userspace (I used > >> usemem[1] for that). > > > > Have you tried to bisect it? Bisected to a590d2628f08 ("mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB"). I guess it's candidate for __GFP_WARN. Not sure if there's a better solution. This helps: diff --git a/lib/stackdepot.c b/lib/stackdepot.c index 53ad6c0831ae..60f77f1d470a 100644 --- a/lib/stackdepot.c +++ b/lib/stackdepot.c @@ -242,6 +242,7 @@ depot_stack_handle_t depot_save_stack(struct stack_trace *trace, */ alloc_flags &= ~GFP_ZONEMASK; alloc_flags &= (GFP_ATOMIC | GFP_KERNEL); + alloc_flags |= __GFP_NOWARN; page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER); if (page) prealloc = page_address(page); > > Some of the recent compaction changes might have made a difference. > > AFAIK recent compaction changes are not in mmotm yet. The node-based lru > reclaim might have shifted some balances perhaps. > > >> Any clues? > >> > >> [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar > >> > >> [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) > >> [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 > >> [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 > >> [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 > >> [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 > >> [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 > >> [ 39.416018] Call Trace: > >> [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 > >> [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 > >> [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 > >> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 > >> [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 > >> [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 > >> [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 > >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > >> [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 > >> [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 > >> [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 > >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > >> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 > >> [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 > >> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 > >> [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 > >> [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 > >> [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 > >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > >> [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 > >> [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 > >> [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 > >> [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 > >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > >> [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 > >> [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 > >> [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 > >> [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 > >> [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 > >> [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 > >> [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 > >> [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 > >> [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 > >> [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 > >> [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 > >> [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 > >> [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 > >> [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 > >> [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 > >> [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 > >> [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 > >> [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 > >> [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 > >> [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 > >> [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 > >> [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 > >> [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 > >> [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 > >> [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 > >> [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 > >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 > >> [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 > >> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 > >> [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 > >> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 > >> [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 > >> [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 > >> [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 > >> [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 > >> [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 > >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > >> [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 > >> [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 > >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > >> [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 > >> [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 > >> [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 > >> [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 > >> [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 > >> [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 > >> [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 > >> [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 > >> [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 > >> [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 > >> [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 > >> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 > >> [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 > >> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 > >> [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 > >> [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 > >> [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 > >> [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 > >> [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 > >> [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 > >> [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 > >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 > >> [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 > >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > >> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 > >> [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 > >> [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 > >> [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 > >> [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 > >> [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 > >> [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 > >> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 > >> [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 > >> [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 > >> [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 > >> [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 > >> [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 > >> [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 > >> [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 > >> [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 > >> [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 > >> [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 > >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 > >> [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 > >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > >> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 > >> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 > >> [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 > >> [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 > >> [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 > >> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 > >> [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 > >> [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 > >> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 > >> [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 > >> [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 > >> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 > >> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 > >> [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 > >> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 > >> [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 > >> [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 > >> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 > >> [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 > >> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 > >> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 > >> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 > >> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 > >> [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 > >> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 > >> [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 > >> [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 > >> [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 > >> [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 > >> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 > >> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 > >> [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 > >> [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 > >> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 > >> > >> -- > >> Kirill A. Shutemov > > > -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 15:19 ` Kirill A. Shutemov @ 2016-07-20 15:33 ` Vlastimil Babka 2016-07-20 18:12 ` Alexander Potapenko 0 siblings, 1 reply; 8+ messages in thread From: Vlastimil Babka @ 2016-07-20 15:33 UTC (permalink / raw) To: Kirill A. Shutemov, Alexander Potapenko, Michal Hocko Cc: linux-mm, akpm, riel, rientjes, mgorman On 07/20/2016 05:19 PM, Kirill A. Shutemov wrote: > On Wed, Jul 20, 2016 at 03:39:24PM +0200, Vlastimil Babka wrote: >> On 07/20/2016 01:53 PM, Michal Hocko wrote: >>> On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: >>>> Hello, >>>> >>>> Looks like current mmotm is broken. See trace below. >>> >>> Why do you think it is broken? This is order-2 NOWAIT allocation. So we >>> are relying on atomic highorder reserve and kcompactd to make sufficient >>> progress. It is hard to find out more without the full log including the >>> meminfo. >> >> Also it seems to come from kasan allocating stackdepot space to record >> who freed a slab object, or something. >> >>>> It's easy to reproduce in my setup: virtual machine with some amount of >>>> swap space and try allocate about the size of RAM in userspace (I used >>>> usemem[1] for that). >>> >>> Have you tried to bisect it? > > Bisected to a590d2628f08 ("mm, kasan: switch SLUB to stackdepot, enable > memory quarantine for SLUB"). > > I guess it's candidate for __GFP_WARN. Not sure if there's a better > solution. An order-0 fallback maybe? Agree with NOWARN, if stackdepot (or its users) are able to tell that a trace is missing because allocation has failed - the precise allocation trace isn't that useful I guess. Order-2 allocation that's potentially atomic and frequent just can fail. > This helps: > > diff --git a/lib/stackdepot.c b/lib/stackdepot.c > index 53ad6c0831ae..60f77f1d470a 100644 > --- a/lib/stackdepot.c > +++ b/lib/stackdepot.c > @@ -242,6 +242,7 @@ depot_stack_handle_t depot_save_stack(struct stack_trace *trace, > */ > alloc_flags &= ~GFP_ZONEMASK; > alloc_flags &= (GFP_ATOMIC | GFP_KERNEL); > + alloc_flags |= __GFP_NOWARN; > page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER); > if (page) > prealloc = page_address(page); > >>> Some of the recent compaction changes might have made a difference. >> >> AFAIK recent compaction changes are not in mmotm yet. The node-based lru >> reclaim might have shifted some balances perhaps. >> >>>> Any clues? >>>> >>>> [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar >>>> >>>> [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) >>>> [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 >>>> [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 >>>> [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 >>>> [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 >>>> [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 >>>> [ 39.416018] Call Trace: >>>> [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 >>>> [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 >>>> [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 >>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>> [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 >>>> [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 >>>> [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 >>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>> [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 >>>> [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 >>>> [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 >>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>> [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 >>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>> [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 >>>> [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 >>>> [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 >>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>> [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 >>>> [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 >>>> [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 >>>> [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 >>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>> [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 >>>> [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 >>>> [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 >>>> [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 >>>> [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 >>>> [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 >>>> [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 >>>> [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 >>>> [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 >>>> [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 >>>> [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 >>>> [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 >>>> [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 >>>> [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 >>>> [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 >>>> [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 >>>> [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 >>>> [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 >>>> [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 >>>> [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 >>>> [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 >>>> [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 >>>> [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 >>>> [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 >>>> [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 >>>> [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 >>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>> [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 >>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>> [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 >>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>> [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 >>>> [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 >>>> [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 >>>> [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 >>>> [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 >>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>> [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 >>>> [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 >>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>> [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 >>>> [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 >>>> [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 >>>> [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 >>>> [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 >>>> [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 >>>> [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 >>>> [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 >>>> [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 >>>> [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 >>>> [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 >>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>> [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 >>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>> [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 >>>> [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 >>>> [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 >>>> [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 >>>> [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 >>>> [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 >>>> [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 >>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 >>>> [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 >>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>> [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 >>>> [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 >>>> [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 >>>> [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 >>>> [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 >>>> [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 >>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>> [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 >>>> [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 >>>> [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 >>>> [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 >>>> [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 >>>> [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 >>>> [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 >>>> [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 >>>> [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 >>>> [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 >>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>> [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 >>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>> [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 >>>> [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 >>>> [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 >>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 >>>> [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 >>>> [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 >>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>> [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 >>>> [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 >>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>> [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 >>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>> [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 >>>> [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 >>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>> [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 >>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>> [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 >>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>> [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 >>>> [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 >>>> [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 >>>> [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 >>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>> [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 >>>> [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 >>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>> >>>> -- >>>> Kirill A. Shutemov >>> >> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 15:33 ` Vlastimil Babka @ 2016-07-20 18:12 ` Alexander Potapenko 2016-07-20 22:34 ` Kirill A. Shutemov 0 siblings, 1 reply; 8+ messages in thread From: Alexander Potapenko @ 2016-07-20 18:12 UTC (permalink / raw) To: Vlastimil Babka Cc: Kirill A. Shutemov, Michal Hocko, Linux Memory Management List, Andrew Morton, riel, David Rientjes, mgorman On Wed, Jul 20, 2016 at 5:33 PM, Vlastimil Babka <vbabka@suse.cz> wrote: > On 07/20/2016 05:19 PM, Kirill A. Shutemov wrote: >> On Wed, Jul 20, 2016 at 03:39:24PM +0200, Vlastimil Babka wrote: >>> On 07/20/2016 01:53 PM, Michal Hocko wrote: >>>> On Wed 20-07-16 14:44:17, Kirill A. Shutemov wrote: >>>>> Hello, >>>>> >>>>> Looks like current mmotm is broken. See trace below. >>>> >>>> Why do you think it is broken? This is order-2 NOWAIT allocation. So we >>>> are relying on atomic highorder reserve and kcompactd to make sufficient >>>> progress. It is hard to find out more without the full log including the >>>> meminfo. >>> >>> Also it seems to come from kasan allocating stackdepot space to record >>> who freed a slab object, or something. >>> >>>>> It's easy to reproduce in my setup: virtual machine with some amount of >>>>> swap space and try allocate about the size of RAM in userspace (I used >>>>> usemem[1] for that). >>>> >>>> Have you tried to bisect it? >> >> Bisected to a590d2628f08 ("mm, kasan: switch SLUB to stackdepot, enable >> memory quarantine for SLUB"). >> >> I guess it's candidate for __GFP_WARN. Not sure if there's a better >> solution. > > An order-0 fallback maybe? > Agree with NOWARN, if stackdepot (or its users) are able to tell that a > trace is missing because allocation has failed - the precise allocation > trace isn't that useful I guess. Order-2 allocation that's potentially > atomic and frequent just can fail. > >> This helps: >> >> diff --git a/lib/stackdepot.c b/lib/stackdepot.c >> index 53ad6c0831ae..60f77f1d470a 100644 >> --- a/lib/stackdepot.c >> +++ b/lib/stackdepot.c >> @@ -242,6 +242,7 @@ depot_stack_handle_t depot_save_stack(struct stack_trace *trace, >> */ >> alloc_flags &= ~GFP_ZONEMASK; >> alloc_flags &= (GFP_ATOMIC | GFP_KERNEL); >> + alloc_flags |= __GFP_NOWARN; >> page = alloc_pages(alloc_flags, STACK_ALLOC_ORDER); >> if (page) >> prealloc = page_address(page); >> >>>> Some of the recent compaction changes might have made a difference. >>> >>> AFAIK recent compaction changes are not in mmotm yet. The node-based lru >>> reclaim might have shifted some balances perhaps. >>> >>>>> Any clues? >>>>> >>>>> [1] http://www.spinics.net/lists/linux-mm/attachments/gtarazbJaHPaAT.gtar >>>>> >>>>> [ 39.413099] kswapd2: page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) >>>>> [ 39.414122] CPU: 2 PID: 64 Comm: kswapd2 Not tainted 4.7.0-rc7-mm1-00428-gc3e13e4dab1b-dirty #2878 >>>>> [ 39.416018] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 >>>>> [ 39.416018] 0000000000000002 ffff88002f807690 ffffffff81c8fb0d 1ffff10005f00ed6 >>>>> [ 39.416018] 0000000000000000 0000000000000002 ffff88002f807750 ffff88002f8077a8 >>>>> [ 39.416018] ffffffff813e728b ffff88002f8077a8 0200000000000000 0000000041b58ab3 >>>>> [ 39.416018] Call Trace: >>>>> [ 39.416018] <IRQ> [<ffffffff81c8fb0d>] dump_stack+0x95/0xe8 >>>>> [ 39.416018] [<ffffffff813e728b>] warn_alloc_failed+0x1cb/0x250 >>>>> [ 39.416018] [<ffffffff813e70c0>] ? zone_watermark_ok_safe+0x250/0x250 >>>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>>> [ 39.416018] [<ffffffff813e7f4c>] __alloc_pages_nodemask+0x92c/0x1fe0 >>>>> [ 39.416018] [<ffffffff8119047c>] ? sched_clock_cpu+0x12c/0x1e0 >>>>> [ 39.416018] [<ffffffff81d24a17>] ? depot_save_stack+0x1b7/0x5b0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff81313595>] ? is_ftrace_trampoline+0xe5/0x120 >>>>> [ 39.416018] [<ffffffff813e7620>] ? __free_pages+0x90/0x90 >>>>> [ 39.416018] [<ffffffff811f9870>] ? debug_show_all_locks+0x290/0x290 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>>> [ 39.416018] [<ffffffff81245617>] ? debug_lockdep_rcu_enabled+0x77/0x90 >>>>> [ 39.416018] [<ffffffff81153788>] ? __kernel_text_address+0x78/0xa0 >>>>> [ 39.416018] [<ffffffff8105d05b>] ? print_context_stack+0x7b/0x100 >>>>> [ 39.416018] [<ffffffff814d42dc>] alloc_pages_current+0xbc/0x1f0 >>>>> [ 39.416018] [<ffffffff81d24d5f>] depot_save_stack+0x4ff/0x5b0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff814eb4c7>] kasan_slab_free+0x157/0x180 >>>>> [ 39.416018] [<ffffffff8107c58b>] ? save_stack_trace+0x2b/0x50 >>>>> [ 39.416018] [<ffffffff814eb453>] ? kasan_slab_free+0xe3/0x180 >>>>> [ 39.416018] [<ffffffff814e73e5>] ? kmem_cache_free+0x95/0x300 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff813d59a9>] ? mempool_free+0xd9/0x1d0 >>>>> [ 39.416018] [<ffffffff81be96e5>] ? bio_free+0x145/0x220 >>>>> [ 39.416018] [<ffffffff81bea8bf>] ? bio_put+0x8f/0xb0 >>>>> [ 39.416018] [<ffffffff814a7bfe>] ? end_swap_bio_write+0x22e/0x310 >>>>> [ 39.416018] [<ffffffff81bf1687>] ? bio_endio+0x187/0x1f0 >>>>> [ 39.416018] [<ffffffff81c0e89b>] ? blk_update_request+0x1bb/0xc30 >>>>> [ 39.416018] [<ffffffff81c3238c>] ? blk_mq_end_request+0x4c/0x130 >>>>> [ 39.416018] [<ffffffff8208a330>] ? virtblk_request_done+0xb0/0x2a0 >>>>> [ 39.416018] [<ffffffff81c2d17d>] ? __blk_mq_complete_request_remote+0x5d/0x70 >>>>> [ 39.416018] [<ffffffff8129fe3c>] ? flush_smp_call_function_queue+0xdc/0x3a0 >>>>> [ 39.416018] [<ffffffff812a0548>] ? generic_smp_call_function_single_interrupt+0x18/0x20 >>>>> [ 39.416018] [<ffffffff8109c654>] ? smp_call_function_single_interrupt+0x64/0x90 >>>>> [ 39.416018] [<ffffffff829584a9>] ? call_function_single_interrupt+0x89/0x90 >>>>> [ 39.416018] [<ffffffff81c350f6>] ? blk_mq_map_request+0xe6/0xc00 >>>>> [ 39.416018] [<ffffffff81c36f6f>] ? blk_sq_make_request+0x9af/0xca0 >>>>> [ 39.416018] [<ffffffff81c0b05e>] ? generic_make_request+0x30e/0x660 >>>>> [ 39.416018] [<ffffffff81c0b540>] ? submit_bio+0x190/0x470 >>>>> [ 39.416018] [<ffffffff814a8fd8>] ? __swap_writepage+0x6e8/0x940 >>>>> [ 39.416018] [<ffffffff814a926a>] ? swap_writepage+0x3a/0x70 >>>>> [ 39.416018] [<ffffffff8141376b>] ? shrink_page_list+0x1bdb/0x2f00 >>>>> [ 39.416018] [<ffffffff81416038>] ? shrink_inactive_list+0x538/0xc70 >>>>> [ 39.416018] [<ffffffff81417a1b>] ? shrink_node_memcg+0xa1b/0x1160 >>>>> [ 39.416018] [<ffffffff81418436>] ? shrink_node+0x2d6/0xc60 >>>>> [ 39.416018] [<ffffffff8141bf1e>] ? kswapd+0x82e/0x1460 >>>>> [ 39.416018] [<ffffffff81156d4a>] ? kthread+0x24a/0x2e0 >>>>> [ 39.416018] [<ffffffff8295773f>] ? ret_from_fork+0x1f/0x40 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>>> [ 39.416018] [<ffffffff814e73c5>] ? kmem_cache_free+0x75/0x300 >>>>> [ 39.416018] [<ffffffff81353e65>] ? time_hardirqs_off+0x45/0x2f0 >>>>> [ 39.416018] [<ffffffff814e747f>] ? kmem_cache_free+0x12f/0x300 >>>>> [ 39.416018] [<ffffffff813d5ac2>] ? mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff814e73e5>] kmem_cache_free+0x95/0x300 >>>>> [ 39.416018] [<ffffffff813d5aa0>] ? mempool_free+0x1d0/0x1d0 >>>>> [ 39.416018] [<ffffffff813d5ac2>] mempool_free_slab+0x22/0x30 >>>>> [ 39.416018] [<ffffffff813d59a9>] mempool_free+0xd9/0x1d0 >>>>> [ 39.416018] [<ffffffff81be96e5>] bio_free+0x145/0x220 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff81bea8bf>] bio_put+0x8f/0xb0 >>>>> [ 39.416018] [<ffffffff814a7bfe>] end_swap_bio_write+0x22e/0x310 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff81bf1687>] bio_endio+0x187/0x1f0 >>>>> [ 39.416018] [<ffffffff81c0e89b>] blk_update_request+0x1bb/0xc30 >>>>> [ 39.416018] [<ffffffff81c2d120>] ? blkdev_issue_zeroout+0x3f0/0x3f0 >>>>> [ 39.416018] [<ffffffff81c3238c>] blk_mq_end_request+0x4c/0x130 >>>>> [ 39.416018] [<ffffffff8208a330>] virtblk_request_done+0xb0/0x2a0 >>>>> [ 39.416018] [<ffffffff81c2d17d>] __blk_mq_complete_request_remote+0x5d/0x70 >>>>> [ 39.416018] [<ffffffff8129fe3c>] flush_smp_call_function_queue+0xdc/0x3a0 >>>>> [ 39.416018] [<ffffffff812a0548>] generic_smp_call_function_single_interrupt+0x18/0x20 >>>>> [ 39.416018] [<ffffffff8109c654>] smp_call_function_single_interrupt+0x64/0x90 >>>>> [ 39.416018] [<ffffffff829584a9>] call_function_single_interrupt+0x89/0x90 >>>>> [ 39.416018] <EOI> [<ffffffff8120138b>] ? lock_acquire+0x15b/0x340 >>>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>>> [ 39.416018] [<ffffffff81c350f6>] blk_mq_map_request+0xe6/0xc00 >>>>> [ 39.416018] [<ffffffff81c350a9>] ? blk_mq_map_request+0x99/0xc00 >>>>> [ 39.416018] [<ffffffff81c8d434>] ? blk_integrity_merge_bio+0xb4/0x3b0 >>>>> [ 39.416018] [<ffffffff81c35010>] ? blk_mq_alloc_request+0x490/0x490 >>>>> [ 39.416018] [<ffffffff81c0dd66>] ? blk_attempt_plug_merge+0x226/0x2c0 >>>>> [ 39.416018] [<ffffffff81c36f6f>] blk_sq_make_request+0x9af/0xca0 >>>>> [ 39.416018] [<ffffffff81c365c0>] ? blk_mq_insert_requests+0x940/0x940 >>>>> [ 39.416018] [<ffffffff81c07d20>] ? blk_exit_rl+0x60/0x60 >>>>> [ 39.416018] [<ffffffff81c027b0>] ? handle_bad_sector+0x1e0/0x1e0 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81c0b05e>] generic_make_request+0x30e/0x660 >>>>> [ 39.416018] [<ffffffff81c0ad50>] ? blk_plug_queued_count+0x160/0x160 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>>> [ 39.416018] [<ffffffff8152890b>] ? unlock_page_memcg+0x7b/0x130 >>>>> [ 39.416018] [<ffffffff81c0b540>] submit_bio+0x190/0x470 >>>>> [ 39.416018] [<ffffffff811dff60>] ? woken_wake_function+0x60/0x60 >>>>> [ 39.416018] [<ffffffff81c0b3b0>] ? generic_make_request+0x660/0x660 >>>>> [ 39.416018] [<ffffffff813f769d>] ? __test_set_page_writeback+0x36d/0x8c0 >>>>> [ 39.416018] [<ffffffff814a8fd8>] __swap_writepage+0x6e8/0x940 >>>>> [ 39.416018] [<ffffffff814a79d0>] ? SyS_madvise+0x13c0/0x13c0 >>>>> [ 39.416018] [<ffffffff814a88f0>] ? generic_swapfile_activate+0x490/0x490 >>>>> [ 39.416018] [<ffffffff814abd45>] ? swap_info_get+0x165/0x240 >>>>> [ 39.416018] [<ffffffff814affda>] ? page_swapcount+0xba/0xf0 >>>>> [ 39.416018] [<ffffffff82956ba1>] ? _raw_spin_unlock+0x31/0x50 >>>>> [ 39.416018] [<ffffffff814affdf>] ? page_swapcount+0xbf/0xf0 >>>>> [ 39.416018] [<ffffffff814a926a>] swap_writepage+0x3a/0x70 >>>>> [ 39.416018] [<ffffffff8141376b>] shrink_page_list+0x1bdb/0x2f00 >>>>> [ 39.416018] [<ffffffff81411b90>] ? putback_lru_page+0x3b0/0x3b0 >>>>> [ 39.416018] [<ffffffff81cef9ac>] ? __this_cpu_preempt_check+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81438ed4>] ? __mod_node_page_state+0x94/0xe0 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff811ee78d>] ? get_lock_stats+0x1d/0x90 >>>>> [ 39.416018] [<ffffffff8140fcb0>] ? __isolate_lru_page+0x3b0/0x3b0 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff81353ba2>] ? time_hardirqs_on+0xb2/0x330 >>>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>>> [ 39.416018] [<ffffffff81416038>] shrink_inactive_list+0x538/0xc70 >>>>> [ 39.416018] [<ffffffff81415b00>] ? putback_inactive_pages+0xaa0/0xaa0 >>>>> [ 39.416018] [<ffffffff81416770>] ? shrink_inactive_list+0xc70/0xc70 >>>>> [ 39.416018] [<ffffffff81190023>] ? sched_clock_local+0x43/0x120 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cd5780>] ? _find_next_bit.part.0+0xe0/0x120 >>>>> [ 39.416018] [<ffffffff8140e654>] ? pgdat_reclaimable_pages+0x764/0x9d0 >>>>> [ 39.416018] [<ffffffff8140f2ec>] ? pgdat_reclaimable+0x13c/0x1d0 >>>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>>> [ 39.416018] [<ffffffff81417a1b>] shrink_node_memcg+0xa1b/0x1160 >>>>> [ 39.416018] [<ffffffff81417000>] ? shrink_active_list+0x890/0x890 >>>>> [ 39.416018] [<ffffffff81cef815>] ? check_preemption_disabled+0x35/0x190 >>>>> [ 39.416018] [<ffffffff81cef98c>] ? debug_smp_processor_id+0x1c/0x20 >>>>> [ 39.416018] [<ffffffff8151f208>] ? mem_cgroup_iter+0x1b8/0xd10 >>>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>>> [ 39.416018] [<ffffffff81418436>] shrink_node+0x2d6/0xc60 >>>>> [ 39.416018] [<ffffffff81418160>] ? shrink_node_memcg+0x1160/0x1160 >>>>> [ 39.416018] [<ffffffff8140f3cc>] ? lruvec_lru_size+0x4c/0xa0 >>>>> [ 39.416018] [<ffffffff8141bf1e>] kswapd+0x82e/0x1460 >>>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>>> [ 39.416018] [<ffffffff82956c2c>] ? _raw_spin_unlock_irq+0x2c/0x50 >>>>> [ 39.416018] [<ffffffff811f6535>] ? trace_hardirqs_on_caller+0x405/0x590 >>>>> [ 39.416018] [<ffffffff82956c37>] ? _raw_spin_unlock_irq+0x37/0x50 >>>>> [ 39.416018] [<ffffffff81171cc8>] ? finish_task_switch+0x178/0x5b0 >>>>> [ 39.416018] [<ffffffff811de8f0>] ? __wake_up_common+0x150/0x150 >>>>> [ 39.416018] [<ffffffff82948c3e>] ? __schedule+0x55e/0x1b60 >>>>> [ 39.416018] [<ffffffff81156a32>] ? __kthread_parkme+0x172/0x240 >>>>> [ 39.416018] [<ffffffff81156d4a>] kthread+0x24a/0x2e0 >>>>> [ 39.416018] [<ffffffff8141b6f0>] ? mem_cgroup_shrink_node+0x600/0x600 >>>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>>> [ 39.416018] [<ffffffff81171c9c>] ? finish_task_switch+0x14c/0x5b0 >>>>> [ 39.416018] [<ffffffff8295773f>] ret_from_fork+0x1f/0x40 >>>>> [ 39.416018] [<ffffffff81156b00>] ? __kthread_parkme+0x240/0x240 >>>>> >>>>> -- >>>>> Kirill A. Shutemov >>>> >>> >> > Am I understanding right that you're seeing allocation failures from the stack depot? How often do they happen? Are they reported under heavy load, or just when you boot the kernel? Allocating with __GFP_NOWARN will help here, but I think we'd better figure out what's gone wrong. I've sent https://lkml.org/lkml/2016/7/14/566, which should reduce the stack depot's memory consumption, for review - can you see if the bug is still reproducible with that? -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 18:12 ` Alexander Potapenko @ 2016-07-20 22:34 ` Kirill A. Shutemov 2016-07-21 11:33 ` Alexander Potapenko 0 siblings, 1 reply; 8+ messages in thread From: Kirill A. Shutemov @ 2016-07-20 22:34 UTC (permalink / raw) To: Alexander Potapenko Cc: Vlastimil Babka, Michal Hocko, Linux Memory Management List, Andrew Morton, riel, David Rientjes, mgorman On Wed, Jul 20, 2016 at 08:12:13PM +0200, Alexander Potapenko wrote: > >>>>> It's easy to reproduce in my setup: virtual machine with some amount of > >>>>> swap space and try allocate about the size of RAM in userspace (I used > >>>>> usemem[1] for that). > > Am I understanding right that you're seeing allocation failures from > the stack depot? How often do they happen? Are they reported under > heavy load, or just when you boot the kernel? As I described, it happens under memory pressure. > Allocating with __GFP_NOWARN will help here, but I think we'd better > figure out what's gone wrong. > I've sent https://lkml.org/lkml/2016/7/14/566, which should reduce the > stack depot's memory consumption, for review - can you see if the bug > is still reproducible with that? I was not able to trigger the failure with the same test case. Tested with v2 of the patch. (Links to http://lkml.kernel.org/ or other archive with message-id in url is prefered. lkml.org is garbage) -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) 2016-07-20 22:34 ` Kirill A. Shutemov @ 2016-07-21 11:33 ` Alexander Potapenko 0 siblings, 0 replies; 8+ messages in thread From: Alexander Potapenko @ 2016-07-21 11:33 UTC (permalink / raw) To: Kirill A. Shutemov Cc: Vlastimil Babka, Michal Hocko, Linux Memory Management List, Andrew Morton, riel, David Rientjes, mgorman On Thu, Jul 21, 2016 at 12:34 AM, Kirill A. Shutemov <kirill@shutemov.name> wrote: > On Wed, Jul 20, 2016 at 08:12:13PM +0200, Alexander Potapenko wrote: >> >>>>> It's easy to reproduce in my setup: virtual machine with some amount of >> >>>>> swap space and try allocate about the size of RAM in userspace (I used >> >>>>> usemem[1] for that). >> >> Am I understanding right that you're seeing allocation failures from >> the stack depot? How often do they happen? Are they reported under >> heavy load, or just when you boot the kernel? > > As I described, it happens under memory pressure. > >> Allocating with __GFP_NOWARN will help here, but I think we'd better >> figure out what's gone wrong. >> I've sent https://lkml.org/lkml/2016/7/14/566, which should reduce the >> stack depot's memory consumption, for review - can you see if the bug >> is still reproducible with that? > > I was not able to trigger the failure with the same test case. > Tested with v2 of the patch. When the allocation happens in IRQ handler, we try to be clever and cut everything below EOI, because the lower frames don't really matter, but prevent stack deduplication. But since the stack pointers aren't preserved in the stack trace, the only way to do so is to check whether each frame is an IRQ entry point. That patch adds several entry points to the .irqentry.text section, thus allowing the stack depot to filter them out as well. If that works for you, we'd better not add the __GFP_NOWARN flag - that way we'll be able to detect similar problems in the future. > (Links to http://lkml.kernel.org/ or other archive with message-id in url > is prefered. lkml.org is garbage) Does gmane.org work (e.g. http://article.gmane.org/gmane.linux.kernel/2266971) It is surprisingly tricky to figure out a message id from the subject. > -- > Kirill A. Shutemov -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-07-21 11:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-20 11:44 [mmotm-2016-07-18-16-40] page allocation failure: order:2, mode:0x2000000(GFP_NOWAIT) Kirill A. Shutemov 2016-07-20 11:53 ` Michal Hocko 2016-07-20 13:39 ` Vlastimil Babka 2016-07-20 15:19 ` Kirill A. Shutemov 2016-07-20 15:33 ` Vlastimil Babka 2016-07-20 18:12 ` Alexander Potapenko 2016-07-20 22:34 ` Kirill A. Shutemov 2016-07-21 11:33 ` Alexander Potapenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).