From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753037AbcHONc2 (ORCPT ); Mon, 15 Aug 2016 09:32:28 -0400 Received: from parrot.pmhahn.de ([88.198.50.102]:55421 "EHLO parrot.pmhahn.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752783AbcHONc1 (ORCPT ); Mon, 15 Aug 2016 09:32:27 -0400 X-Greylist: delayed 358 seconds by postgrey-1.27 at vger.kernel.org; Mon, 15 Aug 2016 09:32:26 EDT To: "linux-kernel@vger.kernel.org" , Rusty Russell , qemu-devel From: Philipp Hahn X-Enigmail-Draft-Status: N1110 Subject: virnet: page allocation failure: order:0 Message-ID: <8838f4e4-cb93-9c78-f446-7b1e2cb639fa@pmhahn.de> Date: Mon, 15 Aug 2016 15:26:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, this Sunday one of our virtual servers running linux-4.1.16 inside OpenStack using qemu "crashed" while doing a backup using rsync to a slow NFS server. Crash here means that the server became unresponsive to network traffic: - it was no longer able to contact the two LDAP servers - no ssh login was possible - the backup got stuck - crond was still running and added process after process, leading to ~1.5k processes running after one day. What I don't know is if the network problem was the cause or the consequence. Because of that I want to understand why the follwoing order=0 allocation failed: > swapper/0: page allocation failure: order:0, mode:0x120 4KiB src/extern/linux/include/linux/gfp.h: 18 #define ___GFP_HIGH»»···0x20u 21 #define ___GFP_COLD»»···0x100u 72 #define __GFP_HIGH»·((__force gfp_t)___GFP_HIGH)»···/* Should access emergency pools? */ 75 #define __GFP_COLD»·((__force gfp_t)___GFP_COLD)»···/* Cache-cold page required */ > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-ucs190-amd64 #1 Debian 4.1.6-1.190.201604142226 The kernel is 4.1.16 with some patches from Debian and some others. > Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011 > 0000000000000000 0000000000000000 ffffffff81597807 0000000000000120 > ffffffff8116b6e9 0000000000018200 ffff88021fffbb00 0000000000000000 > ffff88021fffbb08 0000000000000000 ffff88021fffab50 ffffffff818164e0 > Call Trace: > [] ? dump_stack+0x40/0x50 > [] ? warn_alloc_failed+0xf9/0x150 > [] ? __alloc_pages_nodemask+0x65a/0x9d0 > [] ? alloc_pages_current+0xa4/0x120 > [] ? skb_page_frag_refill+0xb7/0xe0 > [] ? try_fill_recv+0x31b/0x610 [virtio_net] > [] ? virtnet_receive+0x580/0x890 [virtio_net] Received network packet, but failed to copy from VirtIO to local kernel memory. > [] ? virtnet_poll+0x26/0x90 [virtio_net] > [] ? net_rx_action+0x159/0x330 > [] ? __do_softirq+0xde/0x260 > [] ? irq_exit+0x95/0xa0 > [] ? do_IRQ+0x64/0x110 > [] ? common_interrupt+0x6e/0x6e > [] ? mwait_idle+0x150/0x150 > [] ? native_safe_halt+0x2/0x10 > [] ? default_idle+0x1c/0xb0 > [] ? cpu_startup_entry+0x314/0x3e0 > [] ? start_kernel+0x48d/0x498 > [] ? set_init_arg+0x50/0x50 > [] ? early_idt_handler_array+0x117/0x120 > [] ? early_idt_handler_array+0x117/0x120 > [] ? x86_64_start_kernel+0x14a/0x159 > Mem-Info: > active_anon:34861 inactive_anon:3120 isolated_anon:0 > active_file:941577 inactive_file:946489 isolated_file:0 > unevictable:974 dirty:2953 writeback:282206 unstable:1277 > slab_reclaimable:55077 slab_unreclaimable:31365 > mapped:3979 shmem:3986 pagetables:3163 bounce:0 > free:9437 free_pcp:740 free_cma:0 Looks like enough memory is free > Node 0 DMA free:15908kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes > lowmem_reserve[]: 0 3492 7971 7971 > Node 0 DMA32 free:19532kB min:4968kB low:6208kB high:7452kB active_anon:62328kB inactive_anon:5336kB active_file:1659704kB inactive_file:1671236kB unevictable:2056kB isolated(anon):0kB isolated(file):0kB present:3653624kB managed:3578476kB mlocked:2056kB dirty:10472kB writeback:479692kB mapped:11892kB shmem:7132kB slab_reclaimable:79832kB slab_unreclaimable:46784kB kernel_stack:2000kB pagetables:4660kB unstable:2596kB bounce:0kB free_pcp:1080kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 4478 4478 > Node 0 Normal free:2308kB min:6372kB low:7964kB high:9556kB active_anon:77116kB inactive_anon:7144kB active_file:2106604kB inactive_file:2114720kB unevictable:1840kB isolated(anon):0kB isolated(file):0kB present:4718592kB managed:4586204kB mlocked:1840kB dirty:1340kB writeback:649132kB mapped:4024kB shmem:8812kB slab_reclaimable:140476kB slab_unreclaimable:78676kB kernel_stack:1888kB pagetables:7992kB unstable:2512kB bounce:0kB free_pcp:1880kB local_pcp:164kB free_cma:0kB writeback_tmp:0kB pages_scanned:768 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 Show also enough pages being available > Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (EM) = 15908kB > Node 0 DMA32: 4632*4kB (UEM) 81*8kB (UMR) 4*16kB (R) 1*32kB (R) 0*64kB 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 19656kB > Node 0 Normal: 315*4kB (UM) 1*8kB (R) 1*16kB (R) 1*32kB (R) 1*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 2532kB > Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Here too. > 1892811 total pagecache pages > 0 pages in swap cache > Swap cache stats: add 0, delete 0, find 0/0 > Free swap = 0kB > Total swap = 0kB Okay, no swap for defragmentation, but should be no problem for order=0 > 2097052 pages RAM > 0 pages HighMem/MovableOnly > 51905 pages reserved > 0 pages hwpoisoned Any explanation? Thank you in advance Philipp From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53008) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZHuW-0002gQ-ET for qemu-devel@nongnu.org; Mon, 15 Aug 2016 09:26:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bZHuR-0006WD-In for qemu-devel@nongnu.org; Mon, 15 Aug 2016 09:26:36 -0400 Received: from parrot.pmhahn.de ([88.198.50.102]:44741) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bZHuR-0006Vx-5U for qemu-devel@nongnu.org; Mon, 15 Aug 2016 09:26:31 -0400 From: Philipp Hahn Message-ID: <8838f4e4-cb93-9c78-f446-7b1e2cb639fa@pmhahn.de> Date: Mon, 15 Aug 2016 15:26:25 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] virnet: page allocation failure: order:0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "linux-kernel@vger.kernel.org" , Rusty Russell , qemu-devel Hello, this Sunday one of our virtual servers running linux-4.1.16 inside OpenStack using qemu "crashed" while doing a backup using rsync to a slow NFS server. Crash here means that the server became unresponsive to network traffic: - it was no longer able to contact the two LDAP servers - no ssh login was possible - the backup got stuck - crond was still running and added process after process, leading to ~1.5k processes running after one day. What I don't know is if the network problem was the cause or the consequence. Because of that I want to understand why the follwoing order=3D0 allocation failed: > swapper/0: page allocation failure: order:0, mode:0x120 4KiB src/extern/linux/include/linux/gfp.h: 18 #define ___GFP_HIGH=C2=BB=C2=BB=C2=B7=C2=B7=C2=B70x20u 21 #define ___GFP_COLD=C2=BB=C2=BB=C2=B7=C2=B7=C2=B70x100u 72 #define __GFP_HIGH=C2=BB=C2=B7((__force gfp_t)___GFP_HIGH)=C2=BB=C2=B7= =C2=B7=C2=B7/* Should access emergency pools? */ 75 #define __GFP_COLD=C2=BB=C2=B7((__force gfp_t)___GFP_COLD)=C2=BB=C2=B7= =C2=B7=C2=B7/* Cache-cold page required */ > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-ucs190-amd64 #1 Debian 4.1.6-1.190.201604142226 The kernel is 4.1.16 with some patches from Debian and some others. > Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/20= 11 > 0000000000000000 0000000000000000 ffffffff81597807 0000000000000120 > ffffffff8116b6e9 0000000000018200 ffff88021fffbb00 0000000000000000 > ffff88021fffbb08 0000000000000000 ffff88021fffab50 ffffffff818164e0 > Call Trace: > [] ? dump_stack+0x40/0x50 > [] ? warn_alloc_failed+0xf9/0x150 > [] ? __alloc_pages_nodemask+0x65a/0x9d0 > [] ? alloc_pages_current+0xa4/0x120 > [] ? skb_page_frag_refill+0xb7/0xe0 > [] ? try_fill_recv+0x31b/0x610 [virtio_net] > [] ? virtnet_receive+0x580/0x890 [virtio_net] Received network packet, but failed to copy from VirtIO to local kernel memory. > [] ? virtnet_poll+0x26/0x90 [virtio_net] > [] ? net_rx_action+0x159/0x330 > [] ? __do_softirq+0xde/0x260 > [] ? irq_exit+0x95/0xa0 > [] ? do_IRQ+0x64/0x110 > [] ? common_interrupt+0x6e/0x6e > [] ? mwait_idle+0x150/0x150 > [] ? native_safe_halt+0x2/0x10 > [] ? default_idle+0x1c/0xb0 > [] ? cpu_startup_entry+0x314/0x3e0 > [] ? start_kernel+0x48d/0x498 > [] ? set_init_arg+0x50/0x50 > [] ? early_idt_handler_array+0x117/0x120 > [] ? early_idt_handler_array+0x117/0x120 > [] ? x86_64_start_kernel+0x14a/0x159 > Mem-Info: > active_anon:34861 inactive_anon:3120 isolated_anon:0 > active_file:941577 inactive_file:946489 isolated_file:0 > unevictable:974 dirty:2953 writeback:282206 unstable:1277 > slab_reclaimable:55077 slab_unreclaimable:31365 > mapped:3979 shmem:3986 pagetables:3163 bounce:0 > free:9437 free_pcp:740 free_cma:0 Looks like enough memory is free > Node 0 DMA free:15908kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes > lowmem_reserve[]: 0 3492 7971 7971 > Node 0 DMA32 free:19532kB min:4968kB low:6208kB high:7452kB active_anon:62328kB inactive_anon:5336kB active_file:1659704kB inactive_file:1671236kB unevictable:2056kB isolated(anon):0kB isolated(file):0kB present:3653624kB managed:3578476kB mlocked:2056kB dirty:10472kB writeback:479692kB mapped:11892kB shmem:7132kB slab_reclaimable:79832kB slab_unreclaimable:46784kB kernel_stack:2000kB pagetables:4660kB unstable:2596kB bounce:0kB free_pcp:1080kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 4478 4478 > Node 0 Normal free:2308kB min:6372kB low:7964kB high:9556kB active_anon:77116kB inactive_anon:7144kB active_file:2106604kB inactive_file:2114720kB unevictable:1840kB isolated(anon):0kB isolated(file):0kB present:4718592kB managed:4586204kB mlocked:1840kB dirty:1340kB writeback:649132kB mapped:4024kB shmem:8812kB slab_reclaimable:140476kB slab_unreclaimable:78676kB kernel_stack:1888kB pagetables:7992kB unstable:2512kB bounce:0kB free_pcp:1880kB local_pcp:164kB free_cma:0kB writeback_tmp:0kB pages_scanned:768 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 Show also enough pages being available > Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (EM) =3D 15908kB > Node 0 DMA32: 4632*4kB (UEM) 81*8kB (UMR) 4*16kB (R) 1*32kB (R) 0*64kB 1*128kB (R) 1*256kB (R) 0*512kB 0*1024kB 0*2048kB 0*4096kB =3D 19656kB > Node 0 Normal: 315*4kB (UM) 1*8kB (R) 1*16kB (R) 1*32kB (R) 1*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB =3D 2532kB > Node 0 hugepages_total=3D0 hugepages_free=3D0 hugepages_surp=3D0 hugepages_size=3D2048kB Here too. > 1892811 total pagecache pages > 0 pages in swap cache > Swap cache stats: add 0, delete 0, find 0/0 > Free swap =3D 0kB > Total swap =3D 0kB Okay, no swap for defragmentation, but should be no problem for order=3D0 > 2097052 pages RAM > 0 pages HighMem/MovableOnly > 51905 pages reserved > 0 pages hwpoisoned Any explanation? Thank you in advance Philipp