From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752793AbcIUGyh (ORCPT ); Wed, 21 Sep 2016 02:54:37 -0400 Received: from parrot.pmhahn.de ([88.198.50.102]:55464 "EHLO parrot.pmhahn.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750752AbcIUGyf (ORCPT ); Wed, 21 Sep 2016 02:54:35 -0400 X-Greylist: delayed 589 seconds by postgrey-1.27 at vger.kernel.org; Wed, 21 Sep 2016 02:54:34 EDT Subject: Re: RFH: virnet: page allocation failure: order:0 To: "linux-kernel@vger.kernel.org" , linux-mm@kvack.org References: <8838f4e4-cb93-9c78-f446-7b1e2cb639fa@pmhahn.de> From: Philipp Hahn X-Enigmail-Draft-Status: N1110 Message-ID: Date: Wed, 21 Sep 2016 08:44:37 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.2.0 MIME-Version: 1.0 In-Reply-To: <8838f4e4-cb93-9c78-f446-7b1e2cb639fa@pmhahn.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Am 15.08.2016 um 15:26 schrieb Philipp Hahn: > this Sunday one of our virtual servers running linux-4.1.16 inside > OpenStack using qemu "crashed" while doing a backup using rsync to a > slow NFS server. This happened again last weekend, with the same stack trace: >> Call Trace: >> [] ? dump_stack+0x40/0x50 >> [] ? warn_alloc_failed+0xf9/0x150 >> [] ? __alloc_pages_nodemask+0x65a/0x9d0 >> [] ? alloc_pages_current+0xa4/0x120 >> [] ? skb_page_frag_refill+0xb7/0xe0 >> [] ? try_fill_recv+0x31b/0x610 [virtio_net] >> [] ? virtnet_receive+0x580/0x890 [virtio_net] >> [] ? virtnet_poll+0x26/0x90 [virtio_net] >> [] ? net_rx_action+0x159/0x330 >> [] ? __do_softirq+0xde/0x260 >> [] ? irq_exit+0x95/0xa0 >> [] ? do_IRQ+0x64/0x110 >> [] ? common_interrupt+0x6e/0x6e >> [] ? mwait_idle+0x150/0x150 >> [] ? native_safe_halt+0x2/0x10 >> [] ? default_idle+0x1c/0xb0 >> [] ? cpu_startup_entry+0x314/0x3e0 >> [] ? _raw_spin_unlock_irqrestore+0x17/0x50 >> [] ? start_secondary+0x185/0x1b0 > What I don't know is if the network problem was the cause or the > consequence. Because of that I want to understand why the follwoing > order=0 allocation failed: What I still don't understand is how an "order=0" allocation can fail? >> Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (EM) = 15908kB >> Node 0 DMA32: 4701*4kB (UM) 0*8kB 9*16kB (R) 0*32kB 1*64kB (R) 0*128kB 0*256kB 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 19524kB >> Node 0 Normal: 352*4kB (UM) 5*8kB (UM) 2*16kB (R) 0*32kB 1*64kB (R) 1*128kB (R) 0*256kB 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 2696kB >> Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB In every zone there are plenty of free pages, so why doesn't allocation 4 KiB work? I looked at the source code, dis-assembly, similar reports, but now I'm lost. Can someone give me a hint where to look next please? Thank you in advance. Philipp Hahn