From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761147AbZFQUZH (ORCPT ); Wed, 17 Jun 2009 16:25:07 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755768AbZFQUY6 (ORCPT ); Wed, 17 Jun 2009 16:24:58 -0400 Received: from isrv.corpit.ru ([81.13.33.159]:40092 "EHLO isrv.corpit.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755352AbZFQUY5 (ORCPT ); Wed, 17 Jun 2009 16:24:57 -0400 Message-ID: <4A395119.5060108@msgid.tls.msk.ru> Date: Thu, 18 Jun 2009 00:24:57 +0400 From: Michael Tokarev Organization: Telecom Service, JSC User-Agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: "J. Bruce Fields" CC: Justin Piszcz , linux-kernel@vger.kernel.org Subject: Re: 2.6.29.1: nfsd: page allocation failure - nfsd or kernel problem? References: <4A37FE48.6070306@msgid.tls.msk.ru> <4A38ACC0.3060501@msgid.tls.msk.ru> <4A38C7CA.7040005@msgid.tls.msk.ru> <20090617185139.GF24040@fieldses.org> In-Reply-To: <20090617185139.GF24040@fieldses.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org J. Bruce Fields wrote: > On Wed, Jun 17, 2009 at 02:39:06PM +0400, Michael Tokarev wrote: >> Justin Piszcz wrote: >>> >>> On Wed, 17 Jun 2009, Michael Tokarev wrote: >>> >>>> Michael Tokarev wrote: >>>>> Justin Piszcz wrote: >>>> ... >>>> >>>> Justin, by the way, what's the underlying filesystem on the server? >>>> >>>> I've seen this error on 2 machines already (both running 2.6.29.x >>>> x86-64), >>>> and in both cases the filesystem on the server was xfs. May this be >>>> related somehow to http://bugzilla.kernel.org/show_bug.cgi?id=13375 ? >>>> That one is different, but also about xfs and nfs. I'm trying to >>>> reproduce the problem on different filesystem... >>> Hello, I am also running XFS on 2.6.29.x x86-64. >>> >>> For me, the error happened when I was running an XFSDUMP from a client >>> (and dumping) the stream over NFS to the XFS server/filesystem. This >>> is typically when the error occurs or during heavy I/O. >> Very similar load was here -- not xfsdump but tar and dump of an ext3 >> filesystems. >> >> And no, it's NOT xfs-related: I can trigger the same issue easily on Note the NOT, in upper case ;) >> ext4 as well. About 20 minutes of running 'dump' of another fs >> to the nfs mount and voila, nfs server reports the same page allocation >> failure. Note that all file operations are still working, i.e. it >> produces good (not corrupted) files on the server. > > There's a possibly related report for 2.6.30 here: > > http://bugzilla.kernel.org/show_bug.cgi?id=13518 Does not look similar. I repeated the issue here. The slab which is growing here is buffer_head. It's growing slowly -- right now, after ~5 minutes of constant writes over nfs, its size is 428423 objects, growing at about 5000 objects/minute rate. When stopping writing, the cache shrinks slowly back to an acceptable size, probably when the data gets actually written to disk. It looks like we need a bug entry for this :) I'll re-try 2.6.30 hopefully tomorrow. /mjt