From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id 3AA007F3F for ; Fri, 12 Apr 2013 01:51:20 -0500 (CDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay3.corp.sgi.com (Postfix) with ESMTP id C3957AC002 for ; Thu, 11 Apr 2013 23:51:19 -0700 (PDT) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) by cuda.sgi.com with ESMTP id FouS85lcONKnEKqs (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 11 Apr 2013 23:51:18 -0700 (PDT) Message-ID: <5167AEDD.2050708@oracle.com> Date: Fri, 12 Apr 2013 14:51:09 +0800 From: Jeff Liu MIME-Version: 1.0 Subject: Re: New xfstests generic/308 causes XFS hang (high CPU use), at least on 32-bit References: <51656D47.9010806@gmail.com> In-Reply-To: <51656D47.9010806@gmail.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: "Michael L. Semon" Cc: xfs@oss.sgi.com Hi Michael, On 04/10/2013 09:46 PM, Michael L. Semon wrote: > Hi! On my 32-bit Pentium III PC, xfstests generic/308 uses xfs_io, and > that xfs_io hangs the XFS file system without causing a crash. In other > words, the FS cannot be umounted, xfs_io can't be killed, and shutdown > is handled by magic SysRq keys. In this time, there is about 90% CPU > usage from xfs_io (top) but zero disk I/O (iostat). > > The PC uses kernel 3.8-rc4 + Dave's CRC v4 patches + J. Liu's bitness patch. I think this is a bug for x86 only and it is irrelevant to above patches(You have also mentioned in another email). AFAICS, it is caused by the 2nd test case in 308, i.e. # Write to the block after the extent just created offset=$(((2**32 - 1) * $block_size)) $XFS_IO_PROG -f -c "pwrite $offset $block_size" -c fsync $testfile >>$seqres.full 2>&1 Run xfs_io with the given huge offset is enough to reproduce this issue(don't need to specify the 'fsync' option), it will hang at page write back stage soon. As we performs buffered io writes, the code execute path should go through: xfs_buffered_aio_write xfs_file_aio_write_checks generic_write_checks In this case, the given offset is larger than s_maxbytes on 32-bit machine. I think we should not be allowed to create this file at all according to the following check up at generic_write_checks(): if (likely(!isblk)) { if (unlikely(*pos >= inode->i_sb->s_maxbytes)) { if (*count || *pos > inode->i_sb->s_maxbytes) { return -EFBIG; } /* zero-length writes at ->s_maxbytes are OK */ } However, at xfs_max_file_offset(), we figure out s_maxbytes to (__uint64_t) range, it violated the MAX_LFS_FILESIZE limits, so the file can be created but will failed at page write back phase after a little while, just like the comments for MAX_LFS_FILESIZE: /* Page cache limit. The filesystems should put that into their s_maxbytes limits, otherwise bad things can happen in VM. */ #if BITS_PER_LONG==32 #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1) I'll try to fix it later. Thanks, -Jeff _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs