From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: large fs testing Date: Tue, 26 May 2009 15:21:32 -0600 Message-ID: <20090526212132.GE3218@webber.adilger.int> References: <4A17FFD8.80401@redhat.com> <5971.1243359565@gamaville.dokosmarshall.org> <4A1C2B40.30102@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT Cc: nicholas.dokos@hp.com, linux-fsdevel@vger.kernel.org, Christoph Hellwig , Douglas Shakshober , Joshua Giles , Valerie Aurora , Eric Sandeen , Steven Whitehouse , Edward Shishkin , Josef Bacik , Jeff Moyer , Chris Mason , "Whitney, Eric" , Theodore Tso To: Ric Wheeler Return-path: Received: from sca-es-mail-1.Sun.COM ([192.18.43.132]:38483 "EHLO sca-es-mail-1.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753001AbZEZVV6 (ORCPT ); Tue, 26 May 2009 17:21:58 -0400 Received: from fe-sfbay-09.sun.com ([192.18.43.129]) by sca-es-mail-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id n4QLLnAj027537 for ; Tue, 26 May 2009 14:21:51 -0700 (PDT) Content-disposition: inline Received: from conversion-daemon.fe-sfbay-09.sun.com by fe-sfbay-09.sun.com (Sun Java(tm) System Messaging Server 7u2-7.02 64bit (built Apr 16 2009)) id <0KK900M00SEKB700@fe-sfbay-09.sun.com> for linux-fsdevel@vger.kernel.org; Tue, 26 May 2009 14:21:49 -0700 (PDT) In-reply-to: <4A1C2B40.30102@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On May 26, 2009 13:47 -0400, Ric Wheeler wrote: > These runs were without lazy init, so I would expect to be a little more > than twice as slow as your second run (not the three times I saw) > assuming that it scales linearly. Making lazy_itable_init the default formatting option for ext4 is/was dependent upon the kernel doing the zeroing of the inode table blocks at first mount time. I'm not sure if that was implemented yet. > This run was with limited DRAM on the > box (6GB) and only a single HBA, but I am afraid that I did not get any > good insight into what was the bottleneck during my runs. For a very large array (80TB) this could be 1TB or more of inode tables that are being zeroed out at format time. After 64TB the default mke2fs options will cap out at 4B inodes in the filesystem. 1TB/90min ~= 200MB/s so this is probably your bottleneck. > Do you have any access to even larger storage, say the mythical 100TB :-) > ? Any insight on interesting workloads? I would definitely be most interested in e2fsck performance at this scale (RAM usage and elapsed time) because this will in the end be the defining limit on how large a usable filesystem can actually be in practise. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.