All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: nicholas.dokos@hp.com
Cc: linux-ext4@vger.kernel.org, Valerie Aurora <vaurora@redhat.com>
Subject: Re: 32TB ext4 fsck times
Date: Tue, 21 Apr 2009 15:31:18 -0400	[thread overview]
Message-ID: <49EE1F06.5040508@redhat.com> (raw)
In-Reply-To: <10039.1240286799@gamaville.dokosmarshall.org>

Nick Dokos wrote:
> Now that 64-bit e2fsck can run to completion on a (newly-minted, never
> mounted) filesystem, here are some numbers. They must be taken with
> a large grain of salt of course, given the unrealistict situation, but
> they might be reasonable lower bounds of what one might expect.
>
> First, the disks are 300GB SCSI 15K rpm - there are 28 disks per RAID
> controller and they are striped into 2TiB volumes (that's a limitation
> of the hardware). 16 of these volumes are striped together using LVM, to
> make a 32TiB volume.
>
> The machine is a four-slot quad core AMD box with 128GB of memory and
> dual-port FC adapters.
>   
Certainly a great configuration for this test....

> The filesystem was created with default values for everything, except
> that the resize_inode feature is turned off. I cleared caches before the
> run.
>
> # time e2fsck -n -f /dev/mapper/bigvg-bigvol
> e2fsck 1.41.4-64bit (17-Apr-2009)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /dev/mapper/bigvg-bigvol: 11/2050768896 files (0.0% non-contiguous), 128808243/8203075584 blocks
>
> real	23m13.725s
> user	23m8.172s
> sys	0m4.323s
>   

I am a bit surprised to see it run so slowly on an empty file system. 
Not an apples to apples comparison, but on my f10 desktop with the older 
fsck, I can fsck an empty 1TB S-ATA drive in just 23 seconds. An array 
should get much better streaming bandwidth but be relatively slower for 
random reads. I wonder if we are much seekier than we should be? Not 
prefetching as much?

ric


> Most of the time (about 22 minutes) is in pass 5. I was taking snapshots
> of
>
>      /proc/<pid of e2fsck>/statm
>
> every 10 seconds during the run[1]. It starts out like this:
>
>
> 27798 3293 217 42 0 3983 0
> 609328 585760 263 42 0 585506 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 752059 728469 272 42 0 728237 0
> 717255 693666 273 42 0 693433 0
> 717255 693666 273 42 0 693433 0
> 717255 693666 273 42 0 693433 0
> ....
>
> and stays at that level for most of the run (the drop occurs a short
> time after pass 5 starts). Here is what it looks like at the end:
>
> ....
> 717255 693666 273 42 0 693433 0
> 717255 693666 273 42 0 693433 0
> 717255 693666 273 42 0 693433 0
> 717499 693910 273 42 0 693677 0
> 717499 693910 273 42 0 693677 0
> 717499 693910 273 42 0 693677 0
>
>
> So in this very simple case, memory required tops out at about 3 GB for the
> 32Tib filesystem, or 0.4 bytes per block.
>
> Nick
>
>
> [1] The numbers are numbers of pages. The format is described in
> Documentation/filesystems/proc.txt:
>
> Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
> ..............................................................................
>  Field    Content
>  size     total program size (pages)		(same as VmSize in status)
>  resident size of memory portions (pages)	(same as VmRSS in status)
>  shared   number of pages that are shared	(i.e. backed by a file)
>  trs      number of pages that are 'code'	(not including libs; broken,
> 							includes data segment)
>  lrs      number of pages of library		(always 0 on 2.6)
>  drs      number of pages of data/stack		(including libs; broken,
> 							includes library text)
>  dt       number of dirty pages			(always 0 on 2.6)
> ..............................................................................
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


  reply	other threads:[~2009-04-21 19:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-21  4:06 (unknown), Nick Dokos
2009-04-21 19:31 ` Ric Wheeler [this message]
2009-04-21 19:38   ` 32TB ext4 fsck times Eric Sandeen
2009-04-22 23:18   ` Valerie Aurora Henson
2009-04-23  6:01     ` Nick Dokos
2009-04-23 15:14       ` Valerie Aurora Henson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49EE1F06.5040508@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=nicholas.dokos@hp.com \
    --cc=vaurora@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.