All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Czerner <lczerner@redhat.com>
To: Chris Mason <chris.mason@oracle.com>
Cc: Jacek Luczak <difrost.kernel@gmail.com>,
	linux-ext4@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-btrfs@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
	"Ted Ts'o" <tytso@mit.edu>
Subject: Re: getdents - ext4 vs btrfs performance
Date: Wed, 29 Feb 2012 15:00:12 +0100 (CET)	[thread overview]
Message-ID: <alpine.LFD.2.00.1202291453450.23368@dhcp-27-109.brq.redhat.com> (raw)
In-Reply-To: <20120229135158.GA5054@shiny>

On Wed, 29 Feb 2012, Chris Mason wrote:

> On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
> > Hi All,
> > 
> > Long story short: We've found that operations on a directory structure
> > holding many dirs takes ages on ext4.
> > 
> > The Question: Why there's that huge difference in ext4 and btrfs? See
> > below test results for real values.
> > 
> > Background: I had to backup a Jenkins directory holding workspace for
> > few projects which were co from svn (implies lot of extra .svn dirs).
> > The copy takes lot of time (at least more than I've expected) and
> > process was mostly in D (disk sleep). I've dig more and done some
> > extra test to see if this is not a regression on block/fs site. To
> > isolate the issue I've also performed same tests on btrfs.
> > 
> > Test environment configuration:
> > 1) HW: HP ProLiant BL460 G6, 48 GB of memory, 2x 6 core Intel X5670 HT
> > enabled, Smart Array P410i, RAID 1 on top of 2x 10K RPM SAS HDDs.
> > 2) Kernels: All tests were done on following kernels:
> >  - 2.6.39.4-3 -- the build ID (3) is used here for internal tacking of
> > config changes mostly. In -3 we've introduced ,,fix readahead pipeline
> > break caused by block plug'' patch. Otherwise it's pure 2.6.39.4.
> >  - 3.2.7 -- latest kernel at the time of testing (3.2.8 has been
> > release recently).
> > 3) A subject of tests, directory holding:
> >  - 54GB of data (measured on ext4)
> >  - 1978149 files
> >  - 844008 directories
> > 4) Mount options:
> >  - ext4 -- errors=remount-ro,noatime,data=writeback
> >  - btrfs -- noatime,nodatacow and for later investigation on
> > copression effect: noatime,nodatacow,compress=lzo
> 
> For btrfs, nodatacow and compression don't really mix.  The compression
> will just override it. (Just FYI, not really related to these results).
> 
> > 
> > In all tests I've been measuring time of execution. Following tests
> > were performed:
> > - find . -type d
> > - find . -type f
> > - cp -a
> > - rm -rf
> > 
> > Ext4 results:
> > | Type     | 2.6.39.4-3   | 3.2.7
> > | Dir cnt  | 17m 40sec  | 11m 20sec
> > | File cnt |  17m 36sec | 11m 22sec
> > | Copy    | 1h 28m        | 1h 27m
> > | Remove| 3m 43sec
> 
> Are the btrfs numbers missing? ;)
> 
> In order for btrfs to be faster for cp -a, the files probably didn't
> change much since creation.  Btrfs maintains extra directory indexes
> that help in sequential backup scans, but this usually means slower
> delete performance.

Exactly and IIRC ext4 have directory entries stored in hash order which
does not really help the sequential access.

> 
> But, how exactly did you benchmark it?  If you compare a fresh
> mkfs.btrfs where you just copied all the data over with an ext4 FS that
> has been on the disk for a long time, it isn't quite fair to ext4.

I have the same question, note that if the files on ext4 has been worked
with it may very well be that directory hash trees are not in very good
shape. You can attempt to optimize that by e2fsck (just run fsck.ext4 -f
<device>) but that may take quite some time and memory, but it is worth
trying.

Thanks!
-Lukas

> 
> -chris
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 

  reply	other threads:[~2012-02-29 14:00 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-29 13:31 getdents - ext4 vs btrfs performance Jacek Luczak
2012-02-29 13:51 ` Chris Mason
2012-02-29 14:00   ` Lukas Czerner [this message]
2012-02-29 14:05   ` Chris Mason
2012-02-29 13:52 Jacek Luczak
2012-02-29 13:55 ` Jacek Luczak
2012-02-29 13:55   ` Jacek Luczak
2012-02-29 14:07   ` Jacek Luczak
2012-02-29 14:07     ` Jacek Luczak
2012-02-29 14:07     ` Jacek Luczak
2012-02-29 14:21     ` Jacek Luczak
2012-02-29 14:21       ` Jacek Luczak
2012-02-29 14:21       ` Jacek Luczak
2012-02-29 14:42     ` Chris Mason
2012-02-29 14:55       ` Jacek Luczak
2012-03-01 13:35         ` Jacek Luczak
2012-03-01 13:50           ` Hillf Danton
2012-03-01 14:03             ` Jacek Luczak
2012-03-01 14:18               ` Chris Mason
2012-03-01 14:43                 ` Jacek Luczak
2012-03-01 14:43                   ` Jacek Luczak
2012-03-01 14:51                   ` Chris Mason
2012-03-01 14:51                     ` Chris Mason
2012-03-01 14:51                     ` Chris Mason
2012-03-01 14:57                     ` Jacek Luczak
2012-03-01 14:57                       ` Jacek Luczak
2012-03-01 14:57                       ` Jacek Luczak
2012-03-01 18:42                   ` Ted Ts'o
2012-03-02  9:51                     ` Jacek Luczak
2012-03-01  4:44 ` Theodore Tso
2012-03-01  4:44   ` Theodore Tso
2012-03-01  4:44   ` Theodore Tso
2012-03-01 14:38   ` Chris Mason
2012-03-01 14:38     ` Chris Mason
2012-03-02 10:05     ` Jacek Luczak
2012-03-02 10:05       ` Jacek Luczak
2012-03-02 10:05       ` Jacek Luczak
2012-03-02 14:00       ` Chris Mason
2012-03-02 14:16         ` Jacek Luczak
2012-03-02 14:16           ` Jacek Luczak
2012-03-02 14:16           ` Jacek Luczak
2012-03-02 14:26           ` Chris Mason
2012-03-02 14:26             ` Chris Mason
2012-03-02 19:32             ` Ted Ts'o
2012-03-02 19:50               ` Chris Mason
2012-03-05 13:10               ` Jan Kara
2012-03-03 22:41             ` Jacek Luczak
2012-03-03 22:41               ` Jacek Luczak
2012-03-04 10:25               ` Jacek Luczak
2012-03-04 10:25                 ` Jacek Luczak
2012-03-05 11:32                 ` Jacek Luczak
2012-03-05 11:32                   ` Jacek Luczak
2012-03-05 11:32                   ` Jacek Luczak
2012-03-06  0:37                   ` Chris Mason
2012-03-06  0:37                     ` Chris Mason
2012-03-08 17:02   ` Phillip Susi
2012-03-09 11:29 ` Lukas Czerner
2012-03-09 14:34   ` Chris Mason
2012-03-10  0:09   ` Andreas Dilger
2012-03-10  4:48     ` Ted Ts'o
2012-03-11 10:30       ` Andreas Dilger
2012-03-11 16:13         ` Ted Ts'o
2012-03-15 10:42           ` Jacek Luczak
2012-03-15 10:42             ` Jacek Luczak
2012-03-15 10:42             ` Jacek Luczak
2012-03-18 20:56             ` Ted Ts'o
2012-03-13 19:05       ` Phillip Susi
2012-03-13 19:53         ` Ted Ts'o
2012-03-13 20:22           ` Phillip Susi
2012-03-13 21:33             ` Ted Ts'o
2012-03-14  2:48               ` Yongqiang Yang
2012-03-14  2:51                 ` Ted Ts'o
2012-03-14 14:17                   ` Zach Brown
2012-03-14 16:48                     ` Ted Ts'o
2012-03-14 17:37                       ` Zach Brown
2012-03-14  8:12               ` Lukas Czerner
2012-03-14  9:29                 ` Yongqiang Yang
2012-03-14  9:29                   ` Yongqiang Yang
2012-03-14  9:29                   ` Yongqiang Yang
2012-03-14  9:38                   ` Lukas Czerner
2012-03-14 12:50                 ` Ted Ts'o
2012-03-14 14:34                   ` Lukas Czerner
2012-03-14 17:02                     ` Ted Ts'o
2012-03-14 19:17                   ` Chris Mason
2012-03-14 14:28               ` Phillip Susi
2012-03-14 16:54                 ` Ted Ts'o
2012-03-10  3:52 ` Ted Ts'o
2012-03-15  7:59   ` Jacek Luczak
2012-03-15  7:59     ` Jacek Luczak
2012-03-15  7:59     ` Jacek Luczak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1202291453450.23368@dhcp-27-109.brq.redhat.com \
    --to=lczerner@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=difrost.kernel@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.