All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Ric Wheeler <rwheeler@redhat.com>
Cc: Eric Sandeen <sandeen@redhat.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Ric Wheeler <ricwheeler@gmail.com>, Fredrick <fjohnber@zoho.com>,
	linux-ext4@vger.kernel.org, wenqing.lz@taobao.com
Subject: Re: ext4_fallocate
Date: Fri, 29 Jun 2012 13:02:35 -0600	[thread overview]
Message-ID: <27810AD9-FEA3-4C82-A5EC-A4B9B5F90071@dilger.ca> (raw)
In-Reply-To: <4FEC3FAA.1060503@redhat.com>

On 2012-06-28, at 5:27 AM, Ric Wheeler wrote:
> We need to keep in mind what the goal of pre-allocation is (should be?) - spend a bit of extra time doing the allocation call so we get really good, contiguous layout on disk which ultimately will help in streaming read/write workloads.
> 
> If you have a reasonably small file, pre-allocation is probably simply a waste of time - you would be better off overwriting the maximum file size with all zeros (even a 1GB file would take only a few seconds).
> 
> If the file is large enough to be interesting, I think that we might want to think about a scheme that would bring small random IO's more into line with the 1MB results Eric saw.
> 
> One way to do that might be to have a minimum "chunk" that we would zero out for any IO to an allocated but unwritten extent. You write 4KB to the middle of said region, we pad up and zero out to the nearest MB with zeros.

There is already code for this in the ext4 uninit extent handling.
Currently the limit is 15(?) blocks, to inflate the write size up
to zero-fill a 64kB chunk on 4kB blocksize filesystems.  I wouldn't
object to increasing this to zero out a full 1MB (aligned) chunk
under random IO cases.

Yes, it would hurt the latency a small amount for the first writes
(though not very much for disks, given the seek overhead), but it
would avoid clobbering the extent tree so drastically, which also
has long-term bad performance effects.

> Note for the target class of drives (S-ATA) that Ted mentioned earlier, doing a random 4KB write vs a 1MB write is not that much slower (you need to pay the head movement costs already).  Of course, the sweet spot might turn out to be a bit smaller or larger.

Right, at ~100-150 seeks/sec, it is pretty close to the 150 MB/s write
bandwidth limit, so the cost of doing 4kB vs. 1MB writes is a toss-up.
I expect the bandwidth of drives to continue increasing, while the seek
rate will stay the same.  The tradeoff for SSD devices is not so clear,
since inflated writes will hurt the cache, but in that case, it makes
more sense to use trim-with-zero (if supported) instead of unallocated
blocks anyway.

Cheers, Andreas






  reply	other threads:[~2012-06-29 19:02 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-25  6:42 ext4_fallocate Fredrick
2012-06-25  7:33 ` ext4_fallocate Andreas Dilger
2012-06-28 15:12   ` ext4_fallocate Phillip Susi
2012-06-28 15:23     ` ext4_fallocate Eric Sandeen
2012-06-25  8:51 ` ext4_fallocate Zheng Liu
2012-06-25 19:04   ` ext4_fallocate Fredrick
2012-06-25 19:17   ` ext4_fallocate Theodore Ts'o
2012-06-26  1:23     ` ext4_fallocate Fredrick
2012-06-26 13:13     ` ext4_fallocate Ric Wheeler
2012-06-26 17:30       ` ext4_fallocate Theodore Ts'o
2012-06-26 18:06         ` ext4_fallocate Fredrick
2012-06-26 18:21         ` ext4_fallocate Ric Wheeler
2012-06-26 18:57           ` ext4_fallocate Ted Ts'o
2012-06-26 19:22             ` ext4_fallocate Ric Wheeler
2012-06-26 18:05       ` ext4_fallocate Fredrick
2012-06-26 18:59         ` ext4_fallocate Ted Ts'o
2012-06-26 19:30         ` ext4_fallocate Ric Wheeler
2012-06-26 19:57           ` ext4_fallocate Eric Sandeen
2012-06-26 20:44             ` ext4_fallocate Eric Sandeen
2012-06-27 15:14               ` ext4_fallocate Eric Sandeen
2012-06-27 19:30               ` ext4_fallocate Theodore Ts'o
2012-06-27 23:02                 ` ext4_fallocate Eric Sandeen
2012-06-28 11:27                   ` ext4_fallocate Ric Wheeler
2012-06-29 19:02                     ` Andreas Dilger [this message]
2012-07-02  3:03                       ` ext4_fallocate Zheng Liu
2012-06-28 12:48                   ` ext4_fallocate Theodore Ts'o
2012-07-02  3:16                   ` ext4_fallocate Zheng Liu
2012-07-02 16:33                     ` ext4_fallocate Eric Sandeen
2012-07-02 17:44                       ` ext4_fallocate Jan Kara
2012-07-02 17:48                         ` ext4_fallocate Ric Wheeler
2012-07-03 17:41                           ` ext4_fallocate Zheng Liu
2012-07-03 17:57                             ` ext4_fallocate Zach Brown
2012-07-04  2:23                               ` ext4_fallocate Zheng Liu
2012-07-02 18:01                         ` ext4_fallocate Theodore Ts'o
2012-07-03  9:30                           ` ext4_fallocate Jan Kara
2012-07-04  1:15                         ` ext4_fallocate Phillip Susi
2012-07-04  2:36                           ` ext4_fallocate Zheng Liu
2012-07-04  3:06                             ` ext4_fallocate Phillip Susi
2012-07-04  3:48                               ` ext4_fallocate Zheng Liu
2012-07-04 12:20                               ` ext4_fallocate Ric Wheeler
2012-07-04 13:25                                 ` ext4_fallocate Zheng Liu
2012-06-26 13:06 ` ext4_fallocate Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27810AD9-FEA3-4C82-A5EC-A4B9B5F90071@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=fjohnber@zoho.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=ricwheeler@gmail.com \
    --cc=rwheeler@redhat.com \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    --cc=wenqing.lz@taobao.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.