All of lore.kernel.org
 help / color / mirror / Atom feed
From: pg_xf2@xf2.for.sabi.co.UK (Peter Grandi)
To: Linux fs XFS <xfs@oss.sgi.com>
Subject: Re: GNU 'tar', Schilling's 'tar', write-cache/barrier
Date: Sat, 24 Mar 2012 18:35:46 +0000	[thread overview]
Message-ID: <20334.5122.852478.206217@tree.ty.sabi.co.UK> (raw)
In-Reply-To: <20120324171150.GA69366@nsrc.org>

[ ... ]

>> #  (cd /tmp/ext4; rm -rf linux-2.6.32; sync; time star -no-fsync -x -f /tmp/linux-2.6.32.tar; egrep 'Dirty|Writeback' /proc/meminfo; time sync)
>> real    0m1.204s
>> Dirty:          419456 kB
>> real    0m5.012s

>> #  (cd /tmp/ext4; rm -rf linux-2.6.32; sync; time star -x -f /tmp/linux-2.6.32.tar; egrep 'Dirty|Writeback' /proc/meminfo; time sync)
>> real    23m29.346s
>> Dirty:             108 kB
>> real    0m0.236s

> But as a user, what guarantees do I *want* from tar?

Ahhhh, but that depends *a lot* on the application, that may or
may not be 'tar', and what you are using 'tar' for. Consider for
example restoring a backup using RSYNC instead of 'tar'.

> I think the only meaningful guarantee I might want is: "if the
> tar returns successfully, I want to know that all the files
> are persisted to disk".

Perhaps in some cases, but perhaps in others not. For example if
you are restoring 20TB, having to redo the whole 20TB or a
significant fraction may be undesirable, and you would like to
change the guarantee as tyou write later:

  > On the flip side, does fsync()ing each individual file [
  > ... ] you could safely restart an aborted untar [ ... ] the
  > last file which was unpacked may only have been partially
  > written to disk [ ... ]

to add "if the tar does not return successfully, I want to know
that most or or all the files are persisted, except the last one
that was only partially written, which I want to disappear, so I
can rerun 'tar -x -k' and only restore the rest of the files".

> And of course that's what your final "sync" does, although
> with the unfortunate side-effect of syncing all other dirty
> blocks in the system too.

Just to be sure: that was on a quiescent system, so in the
particular case of my tests it was just on the 'tar'.

[ ... ]

> I think what's needed is a group fsync which says "please
> ensure this set of files is all persisted to disk", which is
> done at the end, or after every N files.  If such an API
> exists I don't know of it.

That's in part what mentioned here:

[ ... ]

> If the above benchmark is typical, it suggests that fsyncing
> after every file is 4 times slower than untar followed by
> sync.

Depends on how often the flusher runs and how aggressively and
how much memory you get. In the comparison quoted above, GNU
'tar' on 'ext4' dumps 410MB into RAM in just over 1 second plus
5 seconds for 'sync', and Schilling's 'tar' persists the lot to
disk, incrementally, in 1409 seconds. The ratio is 227 times.

Because that's a typical disk drive that can either do around
100MB/s with bulk sequential IO (thus the 5 seconds 'sync') or
around 0.5-4MB/s with small random IO.

> So I reckon you would be better off using the fast/unsafe
> version, and simply restarting it from the beginning if the
> system crashed while you were running it. [ ... ]

That's in one very specific example with one application in one
context. As to this, for a better discussion, let's go back to
your original and very appropriate question:

  > But as a user, what guarantees do I *want* from tar?

The question is very sensible as far as it goes, but it does not
go far enough, because «from tar» and small 'tar' archives is
just happenstance: what you should ask yourself is:

  But as a user, what guarantees do I *want* from filesystems
  and the applications that use them?

That's in essence the O_PONIES question.

That question can have many answers each of them addressing a
different aspect of normative and positive situation, and I'll
try to list some.

The first answer is that you want to be able to choose different
guarantees and costs, and know which they are. In this respect
'delaylog' log, properly described as an improvement in both
unsafety and speed, is a good thing to have, because it is often
a useful option. So are 'sync', 'nobarrier', and 'eatmydata'.

The second answer is that as a rule users don't have the
knowledge or the desire to understand the tradeoffs offered by
filesystems and how they relate to the behavior of the programs
(including 'tar') that they use, so there needs to be a default
guarantee that most users would have chosen if they could, and
this should be about more safety rather than the more speed, and
this was what «XFS @ 2009-2010» was doing.

More to follow...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-03-24 18:35 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-15  0:30 raid10n2/xfs setup guidance on write-cache/barrier Jessie Evangelista
2012-03-15  5:38 ` Stan Hoeppner
2012-03-15 12:06   ` Jessie Evangelista
2012-03-15 14:07     ` Peter Grandi
2012-03-15 14:07       ` Peter Grandi
2012-03-15 15:25       ` keld
2012-03-15 15:25         ` keld
2012-03-15 16:52         ` Jessie Evangelista
2012-03-15 16:52           ` Jessie Evangelista
2012-03-15 17:15           ` keld
2012-03-15 17:15             ` keld
2012-03-15 17:40             ` keld
2012-03-15 17:40               ` keld
2012-03-15 16:18       ` Jessie Evangelista
2012-03-15 16:18         ` Jessie Evangelista
2012-03-15 23:00         ` Peter Grandi
2012-03-15 23:00           ` Peter Grandi
2012-03-16  3:36           ` Jessie Evangelista
2012-03-16  3:36             ` Jessie Evangelista
2012-03-16 11:06             ` Michael Monnerie
2012-03-16 11:06               ` Michael Monnerie
2012-03-16 12:21               ` Peter Grandi
2012-03-16 12:21                 ` Peter Grandi
2012-03-16 17:15             ` Brian Candler
2012-03-16 17:15               ` Brian Candler
2012-03-17 15:35             ` Peter Grandi
2012-03-17 15:35               ` Peter Grandi
2012-03-17 21:39               ` raid10n2/xfs setup guidance on write-cache/barrier (GiB alignment) Zdenek Kaspar
2012-03-17 21:39                 ` Zdenek Kaspar
2012-03-18  0:08                 ` Peter Grandi
2012-03-18  0:08                   ` Peter Grandi
2012-03-26 19:50               ` raid10n2/xfs setup guidance on write-cache/barrier Martin Steigerwald
2012-03-17  4:21       ` NOW:Peter goading Dave over delaylog - WAS: " Stan Hoeppner
2012-03-17 22:34         ` Dave Chinner
2012-03-18  2:09           ` Peter Grandi
2012-03-18  2:09             ` Peter Grandi
2012-03-18 11:25             ` Peter Grandi
2012-03-18 11:25               ` Peter Grandi
2012-03-18 14:00               ` Christoph Hellwig
2012-03-18 14:00                 ` Christoph Hellwig
2012-03-18 19:17                 ` Peter Grandi
2012-03-18 19:17                   ` Peter Grandi
2012-03-19  9:07                   ` Stan Hoeppner
2012-03-19  9:07                     ` Stan Hoeppner
2012-03-20 12:34                     ` Jessie Evangelista
2012-03-20 12:34                       ` Jessie Evangelista
2012-03-18 18:08               ` Stan Hoeppner
2012-03-18 18:08                 ` Stan Hoeppner
2012-03-22 21:26                 ` Peter Grandi
2012-03-22 21:26                   ` Peter Grandi
2012-03-23  5:10                   ` Stan Hoeppner
2012-03-23  5:10                     ` Stan Hoeppner
2012-03-23 22:48                   ` Martin Steigerwald
2012-03-24  1:27                     ` Peter Grandi
2012-03-24 16:27                       ` GNU 'tar', Schilling's 'tar', write-cache/barrier Peter Grandi
2012-03-24 17:11                         ` Brian Candler
2012-03-24 18:35                           ` Peter Grandi [this message]
2012-03-16 12:25     ` raid10n2/xfs setup guidance on write-cache/barrier Stan Hoeppner
2012-03-16 18:01       ` Jon Nelson
2012-03-16 18:03         ` Jon Nelson
2012-03-16 19:28           ` Peter Grandi
2012-03-16 19:28             ` Peter Grandi
2012-03-17  0:02             ` Stan Hoeppner
2012-03-17  0:02               ` Stan Hoeppner
2012-03-17 22:10 ` Zdenek Kaspar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20334.5122.852478.206217@tree.ty.sabi.co.UK \
    --to=pg_xf2@xf2.for.sabi.co.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.