All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Machek <pavel@suse.cz>
To: Theodore Tso <tytso@mit.edu>,
	mikulas@artax.karlin.mff.cuni.cz, clock@atrey.karlin.mff.cuni.cz,
	kernel list <linux-kernel@vger.kernel.org>,
	aviro@redhat.com
Subject: Re: writing file to disk: not as easy as it looks
Date: Tue, 2 Dec 2008 16:26:18 +0100	[thread overview]
Message-ID: <20081202152618.GA1646@ucw.cz> (raw)
In-Reply-To: <20081202140439.GF16172@mit.edu>

On Tue 2008-12-02 09:04:39, Theodore Tso wrote:
> On Tue, Dec 02, 2008 at 10:40:59AM +0100, Pavel Machek wrote:
> > Actually, it looks like POSIX file interface is on the lowest step of
> > Rusty's scale: one that is impossible to use correctly. Yes, it seems
> > impossible to reliably&safely write file to disk under Linux. Double
> > plus uncool.
> > 
> > So... how to write file to disk and wait for it to reach the stable
> > storage, with proper error handling?
> 
> Are you trying to do this in C or shell?  There is no "fsync" shell
> command as far as I know, which is what is confusing me.  And whether
> "> file" checks for errors or not obviously depends on the application
> which is writing to stdout.  Some might check for errors, some might
> not....

True. I'd prefer to use shell, but C is okay, too. 'fsync' shell
command seems to exist on opensuse, sorry for confusion.

> Why do you feel the need to error check "fsync ../.." and "fsync
> ../../..", et.  al?

> I can understand why you might want to fsync the containing directory
> to make sure the directory entry got written to disk --- but if you're
> that paranoid, many modern filesystems use some kind of tree
> structure

If I'm trying to write foo/bar/baz/file, and file/baz inodes/dentries
are written to disk, but foo is not, file still will not be found
under full name - and recovering it from lost&found is hard to do
automatically.

> for the directory, and there is always the chance that a second later,
> in a b-tree node split, due to a disk error the directory entry gets
> lost.

If disk looses data after acknowledging the write, all hope is lost.
Else I expect filesystem to preserve data I successfully synced.

     (In the b-tree split failed case I'd expect transaction commit to
     fail because new data could not be weitten; at that point
     disk+journal should still contain all the data needed for
     recovery of synced/old files, right?)

> What exactly are your requirements here, and what are you trying to
> do?  What are you worried about?  Most MTA's are quite happy
> settling

I'm trying to put my main filesystem on a SD card. hp2133 has only 4GB
internal flash, so I got 32GB SDHC. Unfortunately, SD card on hp is
very easy to eject by mistake.

> with an fsync() to make sure the data made it to the disk safely and
> the super-paranoid might also keep an open fd on the spool directory
> and fsync that too.  That's been enough for most POSIX programs.

Well.. I believe those POSIX programs are unsafe on removable media.

mta #1 	 	       	     	      mta #2

cat > mail1
fsync mail1
					cat > mail2
					fsync mail2
		(spool media
				removed)
					fsync . -> ERROR
					corrrectly reports
					mail2 as undelivered
fsync . -> success; first fsync cleared
      	   error condition


I'm trying to figure out why I'm loosing data on flashes. So far it
seems that both SD cards and USB flash disks have problems, and that
ext2/3 have problems... and that combination of ext2/3+flash  can't
even work in thery :-(.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

  reply	other threads:[~2008-12-02 15:26 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-02  9:40 writing file to disk: not as easy as it looks Pavel Machek
2008-12-02 14:04 ` Theodore Tso
2008-12-02 15:26   ` Pavel Machek [this message]
2008-12-02 16:37     ` Theodore Tso
2008-12-02 17:22       ` Chris Friesen
2008-12-02 20:55         ` Theodore Tso
2008-12-02 22:44           ` Pavel Machek
2008-12-02 22:50             ` Pavel Machek
2008-12-03  5:07             ` Theodore Tso
2008-12-03  8:46               ` Pavel Machek
2008-12-03 15:50                 ` Mikulas Patocka
2008-12-03 15:54                   ` Alan Cox
2008-12-03 17:37                     ` Mikulas Patocka
2008-12-03 17:52                       ` Alan Cox
2008-12-03 18:16                       ` Pavel Machek
2008-12-03 18:33                         ` Mikulas Patocka
2008-12-03 16:42                 ` Theodore Tso
2008-12-03 17:43                   ` Mikulas Patocka
2008-12-03 18:26                     ` Pavel Machek
2008-12-03 15:34               ` Mikulas Patocka
2008-12-15 10:24               ` [patch] " Pavel Machek
2008-12-15 11:03           ` Pavel Machek
2008-12-15 20:08             ` Folkert van Heusden
2008-12-02 19:10       ` Folkert van Heusden
2008-12-02 23:01 ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081202152618.GA1646@ucw.cz \
    --to=pavel@suse.cz \
    --cc=aviro@redhat.com \
    --cc=clock@atrey.karlin.mff.cuni.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikulas@artax.karlin.mff.cuni.cz \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.