All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS robustness questions
@ 2009-07-24  4:00 Charles Manning
  2009-07-24  6:03 ` Artem Bityutskiy
  2009-07-24  6:43 ` Adrian Hunter
  0 siblings, 2 replies; 9+ messages in thread
From: Charles Manning @ 2009-07-24  4:00 UTC (permalink / raw)
  To: linux-mtd

This is probably documented somewhere but I could not find it...

What operations in UBIFS are robust to power failure and which are not?

I know for example that writing a file into flash does not mean it has been 
completely written to flash until after a sync, but what about other 
operations such as mv?

The reasonn I'm asking this is that I want to be able to "hot-swap" a 
directory of files without losing any file state.

What I'm considerings doing is something like:

Start with ~/runtime having a sane set of files

untar etc into ~/updated
sync
mv ~/updated ~/run-time
sync

What is unacceptable is that, at any time, a power failure/reboot results in 
~/runtime having a non-sane set of files.

* Does the above sequence look safe?
* Is the second sync required?


TIA

-- Charles

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24  4:00 UBIFS robustness questions Charles Manning
@ 2009-07-24  6:03 ` Artem Bityutskiy
  2009-07-24  6:43 ` Adrian Hunter
  1 sibling, 0 replies; 9+ messages in thread
From: Artem Bityutskiy @ 2009-07-24  6:03 UTC (permalink / raw)
  To: Charles Manning; +Cc: linux-mtd

On 07/24/2009 07:00 AM, Charles Manning wrote:
> This is probably documented somewhere but I could not find it...
>
> What operations in UBIFS are robust to power failure and which are not?

Hi, did you look through these:

http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writeback
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_writebuffer
http://www.linux-mtd.infradead.org/doc/ubifs.html#L_sync_exceptions
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_empty_file

>
> I know for example that writing a file into flash does not mean it has been
> completely written to flash until after a sync, but what about other
> operations such as mv?
>
> The reasonn I'm asking this is that I want to be able to "hot-swap" a
> directory of files without losing any file state.

Err, if you do sync() and the like properly, you should not loose anything.

> What I'm considerings doing is something like:
>
> Start with ~/runtime having a sane set of files
>
> untar etc into ~/updated
> sync
> mv ~/updated ~/run-time
> sync
>
> What is unacceptable is that, at any time, a power failure/reboot results in
> ~/runtime having a non-sane set of files.

Err, this will just move "updated" to the "runtime" directory. Is this what
you mean? But the above must be safe.

> * Does the above sequence look safe?
> * Is the second sync required?

It is required if you want to make sure that the directory has really been renamed,
otherwise the renaming data will sit in the write-buffer for some time, and in case
of a power you end up with "updated" at the old place, but nothing should be
corrupted. IOW, you do not have to, but may want to.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24  4:00 UBIFS robustness questions Charles Manning
  2009-07-24  6:03 ` Artem Bityutskiy
@ 2009-07-24  6:43 ` Adrian Hunter
  2009-07-24  9:24   ` Adrian Hunter
  1 sibling, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2009-07-24  6:43 UTC (permalink / raw)
  To: Charles Manning; +Cc: linux-mtd

Charles Manning wrote:
> This is probably documented somewhere but I could not find it...
> 
> What operations in UBIFS are robust to power failure and which are not?

Only sync operations guarantee that changes have reached the flash.
There are all the usual ways to sync:
	fsync/fdatasync a file/directory
	open a file as synchronous
	mark a file with the sync flag
	sync the filesystem
	mount the file system as synchronous

> I know for example that writing a file into flash does not mean it has been 
> completely written to flash until after a sync, but what about other 
> operations such as mv?

After mv, the containing directory must be sync'd to be sure the change reaches the
flash.  But rename is atomic so there will always be either the old
naming or the new naming

> The reasonn I'm asking this is that I want to be able to "hot-swap" a 
> directory of files without losing any file state.

Should be no problem if you sync correctly.

> What I'm considerings doing is something like:
> 
> Start with ~/runtime having a sane set of files
> 
> untar etc into ~/updated
> sync
> mv ~/updated ~/run-time
> sync
> 
> What is unacceptable is that, at any time, a power failure/reboot results in 
> ~/runtime having a non-sane set of files.
> 
> * Does the above sequence look safe?

Yes

> * Is the second sync required?

It is required to guarantee that the mv has reached the flash at that
point in time i.e. power loss before the second sync => same as if mv
was not done

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24  6:43 ` Adrian Hunter
@ 2009-07-24  9:24   ` Adrian Hunter
  2009-07-24 10:03     ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2009-07-24  9:24 UTC (permalink / raw)
  To: Charles Manning; +Cc: linux-mtd

Hunter Adrian (Nokia-D/Helsinki) wrote:
> Charles Manning wrote:
>> This is probably documented somewhere but I could not find it...
>>
>> What operations in UBIFS are robust to power failure and which are not?
> 
> Only sync operations guarantee that changes have reached the flash.
> There are all the usual ways to sync:
> 	fsync/fdatasync a file/directory
> 	open a file as synchronous
> 	mark a file with the sync flag
> 	sync the filesystem
> 	mount the file system as synchronous
> 
>> I know for example that writing a file into flash does not mean it has been 
>> completely written to flash until after a sync, but what about other 
>> operations such as mv?
> 
> After mv, the containing directory must be sync'd to be sure the change reaches the
> flash.  But rename is atomic so there will always be either the old
> naming or the new naming
> 
>> The reasonn I'm asking this is that I want to be able to "hot-swap" a 
>> directory of files without losing any file state.
> 
> Should be no problem if you sync correctly.
> 
>> What I'm considerings doing is something like:
>>
>> Start with ~/runtime having a sane set of files
>>
>> untar etc into ~/updated
>> sync
>> mv ~/updated ~/run-time
>> sync
>>
>> What is unacceptable is that, at any time, a power failure/reboot results in 
>> ~/runtime having a non-sane set of files.
>>
>> * Does the above sequence look safe?
> 
> Yes

Well, safe but not possible. You cannot rename over the top
of a non-empty directory. Sorry I was misleading.

>> * Is the second sync required?
> 
> It is required to guarantee that the mv has reached the flash at that
> point in time i.e. power loss before the second sync => same as if mv
> was not done

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24  9:24   ` Adrian Hunter
@ 2009-07-24 10:03     ` Adrian Hunter
  2009-07-24 23:39       ` Jamie Lokier
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2009-07-24 10:03 UTC (permalink / raw)
  To: Charles Manning; +Cc: linux-mtd

Adrian Hunter wrote:
> Hunter Adrian (Nokia-D/Helsinki) wrote:
>> Charles Manning wrote:
>>> This is probably documented somewhere but I could not find it...
>>>
>>> What operations in UBIFS are robust to power failure and which are not?
>> Only sync operations guarantee that changes have reached the flash.
>> There are all the usual ways to sync:
>> 	fsync/fdatasync a file/directory
>> 	open a file as synchronous
>> 	mark a file with the sync flag
>> 	sync the filesystem
>> 	mount the file system as synchronous
>>
>>> I know for example that writing a file into flash does not mean it has been 
>>> completely written to flash until after a sync, but what about other 
>>> operations such as mv?
>> After mv, the containing directory must be sync'd to be sure the change reaches the
>> flash.  But rename is atomic so there will always be either the old
>> naming or the new naming
>>
>>> The reasonn I'm asking this is that I want to be able to "hot-swap" a 
>>> directory of files without losing any file state.
>> Should be no problem if you sync correctly.
>>
>>> What I'm considerings doing is something like:
>>>
>>> Start with ~/runtime having a sane set of files
>>>
>>> untar etc into ~/updated
>>> sync
>>> mv ~/updated ~/run-time
>>> sync
>>>
>>> What is unacceptable is that, at any time, a power failure/reboot results in 
>>> ~/runtime having a non-sane set of files.
>>>
>>> * Does the above sequence look safe?
>> Yes
> 
> Well, safe but not possible. You cannot rename over the top
> of a non-empty directory. Sorry I was misleading.

Sorry to drag this out but it seems like it can be done with symlinks

e.g.

/ # mkdir test
/ # cd test
/test # mkdir version1
/test # mkdir version2
/test # echo "This is version 1" > version1/afile
/test # echo "This is version 2" > version2/afile
/test # ln -s version1 current
/test # ln -s version2 next                                                                              
/test # cat current/afile                                                                                
This is version 1                                                                                        
/test # cat next/afile                                                                                   
This is version 2                                                                                        
/test # mv -T next current
/test # ls -al
drwxr-xr-x    4 root     root          432 Jan  2 01:57 .
drwxrwxrwx   25 root     root         1704 Jan  2 01:44 ..
lrwxrwxrwx    1 root     root            8 Jan  2 01:46 current -> version2
-rwxr-xr-x    1 root     root       261307 Jul 24  2009 mv
drwxr-xr-x    2 root     root          224 Jan  2 01:47 version1
drwxr-xr-x    2 root     root          224 Jan  2 01:45 version2
/test # cat current/afile
This is version 2
/test # 


Note that busybox's 'mv' does not support the -T option

>>> * Is the second sync required?
>> It is required to guarantee that the mv has reached the flash at that
>> point in time i.e. power loss before the second sync => same as if mv
>> was not done
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24 10:03     ` Adrian Hunter
@ 2009-07-24 23:39       ` Jamie Lokier
  2009-07-26  6:29         ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Jamie Lokier @ 2009-07-24 23:39 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Charles Manning, linux-mtd

Adrian Hunter wrote:
> Sorry to drag this out but it seems like it can be done with symlinks

That's right.  It should be powerfail safe.
Don't forget to "rm -fr version1" at the end :-)

However, if you are looking to use this for atomic update of a
directory while there are programs still running which use the
directory, it won't work.

You can't delete the old directory, because programs might still be
inside it...

It's not even always safe to kill and restart the programs after
renaming the symlink, because they might read some files from the new
directory before they've finished reading other files from the old
directory.

Regarding powerfail safety, it means you might have to defer deleting
the old directory until some major system action, like the next reboot.

-- Jamie

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-24 23:39       ` Jamie Lokier
@ 2009-07-26  6:29         ` Adrian Hunter
  2009-07-26 19:21           ` Jamie Lokier
  0 siblings, 1 reply; 9+ messages in thread
From: Adrian Hunter @ 2009-07-26  6:29 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Charles Manning, linux-mtd

Jamie Lokier wrote:
> Adrian Hunter wrote:
>> Sorry to drag this out but it seems like it can be done with symlinks
> 
> That's right.  It should be powerfail safe.
> Don't forget to "rm -fr version1" at the end :-)
> 
> However, if you are looking to use this for atomic update of a
> directory while there are programs still running which use the
> directory, it won't work.
> 
> You can't delete the old directory, because programs might still be
> inside it...

Are you sure about that.  I can do this:

/ # mkdir test2
/ # cd test2
/test2 # cp /bin/bash .
/test2 # ls -al
drwxr-xr-x    2 root     root          224 Jan  3 22:20 .
drwxrwxrwx   25 root     root         1768 Jan  3 22:20 ..
-rwxr-xr-x    1 root     root       612764 Jan  3 22:20 bash
/test2 # ./bash -c "sleep 30;echo Done" &
/test2 # rm bash
/test2 # cd ..
/ # rmdir test2
/ # ps | grep bash
 1261 root      2500 S    ./bash -c sleep 30;echo Done 
/ # 
/ # 
/ # Done

[2] + Done                       ./bash -c "sleep 30;echo Done"

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-26  6:29         ` Adrian Hunter
@ 2009-07-26 19:21           ` Jamie Lokier
  2009-07-27  8:09             ` Adrian Hunter
  0 siblings, 1 reply; 9+ messages in thread
From: Jamie Lokier @ 2009-07-26 19:21 UTC (permalink / raw)
  To: Adrian Hunter; +Cc: Charles Manning, linux-mtd

Adrian Hunter wrote:
> Jamie Lokier wrote:
> >Adrian Hunter wrote:
> >>Sorry to drag this out but it seems like it can be done with symlinks
> >
> >That's right.  It should be powerfail safe.
> >Don't forget to "rm -fr version1" at the end :-)
> >
> >However, if you are looking to use this for atomic update of a
> >directory while there are programs still running which use the
> >directory, it won't work.
> >
> >You can't delete the old directory, because programs might still be
> >inside it...
> 
> Are you sure about that.  I can do this:
> 
> / # mkdir test2
> / # cd test2
> /test2 # cp /bin/bash .
> /test2 # ls -al
> drwxr-xr-x    2 root     root          224 Jan  3 22:20 .
> drwxrwxrwx   25 root     root         1768 Jan  3 22:20 ..
> -rwxr-xr-x    1 root     root       612764 Jan  3 22:20 bash
> /test2 # ./bash -c "sleep 30;echo Done" &
> /test2 # rm bash
> /test2 # cd ..
> / # rmdir test2
> / # ps | grep bash
> 1261 root      2500 S    ./bash -c sleep 30;echo Done 
> / # 
> / # 
> / # Done
> 
> [2] + Done                       ./bash -c "sleep 30;echo Done"

(By the way, Linux has not always allowed an empty but in-use directory
to be rmdir'd, but it does these days).

What I mean is, you can delete the old directory, but it's not always
safe because you might break programs which are depending on the
directory's contents when you do.

For example:

$ mkdir dir1
$ echo "message1" > dir1/message
$ ln -sfT dir1 new
$ mv -T new current

$ sh -c 'cd current; while :; do cat message > /dev/ttyAM0; sleep 1; done' &

==> Writes "message1" to the serial port every second.

$ mkdir dir2
$ echo "message2" > dir2/message
$ ln -sfT dir2 new
$ mv -T new current   # Looks atomic

==> Still writes "message1" to the serial port every second.
==> Maybe that's ok, maybe not.

$ rm -fr dir2         # Old version, no longer in use?

==> The background script Writes "File not found" error every second...
==> Clearly not ok.

If the script is written differently as

  $ sh -c 'while :; do cat current/message > /dev/ttyAM0; sleep 1; done' &

then it works better, changing the message in this example most of time.

It's not obvious, but even that version has an extremely rare race
condition: "cat current/message" does path traversal in the kernel,
which may open "current" just before the symlink changes, then (due to
preemptive scheduling or SMP) look up "message" after that's been
deleted.  It is probably very hard to trigger, but it's a race condition.

And even without that race condition, the method doesn't work in
general.  If it was reading two different files, it could easily see
one file from the old version and one file from the new version for a
moment.  The inconsistency could be harmless or fatal depending on the
application.

It's a hard problem to solve properly, unless you analyse each
application or kill each application before the change and restart
them afterwards.  In which case maybe you don't need the change to be
atomic :-)

Databases solve it with transactions, which are nice to use and
understand, but they introduces coordination problems in a different
way if they aren't used consistently and correctly.

This is why every Linux distro has occasional glitches when package
managers update a running system, and reports of things going wrong
which are too rare to fix, to transient to repeat, and go away on the
next reboot.

-- Jamie

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: UBIFS robustness questions
  2009-07-26 19:21           ` Jamie Lokier
@ 2009-07-27  8:09             ` Adrian Hunter
  0 siblings, 0 replies; 9+ messages in thread
From: Adrian Hunter @ 2009-07-27  8:09 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: Charles Manning, linux-mtd

Jamie Lokier wrote:
> Adrian Hunter wrote:
>> Jamie Lokier wrote:
>>> Adrian Hunter wrote:
>>>> Sorry to drag this out but it seems like it can be done with symlinks
>>> That's right.  It should be powerfail safe.
>>> Don't forget to "rm -fr version1" at the end :-)
>>>
>>> However, if you are looking to use this for atomic update of a
>>> directory while there are programs still running which use the
>>> directory, it won't work.
>>>
>>> You can't delete the old directory, because programs might still be
>>> inside it...
>> Are you sure about that.  I can do this:
>>
>> / # mkdir test2
>> / # cd test2
>> /test2 # cp /bin/bash .
>> /test2 # ls -al
>> drwxr-xr-x    2 root     root          224 Jan  3 22:20 .
>> drwxrwxrwx   25 root     root         1768 Jan  3 22:20 ..
>> -rwxr-xr-x    1 root     root       612764 Jan  3 22:20 bash
>> /test2 # ./bash -c "sleep 30;echo Done" &
>> /test2 # rm bash
>> /test2 # cd ..
>> / # rmdir test2
>> / # ps | grep bash
>> 1261 root      2500 S    ./bash -c sleep 30;echo Done 
>> / # 
>> / # 
>> / # Done
>>
>> [2] + Done                       ./bash -c "sleep 30;echo Done"
> 
> (By the way, Linux has not always allowed an empty but in-use directory
> to be rmdir'd, but it does these days).
> 
> What I mean is, you can delete the old directory, but it's not always
> safe because you might break programs which are depending on the
> directory's contents when you do.
> 
> For example:
> 
> $ mkdir dir1
> $ echo "message1" > dir1/message
> $ ln -sfT dir1 new
> $ mv -T new current
> 
> $ sh -c 'cd current; while :; do cat message > /dev/ttyAM0; sleep 1; done' &
> 
> ==> Writes "message1" to the serial port every second.
> 
> $ mkdir dir2
> $ echo "message2" > dir2/message
> $ ln -sfT dir2 new
> $ mv -T new current   # Looks atomic
> 
> ==> Still writes "message1" to the serial port every second.
> ==> Maybe that's ok, maybe not.
> 
> $ rm -fr dir2         # Old version, no longer in use?
> 
> ==> The background script Writes "File not found" error every second...
> ==> Clearly not ok.
> 
> If the script is written differently as
> 
>   $ sh -c 'while :; do cat current/message > /dev/ttyAM0; sleep 1; done' &
> 
> then it works better, changing the message in this example most of time.
> 
> It's not obvious, but even that version has an extremely rare race
> condition: "cat current/message" does path traversal in the kernel,
> which may open "current" just before the symlink changes, then (due to
> preemptive scheduling or SMP) look up "message" after that's been
> deleted.  It is probably very hard to trigger, but it's a race condition.
> 
> And even without that race condition, the method doesn't work in
> general.  If it was reading two different files, it could easily see
> one file from the old version and one file from the new version for a
> moment.  The inconsistency could be harmless or fatal depending on the
> application.
> 
> It's a hard problem to solve properly, unless you analyse each
> application or kill each application before the change and restart
> them afterwards.  In which case maybe you don't need the change to be
> atomic :-)
> 
> Databases solve it with transactions, which are nice to use and
> understand, but they introduces coordination problems in a different
> way if they aren't used consistently and correctly.
> 
> This is why every Linux distro has occasional glitches when package
> managers update a running system, and reports of things going wrong
> which are too rare to fix, to transient to repeat, and go away on the
> next reboot.

Another problem is that unlinked files that have not been deleted because
they are open, still consume file system space.  So on a little embedded
system, you can unexpectedly run out of space.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-07-27  8:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-24  4:00 UBIFS robustness questions Charles Manning
2009-07-24  6:03 ` Artem Bityutskiy
2009-07-24  6:43 ` Adrian Hunter
2009-07-24  9:24   ` Adrian Hunter
2009-07-24 10:03     ` Adrian Hunter
2009-07-24 23:39       ` Jamie Lokier
2009-07-26  6:29         ` Adrian Hunter
2009-07-26 19:21           ` Jamie Lokier
2009-07-27  8:09             ` Adrian Hunter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.