All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 15910] New: zero-length files and performance degradation
@ 2010-05-05 13:49 bugzilla-daemon
  2010-05-05 18:54 ` [Bug 15910] " bugzilla-daemon
                   ` (10 more replies)
  0 siblings, 11 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-05 13:49 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910

           Summary: zero-length files and performance degradation
           Product: File System
           Version: 2.5
    Kernel Version: 2.6.32-21-generic
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@kernel-bugs.osdl.org
        ReportedBy: jeanbaptiste.lallement@gmail.com
        Regression: No


Hi,

I'm raising this topic again due to the large number of users experiencing the
zero-length issue on ext4 filesystem after a system crash or power failure. We
have collected hundreds of reports from users who can no longer update their
system after a crash during or shortly after package operations due to
zero-length control script (see [1] references)

To reproduce it:
* install a fresh Ubuntu Lucid system on an ext4 filesystem, or Debian with
dpkg < 1.15.6 or Ubuntu Karmic
* install a package, wait a few seconds and simulate a crash 
$ sudo apt-get install some-package; sleep 5; sudo echo b > /proc/sysrq-trigger
* reboot
$ ls -l /var/lib/dpkg/info/some-package.* will list empty maintainer's scripts.
$ ls -l /var/cache/apt/archive/some-package.* will show the empty archive
you've just downloaded
At this stage, the package manager is unusable and the common user cannot
update anything anymore.

This behavior is observed with data=ordered and with or without the mount
option auto_da_alloc.
The problem is caused by 
1) rename which should act as a barrier with data=ordered but doesn't.
auto_da_alloc doesn't detect the replace-via-rename (at least in the case of
dpkg.)
2) file creation followed by a crash resulting in an empty file. 

To work around and mitigate this issue, in Debian and Ubuntu, the 'dpkg'
package manager has been patched to fsync extracted files (Debian dpkg 1.15.6
and Ubuntu 1.15.5.6ubuntu2)

We first added a fsync() call for each extracted file. But scattered fsyncs
resulted in a massive performance degradation during package installation
(factor 10 or more, some reported that it took over an hour to unpack a
linux-headers-* package!)
In order to reduce the I/O performance degradation, fsync calls were deferred
to serialize the write + fsync. The performance loss is now a factor 2 to 5
depending on the package.

So, we currently have the choice between filesystem corruption or major
performance loss. None of them is satisfactory. 

What is simply expected is that a file is there or not, but not something
in-between. 

[1] references:
http://bugs.debian.org/430958
http://bugs.debian.org/567089
http://bugs.debian.org/578635
https://bugs.launchpad.net/ubuntu/+bug/512096
https://bugs.launchpad.net/ubuntu/+bug/537241
https://bugs.launchpad.net/ubuntu/+bug/559915
https://bugs.launchpad.net/ubuntu/+bug/570805

-- 
: JB

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
@ 2010-05-05 18:54 ` bugzilla-daemon
  2010-05-06  4:06 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-05 18:54 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu




--- Comment #1 from Theodore Tso <tytso@mit.edu>  2010-05-05 18:54:23 ---
Why can't you #1, just fsync after writing the control file, if that's the
primary problem?

Or #2, make the dpkg recover more gracefully if it finds that the control file
has been truncated down to zero?

The reality is that all of the newer file systems are going to have this
property.  XFS has always behaved this way.  Btrfs will as well.  We are _all_
using the same hueristic to force sync a file which is replaced via a rename()
system call, but that's really considered a workaround buggy application
programs that don't call fsync(), because there are more stupid application
programmers than there are of us file system developers.

As far as the rest of the files are concerned, what I would suggest doing is
set a sentinel value which is used to indicate that package is being installed,
and if the system crashes, either in the init scripts or the next time dpkg
runs, it should reinstall that package.   That way you're not fsync()'ing every
single file in the package, and you're also not optimizing for the exception
condition.   You just have appropriate application-level retries in case of a
crash.

So Debian and Ubuntu have a choice.  You can just stick with the ext3, and not
upgrade, but this is one place where you can't blackmail file system developers
by saying, "if you don't do this, I'll go use some other file system" ---
because we are *all* doing delayed allocation.   It's allowed by POSIX, and
it's the only way to get much better file system performance --- and there are
intelligent ways you can design your applications so the right thing happens on
a power failure.   Programmers used to be familiar with these in the days
before ext3, because that's how the world has always worked in Unix.  

Ext3 has lousy performance precisely because it guaranteed more semantics that
what was promised by POSIX, and unfortunately, people have gotten flabby
(think: the humans in the movie Wall-E) and lazy about how to write programs
that write to the file system defensively.   So if people are upset about the
performance of ext3, great, upgrade to newer file systems.   But then you will
need to be careful about how you code applications like dpkg.

In retrospect, I really wish we hadn't given programmers the data=ordered
guarantees in ext3, because they both trashed ext3's performance and caused
application programmers to get the wrong idea about how the world worked. 
Unfortunately, the damange has been done....

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
  2010-05-05 18:54 ` [Bug 15910] " bugzilla-daemon
@ 2010-05-06  4:06 ` bugzilla-daemon
  2010-05-06  4:18 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-06  4:06 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Dmitry Monakhov <dmonakhov@openvz.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmonakhov@openvz.org




--- Comment #2 from Dmitry Monakhov <dmonakhov@openvz.org>  2010-05-06 04:06:07 ---
May be it is not best place to ask but still. 

Most of script/app developers was addicted to ordered mode for too long,
so they no longer call fsync() before rename() in usual create new copy and
rename scenario for configs/init scripts. Most of developers not even know
that it is necessary(mandatory).
And in fact consequences are usually fatal because files are usually important
but old version was already unlinked.
This affect both versions because ext3 now use writeback by default,
and ext4 use writeback+delalloc.

May be it is useful to introduce compat mount option which force fsync()
internaly inside rename(). Renames is not what frequent operation so it has
much less performance penalty as real ordered mode.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
  2010-05-05 18:54 ` [Bug 15910] " bugzilla-daemon
  2010-05-06  4:06 ` bugzilla-daemon
@ 2010-05-06  4:18 ` bugzilla-daemon
  2010-05-09 18:19 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-06  4:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Eric Sandeen <sandeen@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandeen@redhat.com




--- Comment #3 from Eric Sandeen <sandeen@redhat.com>  2010-05-06 04:18:57 ---
(In reply to comment #2)

> May be it is useful to introduce compat mount option which force fsync()
> internaly inside rename(). Renames is not what frequent operation so it has
> much less performance penalty as real ordered mode.

ext4 does already have allocate-on-rename heuristics, though not exactly
fsync()

        if (retval == 0 && force_da_alloc)
                ext4_alloc_da_blocks(old_inode);
from

commit 8750c6d5fcbd3342b3d908d157f81d345c5325a7
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Mon Feb 23 23:05:27 2009 -0500

    ext4: Automatically allocate delay allocated blocks on rename

still, more mount options doesn't seem to solve the problem to me, in the end
applications can't rely on it...

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (2 preceding siblings ...)
  2010-05-06  4:18 ` bugzilla-daemon
@ 2010-05-09 18:19 ` bugzilla-daemon
  2010-05-10  2:56   ` tytso
  2010-05-10  3:49 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-09 18:19 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Guillem Jover <guillem@hadrons.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |guillem@hadrons.org




--- Comment #4 from Guillem Jover <guillem@hadrons.org>  2010-05-09 18:19:07 ---
Hi!

(In reply to comment #1)
> Why can't you #1, just fsync after writing the control file, if that's the
> primary problem?
> 
> Or #2, make the dpkg recover more gracefully if it finds that the control file
> has been truncated down to zero?

dpkg is now fsync()ing after all internal db changes, control file extractions,
*and* to be installed files extracted from the deb package. It's also
fsync()ing directories for at least all db directory changes.

As background info, dpkg used to fsync() all db files except for the newly
extracted control files.

> The reality is that all of the newer file systems are going to have this
> property.  XFS has always behaved this way.  Btrfs will as well.  We are _all_
> using the same hueristic to force sync a file which is replaced via a rename()
> system call, but that's really considered a workaround buggy application
> programs that don't call fsync(), because there are more stupid application
> programmers than there are of us file system developers.

I don't have any problem with that, and I personally consider previous dpkg
behaviour buggy. And as you say it's bound to cause problems on other file
systems eventually.

> As far as the rest of the files are concerned, what I would suggest doing is
> set a sentinel value which is used to indicate that package is being installed,
> and if the system crashes, either in the init scripts or the next time dpkg
> runs, it should reinstall that package.   That way you're not fsync()'ing every
> single file in the package, and you're also not optimizing for the exception
> condition.   You just have appropriate application-level retries in case of a
> crash.

dpkg already marks packages which failed to unpack as such, and that they need
to be reinstalled, it can also recover from such situations by rolling back to
the previous files, which it keeps as backups until it has finished the current
package operation.

The problem is, dpkg needs to guarantee the system is always usable, and when a
crash occurs, say when it's unpacking libc, it's not acceptable for dpkg not to
fsync() before rename() as it might end up with an empty libc.so file, even if
it might have marked the package as correctly unpacked (wrongly but unknowingly
as there's no guarantees), which is not true until the changes have been fully
committed to the file system.

If any file of the many packages which are required for a system to boot
properly or for dpkg itself to operate correctly ends up with zero-length then
neither the user nor the system will be able to recover from this situation.
Worse is that this might require recovering from a different media, for
example, which end-users should not be required to do, or they might just not
know how to.

I guess in this regard dpkg is special, and it cannot be compared to something
like firefox fsync()ing too much, if dpkg fails to operate properly your entire
system might get hosed.

> So Debian and Ubuntu have a choice.  You can just stick with the ext3, and not
> upgrade, but this is one place where you can't blackmail file system developers
> by saying, "if you don't do this, I'll go use some other file system" ---
> because we are *all* doing delayed allocation.   It's allowed by POSIX, and
> it's the only way to get much better file system performance --- and there are
> intelligent ways you can design your applications so the right thing happens on
> a power failure.   Programmers used to be familiar with these in the days
> before ext3, because that's how the world has always worked in Unix.  
> 
> Ext3 has lousy performance precisely because it guaranteed more semantics that
> what was promised by POSIX, and unfortunately, people have gotten flabby
> (think: the humans in the movie Wall-E) and lazy about how to write programs
> that write to the file system defensively.   So if people are upset about the
> performance of ext3, great, upgrade to newer file systems.   But then you will
> need to be careful about how you code applications like dpkg.

The main problem is that doing the right thing (fsync() + rename()), does not
really penalize ext3 users, but it does on ext4 which is the one which really
needs it. So we end up with lots of users (mostly from Ubuntu though, as the
one who has already switched to ext4 as default) complaining the slow down is
unacceptable, and I don't see much options besides adding a --force-unsafe-io
or similar, which those users would add in the dpkg.cfg file with the
acknowledgment thay might lose data in case of an abrupt halt.

Something in between we have talked about is doing fsync() on extracted files
only for a subset of the packages, say only for priority important or higher,
which besides being the wrong solution does not cover for example packages as
important as the kernel or boot loaders. Obviously better than no fsync() at
all but still not right, this could be added as --force-unsafe-io and the
previous as --force-unsafer-io though, but still.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 15910] zero-length files and performance degradation
  2010-05-09 18:19 ` bugzilla-daemon
@ 2010-05-10  2:56   ` tytso
  2010-05-10 14:22     ` Peng Tao
  0 siblings, 1 reply; 15+ messages in thread
From: tytso @ 2010-05-10  2:56 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-ext4

> The problem is, dpkg needs to guarantee the system is always usable,
> and when a crash occurs, say when it's unpacking libc, it's not
> acceptable for dpkg not to fsync() before rename() as it might end
> up with an empty libc.so file, even if it might have marked the
> package as correctly unpacked (wrongly but unknowingly as there's no
> guarantees), which is not true until the changes have been fully
> committed to the file system.

Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
mkstemp filename template) do a sync call (which in Linux is
synchronous and won't return until all the files have been written),
and only then, rename the files?  That's going to be the most fastest
and most efficient way to guarantee safety under Linux; the downside
is that you need to have enough free space to store the old and the
new files in the package simultaneously.  But this also is a win,
because it means you don't actually start overwriting files in a
package until you know that the package installation is most likely
going to succeed.  (Well, it could fail in the postinstall script, but
at least you don't have to worry about disk full errors.)

   	     	   	   	       	    - Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (3 preceding siblings ...)
  2010-05-09 18:19 ` bugzilla-daemon
@ 2010-05-10  3:49 ` bugzilla-daemon
  2010-05-10 14:36 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-10  3:49 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910





--- Comment #5 from Theodore Tso <tytso@mit.edu>  2010-05-10 03:49:25 ---
> The problem is, dpkg needs to guarantee the system is always usable,
> and when a crash occurs, say when it's unpacking libc, it's not
> acceptable for dpkg not to fsync() before rename() as it might end
> up with an empty libc.so file, even if it might have marked the
> package as correctly unpacked (wrongly but unknowingly as there's no
> guarantees), which is not true until the changes have been fully
> committed to the file system.

Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
mkstemp filename template) do a sync call (which in Linux is
synchronous and won't return until all the files have been written),
and only then, rename the files?  That's going to be the most fastest
and most efficient way to guarantee safety under Linux; the downside
is that you need to have enough free space to store the old and the
new files in the package simultaneously.  But this also is a win,
because it means you don't actually start overwriting files in a
package until you know that the package installation is most likely
going to succeed.  (Well, it could fail in the postinstall script, but
at least you don't have to worry about disk full errors.)

                                             - Ted

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 15910] zero-length files and performance degradation
  2010-05-10  2:56   ` tytso
@ 2010-05-10 14:22     ` Peng Tao
  2010-05-10 14:34       ` tytso
  0 siblings, 1 reply; 15+ messages in thread
From: Peng Tao @ 2010-05-10 14:22 UTC (permalink / raw)
  To: tytso; +Cc: bugzilla-daemon, linux-ext4

On Mon, May 10, 2010 at 10:56 AM,  <tytso@mit.edu> wrote:
>> The problem is, dpkg needs to guarantee the system is always usable,
>> and when a crash occurs, say when it's unpacking libc, it's not
>> acceptable for dpkg not to fsync() before rename() as it might end
>> up with an empty libc.so file, even if it might have marked the
>> package as correctly unpacked (wrongly but unknowingly as there's no
>> guarantees), which is not true until the changes have been fully
>> committed to the file system.
>
> Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
> mkstemp filename template) do a sync call (which in Linux is
> synchronous and won't return until all the files have been written),
> and only then, rename the files?  That's going to be the most fastest
> and most efficient way to guarantee safety under Linux; the downside
> is that you need to have enough free space to store the old and the
> new files in the package simultaneously.  But this also is a win,
> because it means you don't actually start overwriting files in a
> package until you know that the package installation is most likely
> going to succeed.  (Well, it could fail in the postinstall script, but
> at least you don't have to worry about disk full errors.)
What about letting fsync() on dir recursively fsync() all
files/sub-dirs in the dir?
Then apps can unpack package in a temp dir, fsync(), and rename.
>
>                                            - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Thanks,
-Bergwolf
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Bug 15910] zero-length files and performance degradation
  2010-05-10 14:22     ` Peng Tao
@ 2010-05-10 14:34       ` tytso
  0 siblings, 0 replies; 15+ messages in thread
From: tytso @ 2010-05-10 14:34 UTC (permalink / raw)
  To: Peng Tao; +Cc: bugzilla-daemon, linux-ext4

On Mon, May 10, 2010 at 10:22:47PM +0800, Peng Tao wrote:
> What about letting fsync() on dir recursively fsync() all
> files/sub-dirs in the dir?
> Then apps can unpack package in a temp dir, fsync(), and rename.

There are programs to do who execute fsync() on a directory, and they
do not expect a recursive fsync() on all files/subdirectories in a
directory.

At least for Linux, sync() is synchronous and will do what you want.
There is unfortunately not a portable way to do what you want short of
fsync'ing all of the files after they are written.  This case is
mostly optimized under ext3/4 (we could do a bit better for ext4, but
the performance shouldn't be disastrous --- certainly much better than
write a file, fsync, rename a file, repeat).

					- Ted

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (4 preceding siblings ...)
  2010-05-10  3:49 ` bugzilla-daemon
@ 2010-05-10 14:36 ` bugzilla-daemon
  2010-05-10 14:52 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-10 14:36 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910





--- Comment #6 from Peng Tao <bergwolf@gmail.com>  2010-05-10 14:24:44 ---
On Mon, May 10, 2010 at 10:56 AM,  <tytso@mit.edu> wrote:
>> The problem is, dpkg needs to guarantee the system is always usable,
>> and when a crash occurs, say when it's unpacking libc, it's not
>> acceptable for dpkg not to fsync() before rename() as it might end
>> up with an empty libc.so file, even if it might have marked the
>> package as correctly unpacked (wrongly but unknowingly as there's no
>> guarantees), which is not true until the changes have been fully
>> committed to the file system.
>
> Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
> mkstemp filename template) do a sync call (which in Linux is
> synchronous and won't return until all the files have been written),
> and only then, rename the files?  That's going to be the most fastest
> and most efficient way to guarantee safety under Linux; the downside
> is that you need to have enough free space to store the old and the
> new files in the package simultaneously.  But this also is a win,
> because it means you don't actually start overwriting files in a
> package until you know that the package installation is most likely
> going to succeed.  (Well, it could fail in the postinstall script, but
> at least you don't have to worry about disk full errors.)
What about letting fsync() on dir recursively fsync() all
files/sub-dirs in the dir?
Then apps can unpack package in a temp dir, fsync(), and rename.
>
>                                            - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Thanks,
-Bergwolf

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (5 preceding siblings ...)
  2010-05-10 14:36 ` bugzilla-daemon
@ 2010-05-10 14:52 ` bugzilla-daemon
  2010-05-10 17:23 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-10 14:52 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910





--- Comment #6 from Peng Tao <bergwolf@gmail.com>  2010-05-10 14:24:44 ---
On Mon, May 10, 2010 at 10:56 AM,  <tytso@mit.edu> wrote:
>> The problem is, dpkg needs to guarantee the system is always usable,
>> and when a crash occurs, say when it's unpacking libc, it's not
>> acceptable for dpkg not to fsync() before rename() as it might end
>> up with an empty libc.so file, even if it might have marked the
>> package as correctly unpacked (wrongly but unknowingly as there's no
>> guarantees), which is not true until the changes have been fully
>> committed to the file system.
>
> Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
> mkstemp filename template) do a sync call (which in Linux is
> synchronous and won't return until all the files have been written),
> and only then, rename the files?  That's going to be the most fastest
> and most efficient way to guarantee safety under Linux; the downside
> is that you need to have enough free space to store the old and the
> new files in the package simultaneously.  But this also is a win,
> because it means you don't actually start overwriting files in a
> package until you know that the package installation is most likely
> going to succeed.  (Well, it could fail in the postinstall script, but
> at least you don't have to worry about disk full errors.)
What about letting fsync() on dir recursively fsync() all
files/sub-dirs in the dir?
Then apps can unpack package in a temp dir, fsync(), and rename.
>
>                                            - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Thanks,
-Bergwolf

--- Comment #7 from Theodore Tso <tytso@mit.edu>  2010-05-10 14:42:13 ---
On Mon, May 10, 2010 at 10:22:47PM +0800, Peng Tao wrote:
> What about letting fsync() on dir recursively fsync() all
> files/sub-dirs in the dir?
> Then apps can unpack package in a temp dir, fsync(), and rename.

There are programs to do who execute fsync() on a directory, and they
do not expect a recursive fsync() on all files/subdirectories in a
directory.

At least for Linux, sync() is synchronous and will do what you want.
There is unfortunately not a portable way to do what you want short of
fsync'ing all of the files after they are written.  This case is
mostly optimized under ext3/4 (we could do a bit better for ext4, but
the performance shouldn't be disastrous --- certainly much better than
write a file, fsync, rename a file, repeat).

                    - Ted

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (6 preceding siblings ...)
  2010-05-10 14:52 ` bugzilla-daemon
@ 2010-05-10 17:23 ` bugzilla-daemon
  2010-05-10 22:33 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-10 17:23 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910





--- Comment #8 from Guillem Jover <guillem@hadrons.org>  2010-05-10 17:23:31 ---
(In reply to comment #5)
> Why not unpack all of the files as "foo.XXXXXX" (where XXXXXX is a
> mkstemp filename template) do a sync call (which in Linux is
> synchronous and won't return until all the files have been written),
> and only then, rename the files? That's going to be the most fastest
> and most efficient way to guarantee safety under Linux; the downside
> is that you need to have enough free space to store the old and the
> new files in the package simultaneously. But this also is a win,
> because it means you don't actually start overwriting files in a
> package until you know that the package installation is most likely
> going to succeed.  (Well, it could fail in the postinstall script, but
> at least you don't have to worry about disk full errors.)

Ah, forgot to mention that we also discussed about using sync(), but
the problem as you say is that using sync() is not portable, so we need
the deferred fsync() and rename() code anyway for unpacked files on
non-Linux systems. Another possible issue, is that if there's been lots
of I/O in parallel or just before running dpkg the sync() might take much
longer than expected, but depending on the implementation fsync() might
show similar slowdowns anyway (not, though if it was on a different
"disk" and file system).

Regarding the downsides and wins you mention they already apply to the
current implementation. As I mentioned before dpkg has always supported
rolling back, by making a hardlinked backup of the old file as .dpkg-tmp,
extracting the new file as .dpkg-new and then doing an atomic rename() over
the current file, and in case of error (from dpkg itself or the appropriate
maintainer script) restoring all the old file backups for the package
(either in the current run or in a subsequent dpkg run). And only once
the unpack stage has been successful it removes the backups in one pass.
So the need for rollback already makes dpkg take (approx.) twice the space
per package, and thus there's no unsafe overwrites that cannot be reverted
(except for the zero-length ones).

I've added the conditional code now for Linux to do the sync() and then
rename() all files in one pass, and it's just few lines of code (due to
the deferred fsync() changes which are now in place), I'll request some
testing from ext4 users, and if it improves something and does not make
the matters worse on ext3 and other file systems, then I guess we might
use that on Linux. It still looks like a workaround to me.

As a side remark, I don't think it's fair though, that you complain about
application developers not doing the right thing, when at the same time,
you expect them not to use the proper portable tool for such job. And that
you seem to not see a problem that using it implies a performance penalty
on a file system that really needs it. That there's lots of users willing
to sacrifice safety for performance, tells me the penalty is significant
enough. Isn't there anything that could be improved to make the correct
fsync()+rename() case a bit faster? In this particular case those are
already batched after all writes have been performed.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (7 preceding siblings ...)
  2010-05-10 17:23 ` bugzilla-daemon
@ 2010-05-10 22:33 ` bugzilla-daemon
  2011-03-07  0:30 ` bugzilla-daemon
  2011-03-09 19:09 ` bugzilla-daemon
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2010-05-10 22:33 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910





--- Comment #9 from Theodore Tso <tytso@mit.edu>  2010-05-10 22:33:09 ---
I'll grant that using sync(2) is non-portable, but relying on _not_ needing an
fsync(2) at all was also just as non-portable, if not worse (it only really
worked on ext3, and no other file system, and of course only on Linux).

Trying to make (fsync ,rename)**N --- that is, alternating fsync and rename
calls --- fast is always going to be difficult for nearly all file systems. 
The fundamental problem is that file systems are optimized for throughput when
you're _not_ calling fsync all the time, that's a very different sort of thing
than what databases need to do --- and databases generally solve the problem by
having _two_ logs, a redo and an undo log.  I don't know of any filesystem
which has that kind of complexity, and so pretty much any filesystem where you
have a series if fsync() and rename() calls interleaves is going to run you
into pain.   Some filesystems will be better at it than others, but it's always
going to be faster to write all the files, do a single sync, and then do all of
the renames.   Yeah, that's non-portable; the problem is that the only
synchronization primitive which POSIX gives us is fsync(), and so we just don't
have a lot of options in terms of what we communicate between userspace and the
kernel.

One of the things I wonder about is why are users' systems cashing so often
such that this is a problem?  I can't remember the last time I've had a system
crash while I've been doing an "apt-get dist-upgrade" or "apt-get upgrade".  Is
this a common problem or an uncommon problem?    And if it's not that common
(and I hope it is, but maybe Ubuntu is shipping too many unstable crappy binary
device drivers), maybe the right answer is to have rescue CD's or rescue
partitions which will automatically repair a damaged libc package if the
systems just happened to crash while upgrading glibc.   Again, let's optimize
for the common case, and I hope we've haven't entered the windows world where
blue screens of death are so common that this is the case we have to optimize
for....

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (8 preceding siblings ...)
  2010-05-10 22:33 ` bugzilla-daemon
@ 2011-03-07  0:30 ` bugzilla-daemon
  2011-03-09 19:09 ` bugzilla-daemon
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2011-03-07  0:30 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Phillip Susi <psusi@cfl.rr.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |psusi@cfl.rr.com




--- Comment #10 from Phillip Susi <psusi@cfl.rr.com>  2011-03-07 00:30:36 ---
Since it was decided that this is not a bug in the kernel, shouldn't this
report be closed?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug 15910] zero-length files and performance degradation
  2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
                   ` (9 preceding siblings ...)
  2011-03-07  0:30 ` bugzilla-daemon
@ 2011-03-09 19:09 ` bugzilla-daemon
  10 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2011-03-09 19:09 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15910


Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




--- Comment #11 from Theodore Tso <tytso@mit.edu>  2011-03-09 19:09:22 ---
Agreed, closing.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-03-09 19:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-05 13:49 [Bug 15910] New: zero-length files and performance degradation bugzilla-daemon
2010-05-05 18:54 ` [Bug 15910] " bugzilla-daemon
2010-05-06  4:06 ` bugzilla-daemon
2010-05-06  4:18 ` bugzilla-daemon
2010-05-09 18:19 ` bugzilla-daemon
2010-05-10  2:56   ` tytso
2010-05-10 14:22     ` Peng Tao
2010-05-10 14:34       ` tytso
2010-05-10  3:49 ` bugzilla-daemon
2010-05-10 14:36 ` bugzilla-daemon
2010-05-10 14:52 ` bugzilla-daemon
2010-05-10 17:23 ` bugzilla-daemon
2010-05-10 22:33 ` bugzilla-daemon
2011-03-07  0:30 ` bugzilla-daemon
2011-03-09 19:09 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.