From: Theodore Ts'o <tytso@mit.edu>
To: Jim Meyering <jim@meyering.net>
Cc: "Andreas Dilger" <adilger@dilger.ca>,
"Niklas Hambüchen" <niklas@nh2.me>,
"Linux FS-devel Mailing List" <linux-fsdevel@vger.kernel.org>,
"Paul Eggert" <eggert@cs.ucla.edu>,
"Pádraig Brady" <P@draigbrady.com>
Subject: Re: O(n^2) deletion performance
Date: Thu, 4 Jan 2018 02:16:55 -0500 [thread overview]
Message-ID: <20180104071655.GG23371@thunk.org> (raw)
In-Reply-To: <CA+8g5KFGDG6=R3vY4jvWuSZjap4sCxrSnK1sfTsPnjjJzbYEQw@mail.gmail.com>
On Wed, Jan 03, 2018 at 08:16:58PM -0800, Jim Meyering wrote:
>
> Still wondering how this happened... deliberate optimization for
> something else, probably.
> And wishing I'd written a relative (not absolute) test for it in 2008,
> so I would have noticed sooner.
> In 2008 when I wrote this coreutils extN performance test:
>
> https://git.savannah.gnu.org/cgit/coreutils.git/tree/tests/rm/ext3-perf.sh
>
> there was no O(N^2) or even "just" O(N^1.5) component when using the
> then-just-improved rm. Many of us plotted the curves.
How high (how many files) did you plot the curves back in 2008? Do
you remember?
> Any idea when ext4's unlink became more expensive?
Ext4's unlink hasn't really changed. What *has* changed over time is
the block and inode allocation algorithms. 2008 is before we
optimized allocation algorithms to read/write operations to large
files more efficient, and to try to avoid free space fragmention as
the file system aged.
Given that on large / fast NVMe device everything looks linear, it's
pretty clear that what you're seeing is essentially seek time on
spinning rust, and probably SSD GC overhead on flash devices.
> I've just run a test on the spinning-disk file system mentioned above,
> and it took 75 minutes to delete 12.8M entries. That's rather nasty.
Yes, but how long did it take to *create* the 12.8M entries? My guess
is that it was roughly comparable?
- Ted
next prev parent reply other threads:[~2018-01-04 7:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-02 0:21 O(n^2) deletion performance Niklas Hambüchen
2018-01-02 1:20 ` Niklas Hambüchen
2018-01-02 1:59 ` Theodore Ts'o
2018-01-02 2:49 ` Andreas Dilger
2018-01-02 4:27 ` Jim Meyering
2018-01-02 6:22 ` Theodore Ts'o
2018-01-04 4:16 ` Jim Meyering
2018-01-04 7:16 ` Theodore Ts'o [this message]
2018-01-04 11:42 ` Dave Chinner
2018-01-02 4:33 ` Theodore Ts'o
2018-01-02 4:54 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180104071655.GG23371@thunk.org \
--to=tytso@mit.edu \
--cc=P@draigbrady.com \
--cc=adilger@dilger.ca \
--cc=eggert@cs.ucla.edu \
--cc=jim@meyering.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=niklas@nh2.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).