From: Andy Isaacson <adi@hexapodia.org>
To: linux-kernel@vger.kernel.org
Subject: sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?)
Date: Fri, 5 Dec 2003 15:00:08 -0600 [thread overview]
Message-ID: <20031205150008.B14054@hexapodia.org> (raw)
In-Reply-To: <20031204172348.A14054@hexapodia.org>; from adi@hexapodia.org on Thu, Dec 04, 2003 at 05:23:48PM -0600
On Thu, Dec 04, 2003 at 05:23:48PM -0600, Andy Isaacson wrote:
> On Thu, Dec 04, 2003 at 02:32:23PM -0600, Rob Landley wrote:
> > What are the downsides of holes? (How big do they have to be to
> > actually save space, is there a performance penalty to having a file
> > with 1000 4k holes in it, etc...)
>
> It's filesystem-dependent; some filesystems don't implement sparse
> files. The lower bound is one block; on extents-based filesystems like
> XFS it might be bigger. (If you've got 1GB of data, then a 1MB block of
> zeros, then another GB of data, you're probably better off allocating a
> single 2GB extent rather than two smaller extents with a hole.)
>
> There's no inherent downside to holey files; in fact they can be a
> straight-up performance win -- that's a block that doesn't need to be
> read from disk, just hand the user a COW pointer to your zero page. And
> if you're lucky and the preceding and following blocks are allocated
> adjacent on disk, you can do it all as a single streaming IO.
I got curious enough to run some tests, and was suprised at the results.
My machine (Athlon XP 2400+, 2030 MHz, 512 MB, KT400, 2.4.22) can read
out of buffer cache at 234 MB/s, and off of its IDE disk at 40 MB/s.
I'd assumed that read(2)ing a holey file would go faster than reading
out of buffer cache; in theory you could do it completely in L1 cache
(with a 4KB buffer, it's just a ton of syscalls, some page table
manipulation, and a bunch of memcpy() out of a single zero page). But
it turns out that reading a hole is *slower* than reading data from
buffer cache, just 195 MB/s.
200 MB file 234 MB/s (with warm caches)
1 GB file 40 MB/s (exceeds physical memory)
1 GB sparse file 195 MB/s
the 1GB sparse file was created with "dd if=file of=1gsparse bs=1M
count=1 seek=1023"; the filesystem is ext3.
Here's 'vmstat 5' while reading the 200MB file in a loop:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 50968 4468 4872 410424 0 0 0 9 102 46 62 38 0 0
1 0 50968 4448 4892 410424 0 0 0 6 101 41 62 38 0 0
1 0 50968 4428 4912 410424 0 0 0 6 101 40 62 38 0 0
1 0 50968 4404 4936 410424 0 0 0 6 101 37 61 39 0 0
1 0 50968 4384 4956 410424 0 0 0 8 105 117 60 40 0 0
1 0 50968 4484 4984 410296 0 0 0 9 103 81 62 38 0 0
here's 'vmstat 5' while reading the 1GB sparse file in a loop:
1 0 55448 4460 2464 417320 0 0 217 6 144 3117 45 49 6 0
1 0 55448 4444 2480 417304 0 0 219 6 204 3237 50 44 6 0
1 0 55448 4444 2488 417288 0 0 218 9 181 3200 49 45 6 0
1 0 55460 4456 2468 417140 30 0 249 6 182 3193 46 48 6 0
1 0 55460 4396 2484 417300 0 2 220 12 140 3084 46 48 6 0
1 0 55460 4356 2464 417360 0 0 216 2 145 3101 47 48 6 0
The code is simply doing
while((n = read(fd, buf, sizeof(buf))) > 0) {
c += n;
for(i=0; i < n; i++) {
hist[buf[i]]++;
}
}
compiled with gcc 3.3.2 -O2.
Code appended.
-andy
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/time.h>
#include <fcntl.h>
#include <ctype.h>
static void
die(char *fmt, ...)
{
va_list a;
va_start(a, fmt);
vfprintf(stderr, fmt, a);
va_end(a);
exit(1);
}
double tod(void)
{
static struct timeval tv1;
struct timeval tv2;
double r;
if(tv1.tv_sec == 0) {
gettimeofday(&tv1, 0);
return 0;
}
gettimeofday(&tv2, 0);
r = (tv2.tv_sec - tv1.tv_sec) + (tv2.tv_usec - tv1.tv_usec) / 1e6;
memcpy(&tv1, &tv2, sizeof(tv1));
return r;
}
int main(int argc, char **argv)
{
char buf[4096];
int fd, i, n, m;
long long c = 0;
double t1, t2;
int hist[256] = { 0 };
unsigned char *p = buf;
if(argc != 2) die("usage: %s file\n", argv[0]);
if((fd = open(argv[1], O_RDONLY)) == -1)
die("%s: %s\n", argv[1], strerror(errno));
t1 = tod();
while((n = read(fd, buf, sizeof(buf))) > 0) {
c += n;
for(i=0; i < n; i++) {
hist[p[i]]++;
}
}
t2 = tod();
if(n == -1) die("read: %s\n", strerror(errno));
m = 0;
for(i=1; i<256; i++)
if(hist[i] > hist[m]) m = i;
printf("%lld characters read, mode at %d '%c' with %d\n",
c, m, isprint(m) ? m : '?', hist[m]);
printf("%f seconds, %f MB/sec\n", t2-t1, c / (t2-t1) / 1e6);
return 0;
}
next prev parent reply other threads:[~2003-12-05 21:00 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-04 20:32 Is there a "make hole" (truncate in middle) syscall? Rob Landley
2003-12-04 20:55 ` Måns Rullgård
2003-12-04 21:10 ` Szakacsits Szabolcs
2003-12-05 0:02 ` Rob Landley
2003-12-04 22:33 ` Szakacsits Szabolcs
2003-12-05 11:22 ` Helge Hafting
2003-12-05 12:11 ` Måns Rullgård
2003-12-05 22:41 ` Mike Fedyk
2003-12-05 23:25 ` Måns Rullgård
2003-12-05 23:33 ` Szakacsits Szabolcs
2003-12-05 23:25 ` Szakacsits Szabolcs
2003-12-04 21:48 ` Mike Fedyk
2003-12-04 23:59 ` Rob Landley
2003-12-05 22:42 ` Olaf Titz
2003-12-04 22:53 ` Peter Chubb
2003-12-05 1:04 ` Philippe Troin
2003-12-05 2:39 ` Peter Chubb
2003-12-08 4:03 ` bill davidsen
2003-12-04 23:23 ` Andy Isaacson
2003-12-04 23:42 ` Szakacsits Szabolcs
2003-12-05 2:03 ` Mike Fedyk
2003-12-05 7:09 ` Ville Herva
2003-12-05 11:22 ` Anton Altaparmakov
2003-12-05 11:44 ` viro
2003-12-05 14:27 ` Anton Altaparmakov
2003-12-05 21:00 ` Andy Isaacson [this message]
2003-12-05 21:12 ` sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?) Linus Torvalds
2003-12-08 20:43 ` Andy Isaacson
2003-12-11 5:13 ` Is there a "make hole" (truncate in middle) syscall? Hua Zhong
2003-12-11 6:19 ` Rob Landley
2003-12-11 18:58 ` Andy Isaacson
2003-12-11 19:15 ` Hua Zhong
2003-12-11 19:43 ` Andreas Dilger
2003-12-12 21:37 ` Daniel Phillips
2003-12-11 19:48 ` Jörn Engel
2003-12-11 19:55 ` Hua Zhong
2003-12-11 19:58 ` Andy Isaacson
2003-12-12 12:18 ` Jörn Engel
2003-12-12 15:40 ` Andy Isaacson
2003-12-12 16:03 ` Jörn Engel
2003-12-11 20:32 ` Rob Landley
2003-12-12 12:55 ` Jörn Engel
2003-12-12 13:28 ` Vladimir Saveliev
2003-12-12 13:43 ` Jörn Engel
2003-12-12 13:52 ` Vladimir Saveliev
2003-12-12 14:04 ` Jörn Engel
2003-12-12 13:53 ` Rob Landley
2003-12-12 14:01 ` Vladimir Saveliev
2003-12-12 21:35 ` Rob Landley
2003-12-15 10:00 ` Vladimir Saveliev
2003-12-15 11:52 ` Rob Landley
2003-12-15 13:26 ` Jörn Engel
2003-12-12 13:39 ` Rob Landley
2003-12-12 13:56 ` Jörn Engel
2003-12-12 14:24 ` Jörn Engel
2003-12-12 21:37 ` Rob Landley
2003-12-15 12:47 ` Jörn Engel
2003-12-16 5:43 ` Rob Landley
2003-12-16 11:05 ` Jörn Engel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031205150008.B14054@hexapodia.org \
--to=adi@hexapodia.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).