linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Isaacson <adi@hexapodia.org>
To: linux-kernel@vger.kernel.org
Subject: sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?)
Date: Fri, 5 Dec 2003 15:00:08 -0600	[thread overview]
Message-ID: <20031205150008.B14054@hexapodia.org> (raw)
In-Reply-To: <20031204172348.A14054@hexapodia.org>; from adi@hexapodia.org on Thu, Dec 04, 2003 at 05:23:48PM -0600

On Thu, Dec 04, 2003 at 05:23:48PM -0600, Andy Isaacson wrote:
> On Thu, Dec 04, 2003 at 02:32:23PM -0600, Rob Landley wrote:
> > What are the downsides of holes?  (How big do they have to be to
> > actually save space, is there a performance penalty to having a file
> > with 1000 4k holes in it, etc...)
> 
> It's filesystem-dependent; some filesystems don't implement sparse
> files.  The lower bound is one block; on extents-based filesystems like
> XFS it might be bigger.  (If you've got 1GB of data, then a 1MB block of
> zeros, then another GB of data, you're probably better off allocating a
> single 2GB extent rather than two smaller extents with a hole.)
> 
> There's no inherent downside to holey files; in fact they can be a
> straight-up performance win -- that's a block that doesn't need to be
> read from disk, just hand the user a COW pointer to your zero page.  And
> if you're lucky and the preceding and following blocks are allocated
> adjacent on disk, you can do it all as a single streaming IO.

I got curious enough to run some tests, and was suprised at the results.
My machine (Athlon XP 2400+, 2030 MHz, 512 MB, KT400, 2.4.22) can read
out of buffer cache at 234 MB/s, and off of its IDE disk at 40 MB/s.
I'd assumed that read(2)ing a holey file would go faster than reading
out of buffer cache; in theory you could do it completely in L1 cache
(with a 4KB buffer, it's just a ton of syscalls, some page table
manipulation, and a bunch of memcpy() out of a single zero page).  But
it turns out that reading a hole is *slower* than reading data from
buffer cache, just 195 MB/s.

200 MB file       234 MB/s  (with warm caches)
1 GB file          40 MB/s  (exceeds physical memory)
1 GB sparse file  195 MB/s

the 1GB sparse file was created with "dd if=file of=1gsparse bs=1M
count=1 seek=1023"; the filesystem is ext3.

Here's 'vmstat 5' while reading the 200MB file in a loop:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa

 1  0  50968   4468   4872 410424    0    0     0     9  102    46 62 38  0  0
 1  0  50968   4448   4892 410424    0    0     0     6  101    41 62 38  0  0
 1  0  50968   4428   4912 410424    0    0     0     6  101    40 62 38  0  0
 1  0  50968   4404   4936 410424    0    0     0     6  101    37 61 39  0  0
 1  0  50968   4384   4956 410424    0    0     0     8  105   117 60 40  0  0
 1  0  50968   4484   4984 410296    0    0     0     9  103    81 62 38  0  0

here's 'vmstat 5' while reading the 1GB sparse file in a loop:

 1  0  55448   4460   2464 417320    0    0   217     6  144  3117 45 49  6  0
 1  0  55448   4444   2480 417304    0    0   219     6  204  3237 50 44  6  0
 1  0  55448   4444   2488 417288    0    0   218     9  181  3200 49 45  6  0
 1  0  55460   4456   2468 417140   30    0   249     6  182  3193 46 48  6  0
 1  0  55460   4396   2484 417300    0    2   220    12  140  3084 46 48  6  0
 1  0  55460   4356   2464 417360    0    0   216     2  145  3101 47 48  6  0

The code is simply doing

        while((n = read(fd, buf, sizeof(buf))) > 0) {
                c += n;
                for(i=0; i < n; i++) {
                        hist[buf[i]]++;
                }
        }

compiled with gcc 3.3.2 -O2.

Code appended.

-andy

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>

#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/time.h>
#include <fcntl.h>
#include <ctype.h>

static void
die(char *fmt, ...)
{
	va_list a;
	va_start(a, fmt);
	vfprintf(stderr, fmt, a);
	va_end(a);
	exit(1);
}

double tod(void)
{
	static struct timeval tv1;
	struct timeval tv2;
	double r;

	if(tv1.tv_sec == 0) {
		gettimeofday(&tv1, 0);
		return 0;
	}
	gettimeofday(&tv2, 0);
	r = (tv2.tv_sec - tv1.tv_sec) + (tv2.tv_usec - tv1.tv_usec) / 1e6;
	memcpy(&tv1, &tv2, sizeof(tv1));
	return r;
}

int main(int argc, char **argv)
{
	char buf[4096];
	int fd, i, n, m;
	long long c = 0;
	double t1, t2;
	int hist[256] = { 0 };
	unsigned char *p = buf;

	if(argc != 2) die("usage: %s file\n", argv[0]);

	if((fd = open(argv[1], O_RDONLY)) == -1)
		die("%s: %s\n", argv[1], strerror(errno));

	t1 = tod();
	while((n = read(fd, buf, sizeof(buf))) > 0) {
		c += n;
		for(i=0; i < n; i++) {
			hist[p[i]]++;
		}
	}
	t2 = tod();
	if(n == -1) die("read: %s\n", strerror(errno));

	m = 0;
	for(i=1; i<256; i++)
		if(hist[i] > hist[m]) m = i;
	printf("%lld characters read, mode at %d '%c' with %d\n",
			c, m, isprint(m) ? m : '?', hist[m]);
	printf("%f seconds, %f MB/sec\n", t2-t1, c / (t2-t1) / 1e6);
	return 0;
}

  parent reply	other threads:[~2003-12-05 21:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-04 20:32 Is there a "make hole" (truncate in middle) syscall? Rob Landley
2003-12-04 20:55 ` Måns Rullgård
2003-12-04 21:10 ` Szakacsits Szabolcs
2003-12-05  0:02   ` Rob Landley
2003-12-04 22:33     ` Szakacsits Szabolcs
2003-12-05 11:22     ` Helge Hafting
2003-12-05 12:11   ` Måns Rullgård
2003-12-05 22:41     ` Mike Fedyk
2003-12-05 23:25       ` Måns Rullgård
2003-12-05 23:33       ` Szakacsits Szabolcs
2003-12-05 23:25     ` Szakacsits Szabolcs
2003-12-04 21:48 ` Mike Fedyk
2003-12-04 23:59   ` Rob Landley
2003-12-05 22:42     ` Olaf Titz
2003-12-04 22:53 ` Peter Chubb
2003-12-05  1:04   ` Philippe Troin
2003-12-05  2:39     ` Peter Chubb
2003-12-08  4:03     ` bill davidsen
2003-12-04 23:23 ` Andy Isaacson
2003-12-04 23:42   ` Szakacsits Szabolcs
2003-12-05  2:03     ` Mike Fedyk
2003-12-05  7:09       ` Ville Herva
2003-12-05 11:22   ` Anton Altaparmakov
2003-12-05 11:44     ` viro
2003-12-05 14:27       ` Anton Altaparmakov
2003-12-05 21:00   ` Andy Isaacson [this message]
2003-12-05 21:12     ` sparse file performance (was Re: Is there a "make hole" (truncate in middle) syscall?) Linus Torvalds
2003-12-08 20:43       ` Andy Isaacson
2003-12-11  5:13 ` Is there a "make hole" (truncate in middle) syscall? Hua Zhong
2003-12-11  6:19   ` Rob Landley
2003-12-11 18:58   ` Andy Isaacson
2003-12-11 19:15     ` Hua Zhong
2003-12-11 19:43       ` Andreas Dilger
2003-12-12 21:37         ` Daniel Phillips
2003-12-11 19:48       ` Jörn Engel
2003-12-11 19:55         ` Hua Zhong
2003-12-11 19:58         ` Andy Isaacson
2003-12-12 12:18           ` Jörn Engel
2003-12-12 15:40             ` Andy Isaacson
2003-12-12 16:03               ` Jörn Engel
2003-12-11 20:32         ` Rob Landley
2003-12-12 12:55           ` Jörn Engel
2003-12-12 13:28             ` Vladimir Saveliev
2003-12-12 13:43               ` Jörn Engel
2003-12-12 13:52                 ` Vladimir Saveliev
2003-12-12 14:04                   ` Jörn Engel
2003-12-12 13:53               ` Rob Landley
2003-12-12 14:01                 ` Vladimir Saveliev
2003-12-12 21:35                   ` Rob Landley
2003-12-15 10:00                     ` Vladimir Saveliev
2003-12-15 11:52                       ` Rob Landley
2003-12-15 13:26                         ` Jörn Engel
2003-12-12 13:39             ` Rob Landley
2003-12-12 13:56               ` Jörn Engel
2003-12-12 14:24                 ` Jörn Engel
2003-12-12 21:37                   ` Rob Landley
2003-12-15 12:47                     ` Jörn Engel
2003-12-16  5:43                       ` Rob Landley
2003-12-16 11:05                         ` Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031205150008.B14054@hexapodia.org \
    --to=adi@hexapodia.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).