linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@debian.org>
To: Erez Zadok <ezk@cs.sunysb.edu>
Cc: "Jörn Engel" <joern@wohnheim.fh-wedel.de>,
	"Phillip Lougher" <phillip@lougher.demon.co.uk>,
	"Kallol Biswas" <kbiswas@neoscale.com>,
	linux-kernel@vger.kernel.org,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: partially encrypted filesystem
Date: Fri, 5 Dec 2003 19:14:47 +0000	[thread overview]
Message-ID: <20031205191447.GC29469@parcelfarce.linux.theplanet.co.uk> (raw)
In-Reply-To: <200312051616.hB5GGpef027492@agora.fsl.cs.sunysb.edu>

On Fri, Dec 05, 2003 at 11:16:51AM -0500, Erez Zadok wrote:
> We compress each chunk separately; currently chunk==PAGE_CACHE_SIZE.  For
> each file foo we keep an index file foo.idx that records the offsets in the
> main file of where you might find the decompressed data for page N.  Then we
> hook it up into the page read/write ops of the VFS.  It works great for the
> most common file access patterns: small files, sequential/random reads, and
> sequential writes.  But, it works poorly for random writes into large files,
> b/c we have to decompress and re-compress the data past the point of
> writing.  Our paper provides a lot of benchmarks results showing performance
> and resulting space consumption under various scenarios.
> 
> We've got some ideas on how to improve performance for writes-in-the-middle,
> but they may hurt performance for common cases.  Essentially we have to go
> for some sort of O(log n)-like data structure, which'd make random writes
> much better.  But since it may hurt performance for other access patterns,
> we've been thinking about some way to support both modes and be able to
> switch b/t the two modes on the fly (or at least let users "mark" a file as
> one for which you'd expect a lot of random writes to happen).
> 
> If anyone has some comments or suggestions, we'd love to hear them.

Sure.  I've described it before on this list, but here goes:

What Acorn did for their RISCiX product (4.3BSD based, ran on an ARM box
in late 80s/early 90s) was compress each 32k page individually and write
it to a 1k block size filesystem (discs were around 50MB at the time,
1k was the right size).  This left the file full of holes, and wasted
on average around 512/32k = 1/64 of the compression that could have been
attained, but it was very quick to seek to the right place.

Now 4k block size filesystems are the rule, and page size is also
4k so you'd need to be much more clever to achieve the same effect.
Compressing 256k chunks at a time would give you the same wastage, but
I don't think Linux has very good support for filesystems that want to
drop 64 pages into the page cache when the VM/VFS only asked for one.

If it did, that would allow ext2/3 to grow block sizes beyond the current
4k limit on i386, which would be a good thing to do.  Or perhaps we just
need to bite the bullet and increase PAGE_CACHE_SIZE to something bigger,
like 64k.  People are going to want that on 32-bit systems soon anyway.

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

  reply	other threads:[~2003-12-05 19:15 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-03 21:07 partially encrypted filesystem Kallol Biswas
2003-12-03 21:44 ` Richard B. Johnson
2003-12-03 23:20   ` bill davidsen
2003-12-03 21:44 ` Jörn Engel
2003-12-04  0:08   ` Linus Torvalds
2003-12-04  1:25     ` Jeff Garzik
2003-12-04  2:08       ` Linus Torvalds
2003-12-04  3:59       ` H. Peter Anvin
2003-12-04  2:37     ` Charles Manning
2003-12-04 14:17     ` Jörn Engel
2003-12-04 15:20       ` Linus Torvalds
2003-12-04 16:07         ` Phillip Lougher
2003-12-04 17:26         ` Jörn Engel
2003-12-04 18:20           ` Phillip Lougher
2003-12-04 18:40             ` Jörn Engel
2003-12-04 19:41             ` Erez Zadok
2003-12-05 11:20               ` Jörn Engel
2003-12-05 16:16                 ` Erez Zadok
2003-12-05 19:14                   ` Matthew Wilcox [this message]
2003-12-05 19:47                     ` Erez Zadok
2003-12-05 20:28                       ` Matthew Wilcox
2003-12-05 21:38                         ` Pat LaVarre
2003-12-06  0:15                         ` Maciej Zenczykowski
2003-12-06  1:35                           ` Pat LaVarre
2003-12-06  2:39                             ` Valdis.Kletnieks
2003-12-06 11:43                             ` Maciej Zenczykowski
2003-12-07  0:04                               ` Shaya Potter
2003-12-08 14:08                               ` Jörn Engel
2003-12-06  0:50                         ` Phillip Lougher
2003-12-08 11:37                           ` David Woodhouse
2003-12-08 13:44                             ` phillip
2003-12-08 14:07                               ` David Woodhouse
2003-12-10  1:16                               ` [OT?]Re: " Charles Manning
2003-12-10 17:45                                 ` Phillip Lougher
2003-12-09 23:40                             ` Pat LaVarre
2003-12-10  0:07                             ` Pavel Machek
2003-12-10  1:28                               ` Pat LaVarre
2003-12-10  2:13                               ` Charles Manning
2003-12-05 19:58                     ` Pat LaVarre
2003-12-08 11:28             ` David Woodhouse
2003-12-08 13:49               ` phillip
2003-12-04 19:18           ` David Wagner
2003-12-05 13:02             ` Jörn Engel
2003-12-05 17:28               ` Frank v Waveren
2003-12-05 23:59               ` David Wagner
2003-12-19 15:01     ` Rik van Riel
2003-12-04  3:10 ` Valdis.Kletnieks
2003-12-04 18:16 ` Hans Reiser
2003-12-06 19:56 Pat LaVarre
2003-12-06 22:07 ` Maciej Zenczykowski
2003-12-10  3:22 Valient Gough

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031205191447.GC29469@parcelfarce.linux.theplanet.co.uk \
    --to=willy@debian.org \
    --cc=ezk@cs.sunysb.edu \
    --cc=joern@wohnheim.fh-wedel.de \
    --cc=kbiswas@neoscale.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillip@lougher.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).