linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CacheFS
@ 2001-06-07 11:37 Jan Kasprzak
  2001-06-07 15:44 ` CacheFS Jan Harkes
  2001-06-07 22:23 ` CacheFS Albert D. Cahalan
  0 siblings, 2 replies; 7+ messages in thread
From: Jan Kasprzak @ 2001-06-07 11:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: xgajda, kron

	Hello,

	a friend of mine has developed the CacheFS for Linux. His work
is a prototype read-only implementation for Linux 2.2.11 or so. I am
thinking about adapting (or partly rewriting) his work for Linux 2.4.
But before I'll start working, I'd like to ask you for comments on the
proposed CacheFS architecture.

	The goal is to speed-up reading of potentially slow filesystems
(NFS, maybe even CD-based ones) by the local on-disk cache in the same way
IRIX or Solaris CacheFS works. I would expect this to be used on clusters
of computers or university computer labs with NFS-mounted /usr or some
other read-only filesystems. Another goal is to use the Linux filesystem
as a backing store (as opposed to the block device or single large file
used by CODA).

	The CacheFS architecture would consist in two components:
- kernel module, implementing the filesystem of the type "cachefs"
	and a character device /dev/cachefs
- user-space daemon, which would communicate with the kernel over /dev/cachefs
	and which would manage the backing store in a given directory.

	Every file on the front filesystem (NFS or so) volume will be cached
in two local files by cachefsd: The first one would contain the (parts of)
real file content, and the second one would contain file's metadata and the
bitmap of valid blocks (or pages) of the first file. All files in cachefsd's
backing store would be in a per-volume directory, and will be numbered by the
inode number from the front filesystem.

	Now here are some questions:

* Should the cachefsd be in user space (as it is in the prototype implementation)
or should it be moved to the kernel space? The former allows probably better
configuration (maybe a deeper directory structure in the backing store),
but the later is faster as it avoids copying data between the user and kernel
spaces.

* Would you suggest to have only one instance of cachefsd, or one per volume,
	or a multithreading implementation with configurable number
	of threads?

* Is the communication over the character device appropriate? Another
	alternative is probably a new syscall, /proc file, or maybe even
	an ioctl on the root directory of the filesystem (ugh!).

* Can the kernel part of CODA can be used for this?

	Thanks,

-Yenya

-- 
\ Jan "Yenya" Kasprzak <kas at fi.muni.cz>       http://www.fi.muni.cz/~kas/
\\ PGP: finger kas at aisa.fi.muni.cz   0D99A7FB206605D7 8B35FCDE05B18A5E //
\\\             Czech Linux Homepage:  http://www.linux.cz/              ///
It is a very bad idea to feed negative numbers to memcpy.         --Alan Cox

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CacheFS
  2001-06-07 11:37 CacheFS Jan Kasprzak
@ 2001-06-07 15:44 ` Jan Harkes
  2001-06-08 11:48   ` CacheFS Jan Kasprzak
  2001-06-09  8:29   ` CacheFS Pavel Machek
  2001-06-07 22:23 ` CacheFS Albert D. Cahalan
  1 sibling, 2 replies; 7+ messages in thread
From: Jan Harkes @ 2001-06-07 15:44 UTC (permalink / raw)
  To: Jan Kasprzak; +Cc: linux-kernel, xgajda, kron

On Thu, Jun 07, 2001 at 01:37:50PM +0200, Jan Kasprzak wrote:
> 	The goal is to speed-up reading of potentially slow filesystems
> (NFS, maybe even CD-based ones) by the local on-disk cache in the same way
> IRIX or Solaris CacheFS works. I would expect this to be used on clusters
> of computers or university computer labs with NFS-mounted /usr or some
> other read-only filesystems. Another goal is to use the Linux filesystem
> as a backing store (as opposed to the block device or single large file
> used by CODA).

Coda definitely doesn't use a block device or single large file, but a
regular filesystem as backing store. Currently ext2, reiserfs and ramfs
are known to work, and at least tmpfs is know to be broken. I can easily
fix tmpfs, but it isn't urgent so I'm delaying working on it until 2.5
unless there is sufficient interest.

> 	Every file on the front filesystem (NFS or so) volume will be cached
> in two local files by cachefsd: The first one would contain the (parts of)
> real file content, and the second one would contain file's metadata and the
> bitmap of valid blocks (or pages) of the first file. All files in cachefsd's
> backing store would be in a per-volume directory, and will be numbered by the
> inode number from the front filesystem.

- Intermezzo uses 'holes' in files to indicate that content isn't
  available.
- You might want to have a more hierarchical backing store, directory
  operations in large directories are not very efficient.
- I believe you are switching the meaning of front and backend
  filesystems around a lot in your description. Who exactly assigns the
  inode numbers?

> 	Now here are some questions:
> 
> * Should the cachefsd be in user space (as it is in the prototype
> implementation) or should it be moved to the kernel space? The former
> allows probably better configuration (maybe a deeper directory
> structure in the backing store), but the later is faster as it avoids
> copying data between the user and kernel spaces.

If you only allow whole-file accesses, the Coda solution will minimize
data copying between user and kernel space. The file is fetched into the
cache when opened, and every subsequent access is transparently
redirected to the container-file without contacting userspace until the
file is closed.

I am not considering to ever add a bitmap based 'these parts are ok and
those aren't' file access implementation to the Coda kernel module.
However, I do consider a 'data is valid up to this point' offset field a
possible future extension. Basically the open would return early when
the first N pages have been streamed in from the server. Whenever the
client wants to write or read a page beyond this point the kernel makes
a request to userspace to extend the limit. This way quota's can be
enforced, access to large files that are read sequentially is faster,
but the kernel-userspace interactions are minimized.

> * Can the kernel part of CODA can be used for this?

Not if you want to intercept and redirect every single read and write
call. That's a whole other can of worms, and I'd advise you to let the
userspace cachemanager to act as an NFS daemon. In my opinion, the Coda
kernel module fills a specific niche, and should not become yet another
kernel NFS client implementation that happens to bounce requests to
userspace using read/write on a character device instead of RPC/UDP
packets to a socket.

If you are willing to work within the confines of the Coda semantics,
sure. I'd even be willing to push a bit more on the support of more
underlying filesystems and adding the 'valid data offset' logic.

Some references,

UserFS,
    AFAIK one of the first userfs implementations for Linux,

    http://www.goop.org/~jeremy/userfs/
    http://www.penguin.cz/~jim/userfs/ 		     (same one ported to 2.2?)


PodFuk,
    Went from an NFS daemon implementation to using the Coda kernel
    module,

    http://atrey.karlin.mff.cuni.cz/~pavel/podfuk/podfuk.html
    http://sourceforge.net/projects/uservfs/ 			(aka UserVFS?)


AVFS,
    Another userfs implementation that when from a shared library hack
    to using the Coda kernel module,

    http://sourceforge.net/projects/avfs


Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CacheFS
  2001-06-07 11:37 CacheFS Jan Kasprzak
  2001-06-07 15:44 ` CacheFS Jan Harkes
@ 2001-06-07 22:23 ` Albert D. Cahalan
  2001-06-08  5:15   ` CacheFS Michael Clark
  1 sibling, 1 reply; 7+ messages in thread
From: Albert D. Cahalan @ 2001-06-07 22:23 UTC (permalink / raw)
  To: Jan Kasprzak; +Cc: linux-kernel, xgajda, kron

Jan Kasprzak writes:

>                             Another goal is to use the Linux filesystem
> as a backing store (as opposed to the block device or single large file
> used by CODA).
...
> - kernel module, implementing the filesystem of the type "cachefs"
> 	and a character device /dev/cachefs
> - user-space daemon, which would communicate with the kernel
>       over /dev/cachefs and which would manage the backing store
>       in a given directory.
>
> 	Every file on the front filesystem (NFS or so) volume will be cached
> in two local files by cachefsd: The first one would contain the (parts of)
...
> * Should the cachefsd be in user space (as it is in the prototype
> implementation) or should it be moved to the kernel space? The
> former allows probably better configuration (maybe a deeper
> directory structure in the backing store), but the later is
> faster as it avoids copying data between the user and kernel spaces.

I think that, if speed is your goal, you should have the kernel
code use swap space for the cache. Look at what tmpfs does, but
running over top of tmpfs leaves you with the overhead of running
two filesystems and a daemon. It is better to be direct.

Maybe this shouldn't even be a filesystem. You could have a general
way to flag a filesystem as being significantly slower than swap.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CacheFS
  2001-06-07 22:23 ` CacheFS Albert D. Cahalan
@ 2001-06-08  5:15   ` Michael Clark
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Clark @ 2001-06-08  5:15 UTC (permalink / raw)
  To: Albert D. Cahalan; +Cc: Jan Kasprzak, linux-kernel, xgajda, kron

"Albert D. Cahalan" wrote:
> 
> Jan Kasprzak writes:
> 
> >                             Another goal is to use the Linux filesystem
> > as a backing store (as opposed to the block device or single large file
> > used by CODA).
> ...
> > - kernel module, implementing the filesystem of the type "cachefs"
> >       and a character device /dev/cachefs
> > - user-space daemon, which would communicate with the kernel
> >       over /dev/cachefs and which would manage the backing store
> >       in a given directory.
> >
> >       Every file on the front filesystem (NFS or so) volume will be cached
> > in two local files by cachefsd: The first one would contain the (parts of)
> ...
> > * Should the cachefsd be in user space (as it is in the prototype
> > implementation) or should it be moved to the kernel space? The
> > former allows probably better configuration (maybe a deeper
> > directory structure in the backing store), but the later is
> > faster as it avoids copying data between the user and kernel spaces.
> 
> I think that, if speed is your goal, you should have the kernel
> code use swap space for the cache. Look at what tmpfs does, but
> running over top of tmpfs leaves you with the overhead of running
> two filesystems and a daemon. It is better to be direct.

So how would you get persistent caching across reboots which is one of
the major advantages of a cachefs type filesystem. I guess you could tar
the cache on startup and shutdown although would be a little slow :).

I think 'speed' here means faster than NFS or other network filesystems
- you obviously have the overhead of network traffic for cache-coherency
but can avoid a lot of data transfer (even after a reboot).

~mc

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CacheFS
  2001-06-07 15:44 ` CacheFS Jan Harkes
@ 2001-06-08 11:48   ` Jan Kasprzak
  2001-06-09  8:29   ` CacheFS Pavel Machek
  1 sibling, 0 replies; 7+ messages in thread
From: Jan Kasprzak @ 2001-06-08 11:48 UTC (permalink / raw)
  To: linux-kernel, xgajda, kron

Jan Harkes wrote:
: > 	Every file on the front filesystem (NFS or so) volume will be cached
: > in two local files by cachefsd: The first one would contain the (parts of)
: > real file content, and the second one would contain file's metadata and the
: > bitmap of valid blocks (or pages) of the first file. All files in cachefsd's
: > backing store would be in a per-volume directory, and will be numbered by the
: > inode number from the front filesystem.
: 
: - Intermezzo uses 'holes' in files to indicate that content isn't
:   available.

	Well, but can you see the hole from the user-space daemon?

: - You might want to have a more hierarchical backing store, directory
:   operations in large directories are not very efficient.

	Yes, of course. But this is an implementation detail of cachefsd.

: - I believe you are switching the meaning of front and backend
:   filesystems around a lot in your description. Who exactly assigns the
:   inode numbers?

	Well, let's speak about NFS, locally cached on ext2. The present
implemetation takes inode number from NFS, and creates and ext2 file
named - for example - /cache/%d (for file contents) and  /cache/%d.attr for
stat(2) data and valid blocks bitmap. The %d is an inode number from the
NFS volume.

: Some references,
: 
: UserFS, PodFuk, AVFS,

	Thanks,

-Yenya

-- 
\ Jan "Yenya" Kasprzak <kas at fi.muni.cz>       http://www.fi.muni.cz/~kas/
\\ PGP: finger kas at aisa.fi.muni.cz   0D99A7FB206605D7 8B35FCDE05B18A5E //
\\\             Czech Linux Homepage:  http://www.linux.cz/              ///
It is a very bad idea to feed negative numbers to memcpy.         --Alan Cox

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CacheFS
  2001-06-07 15:44 ` CacheFS Jan Harkes
  2001-06-08 11:48   ` CacheFS Jan Kasprzak
@ 2001-06-09  8:29   ` Pavel Machek
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Machek @ 2001-06-09  8:29 UTC (permalink / raw)
  To: Jan Kasprzak, linux-kernel, xgajda, kron

Hi!

> > * Can the kernel part of CODA can be used for this?
> 
> Not if you want to intercept and redirect every single read and write
> call. That's a whole other can of worms, and I'd advise you to let the
> userspace cachemanager to act as an NFS daemon. In my opinion, the Coda
> kernel module fills a specific niche, and should not become yet another
> kernel NFS client implementation that happens to bounce requests to
> userspace using read/write on a character device instead of RPC/UDP
> packets to a socket.

Forget NFS if you want it to be read/write. There are nasty deadlocks
out there.

> AVFS,
>     Another userfs implementation that when from a shared library hack
>     to using the Coda kernel module,
> 
>     http://sourceforge.net/projects/avfs

avfs moved to sourceforge? Wow!
								Pavel
-- 
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* cachefs
       [not found] <200603232313.k2NNDEW2006900@shell0.pdx.osdl.net>
@ 2006-03-23 23:20 ` sv
  0 siblings, 0 replies; 7+ messages in thread
From: sv @ 2006-03-23 23:20 UTC (permalink / raw)
  To: linux-kernel

Hello Andrew,
Is there any plans to include cachefs(fscache) from David Howells into -mm?
Thanks,Sergey

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-03-23 23:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-07 11:37 CacheFS Jan Kasprzak
2001-06-07 15:44 ` CacheFS Jan Harkes
2001-06-08 11:48   ` CacheFS Jan Kasprzak
2001-06-09  8:29   ` CacheFS Pavel Machek
2001-06-07 22:23 ` CacheFS Albert D. Cahalan
2001-06-08  5:15   ` CacheFS Michael Clark
     [not found] <200603232313.k2NNDEW2006900@shell0.pdx.osdl.net>
2006-03-23 23:20 ` cachefs sv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).