archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Union Mount: A Directory listing approach with lseek support
@ 2007-12-05 14:37 Bharata B Rao
  2007-12-05 14:38 ` [RFC PATCH 1/5] Remove existing directory listing implementation Bharata B Rao
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Bharata B Rao @ 2007-12-05 14:37 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel
  Cc: Jan Blunck, Erez Zadok, viro, Christoph Hellwig, Dave Hansen


In Union Mount, the merged view of directories of the union is obtained
by enhancing readdir(2)/getdents(2) to read and merge the entries of
all the directories  by eliminating the duplicates. While we have tried
a few approaches for this, none of them could perfectly solve all the problems.
One of the challenges has been to provide a correct directory seek support for
the union mounted directories. Sometime back when I posted an
RFC ( on this problem, one of the
suggestions I got was to have the dirents stored in a cache (which we
already do for duplicate elimination) and define the directory seek
behaviour on this cache constructed out of unioned directories.

I set out to try this and the result is this set of patches. I am myself
not impressed by the implementation complexity in these patches but posting
them here only to get further comments and suggestions. Moreover I haven't
debugged them thoroughly to uncover all the problems. While I don't expect
anybody try these patches, for the completeness sake I have to mention that
these apply on top of Jan Blunck's patches on 2.6.24-rc2-mm1

In this approach, the cached dirents are given offsets in the form of
linearly increasing indices/cookies (like 0, 1, 2,...). This helps us to
uniformly define offsets across all the directories of the union
irrespective of the type of filesystem involved. Also this is needed to
define a seek behaviour on the union mounted directory. This cache is stored
as part of the struct file of the topmost directory of the union and will
remain as long as the directory is kept open.

While this approach works, it has the following downsides:

- The cache can grow arbitrarily large in size for big directories thereby
consuming lots of memory. Pruning individual cache entries is out of question
as entire cache is needed for subsequent readdirs for duplicate elimination.

- When an exising union gets overlaid by a new directory, there is a
possibility of the cache getting duplicated for the new union, thereby wasting
more space.

- Whenever _any_ directory that is part of the union gets
modified (addition/deletion of entries), the dirent cache of all the unions
which this directory is part of, needs to be purged and rebuilt. This is
expensive not only due to re-reads of dirents but also because
readdir(2)/getdents(2) needs to be synchronized with other operations
like mkdir/mknod/link/unlink etc.

- Since lseek(2) of the unioned directory also works on the same dirent
cache, it too needs to be invalidated when the directory gets modified.

- Supporting SEEK_END of lseek(2) is expensive as it involves reading in
all the dirents to know the EOF of the directory file.

After all this, I am beginning to think if it would be better to delegate
this readdir and whiteout processing to userspace. Can this be better handled
by readdir(3) in glibc ? But even with this, I am not sure if correct seek
behaviour (from lseek(2) or seekdir(3))  can be achieved in an easy manner.
Any thoughts on this ?

Earlier Erez Zodak had suggested that things will become easier if readdir
state is stored as a disk file ( This
approach simplifies directory seek support in Unionfs. But I am not sure if
such an approach would go well with VFS based unification approach like
Union Mount.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-12-07  2:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-05 14:37 [RFC PATCH 0/5] Union Mount: A Directory listing approach with lseek support Bharata B Rao
2007-12-05 14:38 ` [RFC PATCH 1/5] Remove existing directory listing implementation Bharata B Rao
2007-12-05 16:27   ` Dave Hansen
2007-12-05 14:39 ` [RFC PATCH 2/5] Add New directory listing approach Bharata B Rao
2007-12-05 14:40 ` [RFC PATCH 4/5] Directory seek support Bharata B Rao
2007-12-05 14:41 ` [RFC PATCH 5/5] Directory cache invalidation Bharata B Rao
2007-12-05 15:01 ` [RFC PATCH 3/5] Add list_for_each_entry_reverse_from() Bharata B Rao
2007-12-05 17:21 ` [RFC PATCH 0/5] Union Mount: A Directory listing approach with lseek support Dave Hansen
2007-12-06 10:01   ` Jan Blunck
2007-12-06 15:10     ` Bharata B Rao
2007-12-06 17:54     ` Dave Hansen
2007-12-07  1:48 ` sfjro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).