linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Daniel Phillips <daniel@phunq.net>
Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: Re: [RFC] Thing 1: Shardmap fox Ext4
Date: Wed, 27 Nov 2019 21:28:17 -0500	[thread overview]
Message-ID: <20191128022817.GE22921@mit.edu> (raw)
In-Reply-To: <c3636a43-6ae9-25d4-9483-34770b6929d0@phunq.net>

On Wed, Nov 27, 2019 at 02:27:27PM -0800, Daniel Phillips wrote:
> > (2) It's implemented as userspace code (e.g., it uses open(2),
> > mmap(2), et. al) and using C++, so it would need to be reimplemented
> > from scratch for use in the kernel.
> 
> Right. Some of these details, like open, are obviously trivial, others
> less so. Reimplementing from scratch is an overstatement because the
> actual intrusions of user space code are just a small portion of the code
> and nearly all abstracted behind APIs that can be implemented as needed
> for userspace or kernel in out of line helpers, so that the main source
> is strictly unaware of the difference.

The use of C++ with templates is presumably one of the "less so"
parts, and it was that which I had in mind when I said,
"reimplementing from scratch".

> Also, most of this work is already being done for Tux3,

Great, when that work is done, we can take a look at the code and
see....

> > (5) The claim is made that readdir() accesses files sequentially; but
> > there is also mention in Shardmap of compressing shards (e.g.,
> > rewriting them) to squeeze out deleted and tombstone entries.  This
> > pretty much guarantees that it will not be possible to satisfy POSIX
> > requirements of telldir(2)/seekdir(3) (using a 32-bit or 64-bitt
> > cookie), NFS (which also requires use of a 32-bit or 64-bit cookie
> > while doing readdir scan), or readdir() semantics in the face of
> > directory entries getting inserted or removed from the directory.
> 
> No problem, the data blocks are completely separate from the index so
> readdir just walks through them in linear order a la classic UFS/Ext2.
> What could possibly be simpler, faster or more POSIX compliant?

OK, so what you're saying then is for every single directory entry
addition or removal, there must be (at least) two blocks which must be
modified, an (at least one) index block, and a data block, no?  That
makes it worse than htree, where most of the time we only need to
modify a single leaf node.  We only have to touch an index block when
a leaf node gets full and it needs to be split.

Anyway, let's wait and see how you and Hirofumi-san work out those
details for Tux3, and we can look at that and consider next steps at
that time.

Cheers,

						- Ted

  reply	other threads:[~2019-11-28  2:28 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27  1:47 [RFC] Thing 1: Shardmap fox Ext4 Daniel Phillips
2019-11-27  7:40 ` Vyacheslav Dubeyko
2019-11-27  8:28   ` Daniel Phillips
2019-11-27 19:35     ` Viacheslav Dubeyko
2019-11-28  2:54       ` Daniel Phillips
2019-11-28  9:15         ` Andreas Dilger
2019-11-28 10:03           ` Daniel Phillips
2019-11-27 14:25 ` Theodore Y. Ts'o
2019-11-27 22:27   ` Daniel Phillips
2019-11-28  2:28     ` Theodore Y. Ts'o [this message]
2019-11-28  4:27       ` Daniel Phillips
2019-11-30 17:50         ` Theodore Y. Ts'o
2019-12-01  8:21           ` Daniel Phillips
2019-12-04 18:31             ` Andreas Dilger
2019-12-04 21:44               ` Daniel Phillips
2019-12-05  0:36                 ` Andreas Dilger
2019-12-05  2:27                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-04 23:41               ` [RFC] Thing 1: Shardmap fox Ext4 Theodore Y. Ts'o
2019-12-06  1:16                 ` Dave Chinner
2019-12-06  5:09                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-08 22:42                     ` Dave Chinner
2019-11-28 21:17       ` [RFC] Thing 1: Shardmap fox Ext4 Daniel Phillips
2019-12-08 10:25       ` Daniel Phillips
2019-12-02  1:45   ` Daniel Phillips
2019-12-04 15:55     ` Vyacheslav Dubeyko
2019-12-05  9:46       ` Daniel Phillips
2019-12-06 11:47         ` Vyacheslav Dubeyko
2019-12-07  0:46           ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-04 18:03     ` [RFC] Thing 1: Shardmap fox Ext4 Andreas Dilger
2019-12-04 20:47       ` Daniel Phillips
2019-12-04 20:53         ` Daniel Phillips
2019-12-05  5:59           ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191128022817.GE22921@mit.edu \
    --to=tytso@mit.edu \
    --cc=daniel@phunq.net \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).