Linux-ext4 Archive on
 help / color / Atom feed
From: Daniel Phillips <>
To: Vyacheslav Dubeyko <>,
	"Theodore Y. Ts'o" <>
	OGAWA Hirofumi <>
Subject: Re: [RFC] Thing 1: Shardmap fox Ext4
Date: Thu, 5 Dec 2019 01:46:27 -0800
Message-ID: <> (raw)
In-Reply-To: <>

On 2019-12-04 7:55 a.m., Vyacheslav Dubeyko wrote:
>> That is it for media format. Very simple, is it not? My next post
>> will explain the Shardmap directory block format, with a focus on
>> deficiencies of the traditional Ext2 format that were addressed.
> I've tried to take a look into the source code. And it was not easy
> try. :)

Let's see what we can do about that, starting with removing the duopack
(media index entry) and tripack (cache index entry) templates. Now that
the design has settled down we don't need that level of generality so
much any more. The replacements are mostly C-style now and by the time
the Tux3 kernel port is done, will be officially C.

So far I only described the media format, implemented in define_layout(),
which I hope is self explanatory. You should be able to tie it back to
this diagram pretty easily.

> I expected to have the bird-fly understanding from shardmap.h
> file. My expectation was to find the initial set of structure
> declarations with the good comments.

Our wiki is slowly getting populated with design documentation. Most of
what you see in shardmap.h is concerned with the Shardmap cache form,
where all the action happens. I have not said much about that yet, but
there is a post on the way. The main structures are struct shard (a
self contained hash table) and struct keymap (a key value store
populated with shards). Those are obvious I think, please correct me
if I am wrong. A more tricky one is struct tier, which implements our
incremental hash table expansion. You might expect that to be a bit
subtle, and it is.

Before getting into those details, there is an upcoming post about
the record block format, which is pretty non-abstract and, I think,
easy enough to understand from the API declaration in shardmap.h and
the code in recops.c.

There is a diagram here:

but the post this belongs to is not quite ready to go out yet. That one
will be an interlude before for the cache form discussion, which is
where the magic happens, things like rehash and reshard and add_tier,
and the way the hash code gets chopped up as it runs through the access

Here is a diagram of the cache structures, very simple:

And here is a diagram of the Shardmap three level hashing scheme,
which ties everything together:

This needs explanation. It is something new that you won't find in any
textbook, this is the big reveal right here.

> But, frankly speaking, it's very
> complicated path for the concept understanding. Even from C++ point of
> view, the class declarations look very complicated if there are mixing
> of fields with methods declarations.

In each class, fields are declared first, then methods. In the kernel
port of course we will not have classes, and the function names will be
longer as usual.

> So, I believe it makes sense to declare the necessary set of structures
> in the file's beginning with the good comments. Even it will be good to
> split the structure declarations and methods in different files. I
> believe it will ease the way to understand the concept. Otherwise, it
> will be tough to review such code.

Declaring structures and functions in the same file is totally normal
for kernel code, you don't really want these in separate files unless
they break out naturally that way.

This code is dense, there is a lot going on in not very many lines. So
we need lots of lines of documentation to make up for that, which has
not been a priority until now, so please bear with me. And please do
not hesitate to ask specific questions - the answers may well end up in
the wiki.



  reply index

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27  1:47 Daniel Phillips
2019-11-27  7:40 ` Vyacheslav Dubeyko
2019-11-27  8:28   ` Daniel Phillips
2019-11-27 19:35     ` Viacheslav Dubeyko
2019-11-28  2:54       ` Daniel Phillips
2019-11-28  9:15         ` Andreas Dilger
2019-11-28 10:03           ` Daniel Phillips
2019-11-27 14:25 ` Theodore Y. Ts'o
2019-11-27 22:27   ` Daniel Phillips
2019-11-28  2:28     ` Theodore Y. Ts'o
2019-11-28  4:27       ` Daniel Phillips
2019-11-30 17:50         ` Theodore Y. Ts'o
2019-12-01  8:21           ` Daniel Phillips
2019-12-04 18:31             ` Andreas Dilger
2019-12-04 21:44               ` Daniel Phillips
2019-12-05  0:36                 ` Andreas Dilger
2019-12-05  2:27                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-04 23:41               ` [RFC] Thing 1: Shardmap fox Ext4 Theodore Y. Ts'o
2019-12-06  1:16                 ` Dave Chinner
2019-12-06  5:09                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-08 22:42                     ` Dave Chinner
2019-11-28 21:17       ` [RFC] Thing 1: Shardmap fox Ext4 Daniel Phillips
2019-12-08 10:25       ` Daniel Phillips
2019-12-02  1:45   ` Daniel Phillips
2019-12-04 15:55     ` Vyacheslav Dubeyko
2019-12-05  9:46       ` Daniel Phillips [this message]
2019-12-06 11:47         ` Vyacheslav Dubeyko
2019-12-07  0:46           ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-04 18:03     ` [RFC] Thing 1: Shardmap fox Ext4 Andreas Dilger
2019-12-04 20:47       ` Daniel Phillips
2019-12-04 20:53         ` Daniel Phillips
2019-12-05  5:59           ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on

Archives are clonable:
	git clone --mirror linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ \
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone