Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Phillips <daniel@phunq.net>
To: Vyacheslav Dubeyko <slava@dubeyko.com>,
	"Theodore Y. Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: Re: [RFC] Thing 1: Shardmap for Ext4
Date: Fri, 6 Dec 2019 16:46:05 -0800
Message-ID: <a05837a0-a352-fc8d-5c9c-28d8065961fd@phunq.net> (raw)
In-Reply-To: <37c9494c40998d23d0d68afaa5a7f942a23e8986.camel@dubeyko.com>

On 2019-12-06 3:47 a.m., Vyacheslav Dubeyko wrote:
> On Thu, 2019-12-05 at 01:46 -0800, Daniel Phillips wrote:
>> On 2019-12-04 7:55 a.m., Vyacheslav Dubeyko wrote:
>>>>
> 
> <snipped and reoredered>
> 
>> And here is a diagram of the Shardmap three level hashing scheme,
>> which ties everything together:
>>
>>     https://github.com/danielbot/Shardmap/wiki/Shardmap-hashing-scheme
>>
>> This needs explanation. It is something new that you won't find in
>> any
>> textbook, this is the big reveal right here.
>>
> 
> This diagram is pretty good and provides the high-level view of the
> whole scheme. But, maybe, it makes sense to show the granularity of
> hash code. It looks like the low hash is the hash of a name. Am I
> correct?

Not quite. A 64 bit hash code is computed per name, then divided up into
three parts as shown in the diagram. Each part of the hash addresses a
different level of the Shardmap index hierarchy: high bits address the
top level shard array, giving a pointer to a shard; middle bits address
a hash bucket within that shard; low bits are used to resolve collisions
within the hash bucket (and collisions still may occur even when the low
bits are considered, forcing a record block access and full string
compare.

> But how the mid- and high- parts of the hash code are defined?

Given the above description, does the diagram make sense? If so I will
add this description to the wiki.

> It looks like that cached shard stores LBAs of record entry blocks are
> associated with the low hash values.

Rather, associated with the entire hash value.

> But what does it mean that shard is cached?

This is the cache form of the shard, meaning that the unordered hash/lba
index pairs (duopack) were read in from media and loaded into this cache
object (or newly created by recent directory operations.)

> Here is a diagram of the cache structures, very simple:
>>
>>     https://github.com/danielbot/Shardmap/wiki/Shardmap-cache-format
> 
> This diagram is not easy to relate with the previous one. So, shard
> table and shard array are the same entities or not?

They are, and I have updated the hashing scheme diagram to refer to both
as "array". I will similarly update the code, which currently calls the
shard array field "map".

> Or do you mean that
> shard table is storeed on the volume but shard array is constructed in
> memory?

Sorry about that, it should be clear now. On the volume, a simple
unordered collection of hash:lba pairs is stored per shard, which is
reorganized into shard cache form (a hash table object) at demand-load
time.

>> There is a diagram here:
>>
> https://github.com/danielbot/Shardmap/wiki/Shardmap-record-block-format
> 
> I am slightly confused here. Does header be located at the bottom of
> the record block?

The header (just 32 bytes at the moment, possibly to be expanded to 48
or 64) is stored at the top of the zeroth record entry block, which is
therefore a little smaller than any other record entry block.

> My understanding is that records grow from top of the
> block down to the header direction. Am I correct? Why header is not
> located at the top of the block with entry dictionary? Any special
> purpose here?

That should be clear now. I will add the above descriptive text to the
wiki.

Regards,

Daniel

  reply index

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-27  1:47 [RFC] Thing 1: Shardmap fox Ext4 Daniel Phillips
2019-11-27  7:40 ` Vyacheslav Dubeyko
2019-11-27  8:28   ` Daniel Phillips
2019-11-27 19:35     ` Viacheslav Dubeyko
2019-11-28  2:54       ` Daniel Phillips
2019-11-28  9:15         ` Andreas Dilger
2019-11-28 10:03           ` Daniel Phillips
2019-11-27 14:25 ` Theodore Y. Ts'o
2019-11-27 22:27   ` Daniel Phillips
2019-11-28  2:28     ` Theodore Y. Ts'o
2019-11-28  4:27       ` Daniel Phillips
2019-11-30 17:50         ` Theodore Y. Ts'o
2019-12-01  8:21           ` Daniel Phillips
2019-12-04 18:31             ` Andreas Dilger
2019-12-04 21:44               ` Daniel Phillips
2019-12-05  0:36                 ` Andreas Dilger
2019-12-05  2:27                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-04 23:41               ` [RFC] Thing 1: Shardmap fox Ext4 Theodore Y. Ts'o
2019-12-06  1:16                 ` Dave Chinner
2019-12-06  5:09                   ` [RFC] Thing 1: Shardmap for Ext4 Daniel Phillips
2019-12-08 22:42                     ` Dave Chinner
2019-11-28 21:17       ` [RFC] Thing 1: Shardmap fox Ext4 Daniel Phillips
2019-12-08 10:25       ` Daniel Phillips
2019-12-02  1:45   ` Daniel Phillips
2019-12-04 15:55     ` Vyacheslav Dubeyko
2019-12-05  9:46       ` Daniel Phillips
2019-12-06 11:47         ` Vyacheslav Dubeyko
2019-12-07  0:46           ` Daniel Phillips [this message]
2019-12-04 18:03     ` Andreas Dilger
2019-12-04 20:47       ` Daniel Phillips
2019-12-04 20:53         ` Daniel Phillips
2019-12-05  5:59           ` Daniel Phillips

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a05837a0-a352-fc8d-5c9c-28d8065961fd@phunq.net \
    --to=daniel@phunq.net \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=slava@dubeyko.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git