All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Avi Deitcher <avi@deitcher.net>
Cc: linux-ext4@vger.kernel.org
Subject: Re: algorithm for half-md4 used in htree directories
Date: Fri, 15 Oct 2021 15:10:41 -0400	[thread overview]
Message-ID: <YWnSMXcR5anaYTEU@mit.edu> (raw)
In-Reply-To: <CAF1vpkggQpYrg7Z2VVK69pPBo0rSjDUsm8nB8dyES27cmDEf2g@mail.gmail.com>

On Fri, Oct 15, 2021 at 11:43:07AM -0700, Avi Deitcher wrote:
> I am absolutely stumped. I tried the seed as four u32 as is on disk
> (i.e. big-endian); four u32 little-endian; one long little-endian
> array of bytes (I have no idea why that would make sense, but worth
> trying); zeroed out so it gets the default. No one gives a consistent
> solution.
> 
> As far as I can tell: hash tells you which intermediate block to look
> in, minor hash tells you which leaf block to look in, and then you
> scan. So it is pretty easy to see in what range the minor and major
> hash should be, but no luck.
> 
> I put up a gist with debugfs and source and output.
> https://gist.github.com/deitch/53b01a90635449e7674babfe7e7dd002
> 
> Anyone who feels like a look-see, I would much appreciate it (and if
> they figure it out, owe a beer if ever in the same city).

I'm really curious *why* you are trying to reverse engineer the
implementation.  What are you trying to do?

In any case, you're mostly right about what hash and minor_hash are
for.  The 32-bit hash value is only thing that we use in the hashed B+
tree which is used for hash directories.  The 32-bit minor hash is
used to form a 64-bit number that gets used when we need to support
things like NFSv3 directory cursors, and POSIX telldir/seekdir (which
is a massive headache for modern file system, since it assumes that a
64-bit "offset" is all you get to reliably provide the POSIX
telldir/seekdir/readdir guarantees even when the directory is getting
large number of directory entries added and deleted without limit
between the telldir and the seekdir call).

As far as what you are doing wrong, I'll point out that *this* program
(attached below) provides the correct result.  Running this through a
debugger and comparing it with your implrementation is left as an
exercise for the reader --- but why do you want to try to make your
own implementation, when you could just pull down e2fsprogs, compile
it, and then *use* the provided ext2_fs library for whatever the heck
you are trying to do?

    	     	      	  	       - Ted

#include <stdio.h>
#include <stdlib.h>

#include <et/com_err.h>
#include <uuid/uuid.h>
#include <ext2fs/ext2_fs.h>
#include <ext2fs/ext2fs.h>

int main(int argc, char **argv)
{
	uuid_t	buf;
	unsigned int *p;
	int i;
	ext2_dirhash_t hash, minor_hash;
	errcode_t retval;

	uuid_parse("d64563bc-ea93-4aaf-a943-4657711ed153", buf);
	p = (unsigned int *) buf;
	for (i=0; i < 4; i++) {
		printf("buf[%d] = 0x%08x\n", i, p[i]);
	}

	retval = ext2fs_dirhash(1, "dir478", strlen("dir478"), p,
				&hash, &minor_hash);
	printf("dirhash results: retval=%u, hash=0x%08x, minor_hash=0x%08x\n",
	       i, hash, minor_hash);

	exit(0);
}

% gcc -g -o /tmp/foo /tmp/foo.c -luuid -lext2fs -lcom_err
% /tmp/foo
buf[0] = 0xbc6345d6
buf[1] = 0xaf4a93ea
buf[2] = 0x574643a9
buf[3] = 0x53d11e71
dirhash results: retval=4, hash=0x012225e2, minor_hash=0x3f08755d

  reply	other threads:[~2021-10-15 19:10 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 11:49 algorithm for half-md4 used in htree directories Avi Deitcher
2021-10-03 12:47 ` Avi Deitcher
2021-10-03 16:43   ` Andreas Dilger
2021-10-04  7:57     ` Avi Deitcher
2021-10-11 15:30       ` Avi Deitcher
2021-10-11 20:20         ` Theodore Ts'o
2021-10-12  2:58           ` Avi Deitcher
2021-10-12 17:30             ` Theodore Ts'o
2021-10-13  2:20               ` Avi Deitcher
2021-10-15 18:43               ` Avi Deitcher
2021-10-15 19:10                 ` Theodore Ts'o [this message]
2021-10-15 19:43                   ` Avi Deitcher
2021-10-15 20:30                     ` Theodore Ts'o
2021-10-15 19:50                   ` Theodore Ts'o
2021-10-18 16:56                     ` Avi Deitcher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YWnSMXcR5anaYTEU@mit.edu \
    --to=tytso@mit.edu \
    --cc=avi@deitcher.net \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.