All of lore.kernel.org
 help / color / mirror / Atom feed
From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org
Subject: Re: [RFC PATCH] parallel-checkout: send the new object_id algo field to the workers
Date: Fri, 14 May 2021 19:53:07 +0000	[thread overview]
Message-ID: <YJ7VI5kHAAk126YJ@camp.crustytoothpaste.net> (raw)
In-Reply-To: <a4225bc79d963c5a411105e2b75f9ca4b80de835.1621000916.git.matheus.bernardino@usp.br>

[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]

On 2021-05-14 at 14:36:00, Matheus Tavares wrote:
> An object_id storing a SHA-1 name has some unused bytes at the end of
> the hash array. Since these bytes are not used, they are usually not
> initialized to any value either. However, at
> parallel_checkout.c:send_one_item() the object_id of a cache entry is
> copied into a buffer which is later sent to a checkout worker through a
> pipe write(). This makes Valgrind complain about passing uninitialized
> bytes to a syscall. The worker won't use these uninitialized bytes
> either, but the warning could confuse someone trying to debug this code;
> So instead of using oidcpy(), send_one_item() uses hashcpy() to only
> copy the used/initialized bytes of the object_id, and leave the
> remaining part with zeros.
> 
> However, since cf0983213c ("hash: add an algo member to struct
> object_id", 2021-04-26), using hashcpy() is no longer sufficient here as
> it won't copy the new algo field from the object_id. Let's add and use a
> new function which meets both our requirements of copying all the
> important object_id data while still avoiding the uninitialized bytes,
> by padding the end of the hash array in the destination object_id. With
> this change, we also no longer need the destination buffer from
> send_one_item() to be initialized with zeros, so let's switch from
> xcalloc() to xmalloc() to make this clear.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> 
> Hi, brian
> 
> I've read the hash transition plan, but I'm not confident to say that I
> fully understand it yet, so maybe this patch is not exactly what we need
> here. Mainly, I'm not sure I understand in which cases we will have an
> object_id.algo that is not the_hash_algo. Is it for the early transition
> phases, where we have a SHA-256 repo that accepts user input as SHA-1? 

Yes, that's correct, as well as for interoperability with remotes using
a different hash algorithm.

> Also, the object_id's copied here at send_one_item() always come from a
> `struct cache_entry`. In this case, can they still have different
> `algo`s or do we expect them to be the_hash_algo?

No, things in the index should always use the same algorithm..

The patch looks fine to me.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2021-05-14 19:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14 14:36 [RFC PATCH] parallel-checkout: send the new object_id algo field to the workers Matheus Tavares
2021-05-14 19:53 ` brian m. carlson [this message]
2021-05-17 16:54   ` Derrick Stolee
2021-05-17 19:32     ` Junio C Hamano
2021-05-17 19:49 ` [PATCH v2] " Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YJ7VI5kHAAk126YJ@camp.crustytoothpaste.net \
    --to=sandals@crustytoothpaste.net \
    --cc=git@vger.kernel.org \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.