All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	git@vger.kernel.org, "Martin Ågren" <martin.agren@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH 3/5] match-trees: use hashcpy to splice trees
Date: Fri, 11 Jan 2019 09:51:06 -0500	[thread overview]
Message-ID: <20190111145106.GB16754@sigill.intra.peff.net> (raw)
In-Reply-To: <20190110235551.GM423984@genre.crustytoothpaste.net>

On Thu, Jan 10, 2019 at 11:55:51PM +0000, brian m. carlson wrote:

> > I think the only reason they are "struct object_id" is because that's
> > what tree_entry_extract() returns. Should we be providing another
> > function to allow more byte-oriented access?
> 
> The reason is we recursively call splice_tree, passing rewrite_here as
> the first argument. That argument is then used for read_object_file,
> which requires a struct object_id * (because it is, logically, an object
> ID).
> 
> I *think* we could fix this by copying an unsigned char *rewrite_here
> into a temporary struct object_id before we recurse, but it's not
> obvious to me if that's safe to do.

I think rewrite_here needs to be a direct pointer into the buffer, since
we plan to modify it.

I think rewrite_with is correct to be an object_id. It's either the oid
passed in from the caller, or the subtree we generated (in which case
it's the result from write_object_file).

So the "most correct" thing is probably something like this:

diff --git a/match-trees.c b/match-trees.c
index 03e81b56e1..129b13a970 100644
--- a/match-trees.c
+++ b/match-trees.c
@@ -179,7 +179,7 @@ static int splice_tree(const struct object_id *oid1, const char *prefix,
 	char *buf;
 	unsigned long sz;
 	struct tree_desc desc;
-	struct object_id *rewrite_here;
+	unsigned char *rewrite_here;
 	const struct object_id *rewrite_with;
 	struct object_id subtree;
 	enum object_type type;
@@ -206,9 +206,16 @@ static int splice_tree(const struct object_id *oid1, const char *prefix,
 			if (!S_ISDIR(mode))
 				die("entry %s in tree %s is not a tree", name,
 				    oid_to_hex(oid1));
-			rewrite_here = (struct object_id *)(desc.entry.path +
-							    strlen(desc.entry.path) +
-							    1);
+			/*
+			 * We cast here for two reasons:
+			 *
+			 *   - to flip the "char *" (for the path) to "unsigned
+			 *     char *" (for the hash stored after it)
+			 *
+			 *   - to discard the "const"; this is OK because we
+			 *     know it points into our non-const "buf"
+			 */
+			rewrite_here = (unsigned char *)desc.entry.path + strlen(desc.entry.path) + 1;
 			break;
 		}
 		update_tree_entry(&desc);
@@ -224,7 +231,7 @@ static int splice_tree(const struct object_id *oid1, const char *prefix,
 	} else {
 		rewrite_with = oid2;
 	}
-	hashcpy(rewrite_here->hash, rewrite_with->hash);
+	hashcpy(rewrite_here, rewrite_with->hash);
 	status = write_object_file(buf, sz, tree_type, result);
 	free(buf);
 	return status;

I think if I were trying to write this in a less-subtle way, I'd
probably stop trying to do it in-place, and have a copy loop more like:

  for entry in src_tree
    if match_entry(entry, prefix)
      entry = rewrite_entry(entry) /* either oid2 or subtree */
    push_entry(dst_tree)

We may even have to go that way eventually if we might ever be rewriting
to a tree with a different hash size (i.e., there is a hidden assumption
here that rewrite_here points to exactly the_hash_algo->rawsz bytes of
hash). But I think for now it's not necessary, and it's way outside the
scope of what you're trying to do here.

-Peff

  reply	other threads:[~2019-01-11 14:51 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-07 23:34 What's cooking in git.git (Jan 2019, #01; Mon, 7) Junio C Hamano
2019-01-08  9:50 ` tg/checkout-no-overlay, was " Thomas Gummerer
2019-01-08 17:51   ` Junio C Hamano
2019-01-08 17:30 ` ag/sequencer-reduce-rewriting-todo " Alban Gruin
2019-01-08 21:20 ` sb/more-repo-in-api, was " Jonathan Tan
2019-01-08 21:35   ` Junio C Hamano
2019-01-09 21:28     ` Stefan Beller
2019-01-09  7:37 ` Martin Ågren
2019-01-09 21:06   ` Martin Ågren
2019-01-10  1:02     ` brian m. carlson
2019-01-10 18:55       ` Junio C Hamano
2019-01-10 19:03       ` Martin Ågren
2019-01-10  4:25     ` [PATCH 0/5] tree-walk object_id refactor brian m. carlson
2019-01-10  4:25       ` [PATCH 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-10  4:25       ` [PATCH 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-10  4:25       ` [PATCH 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-10  6:45         ` Jeff King
2019-01-10 23:55           ` brian m. carlson
2019-01-11 14:51             ` Jeff King [this message]
2019-01-11 14:54               ` Jeff King
2019-01-14  1:30                 ` brian m. carlson
2019-01-14 15:40                   ` Jeff King
2019-01-10  4:25       ` [PATCH 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-10  6:49         ` Jeff King
2019-01-10 23:57           ` brian m. carlson
2019-01-10  4:25       ` [PATCH 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-10  6:50         ` Jeff King
2019-01-10  6:40       ` [PATCH 0/5] tree-walk object_id refactor Jeff King
2019-01-11  0:17         ` brian m. carlson
2019-01-11 14:17           ` Jeff King
2019-01-15  0:39     ` [PATCH v2 " brian m. carlson
2019-01-15  0:39       ` [PATCH v2 1/5] tree-walk: copy object ID before use brian m. carlson
2019-01-15  0:39       ` [PATCH v2 2/5] match-trees: compute buffer offset correctly when splicing brian m. carlson
2019-01-15  0:39       ` [PATCH v2 3/5] match-trees: use hashcpy to splice trees brian m. carlson
2019-01-15  0:39       ` [PATCH v2 4/5] tree-walk: store object_id in a separate member brian m. carlson
2019-01-15  0:39       ` [PATCH v2 5/5] cache: make oidcpy always copy GIT_MAX_RAWSZ bytes brian m. carlson
2019-01-15 17:51       ` [PATCH v2 0/5] tree-walk object_id refactor Junio C Hamano
2019-01-09 10:28 ` What's cooking in git.git (Jan 2019, #01; Mon, 7) Jeff King
2019-01-10 19:05   ` Junio C Hamano
2019-01-10 19:46   ` Junio C Hamano
2019-01-10 18:02 ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190111145106.GB16754@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.