All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Cc: Patrick Steinhardt <ps@pks.im>
Subject: [PATCH v3 3/5] builtin/patch-id: fix uninitialized hash function
Date: Mon, 13 May 2024 15:41:25 -0700	[thread overview]
Message-ID: <20240513224127.2042052-4-gitster@pobox.com> (raw)
In-Reply-To: <20240513224127.2042052-1-gitster@pobox.com>

From: Patrick Steinhardt <ps@pks.im>

In c8aed5e8da (repository: stop setting SHA1 as the default object hash,
2024-05-07), we have adapted `initialize_repository()` to no longer set
up a default hash function. As this function is also used to set up
`the_repository`, the consequence is that `the_hash_algo` will now by
default be a `NULL` pointer unless the hash algorithm was configured
properly. This is done as a mechanism to detect cases where we may be
using the wrong hash function by accident.

This change now causes git-patch-id(1) to segfault when it's run outside
of a repository. As this command can read diffs from stdin, it does not
necessarily need a repository, but then relies on `the_hash_algo` to
compute the patch ID itself.

It is somewhat dubious that git-patch-id(1) relies on `the_hash_algo` in
the first place. Quoting its manpage:

    A "patch ID" is nothing but a sum of SHA-1 of the file diffs
    associated with a patch, with line numbers ignored. As such, it’s
    "reasonably stable", but at the same time also reasonably unique,
    i.e., two patches that have the same "patch ID" are almost
    guaranteed to be the same thing.

We explicitly document patch IDs to be using SHA-1. Furthermore, patch
IDs are supposed to be stable for most of the part. But even with the
same input, the patch IDs will now be different depending on the repo's
configured object hash.

Work around the issue by setting up SHA-1 when there was no startup
repository for now. This is arguably not the correct fix, but for now we
rather want to focus on getting the segfault fixed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/patch-id.c      | 13 +++++++++++++
 t/t1517-outside-repo.sh |  2 +-
 t/t4204-patch-id.sh     | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/builtin/patch-id.c b/builtin/patch-id.c
index 3894d2b970..e6ae89beab 100644
--- a/builtin/patch-id.c
+++ b/builtin/patch-id.c
@@ -5,6 +5,7 @@
 #include "hash.h"
 #include "hex.h"
 #include "parse-options.h"
+#include "setup.h"
 
 static void flush_current_id(int patchlen, struct object_id *id, struct object_id *result)
 {
@@ -237,6 +238,18 @@ int cmd_patch_id(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, builtin_patch_id_options,
 			     patch_id_usage, 0);
 
+	/*
+	 * We rely on `the_hash_algo` to compute patch IDs. This is dubious as
+	 * it means that the hash algorithm now depends on the object hash of
+	 * the repository, even though git-patch-id(1) clearly defines that
+	 * patch IDs always use SHA1.
+	 *
+	 * NEEDSWORK: This hack should be removed in favor of converting
+	 * the code that computes patch IDs to always use SHA1.
+	 */
+	if (!startup_info->have_repository)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	generate_id_list(opts ? opts > 1 : config.stable,
 			 opts ? opts == 3 : config.verbatim);
 	return 0;
diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
index 16d9714c27..f1fd5c9888 100755
--- a/t/t1517-outside-repo.sh
+++ b/t/t1517-outside-repo.sh
@@ -24,7 +24,7 @@ test_expect_success 'set up a non-repo directory and test file' '
 	git diff >sample.patch
 '
 
-test_expect_failure 'compute a patch-id outside repository' '
+test_expect_success 'compute a patch-id outside repository' '
 	git patch-id <sample.patch >patch-id.expect &&
 	(
 		cd non-repo &&
diff --git a/t/t4204-patch-id.sh b/t/t4204-patch-id.sh
index a7fa94ce0a..605faea0c7 100755
--- a/t/t4204-patch-id.sh
+++ b/t/t4204-patch-id.sh
@@ -310,4 +310,38 @@ test_expect_success 'patch-id handles diffs with one line of before/after' '
 	test_config patchid.stable true &&
 	calc_patch_id diffu1stable <diffu1
 '
+
+test_expect_failure 'patch-id computes same ID with different object hashes' '
+	test_when_finished "rm -rf repo-sha1 repo-sha256" &&
+
+	cat >diff <<-\EOF &&
+	diff --git a/bar b/bar
+	index bdaf90f..31051f6 100644
+	--- a/bar
+	+++ b/bar
+	@@ -2 +2,2 @@
+	 b
+	+c
+	EOF
+
+	git init --object-format=sha1 repo-sha1 &&
+	git -C repo-sha1 patch-id <diff >patch-id-sha1 &&
+	git init --object-format=sha256 repo-sha256 &&
+	git -C repo-sha256 patch-id <diff >patch-id-sha256 &&
+	test_cmp patch-id-sha1 patch-id-sha256
+'
+
+test_expect_success 'patch-id without repository' '
+	cat >diff <<-\EOF &&
+	diff --git a/bar b/bar
+	index bdaf90f..31051f6 100644
+	--- a/bar
+	+++ b/bar
+	@@ -2 +2,2 @@
+	 b
+	+c
+	EOF
+	nongit git patch-id <diff
+'
+
 test_done
-- 
2.45.0-145-g3e4a232f6e


  parent reply	other threads:[~2024-05-13 22:41 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-13  7:15 [PATCH 0/2] Fix use of uninitialized hash algos Patrick Steinhardt
2024-05-13  7:15 ` [PATCH 1/2] builtin/patch-id: fix uninitialized hash function Patrick Steinhardt
2024-05-13  7:15 ` [PATCH 2/2] builtin/hash-object: " Patrick Steinhardt
2024-05-14  0:16   ` Junio C Hamano
2024-05-13 16:01 ` [PATCH 0/2] Fix use of uninitialized hash algos Junio C Hamano
2024-05-13 18:36   ` Junio C Hamano
2024-05-13 19:21 ` [PATCH v2 0/4] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-13 19:21   ` [PATCH v2 1/4] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-13 19:48     ` Kyle Lippincott
2024-05-13 19:21   ` [PATCH v2 2/4] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-13 19:57     ` Kyle Lippincott
2024-05-13 20:33       ` Junio C Hamano
2024-05-13 21:00         ` Junio C Hamano
2024-05-13 21:07           ` Kyle Lippincott
2024-05-13 19:21   ` [PATCH v2 3/4] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-13 19:21   ` [PATCH v2 4/4] builtin/hash-object: " Junio C Hamano
2024-05-13 21:28   ` [PATCH 5/4] apply: " Junio C Hamano
2024-05-13 22:41 ` [PATCH v3 0/5] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-13 22:41   ` Junio C Hamano [this message]
2024-05-13 23:11     ` [PATCH v3 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-14  4:31       ` Patrick Steinhardt
2024-05-14 15:52         ` Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 4/5] builtin/hash-object: " Junio C Hamano
2024-05-13 23:13     ` Junio C Hamano
2024-05-14  4:32       ` Patrick Steinhardt
2024-05-14 15:55         ` Junio C Hamano
2024-05-13 22:41   ` [PATCH v3 5/5] apply: " Junio C Hamano
2024-05-14  1:14 ` [PATCH v4 0/5] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-14  4:32     ` Patrick Steinhardt
2024-05-14 15:05       ` Junio C Hamano
2024-05-14 17:19     ` Junio C Hamano
2024-05-15 12:23       ` Patrick Steinhardt
2024-05-16 15:31       ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-14  4:32     ` Patrick Steinhardt
2024-05-14 15:08       ` Junio C Hamano
2024-05-15 12:24         ` Patrick Steinhardt
2024-05-15 14:15           ` Junio C Hamano
2024-05-15 14:25             ` Patrick Steinhardt
2024-05-15 15:40               ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 4/5] builtin/hash-object: " Junio C Hamano
2024-05-17 23:49     ` Junio C Hamano
2024-05-20 21:19       ` Junio C Hamano
2024-05-20 22:45         ` Junio C Hamano
2024-05-14  1:14   ` [PATCH v4 5/5] apply: " Junio C Hamano
2024-05-20 23:14 ` [PATCH v5 0/5] Fix use of uninitialized hash algorithms Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 1/5] setup: add an escape hatch for "no more default hash algorithm" change Junio C Hamano
2024-05-21  7:57     ` Patrick Steinhardt
2024-05-21 15:59       ` Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 2/5] t1517: test commands that are designed to be run outside repository Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 3/5] builtin/patch-id: fix uninitialized hash function Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 4/5] builtin/hash-object: " Junio C Hamano
2024-05-20 23:14   ` [PATCH v5 5/5] apply: " Junio C Hamano
2024-05-21  7:58     ` Patrick Steinhardt
2024-05-21 13:36       ` Junio C Hamano
2024-05-21  7:58   ` [PATCH v5 0/5] Fix use of uninitialized hash algorithms Patrick Steinhardt
2024-05-21 18:07     ` Junio C Hamano
2024-05-22  4:51       ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240513224127.2042052-4-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.