git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SHA256 support not experimental, or?
@ 2023-06-28 16:28 Adam Majer
  2023-06-29  1:59 ` brian m. carlson
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Adam Majer @ 2023-06-28 16:28 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

Hi all,

Is sha256 still considered experimental or can it be assumed to be stable?

The usecase here is we are planning on moving to sha256 repositories 
mostly due to integrity guarantees, hypothetical or otherwise. What is 
important is not the initial interop challenges with sha1 repos, but 
whether the on-disk format will remain compatible with future versions 
of git. At minimum, the on-disk format would be converted by some future 
version(s) of git into another one and not be an end-of-the-road because 
it was "experimental" where dataloss is an implied risk.

Attached is a patch that removes the scary text, if indeed sha256 should 
be viewed as stable.

Cheers,
- Adam

[-- Attachment #2: 0001-doc-sha256-is-no-longer-experimantal.patch --]
[-- Type: text/x-patch, Size: 1580 bytes --]

---
 Documentation/git.txt                      | 4 ++--
 Documentation/object-format-disclaimer.txt | 8 ++------
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index f0cafa2290..7c150a473c 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -553,8 +553,8 @@ double-quotes and respecting backslash escapes. E.g., the value
 	If this variable is set, the default hash algorithm for new
 	repositories will be set to this value. This value is
 	ignored when cloning and the setting of the remote repository
-	is always used. The default is "sha1". THIS VARIABLE IS
-	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].
+	is always used. The default is "sha1".
+    See `--object-format` in linkgit:git-init[1].
 
 Git Commits
 ~~~~~~~~~~~
diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
index 4cb106f0d1..dccee9c400 100644
--- a/Documentation/object-format-disclaimer.txt
+++ b/Documentation/object-format-disclaimer.txt
@@ -1,6 +1,2 @@
-THIS OPTION IS EXPERIMENTAL! SHA-256 support is experimental and still
-in an early stage.  A SHA-256 repository will in general not be able to
-share work with "regular" SHA-1 repositories.  It should be assumed
-that, e.g., Git internal file formats in relation to SHA-256
-repositories may change in backwards-incompatible ways.  Only use
-`--object-format=sha256` for testing purposes.
+Note: SHA-256 repository will in general not be able to
+share work with "regular" SHA-1 repositories.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-28 16:28 SHA256 support not experimental, or? Adam Majer
@ 2023-06-29  1:59 ` brian m. carlson
  2023-06-29 10:42   ` Adam Majer
  2023-06-29  5:59 ` Junio C Hamano
  2023-07-20 15:07 ` Adam Majer
  2 siblings, 1 reply; 21+ messages in thread
From: brian m. carlson @ 2023-06-29  1:59 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 923 bytes --]

On 2023-06-28 at 16:28:28, Adam Majer wrote:
> Hi all,
> 
> Is sha256 still considered experimental or can it be assumed to be stable?
> 
> The usecase here is we are planning on moving to sha256 repositories mostly
> due to integrity guarantees, hypothetical or otherwise. What is important is
> not the initial interop challenges with sha1 repos, but whether the on-disk
> format will remain compatible with future versions of git. At minimum, the
> on-disk format would be converted by some future version(s) of git into
> another one and not be an end-of-the-road because it was "experimental"
> where dataloss is an implied risk.

I have no intention of changing things at this point.  I think it should
be viewed as stable by now, and I'd support this patch, although to get
it picked up it will need a commit message and a sign-off.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-28 16:28 SHA256 support not experimental, or? Adam Majer
  2023-06-29  1:59 ` brian m. carlson
@ 2023-06-29  5:59 ` Junio C Hamano
  2023-06-29 10:53   ` Adam Majer
  2023-06-29 21:17   ` brian m. carlson
  2023-07-20 15:07 ` Adam Majer
  2 siblings, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2023-06-29  5:59 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

Adam Majer <adamm@zombino.com> writes:

> Is sha256 still considered experimental or can it be assumed to be stable?

I do not think we would officially label SHA-256 support as "stable"
until we have good interoperability with SHA-1 repositories, but the
expectation is that we will make reasonable effort to keep migration
path for the current SHA-256 repositories, even if it turns out that
its on-disk format need to be updated, to keep the end-user data safe.

So while "no-longer-experimental" patch is probably a bit premature,
the warning in flashing red letters to caution against any use other
than testing may want to be toned down.

Thanks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29  1:59 ` brian m. carlson
@ 2023-06-29 10:42   ` Adam Majer
  0 siblings, 0 replies; 21+ messages in thread
From: Adam Majer @ 2023-06-29 10:42 UTC (permalink / raw)
  To: brian m. carlson, git

[-- Attachment #1: Type: text/plain, Size: 288 bytes --]

On 6/29/23 03:59, brian m. carlson wrote:
> I have no intention of changing things at this point.  I think it should
> be viewed as stable by now, and I'd support this patch, although to get
> it picked up it will need a commit message and a sign-off.

Sounds good. Patch follows.

- Adam

[-- Attachment #2: 0001-doc-sha256-is-no-longer-experimental.patch --]
[-- Type: text/x-patch, Size: 2210 bytes --]

From 90be51143e741053390810720ba4a639c3b0b74c Mon Sep 17 00:00:00 2001
From: Adam Majer <adamm@zombino.com>
Date: Wed, 28 Jun 2023 14:46:02 +0200
Subject: [PATCH] doc: sha256 is no longer experimental

The purpose of this patch is to remove scary wording that basically
stops people using sha256 repositories not because of interoperability
issues with sha1 repositories, but from fear that their work will
suddenly become incompatible in some future version of git.

We should be clear that currently sha256 repositories will not work with
sha1 repositories but stop the scary words.

Signed-off-by: Adam Majer <adamm@zombino.com>
---
 Documentation/git.txt                      | 4 ++--
 Documentation/object-format-disclaimer.txt | 8 ++------
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index f0cafa2290..666dbdb55c 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -553,8 +553,8 @@ double-quotes and respecting backslash escapes. E.g., the value
 	If this variable is set, the default hash algorithm for new
 	repositories will be set to this value. This value is
 	ignored when cloning and the setting of the remote repository
-	is always used. The default is "sha1". THIS VARIABLE IS
-	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].
+	is always used. The default is "sha1". See `--object-format`
+	in linkgit:git-init[1].
 
 Git Commits
 ~~~~~~~~~~~
diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
index 4cb106f0d1..1e976688be 100644
--- a/Documentation/object-format-disclaimer.txt
+++ b/Documentation/object-format-disclaimer.txt
@@ -1,6 +1,2 @@
-THIS OPTION IS EXPERIMENTAL! SHA-256 support is experimental and still
-in an early stage.  A SHA-256 repository will in general not be able to
-share work with "regular" SHA-1 repositories.  It should be assumed
-that, e.g., Git internal file formats in relation to SHA-256
-repositories may change in backwards-incompatible ways.  Only use
-`--object-format=sha256` for testing purposes.
+Note: SHA-256 repositories currently will not be able to share work
+with "regular" SHA-1 repositories.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29  5:59 ` Junio C Hamano
@ 2023-06-29 10:53   ` Adam Majer
  2023-06-29 20:56     ` Junio C Hamano
  2023-06-29 21:17   ` brian m. carlson
  1 sibling, 1 reply; 21+ messages in thread
From: Adam Majer @ 2023-06-29 10:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 6/29/23 07:59, Junio C Hamano wrote:
> Adam Majer <adamm@zombino.com> writes:
> 
>> Is sha256 still considered experimental or can it be assumed to be stable?
> 
> I do not think we would officially label SHA-256 support as "stable"
> until we have good interoperability with SHA-1 repositories, but the
> expectation is that we will make reasonable effort to keep migration
> path for the current SHA-256 repositories, even if it turns out that
> its on-disk format need to be updated, to keep the end-user data safe.

That could be a different definition of stable. But I'm satisfied that 
current sha256 repositories will not end up incompatible with some 
future version of git without migration path (talking about on-disk format).

So maybe my question should be reworded to "is sha256 still considered 
early stage, for testing purposes only with possible data-loss or can it 
be relied on for actual long lived repositories?"


> So while "no-longer-experimental" patch is probably a bit premature,
> the warning in flashing red letters to caution against any use other
> than testing may want to be toned down.

Agreed. I think it should be clear that SHA256 and SHA1 repositories 
cannot share data at this point. The scary wording should be removed 
though, as currently it sounds like "data loss incoming and it's your 
fault" if one chooses sha256

- Adam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29 10:53   ` Adam Majer
@ 2023-06-29 20:56     ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2023-06-29 20:56 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

Adam Majer <adamm@zombino.com> writes:

> So maybe my question should be reworded to "is sha256 still considered
> early stage, for testing purposes only with possible data-loss or can
> it be relied on for actual long lived repositories?"

My understanding is that they are in a happy place where they are
just as usable as SHA-1 based repositories have been.  As we have
well-worked out interoperability design but no implementation, it
may have to change once we discover something missing in the design,
though.  But without such clarification, you already know the answer
to the above question in the message you are responding to.  Having
a migration path means "possible data-los" is not in the picture.

> The scary wording should be removed
> though, as currently it sounds like "data loss incoming and it's your
> fault" if one chooses sha256

Good.

THanks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29  5:59 ` Junio C Hamano
  2023-06-29 10:53   ` Adam Majer
@ 2023-06-29 21:17   ` brian m. carlson
  2023-06-29 22:22     ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: brian m. carlson @ 2023-06-29 21:17 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Adam Majer, git

[-- Attachment #1: Type: text/plain, Size: 1187 bytes --]

On 2023-06-29 at 05:59:11, Junio C Hamano wrote:
> Adam Majer <adamm@zombino.com> writes:
> 
> > Is sha256 still considered experimental or can it be assumed to be stable?
> 
> I do not think we would officially label SHA-256 support as "stable"
> until we have good interoperability with SHA-1 repositories, but the
> expectation is that we will make reasonable effort to keep migration
> path for the current SHA-256 repositories, even if it turns out that
> its on-disk format need to be updated, to keep the end-user data safe.

I don't think that's a good position to have.  I'm not working on
interop more than incidentally at the moment, and to my knowledge,
nobody else is, either.  Absent me having substantially more free time
or having my employer pay me to work on it, it is probably not
happening.

We desperately do want people to move away from SHA-1 to SHA-256, and as
soon as there's tooling and forges to do so, we should encourage them to
do so.  Just because people can't interop existing SHA-1 repositories
doesn't mean people can't or shouldn't build new SHA-256 repositories.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29 21:17   ` brian m. carlson
@ 2023-06-29 22:22     ` Junio C Hamano
  2023-06-30  1:21       ` brian m. carlson
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2023-06-29 22:22 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Adam Majer, git

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2023-06-29 at 05:59:11, Junio C Hamano wrote:
>> Adam Majer <adamm@zombino.com> writes:
>> 
>> > Is sha256 still considered experimental or can it be assumed to be stable?
>> 
>> I do not think we would officially label SHA-256 support as "stable"
>> until we have good interoperability with SHA-1 repositories, but the
>> expectation is that we will make reasonable effort to keep migration
>> path for the current SHA-256 repositories, even if it turns out that
>> its on-disk format need to be updated, to keep the end-user data safe.
>
> I don't think that's a good position to have.
> We desperately do want people to move away from SHA-1 to SHA-256, and as
> soon as there's tooling and forges to do so, we should encourage them to
> do so.

I agree that it is good to ensure that SHA-256 support is good
enough to start new projects with.

> Just because people can't interop existing SHA-1 repositories
> doesn't mean people can't or shouldn't build new SHA-256 repositories.

True, and our messaging should avoid scaring them away from doing
so.  But isn't the lack of interoperability one of the reasons why
GitHub and Gitlab do not yet offer choice of the hash?  There
certainly is a chicken-and-egg problem here.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-29 22:22     ` Junio C Hamano
@ 2023-06-30  1:21       ` brian m. carlson
  2023-06-30  9:31         ` Patrick Steinhardt
  0 siblings, 1 reply; 21+ messages in thread
From: brian m. carlson @ 2023-06-30  1:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Adam Majer, git

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

On 2023-06-29 at 22:22:51, Junio C Hamano wrote:
> True, and our messaging should avoid scaring them away from doing
> so.  But isn't the lack of interoperability one of the reasons why
> GitHub and Gitlab do not yet offer choice of the hash?  There
> certainly is a chicken-and-egg problem here.

There are a lot of necessary changes for a forge to adopt SHA-256.  For
example, at GitHub, we have a single null OID constant in some code that
has to be addressed, libgit2 has to be taught about SHA-256 or removed,
and UI changes need to be done to accommodate the larger IDs.  I'm
sure that GitLab has very similar situations, as do all of the other
forges.  After all, think about the extensive number of patches that
went into Git itself to get us there.  Everyone has made all of those
same assumptions in their forges.

I'm certain that whether or not interoperability were available would
not influence the forges' desire to support SHA-256.  It's simply a lot
of work to fix all of those spots that need it and requires a lot of
communication and discussions across teams, all of which takes time.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-30  1:21       ` brian m. carlson
@ 2023-06-30  9:31         ` Patrick Steinhardt
  2023-06-30 11:25           ` Adam Majer
  0 siblings, 1 reply; 21+ messages in thread
From: Patrick Steinhardt @ 2023-06-30  9:31 UTC (permalink / raw)
  To: brian m. carlson, Junio C Hamano, Adam Majer, git

[-- Attachment #1: Type: text/plain, Size: 2516 bytes --]

On Fri, Jun 30, 2023 at 01:21:45AM +0000, brian m. carlson wrote:
> On 2023-06-29 at 22:22:51, Junio C Hamano wrote:
> > True, and our messaging should avoid scaring them away from doing
> > so.  But isn't the lack of interoperability one of the reasons why
> > GitHub and Gitlab do not yet offer choice of the hash?  There
> > certainly is a chicken-and-egg problem here.
> 
> There are a lot of necessary changes for a forge to adopt SHA-256.  For
> example, at GitHub, we have a single null OID constant in some code that
> has to be addressed, libgit2 has to be taught about SHA-256 or removed,
> and UI changes need to be done to accommodate the larger IDs.  I'm
> sure that GitLab has very similar situations, as do all of the other
> forges.  After all, think about the extensive number of patches that
> went into Git itself to get us there.  Everyone has made all of those
> same assumptions in their forges.

Indeed, supporting SHA256 is a major effort on our side at GitLab. Most
of the work isn't really adapting our production code, but it's rather
that tons of tests were written with seed repositories and hardcoded
object hashes. Converting all of that isn't all that hard in the general
case, but it's a tedious job.

In the Gitaly team we have already started to put significant time into
this problem and are slowly chipping away at it. We are at a state where
most of our codebase works with SHA256 alright, and we in fact continue
down that road as a low-priority side project where we convert a handful
of tests every release.

> I'm certain that whether or not interoperability were available would
> not influence the forges' desire to support SHA-256.  It's simply a lot
> of work to fix all of those spots that need it and requires a lot of
> communication and discussions across teams, all of which takes time.

True as well. Even though Gitaly will likely be SHA256-ready in the not
too distant future, that doesn't mean that GitLab as a whole is. The
frontend will need investments as well, and there's likely a long tail
of other stuff that needs to be done that I ain't yet got on my radar
right now.

In any case I'm fully supportive of relaxing the current warning. Except
for the recently discussed edge case where cloning empty repositories
didn't create a SHA256 repository I have found the SHA256 code to be
stable and working as advertised. We should caution people that many
services will not work with SHA256 yet though.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-30  9:31         ` Patrick Steinhardt
@ 2023-06-30 11:25           ` Adam Majer
  2023-06-30 11:38             ` Patrick Steinhardt
  2023-06-30 12:20             ` Son Luong Ngoc
  0 siblings, 2 replies; 21+ messages in thread
From: Adam Majer @ 2023-06-30 11:25 UTC (permalink / raw)
  To: Patrick Steinhardt, brian m. carlson, Junio C Hamano, git

On 6/30/23 11:31, Patrick Steinhardt wrote:
> Indeed, supporting SHA256 is a major effort on our side at GitLab. Most
> of the work isn't really adapting our production code, but it's rather
> that tons of tests were written with seed repositories and hardcoded
> object hashes. Converting all of that isn't all that hard in the general
> case, but it's a tedious job.

Hi!

This actually reminds me of a funny story from my side.

Earlier this year, I was testing various frontends and how they would 
handle SHA256 repositories. All of them failed, not surprising. I even 
managed to lock myself out of Gitlab by importing a SHA256 private repo 
into my home project -- every time this project became visible, it would 
result in Error 500 from the UI. Today (few weeks ago), this appears to 
be fixed -- the UI is just broken, so you can't see anything in sha256 
repository, but at least I was able to delete the project.

The repository was correctly imported and I could clone from gitlab, so 
the problem is mostly "just" UI. :-)

The most likely frontend we'll use for our internal project is Gitea. 
The sha256 support is in progress

https://github.com/go-gitea/gitea/pull/23894

 From the size of this patch, you can see how ingrained SHA1 assumption 
was. Most of the patch is just to remove the hardcoded elements, 
including hardcoded SHA1 empty-tree hashes and assumption that 20 bytes 
is enough to hold a hash. And I didn't even add sha256 test cases...

But I have to say that in at least one occasion, people are bringing up 
the experimental nature of git's sha256 support (per current wording) as 
reason not to make their tools sha256 compliant.

> In any case I'm fully supportive of relaxing the current warning. Except
> for the recently discussed edge case where cloning empty repositories
> didn't create a SHA256 repository I have found the SHA256 code to be
> stable and working as advertised. We should caution people that many
> services will not work with SHA256 yet though.

That is exactly true. But this is also chicken-egg problem. Services are 
not adapted for sha256 repositories because there is simply no demand 
for them. Only when people will start using sha256 repos, will there be 
some demand generated.

- Adam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-30 11:25           ` Adam Majer
@ 2023-06-30 11:38             ` Patrick Steinhardt
  2023-06-30 12:20             ` Son Luong Ngoc
  1 sibling, 0 replies; 21+ messages in thread
From: Patrick Steinhardt @ 2023-06-30 11:38 UTC (permalink / raw)
  To: Adam Majer; +Cc: brian m. carlson, Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 3981 bytes --]

On Fri, Jun 30, 2023 at 01:25:06PM +0200, Adam Majer wrote:
> On 6/30/23 11:31, Patrick Steinhardt wrote:
> > Indeed, supporting SHA256 is a major effort on our side at GitLab. Most
> > of the work isn't really adapting our production code, but it's rather
> > that tons of tests were written with seed repositories and hardcoded
> > object hashes. Converting all of that isn't all that hard in the general
> > case, but it's a tedious job.
> 
> Hi!
> 
> This actually reminds me of a funny story from my side.
> 
> Earlier this year, I was testing various frontends and how they would handle
> SHA256 repositories. All of them failed, not surprising. I even managed to
> lock myself out of Gitlab by importing a SHA256 private repo into my home
> project -- every time this project became visible, it would result in Error
> 500 from the UI. Today (few weeks ago), this appears to be fixed -- the UI
> is just broken, so you can't see anything in sha256 repository, but at least
> I was able to delete the project.

Yeah, thinks gradually start to work. It's kind of satisfying to see how
more and more things start to fall into place.

> The repository was correctly imported and I could clone from gitlab, so the
> problem is mostly "just" UI. :-)

The UI is a significantly broken right now, mostly because the request
routing logic still has a maximum object ID length of 40 characters
hardcoded. So indeed, most of the stuff in the UI doesn't work unless
you do a few changes in the frontend. I should probably just create the
merge request to fix these as I already have those changes available
locally anyway.

But there's other parts that are in the Gitaly backend that don't yet
work. There's some RPCs that parse object IDs, but still use the
hardcoded SHA1 hash. Updating them is trivial, but as mentioned updating
their tests is tedious work.

> The most likely frontend we'll use for our internal project is Gitea. The
> sha256 support is in progress
> 
> https://github.com/go-gitea/gitea/pull/23894
> 
> From the size of this patch, you can see how ingrained SHA1 assumption was.
> Most of the patch is just to remove the hardcoded elements, including
> hardcoded SHA1 empty-tree hashes and assumption that 20 bytes is enough to
> hold a hash. And I didn't even add sha256 test cases...

I guess most projects that started a long time a go made the same error
of taking SHA1 for granted, so they didn't bother writing neither the
production code nor the tests with swapping out the object format in
mind. I guess we've learned our lesson here, which also means that the
next transition (if there ever will be one) should go a lot faster as
the codebases should be prepared then.

> But I have to say that in at least one occasion, people are bringing up the
> experimental nature of git's sha256 support (per current wording) as reason
> not to make their tools sha256 compliant.

Yeah, it's this chicken-and-egg problem. Things are experimental as most
tools ain't got support, but because most things ain't got support we
never get any testers and thus are stuck in that state.

> > In any case I'm fully supportive of relaxing the current warning. Except
> > for the recently discussed edge case where cloning empty repositories
> > didn't create a SHA256 repository I have found the SHA256 code to be
> > stable and working as advertised. We should caution people that many
> > services will not work with SHA256 yet though.
> 
> That is exactly true. But this is also chicken-egg problem. Services are not
> adapted for sha256 repositories because there is simply no demand for them.
> Only when people will start using sha256 repos, will there be some demand
> generated.

Yup, and that is why I have been pushing for SHA256 support internally
at GitLab for quite a while -- our efforts here started almost exactly a
year ago, but has gained more steam in recent months.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-30 11:25           ` Adam Majer
  2023-06-30 11:38             ` Patrick Steinhardt
@ 2023-06-30 12:20             ` Son Luong Ngoc
  2023-06-30 16:45               ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Son Luong Ngoc @ 2023-06-30 12:20 UTC (permalink / raw)
  To: Adam Majer; +Cc: Patrick Steinhardt, brian m. carlson, Junio C Hamano, git

Hi,

> On 30 Jun 2023, at 13:25, Adam Majer <adamm@zombino.com> wrote:...
> On 6/30/23 11:31, Patrick Steinhardt wrote:
...
> > In any case I'm fully supportive of relaxing the current warning. Except
> > for the recently discussed edge case where cloning empty repositories
> > didn't create a SHA256 repository I have found the SHA256 code to be
> > stable and working as advertised. We should caution people that many
> > services will not work with SHA256 yet though.
>
> That is exactly true. But this is also chicken-egg problem. Services are not adapted for sha256 repositories because there is simply no demand for them. Only when people will start using sha256 repos, will there be some demand generated.

FWIW, in the Bazel ecosystem where SHA256 is very popular, there has
been an increasing appetite for FUSE file system to lazily fetch contents
of a git repository.

Build tools such as Bazel would often need to hash the content of the
source files to build a dependency graph.  And in a FUSE setup, it would
be ideal if the FUSE server could supply the hash via an xattr, so that
FUSE client does not need to fetch the whole file content and only the
metadata.

Most tools in this space (Bazel, Buck2) are using SHA256 and are exploring
faster hash such as Blake3, Aegis, KangarooTwelve for larger file
support.  As these matured build tools gains popularity, so will the usage
of SHA256 (and newer hash algorithm).

Another point I think might help motivate different forges to
move would be switching from the object's hash to digest (hash and
file size).  The additional file size information would help tremendously
in predicting compute resources when serving files of a repository.

So I think Git would simply need a bit more time for these related
ecosystems to reach a critical mass and help fuel the transition to a
<new-hasher>.

> - Adam

Regards,
Son Luong.

References:

- https://buck2.build/docs/rfcs/drafts/digest-kinds/#use-cases
- https://github.com/bazelbuild/bazel/pull/18784

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-30 12:20             ` Son Luong Ngoc
@ 2023-06-30 16:45               ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2023-06-30 16:45 UTC (permalink / raw)
  To: Son Luong Ngoc; +Cc: Adam Majer, Patrick Steinhardt, brian m. carlson, git

Son Luong Ngoc <sluongng@gmail.com> writes:

> Build tools such as Bazel would often need to hash the content of the
> source files to build a dependency graph.  And in a FUSE setup, it would
> be ideal if the FUSE server could supply the hash via an xattr, so that
> FUSE client does not need to fetch the whole file content and only the
> metadata.

This is unrelated tangent, but the implementation of virtual
filesystem on top of Git's object store will be able to give such
SHA-256 hash only by computing the hash itself, if the "hash the
content of the source files" has to be exactly SHA-256.  Using Git
repository that uses SHA-256 would *not* help.

    $ git init --object-format sha256
    $ echo hello | git hash-object --stdin
    2cf8d83d9ee29543b34a87727421fdecb7e3f3a183d337639025de576db9ebb4
    $ echo hello | sha256sum
    5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03  -

This is because the object name used by Git is not the hash of the
content.  It is a hash of an object header (object type and byte
count) followed by its contents.

    $ printf "blob 6\0hello\n" | sha256sum
    2cf8d83d9ee29543b34a87727421fdecb7e3f3a183d337639025de576db9ebb4  -

The build systems can choose to tell FUSE server to expose the Git
object names via xattr, but if it needs to see if some contents (not
in FUSE) it has on hand is the same as what is stored in the FUSE
server, it needs to use the "slightly modified SHA-256" that matches
what Git uses.  It would still be using some hash that has the same
strength as underlying SHA-256, but it is *not* SHA-256.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-06-28 16:28 SHA256 support not experimental, or? Adam Majer
  2023-06-29  1:59 ` brian m. carlson
  2023-06-29  5:59 ` Junio C Hamano
@ 2023-07-20 15:07 ` Adam Majer
  2023-07-20 18:18   ` Junio C Hamano
  2 siblings, 1 reply; 21+ messages in thread
From: Adam Majer @ 2023-07-20 15:07 UTC (permalink / raw)
  To: git

I'll try again with inline patch. I think it wasn't picked up since it
was mime encoded by the mail client..

- Adam


From 90be51143e741053390810720ba4a639c3b0b74c Mon Sep 17 00:00:00 2001
From: Adam Majer <adamm@zombino.com>
Date: Wed, 28 Jun 2023 14:46:02 +0200
Subject: [PATCH] doc: sha256 is no longer experimental

The purpose of this patch is to remove scary wording that basically
stops people using sha256 repositories not because of interoperability
issues with sha1 repositories, but from fear that their work will
suddenly become incompatible in some future version of git.

We should be clear that currently sha256 repositories will not work with
sha1 repositories but stop the scary words.

Signed-off-by: Adam Majer <adamm@zombino.com>
---
 Documentation/git.txt                      | 4 ++--
 Documentation/object-format-disclaimer.txt | 8 ++------
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index f0cafa2290..666dbdb55c 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -553,8 +553,8 @@ double-quotes and respecting backslash escapes. E.g., the value
 	If this variable is set, the default hash algorithm for new
 	repositories will be set to this value. This value is
 	ignored when cloning and the setting of the remote repository
-	is always used. The default is "sha1". THIS VARIABLE IS
-	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].
+	is always used. The default is "sha1". See `--object-format`
+	in linkgit:git-init[1].
 
 Git Commits
 ~~~~~~~~~~~
diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
index 4cb106f0d1..1e976688be 100644
--- a/Documentation/object-format-disclaimer.txt
+++ b/Documentation/object-format-disclaimer.txt
@@ -1,6 +1,2 @@
-THIS OPTION IS EXPERIMENTAL! SHA-256 support is experimental and still
-in an early stage.  A SHA-256 repository will in general not be able to
-share work with "regular" SHA-1 repositories.  It should be assumed
-that, e.g., Git internal file formats in relation to SHA-256
-repositories may change in backwards-incompatible ways.  Only use
-`--object-format=sha256` for testing purposes.
+Note: SHA-256 repositories currently will not be able to share work
+with "regular" SHA-1 repositories.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-07-20 15:07 ` Adam Majer
@ 2023-07-20 18:18   ` Junio C Hamano
  2023-07-26 16:44     ` Junio C Hamano
  2023-07-31 13:38     ` Adam Majer
  0 siblings, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2023-07-20 18:18 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

Adam Majer <adamm@zombino.com> writes:

> I'll try again with inline patch. I think it wasn't picked up since it
> was mime encoded by the mail client..
>
> - Adam
>
>
> From 90be51143e741053390810720ba4a639c3b0b74c Mon Sep 17 00:00:00 2001

Remove all the above lines (including the "From <commit object
name>").  If you want to add a note that should not be recorded in
the message of the resulting commit, write it _after_ the three-dash
line after your sign-off.

> From: Adam Majer <adamm@zombino.com>
> Date: Wed, 28 Jun 2023 14:46:02 +0200
> Subject: [PATCH] doc: sha256 is no longer experimental

It is not technically incorrect to have these three lines here, but
when you are presenting your own work, it is preferrable to do
without them.  The "From:" address line and "Subject:" text line do
not have to be here---most people should be able to make the
corresponding e-mail headers to have the value they want to use,
and while the above "Date:" might be the time you wrote the commit,
it is way earlier than the time the contents of the commit was
presented for consideration to the general public, which is recorded
in the e-mail header of the message you are sending.

So, the body of the message usually should start from here (below).

In general, please follow [[describe-changes]] part of the
Documentation/SubmittingPatches document, and also "git log
--no-merges" of recent contributions by others.  "The purpose of
this patch is" is not how we usually talk about our work.

> The purpose of this patch is to remove scary wording that basically
> stops people using sha256 repositories not because of interoperability
> issues with sha1 repositories, but from fear that their work will
> suddenly become incompatible in some future version of git.
>
> We should be clear that currently sha256 repositories will not work with
> sha1 repositories but stop the scary words.
>
> Signed-off-by: Adam Majer <adamm@zombino.com>
> ---
>  Documentation/git.txt                      | 4 ++--
>  Documentation/object-format-disclaimer.txt | 8 ++------
>  2 files changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index f0cafa2290..666dbdb55c 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -553,8 +553,8 @@ double-quotes and respecting backslash escapes. E.g., the value
>  	If this variable is set, the default hash algorithm for new
>  	repositories will be set to this value. This value is
>  	ignored when cloning and the setting of the remote repository
> -	is always used. The default is "sha1". THIS VARIABLE IS
> -	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].
> +	is always used. The default is "sha1". See `--object-format`
> +	in linkgit:git-init[1].

This side looks OK (just removing the single sentence).

>  Git Commits
>  ~~~~~~~~~~~
> diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
> index 4cb106f0d1..1e976688be 100644
> --- a/Documentation/object-format-disclaimer.txt
> +++ b/Documentation/object-format-disclaimer.txt
> @@ -1,6 +1,2 @@
> -THIS OPTION IS EXPERIMENTAL! SHA-256 support is experimental and still
> -in an early stage.  A SHA-256 repository will in general not be able to
> -share work with "regular" SHA-1 repositories.  It should be assumed
> -that, e.g., Git internal file formats in relation to SHA-256
> -repositories may change in backwards-incompatible ways.  Only use
> -`--object-format=sha256` for testing purposes.
> +Note: SHA-256 repositories currently will not be able to share work
> +with "regular" SHA-1 repositories.

The original did not have this problem because it had enough
surrounding context, but the updated text now risks getting misread
as if there are "regular" and "special" SHA-1 repositories, the
latter of which might work better with SHA-256.

And the message about SHA-256's non-experimental status can probably
be a lot stronger, after the discussion we had recently.  How about
saying something like:

    Note: there is no interoperability between SHA-256 repositories
    and SHA-1 repositories right now.  We historically warned that
    SHA-256 repositories may need backward incompatible changes
    later when we introduce such interoperability features, but at
    this point we do not expect that we need to make such a change
    when we do so, and the users can expect that their SHA-256
    repositories they create with today's Git will be usable by
    future versions of Git without losing information.

which would probably be much closer to what you wanted to hear?

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-07-20 18:18   ` Junio C Hamano
@ 2023-07-26 16:44     ` Junio C Hamano
  2023-07-31 13:38     ` Adam Majer
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2023-07-26 16:44 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Adam Majer <adamm@zombino.com> writes:
>
>> I'll try again with inline patch.
>>
>> From 90be51143e741053390810720ba4a639c3b0b74c Mon Sep 17 00:00:00 2001
>
> Remove all the above lines (including the "From <commit object
> ...
>> Signed-off-by: Adam Majer <adamm@zombino.com>
>> ---
>>  Documentation/git.txt                      | 4 ++--
>>  Documentation/object-format-disclaimer.txt | 8 ++------
>>  2 files changed, 4 insertions(+), 8 deletions(-)
>> ...
> This side looks OK (just removing the single sentence).
>
>>  Git Commits
>>  ~~~~~~~~~~~
>> diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
>> index 4cb106f0d1..1e976688be 100644
>> --- a/Documentation/object-format-disclaimer.txt
>> +++ b/Documentation/object-format-disclaimer.txt
>> @@ -1,6 +1,2 @@
>> ...
>
> The original did not have this problem because it had enough
> surrounding context, but the updated text now risks getting misread
> as if there are "regular" and "special" SHA-1 repositories, the
> latter of which might work better with SHA-256.
>
> And the message about SHA-256's non-experimental status can probably
> be a lot stronger, after the discussion we had recently.  How about
> saying something like:
>
>     Note: there is no interoperability between SHA-256 repositories
>     and SHA-1 repositories right now.  We historically warned that
>     SHA-256 repositories may need backward incompatible changes
>     later when we introduce such interoperability features, but at
>     this point we do not expect that we need to make such a change
>     when we do so, and the users can expect that their SHA-256
>     repositories they create with today's Git will be usable by
>     future versions of Git without losing information.
>
> which would probably be much closer to what you wanted to hear?

It has been a week.  Any news on this topic?

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: SHA256 support not experimental, or?
  2023-07-20 18:18   ` Junio C Hamano
  2023-07-26 16:44     ` Junio C Hamano
@ 2023-07-31 13:38     ` Adam Majer
  2023-07-31 13:42       ` [PATCH] doc: sha256 is no longer experimental Adam Majer
  1 sibling, 1 reply; 21+ messages in thread
From: Adam Majer @ 2023-07-31 13:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 7/20/23 20:18, Junio C Hamano wrote:
> Adam Majer <adamm@zombino.com> writes:
> 
>>  From 90be51143e741053390810720ba4a639c3b0b74c Mon Sep 17 00:00:00 2001
> 
> Remove all the above lines (including the "From <commit object
> name>").  If you want to add a note that should not be recorded in
> the message of the resulting commit, write it _after_ the three-dash
> line after your sign-off.

Will do. I think the problem was `git format-patch` and then basically 
pasting that inline instead of using it for basis of an email.

I will try again.

> So, the body of the message usually should start from here (below).

+1

> In general, please follow [[describe-changes]] part of the
> Documentation/SubmittingPatches document, and also "git log
> --no-merges" of recent contributions by others.  "The purpose of
> this patch is" is not how we usually talk about our work.

+1

> And the message about SHA-256's non-experimental status can probably
> be a lot stronger, after the discussion we had recently.  How about
> saying something like:
> 
>      Note: there is no interoperability between SHA-256 repositories
>      and SHA-1 repositories right now.  We historically warned that
>      SHA-256 repositories may need backward incompatible changes
>      later when we introduce such interoperability features, but at
>      this point we do not expect that we need to make such a change
>      when we do so, and the users can expect that their SHA-256
>      repositories they create with today's Git will be usable by
>      future versions of Git without losing information.
> 
> which would probably be much closer to what you wanted to hear?

Thanks, I've included additional context now, rebased on top of next 
branch and will attach it as reply to this message.

- Adam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] doc: sha256 is no longer experimental
  2023-07-31 13:38     ` Adam Majer
@ 2023-07-31 13:42       ` Adam Majer
  2023-07-31 16:01         ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Adam Majer @ 2023-07-31 13:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Remove scary wording that basically stops people using sha256
repositories not because of interoperability issues with sha1
repositories, but from fear that their work will suddenly become
incompatible in some future version of git.

We should be clear that currently sha256 repositories will not work with
sha1 repositories but stop the scary words.

Signed-off-by: Adam Majer <adamm@zombino.com>
---
 Documentation/git.txt                      |  4 ++--
 Documentation/object-format-disclaimer.txt | 15 +++++++++------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index f0cafa2290..11228956cd 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -553,8 +553,8 @@ double-quotes and respecting backslash escapes. E.g., the value
 	If this variable is set, the default hash algorithm for new
 	repositories will be set to this value. This value is
 	ignored when cloning and the setting of the remote repository
-	is always used. The default is "sha1". THIS VARIABLE IS
-	EXPERIMENTAL! See `--object-format` in linkgit:git-init[1].
+	is always used. The default is "sha1".
+	See `--object-format` in linkgit:git-init[1].
 
 Git Commits
 ~~~~~~~~~~~
diff --git a/Documentation/object-format-disclaimer.txt b/Documentation/object-format-disclaimer.txt
index 4cb106f0d1..359f393ec9 100644
--- a/Documentation/object-format-disclaimer.txt
+++ b/Documentation/object-format-disclaimer.txt
@@ -1,6 +1,9 @@
-THIS OPTION IS EXPERIMENTAL! SHA-256 support is experimental and still
-in an early stage.  A SHA-256 repository will in general not be able to
-share work with "regular" SHA-1 repositories.  It should be assumed
-that, e.g., Git internal file formats in relation to SHA-256
-repositories may change in backwards-incompatible ways.  Only use
-`--object-format=sha256` for testing purposes.
+Note: At present, there is no interoperability between SHA-256
+repositories and SHA-1 repositories.
+
+Historically, we warned that SHA-256 repositories may later need
+backward incompatible changes when we introduce such interoperability
+features. Today, we only expect compatible changes. Furthermore, if such
+changes prove to be necessary, it can expected that SHA-256 repositories
+created with today's Git will be usable by future versions of Git
+without data loss.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] doc: sha256 is no longer experimental
  2023-07-31 13:42       ` [PATCH] doc: sha256 is no longer experimental Adam Majer
@ 2023-07-31 16:01         ` Junio C Hamano
  2023-07-31 16:44           ` Adam Majer
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2023-07-31 16:01 UTC (permalink / raw)
  To: Adam Majer; +Cc: git

Adam Majer <adamm@zombino.com> writes:

> +Note: At present, there is no interoperability between SHA-256
> +repositories and SHA-1 repositories.
> +
> +Historically, we warned that SHA-256 repositories may later need
> +backward incompatible changes when we introduce such interoperability
> +features. Today, we only expect compatible changes. Furthermore, if such
> +changes prove to be necessary, it can expected that SHA-256 repositories

"can BE expected" (will tweak locally while queueing; no need to resend).

> +created with today's Git will be usable by future versions of Git
> +without data loss.

Will queue.  Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] doc: sha256 is no longer experimental
  2023-07-31 16:01         ` Junio C Hamano
@ 2023-07-31 16:44           ` Adam Majer
  0 siblings, 0 replies; 21+ messages in thread
From: Adam Majer @ 2023-07-31 16:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On July 31, 2023 6:01:10 p.m. GMT+02:00, Junio C Hamano <gitster@pobox.com> wrote:
>Adam Majer <adamm@zombino.com> writes:
>> +changes prove to be necessary, it can expected that SHA-256 repositories
>
>"can BE expected" (will tweak locally while queueing; no need to resend).

+1

>
>Will queue.  Thanks!

Thanks!

-Adam

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-07-31 16:45 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 16:28 SHA256 support not experimental, or? Adam Majer
2023-06-29  1:59 ` brian m. carlson
2023-06-29 10:42   ` Adam Majer
2023-06-29  5:59 ` Junio C Hamano
2023-06-29 10:53   ` Adam Majer
2023-06-29 20:56     ` Junio C Hamano
2023-06-29 21:17   ` brian m. carlson
2023-06-29 22:22     ` Junio C Hamano
2023-06-30  1:21       ` brian m. carlson
2023-06-30  9:31         ` Patrick Steinhardt
2023-06-30 11:25           ` Adam Majer
2023-06-30 11:38             ` Patrick Steinhardt
2023-06-30 12:20             ` Son Luong Ngoc
2023-06-30 16:45               ` Junio C Hamano
2023-07-20 15:07 ` Adam Majer
2023-07-20 18:18   ` Junio C Hamano
2023-07-26 16:44     ` Junio C Hamano
2023-07-31 13:38     ` Adam Majer
2023-07-31 13:42       ` [PATCH] doc: sha256 is no longer experimental Adam Majer
2023-07-31 16:01         ` Junio C Hamano
2023-07-31 16:44           ` Adam Majer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).