git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/6] doc hash-function-transition: fix asciidoc output
       [not found] ` <3efe3392e9de6d4446665a8e6ae5a06b86bdccae.1612093734.git.gitgitgadget@gmail.com>
@ 2021-01-31 20:23   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-31 20:23 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget; +Cc: git, Junio C Hamano, Thomas Ackermann

4
On Sun, Jan 31 2021, Thomas Ackermann via GitGitGadget wrote:

> From: Thomas Ackermann <th.acker@arcor.de>
>
> fix asciidoc output for lists, special characters and verbatim text while retaining the readabilty of the original text file

Goes for this whole series: Commit messages should use full sentences,
so start with a capital letter. Also word-wrap them, see
Documentation/SubmittingPatches.

It would also help if there was some detail about how this "fixes" the
output, e.g. is "---" in the middle of a paragraph magically converted
somehow but "--" isn't?

> Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
> ---
>  .../technical/hash-function-transition.txt    | 81 +++++++++++--------
>  1 file changed, 46 insertions(+), 35 deletions(-)
>
> diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
> index 6fd20ebbc25..4b04829d537 100644
> --- a/Documentation/technical/hash-function-transition.txt
> +++ b/Documentation/technical/hash-function-transition.txt
> @@ -94,7 +94,7 @@ Overview
>  --------
>  We introduce a new repository format extension. Repositories with this
>  extension enabled use SHA-256 instead of SHA-1 to name their objects.
> -This affects both object names and object content --- both the names
> +This affects both object names and object content -- both the names
>  of objects and all references to other objects within an object are
>  switched to the new hash function.
>  
> @@ -191,21 +191,21 @@ hash functions. They have the following format (all integers are in
>  network byte order):
>  
>  - A header appears at the beginning and consists of the following:
> -  - The 4-byte pack index signature: '\377t0c'
> -  - 4-byte version number: 3
> -  - 4-byte length of the header section, including the signature and
> +  * The 4-byte pack index signature: '\377t0c'
> +  * 4-byte version number: 3
> +  * 4-byte length of the header section, including the signature and
>      version number
> -  - 4-byte number of objects contained in the pack
> -  - 4-byte number of object formats in this pack index: 2
> -  - For each object format:
> -    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
> -    - 4-byte length in bytes of shortened object names. This is the
> +  * 4-byte number of objects contained in the pack
> +  * 4-byte number of object formats in this pack index: 2
> +  * For each object format:
> +    ** 4-byte format identifier (e.g., 'sha1' for SHA-1)
> +    ** 4-byte length in bytes of shortened object names. This is the
>        shortest possible length needed to make names in the shortened
>        object name table unambiguous.
> -    - 4-byte integer, recording where tables relating to this format
> +    ** 4-byte integer, recording where tables relating to this format
>        are stored in this index file, as an offset from the beginning.
> -  - 4-byte offset to the trailer from the beginning of this file.
> -  - Zero or more additional key/value pairs (4-byte key, 4-byte
> +  * 4-byte offset to the trailer from the beginning of this file.
> +  * Zero or more additional key/value pairs (4-byte key, 4-byte
>      value). Only one key is supported: 'PSRC'. See the "Loose objects
>      and unreachable objects" section for supported values and how this
>      is used.  All other keys are reserved. Readers must ignore
> @@ -213,37 +213,36 @@ network byte order):
>  - Zero or more NUL bytes. This can optionally be used to improve the
>    alignment of the full object name table below.
>  - Tables for the first object format:
> -  - A sorted table of shortened object names.  These are prefixes of
> +  * A sorted table of shortened object names.  These are prefixes of
>      the names of all objects in this pack file, packed together
>      without offset values to reduce the cache footprint of the binary
>      search for a specific object name.
>  
> -  - A table of full object names in pack order. This allows resolving
> +  * A table of full object names in pack order. This allows resolving
>      a reference to "the nth object in the pack file" (from a
>      reachability bitmap or from the next table of another object
>      format) to its object name.
>  
> -  - A table of 4-byte values mapping object name order to pack order.
> +  * A table of 4-byte values mapping object name order to pack order.
>      For an object in the table of sorted shortened object names, the
>      value at the corresponding index in this table is the index in the
>      previous table for that same object.
> -
>      This can be used to look up the object in reachability bitmaps or
>      to look up its name in another object format.
>  
> -  - A table of 4-byte CRC32 values of the packed object data, in the
> +  * A table of 4-byte CRC32 values of the packed object data, in the
>      order that the objects appear in the pack file. This is to allow
>      compressed data to be copied directly from pack to pack during
>      repacking without undetected data corruption.
>  
> -  - A table of 4-byte offset values. For an object in the table of
> +  * A table of 4-byte offset values. For an object in the table of
>      sorted shortened object names, the value at the corresponding
>      index in this table indicates where that object can be found in
>      the pack file. These are usually 31-bit pack file offsets, but
>      large offsets are encoded as an index into the next table with the
>      most significant bit set.
>  
> -  - A table of 8-byte offset entries (empty for pack files less than
> +  * A table of 8-byte offset entries (empty for pack files less than
>      2 GiB). Pack files are organized with heavily used objects toward
>      the front, so most object references should not need to refer to
>      this table.
> @@ -252,10 +251,10 @@ network byte order):
>    up to and not including the table of CRC32 values.
>  - Zero or more NUL bytes.
>  - The trailer consists of the following:
> -  - A copy of the 20-byte SHA-256 checksum at the end of the
> +  * A copy of the 20-byte SHA-256 checksum at the end of the
>      corresponding packfile.
>  
> -  - 20-byte SHA-256 checksum of all of the above.
> +  * 20-byte SHA-256 checksum of all of the above.
>  
>  Loose object index
>  ~~~~~~~~~~~~~~~~~~
> @@ -350,8 +349,8 @@ the following steps:
>     they will be discarded.)
>  3. convert to sha256: open a new (sha256) packfile. Read the topologically
>     sorted list just generated. For each object, inflate its
> -   sha1-content, convert to sha256-content, and write it to the sha256
> -   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
> +   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
> +   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
>  4. sort: reorder entries in the new pack to match the order of objects
>     in the pack the server generated and include blobs. Write a sha256 idx
>     file
> @@ -391,6 +390,7 @@ existing "gpgsig" field. Its signed payload is the sha256-content of the
>  commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
>  
>  This means commits can be signed
> +
>  1. using SHA-1 only, as in existing signed commit objects
>  2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
>     fields.
> @@ -408,6 +408,7 @@ sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
>  SIGNATURE-----" delimited in-body signature removed.
>  
>  This means tags can be signed
> +
>  1. using SHA-1 only, as in existing signed tag objects
>  2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
>     signature.
> @@ -598,7 +599,7 @@ The user can also explicitly specify which format to use for a
>  particular revision specifier and for output, overriding the mode. For
>  example:
>  
> -git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
> +    git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
>  
>  Choice of Hash
>  --------------
> @@ -636,6 +637,7 @@ We choose SHA-256.
>  Transition plan
>  ---------------
>  Some initial steps can be implemented independently of one another:
> +
>  - adding a hash function API (vtable)
>  - teaching fsck to tolerate the gpgsig-sha256 field
>  - excluding gpgsig-* from the fields copied by "git commit --amend"
> @@ -647,9 +649,9 @@ Some initial steps can be implemented independently of one another:
>  - introducing index v3
>  - adding support for the PSRC field and safer object pruning
>  
> -
>  The first user-visible change is the introduction of the objectFormat
>  extension (without compatObjectFormat). This requires:
> +
>  - teaching fsck about this mode of operation
>  - using the hash function API (vtable) when computing object names
>  - signing objects and verifying signatures
> @@ -657,6 +659,7 @@ extension (without compatObjectFormat). This requires:
>    repository
>  
>  Next comes introduction of compatObjectFormat:
> +
>  - implementing the loose-object-idx
>  - translating object names between object formats
>  - translating object content between object formats
> @@ -669,6 +672,7 @@ Next comes introduction of compatObjectFormat:
>    "Object names on the command line" above)
>  
>  The next step is supporting fetches and pushes to SHA-1 repositories:
> +
>  - allow pushes to a repository using the compat format
>  - generate a topologically sorted list of the SHA-1 names of fetched
>    objects
> @@ -734,6 +738,7 @@ Using hash functions in parallel
>  Objects newly created would be addressed by the new hash, but inside
>  such an object (e.g. commit) it is still possible to address objects
>  using the old hash function.
> +
>  * You cannot trust its history (needed for bisectability) in the
>    future without further work
>  * Maintenance burden as the number of supported hash functions grows
> @@ -749,6 +754,7 @@ sha1-content based signatures.
>  
>  In other words, a single signature was used to attest to the object
>  content using both hash functions. This had some advantages:
> +
>  * Using one signature instead of two speeds up the signing process.
>  * Having one signed payload with both hashes allows the signer to
>    attest to the sha1-name and sha256-name referring to the same object.
> @@ -756,6 +762,7 @@ content using both hash functions. This had some advantages:
>    to be detected quickly using current versions of git.
>  
>  However, it also came with disadvantages:
> +
>  * Verifying a signed object requires access to the sha1-names of all
>    objects it references, even after the transition is complete and
>    translation table is no longer needed for anything else. To support
> @@ -782,16 +789,17 @@ Document History
>  bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
>  sbeller@google.com
>  
> -Initial version sent to
> -http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
> +* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
>  
>  2017-03-03 jrnieder@gmail.com
>  Incorporated suggestions from jonathantanmy and sbeller:
> +
>  * describe purpose of signed objects with each hash type
>  * redefine signed object verification using object content under the
>    first hash function
>  
>  2017-03-06 jrnieder@gmail.com
> +
>  * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
>  * Make sha3-based signatures a separate field, avoiding the need for
>    "hash" and "nohash" fields (thanks to peff[3]).
> @@ -805,6 +813,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
>    especially Junio).
>  
>  2017-09-27 jrnieder@gmail.com, sbeller@google.com
> +
>  * use placeholder NewHash instead of SHA3-256
>  * describe criteria for picking a hash function.
>  * include a transition plan (thanks especially to Brandon Williams
> @@ -816,12 +825,14 @@ Incorporated suggestions from jonathantanmy and sbeller:
>  
>  Later history:
>  
> - See the history of this file in git.git for the history of subsequent
> - edits. This document history is no longer being maintained as it
> - would now be superfluous to the commit log
> +* See the history of this file in git.git for the history of subsequent
> +  edits. This document history is no longer being maintained as it
> +  would now be superfluous to the commit log
> +
> +References:
>  
> -[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
> -[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
> -[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
> -[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
> -[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
> + [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
> + [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
> + [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
> + [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
> + [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently
       [not found] ` <62ca087d4ebaa5f3a7efba6a2865e89284fcd98d.1612093734.git.gitgitgadget@gmail.com>
@ 2021-01-31 20:24   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-31 20:24 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget; +Cc: git, Junio C Hamano, Thomas Ackermann


On Sun, Jan 31 2021, Thomas Ackermann via GitGitGadget wrote:

> From: Thomas Ackermann <th.acker@arcor.de>
>
> use SHA-1 and SHA-256 instead of sha1 and sha256  when referring to the hash type

Aside from the comment I had on 1/6, this looks good to me.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/6] doc hash-function-transition: use https links consistently
       [not found] ` <d4abf1cf78e2e59e49b81bd458d85848bd3d7ff3.1612093734.git.gitgitgadget@gmail.com>
@ 2021-01-31 20:25   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-31 20:25 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget; +Cc: git, Junio C Hamano, Thomas Ackermann


On Sun, Jan 31 2021, Thomas Ackermann via GitGitGadget wrote:

> From: Thomas Ackermann <th.acker@arcor.de>
>
> use only https links in References

Per my grepping this leaves just 2 more such links, in
t/t0021-conversion.sh, in the whole source tree. Might as well convert
them while we're at it...

> Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
> ---
>  Documentation/technical/hash-function-transition.txt | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
> index 2eba25cf87c..dc0c4976a62 100644
> --- a/Documentation/technical/hash-function-transition.txt
> +++ b/Documentation/technical/hash-function-transition.txt
> @@ -831,8 +831,8 @@ Later history:
>  
>  References:
>  
> - [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
> - [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
> - [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
> - [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
> + [1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
> + [2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
> + [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
> + [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
>   [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/6] doc hash-function-transition: move rationale upwards
       [not found] ` <2cdb0f8e2edc4416c5dfb88722aa05be35afba7d.1612093734.git.gitgitgadget@gmail.com>
@ 2021-01-31 20:37   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-01-31 20:37 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget; +Cc: git, Junio C Hamano, Thomas Ackermann


On Sun, Jan 31 2021, Thomas Ackermann via GitGitGadget wrote:

> diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
> index dc0c4976a62..c9b57a125e2 100644
> --- a/Documentation/technical/hash-function-transition.txt
> +++ b/Documentation/technical/hash-function-transition.txt
> @@ -27,13 +27,17 @@ advantages:
>    methods have a short reliable string that can be used to reliably
>    address stored content.
>  
> -Over time some flaws in SHA-1 have been discovered by security
> -researchers. On 23 February 2017 the SHAttered attack
> +Over time some flaws in SHA-1 have been discovered by security researchers.
> +In early 2005, around the time that Git was written, Xiaoyun Wang,
> +Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
> +collisions in 2^69 operations. In August they published details.
> +Luckily, no practical demonstrations of a collision in full SHA-1 were
> +published until 10 years later: on 23 February 2017 the SHAttered attack
>  (https://shattered.io) demonstrated a practical SHA-1 hash collision.
>  
>  Git v2.13.0 and later subsequently moved to a hardened SHA-1
> -implementation by default, which isn't vulnerable to the SHAttered
> -attack.
> +implementation by default that mitigates the SHAttered attack, but
> +SHA-1 is still believed to be weak.
>  
>  Thus Git has in effect already migrated to a new hash that isn't SHA-1
>  and doesn't share its vulnerabilities, its new hash function just
> @@ -57,6 +61,29 @@ SHA-1 still possesses the other properties such as fast object lookup
>  and safe error checking, but other hash functions are equally suitable
>  that are believed to be cryptographically secure.

I don't think this is an improvement, why does someone trying to learn
about Git's SHA-256 transition care about early SHA-1 flaws that didn't
prompt the transition.

I'm probably biased since the current intro is mine from 5988eb631a3
(doc hash-function-transition: clarify what SHAttered means,
2018-03-26), but this really feels too much like going into the weeds.

I think the document would be improved by just removing the whole
mention of early 2005 and mentioning several researchers by name. I
think the current prose of "Over time some flaws in SHA-1 have been
discovered by security researchers" suffices, if people are curious
about SHA-1's vulnerability history there's plenty of good easily found
sources for that.

> +Choice of Hash
> +--------------
> +The hash to replace the hardened SHA-1 should be stronger than SHA-1
> +was: we would like it to be trustworthy and useful in practice for at
> +least 10 years.
> +
> +Some other relevant properties:
> +
> +1. A 256-bit hash (long enough to match common security practice; not
> +   excessively long to hurt performance and disk usage).
> +
> +2. High quality implementations should be widely available (e.g., in
> +   OpenSSL and Apple CommonCrypto).
> +
> +3. The hash function's properties should match Git's needs (e.g. Git
> +   requires collision and 2nd preimage resistance and does not require
> +   length extension resistance).
> +
> +4. As a tiebreaker, the hash should be fast to compute (fortunately
> +   many contenders are faster than SHA-1).
> +
> +We choose SHA-256.
> +
>  Goals
>  -----
>  1. The transition to SHA-256 can be done one local repository at a time.
> @@ -601,39 +628,6 @@ example:
>  
>      git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
>  
> -Choice of Hash
> ---------------
> -In early 2005, around the time that Git was written, Xiaoyun Wang,
> -Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
> -collisions in 2^69 operations. In August they published details.
> -Luckily, no practical demonstrations of a collision in full SHA-1 were
> -published until 10 years later, in 2017.
> -
> -Git v2.13.0 and later subsequently moved to a hardened SHA-1
> -implementation by default that mitigates the SHAttered attack, but
> -SHA-1 is still believed to be weak.
> -
> -The hash to replace this hardened SHA-1 should be stronger than SHA-1
> -was: we would like it to be trustworthy and useful in practice for at
> -least 10 years.
> -
> -Some other relevant properties:
> -
> -1. A 256-bit hash (long enough to match common security practice; not
> -   excessively long to hurt performance and disk usage).
> -
> -2. High quality implementations should be widely available (e.g., in
> -   OpenSSL and Apple CommonCrypto).
> -
> -3. The hash function's properties should match Git's needs (e.g. Git
> -   requires collision and 2nd preimage resistance and does not require
> -   length extension resistance).
> -
> -4. As a tiebreaker, the hash should be fast to compute (fortunately
> -   many contenders are faster than SHA-1).
> -
> -We choose SHA-256.
> -

Same here. We're going into the weeds about what hashes we didn't pick
before talking about what we're going to do with SHA-256? Much of that
wording is just historical, and pre-dates the SHA-256 pick. I think
something like this would be much better at this point:
    
    diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
    index 6fd20ebbc25..a4222eb0a6c 100644
    --- a/Documentation/technical/hash-function-transition.txt
    +++ b/Documentation/technical/hash-function-transition.txt
    @@ -602,36 +602,17 @@ git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
     
     Choice of Hash
     --------------
    -In early 2005, around the time that Git was written, Xiaoyun Wang,
    -Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
    -collisions in 2^69 operations. In August they published details.
    -Luckily, no practical demonstrations of a collision in full SHA-1 were
    -published until 10 years later, in 2017.
     
    -Git v2.13.0 and later subsequently moved to a hardened SHA-1
    -implementation by default that mitigates the SHAttered attack, but
    -SHA-1 is still believed to be weak.
    -
    -The hash to replace this hardened SHA-1 should be stronger than SHA-1
    -was: we would like it to be trustworthy and useful in practice for at
    -least 10 years.
    -
    -Some other relevant properties:
    -
    -1. A 256-bit hash (long enough to match common security practice; not
    -   excessively long to hurt performance and disk usage).
    -
    -2. High quality implementations should be widely available (e.g., in
    -   OpenSSL and Apple CommonCrypto).
    -
    -3. The hash function's properties should match Git's needs (e.g. Git
    -   requires collision and 2nd preimage resistance and does not require
    -   length extension resistance).
    +There were several contenders for a successor hash to SHA-1, including
    +SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
     
    -4. As a tiebreaker, the hash should be fast to compute (fortunately
    -   many contenders are faster than SHA-1).
    +In late 2018 the project picked SHA-256 as its successor hash.
     
    -We choose SHA-256.
    +See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
    +NewHash, 2018-08-04) and numerous mailing list threads at the time,
    +particularly the one starting at
    +https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
    +for more information.
     
     Transition plan
     ---------------

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 0/6] doc: improvements for hash-function-transition
       [not found] <pull.858.git.1612093734.gitgitgadget@gmail.com>
                   ` (3 preceding siblings ...)
       [not found] ` <2cdb0f8e2edc4416c5dfb88722aa05be35afba7d.1612093734.git.gitgitgadget@gmail.com>
@ 2021-02-02 16:19 ` Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
                     ` (7 more replies)
  4 siblings, 8 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason, Thomas Ackermann

Some asciidoc formatting errors and some minor formatting inconsistencies in
hash-function-transition.txt were fixed.

Content-wise the rationale for choosing SHA-256 was shortened and moved to
the beginning of the document and an incomplete sentence was corrected.

Changes since v1:

 * Better commit messages.
 * Details on SHA-1 weaknesses were removed from the rationale.
 * All http links to lore.kernel.org in the tree were changed to https
   links.

Thanks to Ævar for his suggestions and help.

Signed-off-by: Thomas Ackermann th.acker@arcor.de

Thomas Ackermann (6):
  doc hash-function-transition: fix asciidoc output
  doc hash-function-transition: use SHA-1 and SHA-256 consistently
  doc hash-function-transition: use upper case consistently
  doc hash-function-transition: fix incomplete sentence
  doc hash-function-transition: move rationale upwards
  doc: use https links

 .../technical/hash-function-transition.txt    | 279 ++++++++----------
 t/t0021-conversion.sh                         |   4 +-
 2 files changed, 132 insertions(+), 151 deletions(-)


base-commit: e6362826a0409539642a5738db61827e5978e2e4
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-858%2Ftacker66%2Fdoc_hash_function_transition-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-858/tacker66/doc_hash_function_transition-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/858

Range-diff vs v1:

 1:  3efe3392e9d ! 1:  f36c5dd4c1e doc hash-function-transition: fix asciidoc output
     @@ Metadata
       ## Commit message ##
          doc hash-function-transition: fix asciidoc output
      
     -    fix asciidoc output for lists, special characters and verbatim text while retaining the readabilty of the original text file
     +    Asciidoc requires lists to start with an empty line and uses
     +    different characters for indentation levels ("-", "*", "**", ...).
     +    For special symbols like a dash "--" has to be used and there is
     +    no double arrow "<->", so a left and right arrow "<-->" has to be
     +    combined for that. Lastly for verbatim output a newline followed
     +    by an indentation has to be used.
     +
     +    Fix asciidoc output for lists, special characters and verbatim
     +    text while retaining the readabilty of the original text file.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
 2:  62ca087d4eb ! 2:  681ce4129dc doc hash-function-transition: use SHA-1 and SHA-256 consistently
     @@ Metadata
       ## Commit message ##
          doc hash-function-transition: use SHA-1 and SHA-256 consistently
      
     -    use SHA-1 and SHA-256 instead of sha1 and sha256  when referring to the hash type
     +    Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
     +    to the hash type.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
 3:  37e3fd6aaa0 ! 3:  4f622fffcc5 doc hash-function-transition: use upper case consistently
     @@ Metadata
       ## Commit message ##
          doc hash-function-transition: use upper case consistently
      
     -    use upper case consistently in Document History
     +    Use upper case consistently in Document History.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
 6:  302c7b8dce0 = 4:  58295cadffe doc hash-function-transition: fix incomplete sentence
 5:  2cdb0f8e2ed ! 5:  711a37969b6 doc hash-function-transition: move rationale upwards
     @@ Metadata
       ## Commit message ##
          doc hash-function-transition: move rationale upwards
      
     -    move rationale for new hash function to beginning of document
     +    Move rationale for new hash function to beginning of document
     +    so that it appears before the concrete move to SHA-256 is described.
      
     -    rationale now appears before the concrete move to SHA-256 is described
     +    Remove details about SHA-1 weaknesses. Instead add references
     +    to the details of how the new hash function was chosen.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
     @@ Documentation/technical/hash-function-transition.txt: advantages:
       
      -Over time some flaws in SHA-1 have been discovered by security
      -researchers. On 23 February 2017 the SHAttered attack
     +-(https://shattered.io) demonstrated a practical SHA-1 hash collision.
      +Over time some flaws in SHA-1 have been discovered by security researchers.
     -+In early 2005, around the time that Git was written, Xiaoyun Wang,
     -+Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
     -+collisions in 2^69 operations. In August they published details.
     -+Luckily, no practical demonstrations of a collision in full SHA-1 were
     -+published until 10 years later: on 23 February 2017 the SHAttered attack
     - (https://shattered.io) demonstrated a practical SHA-1 hash collision.
       
       Git v2.13.0 and later subsequently moved to a hardened SHA-1
      -implementation by default, which isn't vulnerable to the SHAttered
      -attack.
     -+implementation by default that mitigates the SHAttered attack, but
     -+SHA-1 is still believed to be weak.
     ++implementation by default, but SHA-1 is still believed to be weak.
       
     - Thus Git has in effect already migrated to a new hash that isn't SHA-1
     - and doesn't share its vulnerabilities, its new hash function just
     +-Thus Git has in effect already migrated to a new hash that isn't SHA-1
     +-and doesn't share its vulnerabilities, its new hash function just
     +-happens to produce exactly the same output for all known inputs,
     +-except two PDFs published by the SHAttered researchers, and the new
     +-implementation (written by those researchers) claims to detect future
     +-cryptanalytic collision attacks.
     +-
     +-Regardless, it's considered prudent to move past any variant of SHA-1
     ++Thus it's considered prudent to move past any variant of SHA-1
     + to a new hash. There's no guarantee that future attacks on SHA-1 won't
     + be published in the future, and those attacks may not have viable
     + mitigations.
      @@ Documentation/technical/hash-function-transition.txt: SHA-1 still possesses the other properties such as fast object lookup
       and safe error checking, but other hash functions are equally suitable
       that are believed to be cryptographically secure.
       
      +Choice of Hash
      +--------------
     -+The hash to replace the hardened SHA-1 should be stronger than SHA-1
     -+was: we would like it to be trustworthy and useful in practice for at
     -+least 10 years.
     -+
     -+Some other relevant properties:
     -+
     -+1. A 256-bit hash (long enough to match common security practice; not
     -+   excessively long to hurt performance and disk usage).
     -+
     -+2. High quality implementations should be widely available (e.g., in
     -+   OpenSSL and Apple CommonCrypto).
     -+
     -+3. The hash function's properties should match Git's needs (e.g. Git
     -+   requires collision and 2nd preimage resistance and does not require
     -+   length extension resistance).
     ++There were several contenders for a successor hash to SHA-1, including
     ++SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
      +
     -+4. As a tiebreaker, the hash should be fast to compute (fortunately
     -+   many contenders are faster than SHA-1).
     ++In late 2018 the project picked SHA-256 as its successor hash.
      +
     -+We choose SHA-256.
     ++See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
     ++NewHash, 2018-08-04) and numerous mailing list threads at the time,
     ++particularly the one starting at
     ++https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
     ++for more information.
      +
       Goals
       -----
 4:  d4abf1cf78e ! 6:  d6041b7e9e8 doc hash-function-transition: use https links consistently
     @@ Metadata
      Author: Thomas Ackermann <th.acker@arcor.de>
      
       ## Commit message ##
     -    doc hash-function-transition: use https links consistently
     +    doc: use https links
      
     -    use only https links in References
     +    Use only https links for lore.kernel.org.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
       ## Documentation/technical/hash-function-transition.txt ##
     +@@ Documentation/technical/hash-function-transition.txt: Document History
     + bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
     + sbeller@google.com
     + 
     +-* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
     ++* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
     + 
     + 2017-03-03 jrnieder@gmail.com
     + Incorporated suggestions from jonathantanmy and sbeller:
      @@ Documentation/technical/hash-function-transition.txt: Later history:
       
       References:
     @@ Documentation/technical/hash-function-transition.txt: Later history:
      + [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
      + [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
        [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
     +
     + ## t/t0021-conversion.sh ##
     +@@ t/t0021-conversion.sh: filter_git () {
     + # Compare two files and ensure that `clean` and `smudge` respectively are
     + # called at least once if specified in the `expect` file. The actual
     + # invocation count is not relevant because their number can vary.
     +-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
     ++# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
     + test_cmp_count () {
     + 	expect=$1
     + 	actual=$2
     +@@ t/t0021-conversion.sh: test_cmp_count () {
     + 
     + # Compare two files but exclude all `clean` invocations because Git can
     + # call `clean` zero or more times.
     +-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
     ++# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
     + test_cmp_exclude_clean () {
     + 	expect=$1
     + 	actual=$2

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/6] doc hash-function-transition: fix asciidoc output
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Asciidoc requires lists to start with an empty line and uses
different characters for indentation levels ("-", "*", "**", ...).
For special symbols like a dash "--" has to be used and there is
no double arrow "<->", so a left and right arrow "<-->" has to be
combined for that. Lastly for verbatim output a newline followed
by an indentation has to be used.

Fix asciidoc output for lists, special characters and verbatim
text while retaining the readabilty of the original text file.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 81 +++++++++++--------
 1 file changed, 46 insertions(+), 35 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 6fd20ebbc25..4b04829d537 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -94,7 +94,7 @@ Overview
 --------
 We introduce a new repository format extension. Repositories with this
 extension enabled use SHA-256 instead of SHA-1 to name their objects.
-This affects both object names and object content --- both the names
+This affects both object names and object content -- both the names
 of objects and all references to other objects within an object are
 switched to the new hash function.
 
@@ -191,21 +191,21 @@ hash functions. They have the following format (all integers are in
 network byte order):
 
 - A header appears at the beginning and consists of the following:
-  - The 4-byte pack index signature: '\377t0c'
-  - 4-byte version number: 3
-  - 4-byte length of the header section, including the signature and
+  * The 4-byte pack index signature: '\377t0c'
+  * 4-byte version number: 3
+  * 4-byte length of the header section, including the signature and
     version number
-  - 4-byte number of objects contained in the pack
-  - 4-byte number of object formats in this pack index: 2
-  - For each object format:
-    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
-    - 4-byte length in bytes of shortened object names. This is the
+  * 4-byte number of objects contained in the pack
+  * 4-byte number of object formats in this pack index: 2
+  * For each object format:
+    ** 4-byte format identifier (e.g., 'sha1' for SHA-1)
+    ** 4-byte length in bytes of shortened object names. This is the
       shortest possible length needed to make names in the shortened
       object name table unambiguous.
-    - 4-byte integer, recording where tables relating to this format
+    ** 4-byte integer, recording where tables relating to this format
       are stored in this index file, as an offset from the beginning.
-  - 4-byte offset to the trailer from the beginning of this file.
-  - Zero or more additional key/value pairs (4-byte key, 4-byte
+  * 4-byte offset to the trailer from the beginning of this file.
+  * Zero or more additional key/value pairs (4-byte key, 4-byte
     value). Only one key is supported: 'PSRC'. See the "Loose objects
     and unreachable objects" section for supported values and how this
     is used.  All other keys are reserved. Readers must ignore
@@ -213,37 +213,36 @@ network byte order):
 - Zero or more NUL bytes. This can optionally be used to improve the
   alignment of the full object name table below.
 - Tables for the first object format:
-  - A sorted table of shortened object names.  These are prefixes of
+  * A sorted table of shortened object names.  These are prefixes of
     the names of all objects in this pack file, packed together
     without offset values to reduce the cache footprint of the binary
     search for a specific object name.
 
-  - A table of full object names in pack order. This allows resolving
+  * A table of full object names in pack order. This allows resolving
     a reference to "the nth object in the pack file" (from a
     reachability bitmap or from the next table of another object
     format) to its object name.
 
-  - A table of 4-byte values mapping object name order to pack order.
+  * A table of 4-byte values mapping object name order to pack order.
     For an object in the table of sorted shortened object names, the
     value at the corresponding index in this table is the index in the
     previous table for that same object.
-
     This can be used to look up the object in reachability bitmaps or
     to look up its name in another object format.
 
-  - A table of 4-byte CRC32 values of the packed object data, in the
+  * A table of 4-byte CRC32 values of the packed object data, in the
     order that the objects appear in the pack file. This is to allow
     compressed data to be copied directly from pack to pack during
     repacking without undetected data corruption.
 
-  - A table of 4-byte offset values. For an object in the table of
+  * A table of 4-byte offset values. For an object in the table of
     sorted shortened object names, the value at the corresponding
     index in this table indicates where that object can be found in
     the pack file. These are usually 31-bit pack file offsets, but
     large offsets are encoded as an index into the next table with the
     most significant bit set.
 
-  - A table of 8-byte offset entries (empty for pack files less than
+  * A table of 8-byte offset entries (empty for pack files less than
     2 GiB). Pack files are organized with heavily used objects toward
     the front, so most object references should not need to refer to
     this table.
@@ -252,10 +251,10 @@ network byte order):
   up to and not including the table of CRC32 values.
 - Zero or more NUL bytes.
 - The trailer consists of the following:
-  - A copy of the 20-byte SHA-256 checksum at the end of the
+  * A copy of the 20-byte SHA-256 checksum at the end of the
     corresponding packfile.
 
-  - 20-byte SHA-256 checksum of all of the above.
+  * 20-byte SHA-256 checksum of all of the above.
 
 Loose object index
 ~~~~~~~~~~~~~~~~~~
@@ -350,8 +349,8 @@ the following steps:
    they will be discarded.)
 3. convert to sha256: open a new (sha256) packfile. Read the topologically
    sorted list just generated. For each object, inflate its
-   sha1-content, convert to sha256-content, and write it to the sha256
-   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
+   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
+   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
 4. sort: reorder entries in the new pack to match the order of objects
    in the pack the server generated and include blobs. Write a sha256 idx
    file
@@ -391,6 +390,7 @@ existing "gpgsig" field. Its signed payload is the sha256-content of the
 commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
 
 This means commits can be signed
+
 1. using SHA-1 only, as in existing signed commit objects
 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
    fields.
@@ -408,6 +408,7 @@ sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
 SIGNATURE-----" delimited in-body signature removed.
 
 This means tags can be signed
+
 1. using SHA-1 only, as in existing signed tag objects
 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
    signature.
@@ -598,7 +599,7 @@ The user can also explicitly specify which format to use for a
 particular revision specifier and for output, overriding the mode. For
 example:
 
-git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
+    git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
 
 Choice of Hash
 --------------
@@ -636,6 +637,7 @@ We choose SHA-256.
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another:
+
 - adding a hash function API (vtable)
 - teaching fsck to tolerate the gpgsig-sha256 field
 - excluding gpgsig-* from the fields copied by "git commit --amend"
@@ -647,9 +649,9 @@ Some initial steps can be implemented independently of one another:
 - introducing index v3
 - adding support for the PSRC field and safer object pruning
 
-
 The first user-visible change is the introduction of the objectFormat
 extension (without compatObjectFormat). This requires:
+
 - teaching fsck about this mode of operation
 - using the hash function API (vtable) when computing object names
 - signing objects and verifying signatures
@@ -657,6 +659,7 @@ extension (without compatObjectFormat). This requires:
   repository
 
 Next comes introduction of compatObjectFormat:
+
 - implementing the loose-object-idx
 - translating object names between object formats
 - translating object content between object formats
@@ -669,6 +672,7 @@ Next comes introduction of compatObjectFormat:
   "Object names on the command line" above)
 
 The next step is supporting fetches and pushes to SHA-1 repositories:
+
 - allow pushes to a repository using the compat format
 - generate a topologically sorted list of the SHA-1 names of fetched
   objects
@@ -734,6 +738,7 @@ Using hash functions in parallel
 Objects newly created would be addressed by the new hash, but inside
 such an object (e.g. commit) it is still possible to address objects
 using the old hash function.
+
 * You cannot trust its history (needed for bisectability) in the
   future without further work
 * Maintenance burden as the number of supported hash functions grows
@@ -749,6 +754,7 @@ sha1-content based signatures.
 
 In other words, a single signature was used to attest to the object
 content using both hash functions. This had some advantages:
+
 * Using one signature instead of two speeds up the signing process.
 * Having one signed payload with both hashes allows the signer to
   attest to the sha1-name and sha256-name referring to the same object.
@@ -756,6 +762,7 @@ content using both hash functions. This had some advantages:
   to be detected quickly using current versions of git.
 
 However, it also came with disadvantages:
+
 * Verifying a signed object requires access to the sha1-names of all
   objects it references, even after the transition is complete and
   translation table is no longer needed for anything else. To support
@@ -782,16 +789,17 @@ Document History
 bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
 sbeller@google.com
 
-Initial version sent to
-http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
 
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
+
 * describe purpose of signed objects with each hash type
 * redefine signed object verification using object content under the
   first hash function
 
 2017-03-06 jrnieder@gmail.com
+
 * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
 * Make sha3-based signatures a separate field, avoiding the need for
   "hash" and "nohash" fields (thanks to peff[3]).
@@ -805,6 +813,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
   especially Junio).
 
 2017-09-27 jrnieder@gmail.com, sbeller@google.com
+
 * use placeholder NewHash instead of SHA3-256
 * describe criteria for picking a hash function.
 * include a transition plan (thanks especially to Brandon Williams
@@ -816,12 +825,14 @@ Incorporated suggestions from jonathantanmy and sbeller:
 
 Later history:
 
- See the history of this file in git.git for the history of subsequent
- edits. This document history is no longer being maintained as it
- would now be superfluous to the commit log
+* See the history of this file in git.git for the history of subsequent
+  edits. This document history is no longer being maintained as it
+  would now be superfluous to the commit log
+
+References:
 
-[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
-[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
-[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
-[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
-[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
+ [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+ [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+ [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+ [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+ [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 19:39     ` Junio C Hamano
  2021-02-02 16:19   ` [PATCH v2 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
to the hash type.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 122 +++++++++---------
 1 file changed, 61 insertions(+), 61 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 4b04829d537..51acf2c10b7 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -107,7 +107,7 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
 interchangeably.
 
 "git cat-file" and "git hash-object" gain options to display an object
-in its sha1 form and write an object given its sha1 form. This
+in its SHA-1 form and write an object given its SHA-1 form. This
 requires all objects referenced by that object to be present in the
 object database so that they can be named using the appropriate name
 (using the bidirectional hash mapping).
@@ -115,7 +115,7 @@ object database so that they can be named using the appropriate name
 Fetches from a SHA-1 based server convert the fetched objects into
 SHA-256 form and record the mapping in the bidirectional mapping table
 (see below for details). Pushes to a SHA-1 based server convert the
-objects being pushed into sha1 form so the server does not have to be
+objects being pushed into SHA-1 form so the server does not have to be
 aware of the hash function the client is using.
 
 Detailed Design
@@ -151,38 +151,38 @@ repository extensions.
 
 Object names
 ~~~~~~~~~~~~
-Objects can be named by their 40 hexadecimal digit sha1-name or 64
-hexadecimal digit sha256-name, plus names derived from those (see
+Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
+hexadecimal digit SHA-256 name, plus names derived from those (see
 gitrevisions(7)).
 
-The sha1-name of an object is the SHA-1 of the concatenation of its
-type, length, a nul byte, and the object's sha1-content. This is the
+The SHA-1 name of an object is the SHA-1 of the concatenation of its
+type, length, a nul byte, and the object's SHA-1 content. This is the
 traditional <sha1> used in Git to name objects.
 
-The sha256-name of an object is the SHA-256 of the concatenation of its
-type, length, a nul byte, and the object's sha256-content.
+The SHA-256 name of an object is the SHA-256 of the concatenation of its
+type, length, a nul byte, and the object's SHA-256 content.
 
 Object format
 ~~~~~~~~~~~~~
 The content as a byte sequence of a tag, commit, or tree object named
-by sha1 and sha256 differ because an object named by sha256-name refers to
-other objects by their sha256-names and an object named by sha1-name
-refers to other objects by their sha1-names.
+by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
+other objects by their SHA-256 names and an object named by SHA-1 name
+refers to other objects by their SHA-1 names.
 
-The sha256-content of an object is the same as its sha1-content, except
-that objects referenced by the object are named using their sha256-names
-instead of sha1-names. Because a blob object does not refer to any
-other object, its sha1-content and sha256-content are the same.
+The SHA-256-content of an object is the same as its SHA-1 content, except
+that objects referenced by the object are named using their SHA-256 names
+instead of SHA-1 names. Because a blob object does not refer to any
+other object, its SHA-1 content and SHA-256 content are the same.
 
-The format allows round-trip conversion between sha256-content and
-sha1-content.
+The format allows round-trip conversion between SHA-256 content and
+SHA-1 content.
 
 Object storage
 ~~~~~~~~~~~~~~
 Loose objects use zlib compression and packed objects use the packed
 format described in Documentation/technical/pack-format.txt, just like
-today. The content that is compressed and stored uses sha256-content
-instead of sha1-content.
+today. The content that is compressed and stored uses SHA-256 content
+instead of SHA-1 content.
 
 Pack index
 ~~~~~~~~~~
@@ -287,18 +287,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):
 
 Translation table
 ~~~~~~~~~~~~~~~~~
-The index files support a bidirectional mapping between sha1-names
-and sha256-names. The lookup proceeds similarly to ordinary object
-lookups. For example, to convert a sha1-name to a sha256-name:
+The index files support a bidirectional mapping between SHA-1 names
+and SHA-256 names. The lookup proceeds similarly to ordinary object
+lookups. For example, to convert a SHA-1 name to a SHA-256 name:
 
  1. Look for the object in idx files. If a match is present in the
-    idx's sorted list of truncated sha1-names, then:
-    a. Read the corresponding entry in the sha1-name order to pack
+    idx's sorted list of truncated SHA-1 names, then:
+    a. Read the corresponding entry in the SHA-1 name order to pack
        name order mapping.
-    b. Read the corresponding entry in the full sha1-name table to
+    b. Read the corresponding entry in the full SHA-1 name table to
        verify we found the right object. If it is, then
-    c. Read the corresponding entry in the full sha256-name table.
-       That is the object's sha256-name.
+    c. Read the corresponding entry in the full SHA-256 name table.
+       That is the object's SHA-256 name.
  2. Check for a loose object. Read lines from loose-object-idx until
     we find a match.
 
@@ -312,10 +312,10 @@ Since all operations that make new objects (e.g., "git commit") add
 the new objects to the corresponding index, this mapping is possible
 for all objects in the object store.
 
-Reading an object's sha1-content
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The sha1-content of an object can be read by converting all sha256-names
-its sha256-content references to sha1-names using the translation table.
+Reading an object's SHA-1 content
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The SHA-1 content of an object can be read by converting all SHA-256 names
+its SHA-256 content references to SHA-1 names using the translation table.
 
 Fetch
 ~~~~~
@@ -338,7 +338,7 @@ the following steps:
 1. index-pack: inflate each object in the packfile and compute its
    SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
    objects the client has locally. These objects can be looked up
-   using the translation table and their sha1-content read as
+   using the translation table and their SHA-1 content read as
    described above to resolve the deltas.
 2. topological sort: starting at the "want"s from the negotiation
    phase, walk through objects in the pack and emit a list of them,
@@ -347,12 +347,12 @@ the following steps:
    (This list only contains objects reachable from the "wants". If the
    pack from the server contained additional extraneous objects, then
    they will be discarded.)
-3. convert to sha256: open a new (sha256) packfile. Read the topologically
+3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
    sorted list just generated. For each object, inflate its
    SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
    pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
 4. sort: reorder entries in the new pack to match the order of objects
-   in the pack the server generated and include blobs. Write a sha256 idx
+   in the pack the server generated and include blobs. Write a SHA-256 idx
    file
 5. clean up: remove the SHA-1 based pack file, index, and
    topologically sorted list obtained from the server in steps 1
@@ -377,16 +377,16 @@ experimenting to get this to perform well.
 Push
 ~~~~
 Push is simpler than fetch because the objects referenced by the
-pushed objects are already in the translation table. The sha1-content
+pushed objects are already in the translation table. The SHA-1 content
 of each object being pushed can be read as described in the "Reading
-an object's sha1-content" section to generate the pack written by git
+an object's SHA-1 content" section to generate the pack written by git
 send-pack.
 
 Signed Commits
 ~~~~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the commit object format to allow
 signing commits without relying on SHA-1. It is similar to the
-existing "gpgsig" field. Its signed payload is the sha256-content of the
+existing "gpgsig" field. Its signed payload is the SHA-256 content of the
 commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
 
 This means commits can be signed
@@ -404,7 +404,7 @@ Signed Tags
 ~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the tag object format to allow
 signing tags without relying on SHA-1. Its signed payload is the
-sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
+SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
 SIGNATURE-----" delimited in-body signature removed.
 
 This means tags can be signed
@@ -416,11 +416,11 @@ This means tags can be signed
 
 Mergetag embedding
 ~~~~~~~~~~~~~~~~~~
-The mergetag field in the sha1-content of a commit contains the
-sha1-content of a tag that was merged by that commit.
+The mergetag field in the SHA-1 content of a commit contains the
+SHA-1 content of a tag that was merged by that commit.
 
-The mergetag field in the sha256-content of the same commit contains the
-sha256-content of the same tag.
+The mergetag field in the SHA-256 content of the same commit contains the
+SHA-256 content of the same tag.
 
 Submodules
 ~~~~~~~~~~
@@ -495,7 +495,7 @@ Caveats
 -------
 Invalid objects
 ~~~~~~~~~~~~~~~
-The conversion from sha1-content to sha256-content retains any
+The conversion from SHA-1 content to SHA-256 content retains any
 brokenness in the original object (e.g., tree entry modes encoded with
 leading 0, tree objects whose paths are not sorted correctly, and
 commit objects without an author or committer). This is a deliberate
@@ -514,15 +514,15 @@ allow lifting this restriction.
 
 Alternates
 ~~~~~~~~~~
-For the same reason, a sha256 repository cannot borrow objects from a
-sha1 repository using objects/info/alternates or
+For the same reason, a SHA-256 repository cannot borrow objects from a
+SHA-1 repository using objects/info/alternates or
 $GIT_ALTERNATE_OBJECT_REPOSITORIES.
 
 git notes
 ~~~~~~~~~
-The "git notes" tool annotates objects using their sha1-name as key.
+The "git notes" tool annotates objects using their SHA-1 name as key.
 This design does not describe a way to migrate notes trees to use
-sha256-names. That migration is expected to happen separately (for
+SHA-256 names. That migration is expected to happen separately (for
 example using a file at the root of the notes tree to describe which
 hash it uses).
 
@@ -556,7 +556,7 @@ unclear:
 
 	Git 2.12
 
-Does this mean Git v2.12.0 is the commit with sha1-name
+Does this mean Git v2.12.0 is the commit with SHA-1 name
 e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
 new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
 
@@ -676,7 +676,7 @@ The next step is supporting fetches and pushes to SHA-1 repositories:
 - allow pushes to a repository using the compat format
 - generate a topologically sorted list of the SHA-1 names of fetched
   objects
-- convert the fetched packfile to sha256 format and generate an idx
+- convert the fetched packfile to SHA-256 format and generate an idx
   file
 - re-sort to match the order of objects in the fetched packfile
 
@@ -748,38 +748,38 @@ using the old hash function.
 Signed objects with multiple hashes
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Instead of introducing the gpgsig-sha256 field in commit and tag objects
-for sha256-content based signatures, an earlier version of this design
-added "hash sha256 <sha256-name>" fields to strengthen the existing
-sha1-content based signatures.
+for SHA-256 content based signatures, an earlier version of this design
+added "hash sha256 <SHA-256 name>" fields to strengthen the existing
+SHA-1 content based signatures.
 
 In other words, a single signature was used to attest to the object
 content using both hash functions. This had some advantages:
 
 * Using one signature instead of two speeds up the signing process.
 * Having one signed payload with both hashes allows the signer to
-  attest to the sha1-name and sha256-name referring to the same object.
+  attest to the SHA-1 name and SHA-256 name referring to the same object.
 * All users consume the same signature. Broken signatures are likely
   to be detected quickly using current versions of git.
 
 However, it also came with disadvantages:
 
-* Verifying a signed object requires access to the sha1-names of all
+* Verifying a signed object requires access to the SHA-1 names of all
   objects it references, even after the transition is complete and
   translation table is no longer needed for anything else. To support
-  this, the design added fields such as "hash sha1 tree <sha1-name>"
-  and "hash sha1 parent <sha1-name>" to the sha256-content of a signed
+  this, the design added fields such as "hash sha1 tree <SHA-1 name>"
+  and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
   commit, complicating the conversion process.
-* Allowing signed objects without a sha1 (for after the transition is
+* Allowing signed objects without a SHA-1 (for after the transition is
   complete) complicated the design further, requiring a "nohash sha1"
-  field to suppress including "hash sha1" fields in the sha256-content
+  field to suppress including "hash sha1" fields in the SHA-256 content
   and signed payload.
 
 Lazily populated translation table
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Some of the work of building the translation table could be deferred to
 push time, but that would significantly complicate and slow down pushes.
-Calculating the sha1-name at object creation time at the same time it is
-being streamed to disk and having its sha256-name calculated should be
+Calculating the SHA-1 name at object creation time at the same time it is
+being streamed to disk and having its SHA-256 name calculated should be
 an acceptable cost.
 
 Document History
@@ -801,7 +801,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
 2017-03-06 jrnieder@gmail.com
 
 * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
-* Make sha3-based signatures a separate field, avoiding the need for
+* Make SHA3-based signatures a separate field, avoiding the need for
   "hash" and "nohash" fields (thanks to peff[3]).
 * Add a sorting phase to fetch (thanks to Junio for noticing the need
   for this).
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 3/6] doc hash-function-transition: use upper case consistently
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use upper case consistently in Document History.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt         | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 51acf2c10b7..2eba25cf87c 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -794,8 +794,8 @@ sbeller@google.com
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
 
-* describe purpose of signed objects with each hash type
-* redefine signed object verification using object content under the
+* Describe purpose of signed objects with each hash type
+* Redefine signed object verification using object content under the
   first hash function
 
 2017-03-06 jrnieder@gmail.com
@@ -814,13 +814,13 @@ Incorporated suggestions from jonathantanmy and sbeller:
 
 2017-09-27 jrnieder@gmail.com, sbeller@google.com
 
-* use placeholder NewHash instead of SHA3-256
-* describe criteria for picking a hash function.
-* include a transition plan (thanks especially to Brandon Williams
+* Use placeholder NewHash instead of SHA3-256
+* Describe criteria for picking a hash function.
+* Include a transition plan (thanks especially to Brandon Williams
   for fleshing these ideas out)
-* define the translation table (thanks, Shawn Pearce[5], Jonathan
+* Define the translation table (thanks, Shawn Pearce[5], Jonathan
   Tan, and Masaya Suzuki)
-* avoid loose object overhead by packing more aggressively in
+* Avoid loose object overhead by packing more aggressively in
   "git gc --auto"
 
 Later history:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/6] doc hash-function-transition: fix incomplete sentence
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-02-02 16:19   ` [PATCH v2 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 16:19   ` [PATCH v2 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 Documentation/technical/hash-function-transition.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 2eba25cf87c..86b09ea0f21 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -315,7 +315,7 @@ for all objects in the object store.
 Reading an object's SHA-1 content
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The SHA-1 content of an object can be read by converting all SHA-256 names
-its SHA-256 content references to SHA-1 names using the translation table.
+of its SHA-256 content references to SHA-1 names using the translation table.
 
 Fetch
 ~~~~~
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 5/6] doc hash-function-transition: move rationale upwards
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-02-02 16:19   ` [PATCH v2 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 19:54     ` Junio C Hamano
  2021-02-02 16:19   ` [PATCH v2 6/6] doc: use https links Thomas Ackermann via GitGitGadget
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Move rationale for new hash function to beginning of document
so that it appears before the concrete move to SHA-256 is described.

Remove details about SHA-1 weaknesses. Instead add references
to the details of how the new hash function was chosen.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 62 +++++--------------
 1 file changed, 16 insertions(+), 46 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 86b09ea0f21..475f2f501a6 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -27,22 +27,12 @@ advantages:
   methods have a short reliable string that can be used to reliably
   address stored content.
 
-Over time some flaws in SHA-1 have been discovered by security
-researchers. On 23 February 2017 the SHAttered attack
-(https://shattered.io) demonstrated a practical SHA-1 hash collision.
+Over time some flaws in SHA-1 have been discovered by security researchers.
 
 Git v2.13.0 and later subsequently moved to a hardened SHA-1
-implementation by default, which isn't vulnerable to the SHAttered
-attack.
+implementation by default, but SHA-1 is still believed to be weak.
 
-Thus Git has in effect already migrated to a new hash that isn't SHA-1
-and doesn't share its vulnerabilities, its new hash function just
-happens to produce exactly the same output for all known inputs,
-except two PDFs published by the SHAttered researchers, and the new
-implementation (written by those researchers) claims to detect future
-cryptanalytic collision attacks.
-
-Regardless, it's considered prudent to move past any variant of SHA-1
+Thus it's considered prudent to move past any variant of SHA-1
 to a new hash. There's no guarantee that future attacks on SHA-1 won't
 be published in the future, and those attacks may not have viable
 mitigations.
@@ -57,6 +47,19 @@ SHA-1 still possesses the other properties such as fast object lookup
 and safe error checking, but other hash functions are equally suitable
 that are believed to be cryptographically secure.
 
+Choice of Hash
+--------------
+There were several contenders for a successor hash to SHA-1, including
+SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
+
+In late 2018 the project picked SHA-256 as its successor hash.
+
+See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
+NewHash, 2018-08-04) and numerous mailing list threads at the time,
+particularly the one starting at
+https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
+for more information.
+
 Goals
 -----
 1. The transition to SHA-256 can be done one local repository at a time.
@@ -601,39 +604,6 @@ example:
 
     git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
 
-Choice of Hash
---------------
-In early 2005, around the time that Git was written, Xiaoyun Wang,
-Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
-collisions in 2^69 operations. In August they published details.
-Luckily, no practical demonstrations of a collision in full SHA-1 were
-published until 10 years later, in 2017.
-
-Git v2.13.0 and later subsequently moved to a hardened SHA-1
-implementation by default that mitigates the SHAttered attack, but
-SHA-1 is still believed to be weak.
-
-The hash to replace this hardened SHA-1 should be stronger than SHA-1
-was: we would like it to be trustworthy and useful in practice for at
-least 10 years.
-
-Some other relevant properties:
-
-1. A 256-bit hash (long enough to match common security practice; not
-   excessively long to hurt performance and disk usage).
-
-2. High quality implementations should be widely available (e.g., in
-   OpenSSL and Apple CommonCrypto).
-
-3. The hash function's properties should match Git's needs (e.g. Git
-   requires collision and 2nd preimage resistance and does not require
-   length extension resistance).
-
-4. As a tiebreaker, the hash should be fast to compute (fortunately
-   many contenders are faster than SHA-1).
-
-We choose SHA-256.
-
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 6/6] doc: use https links
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-02-02 16:19   ` [PATCH v2 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
@ 2021-02-02 16:19   ` Thomas Ackermann via GitGitGadget
  2021-02-02 19:57   ` [PATCH v2 0/6] doc: improvements for hash-function-transition Junio C Hamano
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
  7 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-02 16:19 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use only https links for lore.kernel.org.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 Documentation/technical/hash-function-transition.txt | 10 +++++-----
 t/t0021-conversion.sh                                |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 475f2f501a6..e7e6bd95ff9 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -759,7 +759,7 @@ Document History
 bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
 sbeller@google.com
 
-* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
 
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
@@ -801,8 +801,8 @@ Later history:
 
 References:
 
- [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
- [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
- [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
- [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+ [1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+ [2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+ [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+ [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
  [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e4c4de5c745..e828ee964c1 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -34,7 +34,7 @@ filter_git () {
 # Compare two files and ensure that `clean` and `smudge` respectively are
 # called at least once if specified in the `expect` file. The actual
 # invocation count is not relevant because their number can vary.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_count () {
 	expect=$1
 	actual=$2
@@ -49,7 +49,7 @@ test_cmp_count () {
 
 # Compare two files but exclude all `clean` invocations because Git can
 # call `clean` zero or more times.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_exclude_clean () {
 	expect=$1
 	actual=$2
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently
  2021-02-02 16:19   ` [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
@ 2021-02-02 19:39     ` Junio C Hamano
  2021-02-02 23:19       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2021-02-02 19:39 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Thomas Ackermann

"Thomas Ackermann via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Thomas Ackermann <th.acker@arcor.de>
>
> Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
> to the hash type.

Ahh.  [1/6] was supposed to be only about formatting, and I found it
a bit irritating that it had some of these changes mixed in, as it
was not entirely clear to me that [1/6] covered all those lowercase
sha1 and sha256 instances, or just some of them.

Moving them from [1/6] to this step would help future readers by
reducing such irritation (I do not know if it is worth it until
I read through the series to the end).

> Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
> ---
>  .../technical/hash-function-transition.txt    | 122 +++++++++---------
>  1 file changed, 61 insertions(+), 61 deletions(-)

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 5/6] doc hash-function-transition: move rationale upwards
  2021-02-02 16:19   ` [PATCH v2 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
@ 2021-02-02 19:54     ` Junio C Hamano
  2021-02-02 23:23       ` brian m. carlson
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2021-02-02 19:54 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget
  Cc: brian m. carlson, Jonathan Nieder, git,
	Ævar Arnfjörð Bjarmason, Thomas Ackermann

"Thomas Ackermann via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Thomas Ackermann <th.acker@arcor.de>
>
> Move rationale for new hash function to beginning of document
> so that it appears before the concrete move to SHA-256 is described.
>
> Remove details about SHA-1 weaknesses. Instead add references
> to the details of how the new hash function was chosen.
>
> Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
> ---
>  .../technical/hash-function-transition.txt    | 62 +++++--------------
>  1 file changed, 16 insertions(+), 46 deletions(-)

Hmph, this might turn out to be a bit more controversial than its
worth.  I'd summon/cc a few people from the original discussion.

> -Over time some flaws in SHA-1 have been discovered by security
> -researchers. On 23 February 2017 the SHAttered attack
> -(https://shattered.io) demonstrated a practical SHA-1 hash collision.
> +Over time some flaws in SHA-1 have been discovered by security researchers.
>  
>  Git v2.13.0 and later subsequently moved to a hardened SHA-1
> -implementation by default, which isn't vulnerable to the SHAttered
> -attack.
> +implementation by default, but SHA-1 is still believed to be weak.

Even if we've hardended against one particular form of attack, we
still have incentive to switch away from SHA-1.  It is unclear why
we just do not add ", but ..." to the original and instead remove
the half-sentence about sha1dc.

> @@ -57,6 +47,19 @@ SHA-1 still possesses the other properties such as fast object lookup
>  and safe error checking, but other hash functions are equally suitable
>  that are believed to be cryptographically secure.
>  
> +Choice of Hash
> +--------------
> +There were several contenders for a successor hash to SHA-1, including
> +SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
> +
> +In late 2018 the project picked SHA-256 as its successor hash.
> +
> +See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
> +NewHash, 2018-08-04) and numerous mailing list threads at the time,
> +particularly the one starting at
> +https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
> +for more information.

I personally think this is referring too much to external document
for typical readers, and lost too much relative to the original.  I
do not mind losing the history of how we reached the conclusion that
SHA-1 is no longer viable at all, but I am not sure if we want to
lose the list of criteria we used when choosing (i.e. stronger than
SHA-1, 256-bit, quality implementations, etc.) from this section.

> -The hash to replace this hardened SHA-1 should be stronger than SHA-1
> -was: we would like it to be trustworthy and useful in practice for at
> -least 10 years.
> -
> -Some other relevant properties:
> -
> -1. A 256-bit hash (long enough to match common security practice; not
> -   excessively long to hurt performance and disk usage).
> -
> -2. High quality implementations should be widely available (e.g., in
> -   OpenSSL and Apple CommonCrypto).
> -
> -3. The hash function's properties should match Git's needs (e.g. Git
> -   requires collision and 2nd preimage resistance and does not require
> -   length extension resistance).
> -
> -4. As a tiebreaker, the hash should be fast to compute (fortunately
> -   many contenders are faster than SHA-1).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 0/6] doc: improvements for hash-function-transition
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
                     ` (5 preceding siblings ...)
  2021-02-02 16:19   ` [PATCH v2 6/6] doc: use https links Thomas Ackermann via GitGitGadget
@ 2021-02-02 19:57   ` Junio C Hamano
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
  7 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2021-02-02 19:57 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Thomas Ackermann

"Thomas Ackermann via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Some asciidoc formatting errors and some minor formatting inconsistencies in
> hash-function-transition.txt were fixed.
>
> Content-wise the rationale for choosing SHA-256 was shortened and moved to
> the beginning of the document and an incomplete sentence was corrected.

The series was more-or-less a pleasant read, except for a minor nit
about sha1 => SHA-1 sha256 => SHA-256 changes in 1/6 that should
have been in 2/6, and some changes in 5/6 that I found questionable.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently
  2021-02-02 19:39     ` Junio C Hamano
@ 2021-02-02 23:19       ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2021-02-02 23:19 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget
  Cc: git, Ævar Arnfjörð Bjarmason, Thomas Ackermann

Junio C Hamano <gitster@pobox.com> writes:

> "Thomas Ackermann via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Thomas Ackermann <th.acker@arcor.de>
>>
>> Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
>> to the hash type.
>
> Ahh.  [1/6] was supposed to be only about formatting, and I found it
> a bit irritating that it had some of these changes mixed in, as it
> was not entirely clear to me that [1/6] covered all those lowercase
> sha1 and sha256 instances, or just some of them.
>
> Moving them from [1/6] to this step would help future readers by
> reducing such irritation (I do not know if it is worth it until
> I read through the series to the end).

It seems that only one hunk in 1/6 had premature conversion.
I've tried to locally move it around before queuing.

Thanks.

1:  eea107fb0e ! 1:  5df3cc249d doc hash-function-transition: fix asciidoc output
    @@ Documentation/technical/hash-function-transition.txt: network byte order):
      Loose object index
      ~~~~~~~~~~~~~~~~~~
     @@ Documentation/technical/hash-function-transition.txt: the following steps:
    -    they will be discarded.)
      3. convert to sha256: open a new (sha256) packfile. Read the topologically
         sorted list just generated. For each object, inflate its
    --   sha1-content, convert to sha256-content, and write it to the sha256
    +    sha1-content, convert to sha256-content, and write it to the sha256
     -   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
    -+   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
    -+   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
    ++   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
      4. sort: reorder entries in the new pack to match the order of objects
         in the pack the server generated and include blobs. Write a sha256 idx
         file
2:  58934c8b43 ! 2:  8488d4f5f1 doc hash-function-transition: use SHA-1 and SHA-256 consistently
    @@ Documentation/technical/hash-function-transition.txt: the following steps:
     -3. convert to sha256: open a new (sha256) packfile. Read the topologically
     +3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
         sorted list just generated. For each object, inflate its
    -    SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
    -    pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
    +-   sha1-content, convert to sha256-content, and write it to the sha256
    +-   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
    ++   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
    ++   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
      4. sort: reorder entries in the new pack to match the order of objects
     -   in the pack the server generated and include blobs. Write a sha256 idx
     +   in the pack the server generated and include blobs. Write a SHA-256 idx
3:  4a710c8715 = 3:  454a9437cf doc hash-function-transition: use upper case consistently
4:  7e690524ac = 4:  b2c881b66b doc hash-function-transition: fix incomplete sentence
5:  80089fe818 = 5:  4ee4775ca3 doc hash-function-transition: move rationale upwards
6:  b221eae801 = 6:  c27c52ca0c doc: use https links

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 5/6] doc hash-function-transition: move rationale upwards
  2021-02-02 19:54     ` Junio C Hamano
@ 2021-02-02 23:23       ` brian m. carlson
  0 siblings, 0 replies; 25+ messages in thread
From: brian m. carlson @ 2021-02-02 23:23 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Thomas Ackermann via GitGitGadget, Jonathan Nieder, git,
	Ævar Arnfjörð Bjarmason, Thomas Ackermann

[-- Attachment #1: Type: text/plain, Size: 4212 bytes --]

On 2021-02-02 at 19:54:20, Junio C Hamano wrote:
> "Thomas Ackermann via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > -Over time some flaws in SHA-1 have been discovered by security
> > -researchers. On 23 February 2017 the SHAttered attack
> > -(https://shattered.io) demonstrated a practical SHA-1 hash collision.
> > +Over time some flaws in SHA-1 have been discovered by security researchers.
> > 
> >  Git v2.13.0 and later subsequently moved to a hardened SHA-1
> > -implementation by default, which isn't vulnerable to the SHAttered
> > -attack.
> > +implementation by default, but SHA-1 is still believed to be weak.
> 
> Even if we've hardended against one particular form of attack, we
> still have incentive to switch away from SHA-1.  It is unclear why
> we just do not add ", but ..." to the original and instead remove
> the half-sentence about sha1dc.

I think we should keep the original statement about the attack since
it's relevant to why we want to change.  I also think we should say,
"but SHA-1 is still weak".  Saying "is still believed to be" implies
doubt or uncertainty, and the fact that multiple collisions have been
performed and can be trivially verified should remove any doubt.

Even if SHA-1 were still perfectly secure (which it is not), it can only
by design provide an 80-bit security level, which is inadequate by
today's standards.

> > @@ -57,6 +47,19 @@ SHA-1 still possesses the other properties such as fast object lookup
> >  and safe error checking, but other hash functions are equally suitable
> >  that are believed to be cryptographically secure.
> > 
> > +Choice of Hash
> > +--------------
> > +There were several contenders for a successor hash to SHA-1, including
> > +SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
> > +
> > +In late 2018 the project picked SHA-256 as its successor hash.
> > +
> > +See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
> > +NewHash, 2018-08-04) and numerous mailing list threads at the time,
> > +particularly the one starting at
> > +https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
> > +for more information.
> 
> I personally think this is referring too much to external document
> for typical readers, and lost too much relative to the original.  I
> do not mind losing the history of how we reached the conclusion that
> SHA-1 is no longer viable at all, but I am not sure if we want to
> lose the list of criteria we used when choosing (i.e. stronger than
> SHA-1, 256-bit, quality implementations, etc.) from this section.

I don't have a problem including this and in fact I think it might be
valuable, but I think we should keep the below data as well.

> > -The hash to replace this hardened SHA-1 should be stronger than SHA-1
> > -was: we would like it to be trustworthy and useful in practice for at
> > -least 10 years.
> > -
> > -Some other relevant properties:
> > -
> > -1. A 256-bit hash (long enough to match common security practice; not
> > -   excessively long to hurt performance and disk usage).
> > -
> > -2. High quality implementations should be widely available (e.g., in
> > -   OpenSSL and Apple CommonCrypto).
> > -
> > -3. The hash function's properties should match Git's needs (e.g. Git
> > -   requires collision and 2nd preimage resistance and does not require
> > -   length extension resistance).
> > -
> > -4. As a tiebreaker, the hash should be fast to compute (fortunately
> > -   many contenders are faster than SHA-1).

I'd prefer to keep the original criteria here, because I think it's
useful to document what they were for why we'd want to change.  For
example, we occasionally get random users asking why we didn't pick a
hash with length extension resistance or why we didn't pick SHA-3.

The fact that we don't need length extension attack resistance and that
most non-Linux crypto libraries provide an extremely limited set of
crypto primitives are essential to our decision.  I think if SHA-3-256
had been more widely available, it would have been the winning candidate.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 0/6] doc: improvements for hash-function-transition
  2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
                     ` (6 preceding siblings ...)
  2021-02-02 19:57   ` [PATCH v2 0/6] doc: improvements for hash-function-transition Junio C Hamano
@ 2021-02-05 18:22   ` Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
                       ` (5 more replies)
  7 siblings, 6 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann

Some asciidoc formatting errors and some minor formatting inconsistencies in
hash-function-transition.txt were fixed.

Content-wise the rationale for choosing SHA-256 was shortened a little bit
and moved to the beginning of the document. Also an incomplete sentence was
corrected.

Changes since v2:

 * Move a stray change from 1/6 back to 2/6; fix an incomplete conversion in
   2/6.
 * Rework rationale based on the comments from Junio and Brian.
 * Rebased on current master.

Changes since v1:

 * Better commit messages.

 * Details on SHA-1 weaknesses were removed from the rationale.

 * All http links to lore.kernel.org in the tree were changed to https
   links.
   
   Thanks to Ævar, Junio and Brian for their suggestions and help.

Signed-off-by: Thomas Ackermann th.acker@arcor.de

Thomas Ackermann (6):
  doc hash-function-transition: fix asciidoc output
  doc hash-function-transition: use SHA-1 and SHA-256 consistently
  doc hash-function-transition: use upper case consistently
  doc hash-function-transition: fix incomplete sentence
  doc hash-function-transition: move rationale upwards
  doc: use https links

 .../technical/hash-function-transition.txt    | 293 +++++++++---------
 t/t0021-conversion.sh                         |   4 +-
 2 files changed, 150 insertions(+), 147 deletions(-)


base-commit: 30b29f044a2b30f0667eb21559959e03eb1bd04f
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-858%2Ftacker66%2Fdoc_hash_function_transition-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-858/tacker66/doc_hash_function_transition-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/858

Range-diff vs v2:

 1:  f36c5dd4c1e3 ! 1:  7c78d0c1c30a doc hash-function-transition: fix asciidoc output
     @@ Documentation/technical/hash-function-transition.txt: network byte order):
       Loose object index
       ~~~~~~~~~~~~~~~~~~
      @@ Documentation/technical/hash-function-transition.txt: the following steps:
     -    they will be discarded.)
       3. convert to sha256: open a new (sha256) packfile. Read the topologically
          sorted list just generated. For each object, inflate its
     --   sha1-content, convert to sha256-content, and write it to the sha256
     +    sha1-content, convert to sha256-content, and write it to the sha256
      -   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
     -+   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
     -+   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
     ++   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
       4. sort: reorder entries in the new pack to match the order of objects
          in the pack the server generated and include blobs. Write a sha256 idx
          file
 2:  681ce4129dc3 ! 2:  69ebc9a8f19a doc hash-function-transition: use SHA-1 and SHA-256 consistently
     @@ Documentation/technical/hash-function-transition.txt: repository extensions.
      -that objects referenced by the object are named using their sha256-names
      -instead of sha1-names. Because a blob object does not refer to any
      -other object, its sha1-content and sha256-content are the same.
     -+The SHA-256-content of an object is the same as its SHA-1 content, except
     ++The SHA-256 content of an object is the same as its SHA-1 content, except
      +that objects referenced by the object are named using their SHA-256 names
      +instead of SHA-1 names. Because a blob object does not refer to any
      +other object, its SHA-1 content and SHA-256 content are the same.
     @@ Documentation/technical/hash-function-transition.txt: the following steps:
      -3. convert to sha256: open a new (sha256) packfile. Read the topologically
      +3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
          sorted list just generated. For each object, inflate its
     -    SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
     -    pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
     +-   sha1-content, convert to sha256-content, and write it to the sha256
     +-   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
     ++   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
     ++   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
       4. sort: reorder entries in the new pack to match the order of objects
      -   in the pack the server generated and include blobs. Write a sha256 idx
      +   in the pack the server generated and include blobs. Write a SHA-256 idx
 3:  4f622fffcc5d = 3:  06b781206e4c doc hash-function-transition: use upper case consistently
 4:  58295cadffe5 = 4:  7a29f06c3f25 doc hash-function-transition: fix incomplete sentence
 5:  711a37969b6f ! 5:  ee0fa2ec1d0f doc hash-function-transition: move rationale upwards
     @@ Commit message
          Move rationale for new hash function to beginning of document
          so that it appears before the concrete move to SHA-256 is described.
      
     -    Remove details about SHA-1 weaknesses. Instead add references
     -    to the details of how the new hash function was chosen.
     +    Remove some of the details about SHA-1 weaknesses and add references
     +    to the details on how the new hash function was chosen instead.
      
          Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
      
       ## Documentation/technical/hash-function-transition.txt ##
     -@@ Documentation/technical/hash-function-transition.txt: advantages:
     -   methods have a short reliable string that can be used to reliably
     -   address stored content.
     - 
     --Over time some flaws in SHA-1 have been discovered by security
     --researchers. On 23 February 2017 the SHAttered attack
     --(https://shattered.io) demonstrated a practical SHA-1 hash collision.
     -+Over time some flaws in SHA-1 have been discovered by security researchers.
     +@@ Documentation/technical/hash-function-transition.txt: researchers. On 23 February 2017 the SHAttered attack
       
       Git v2.13.0 and later subsequently moved to a hardened SHA-1
     --implementation by default, which isn't vulnerable to the SHAttered
     + implementation by default, which isn't vulnerable to the SHAttered
      -attack.
     -+implementation by default, but SHA-1 is still believed to be weak.
     ++attack, but SHA-1 is still weak.
       
      -Thus Git has in effect already migrated to a new hash that isn't SHA-1
      -and doesn't share its vulnerabilities, its new hash function just
     @@ Documentation/technical/hash-function-transition.txt: SHA-1 still possesses the
       
      +Choice of Hash
      +--------------
     ++The hash to replace the hardened SHA-1 should be stronger than SHA-1
     ++was: we would like it to be trustworthy and useful in practice for at
     ++least 10 years.
     ++
     ++Some other relevant properties:
     ++
     ++1. A 256-bit hash (long enough to match common security practice; not
     ++   excessively long to hurt performance and disk usage).
     ++
     ++2. High quality implementations should be widely available (e.g., in
     ++   OpenSSL and Apple CommonCrypto).
     ++
     ++3. The hash function's properties should match Git's needs (e.g. Git
     ++   requires collision and 2nd preimage resistance and does not require
     ++   length extension resistance).
     ++
     ++4. As a tiebreaker, the hash should be fast to compute (fortunately
     ++   many contenders are faster than SHA-1).
     ++
      +There were several contenders for a successor hash to SHA-1, including
      +SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
      +
 6:  d6041b7e9e87 = 6:  c31d6e258fd0 doc: use https links

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 1/6] doc hash-function-transition: fix asciidoc output
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Asciidoc requires lists to start with an empty line and uses
different characters for indentation levels ("-", "*", "**", ...).
For special symbols like a dash "--" has to be used and there is
no double arrow "<->", so a left and right arrow "<-->" has to be
combined for that. Lastly for verbatim output a newline followed
by an indentation has to be used.

Fix asciidoc output for lists, special characters and verbatim
text while retaining the readabilty of the original text file.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 79 +++++++++++--------
 1 file changed, 45 insertions(+), 34 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 6fd20ebbc254..b23d23151a57 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -94,7 +94,7 @@ Overview
 --------
 We introduce a new repository format extension. Repositories with this
 extension enabled use SHA-256 instead of SHA-1 to name their objects.
-This affects both object names and object content --- both the names
+This affects both object names and object content -- both the names
 of objects and all references to other objects within an object are
 switched to the new hash function.
 
@@ -191,21 +191,21 @@ hash functions. They have the following format (all integers are in
 network byte order):
 
 - A header appears at the beginning and consists of the following:
-  - The 4-byte pack index signature: '\377t0c'
-  - 4-byte version number: 3
-  - 4-byte length of the header section, including the signature and
+  * The 4-byte pack index signature: '\377t0c'
+  * 4-byte version number: 3
+  * 4-byte length of the header section, including the signature and
     version number
-  - 4-byte number of objects contained in the pack
-  - 4-byte number of object formats in this pack index: 2
-  - For each object format:
-    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
-    - 4-byte length in bytes of shortened object names. This is the
+  * 4-byte number of objects contained in the pack
+  * 4-byte number of object formats in this pack index: 2
+  * For each object format:
+    ** 4-byte format identifier (e.g., 'sha1' for SHA-1)
+    ** 4-byte length in bytes of shortened object names. This is the
       shortest possible length needed to make names in the shortened
       object name table unambiguous.
-    - 4-byte integer, recording where tables relating to this format
+    ** 4-byte integer, recording where tables relating to this format
       are stored in this index file, as an offset from the beginning.
-  - 4-byte offset to the trailer from the beginning of this file.
-  - Zero or more additional key/value pairs (4-byte key, 4-byte
+  * 4-byte offset to the trailer from the beginning of this file.
+  * Zero or more additional key/value pairs (4-byte key, 4-byte
     value). Only one key is supported: 'PSRC'. See the "Loose objects
     and unreachable objects" section for supported values and how this
     is used.  All other keys are reserved. Readers must ignore
@@ -213,37 +213,36 @@ network byte order):
 - Zero or more NUL bytes. This can optionally be used to improve the
   alignment of the full object name table below.
 - Tables for the first object format:
-  - A sorted table of shortened object names.  These are prefixes of
+  * A sorted table of shortened object names.  These are prefixes of
     the names of all objects in this pack file, packed together
     without offset values to reduce the cache footprint of the binary
     search for a specific object name.
 
-  - A table of full object names in pack order. This allows resolving
+  * A table of full object names in pack order. This allows resolving
     a reference to "the nth object in the pack file" (from a
     reachability bitmap or from the next table of another object
     format) to its object name.
 
-  - A table of 4-byte values mapping object name order to pack order.
+  * A table of 4-byte values mapping object name order to pack order.
     For an object in the table of sorted shortened object names, the
     value at the corresponding index in this table is the index in the
     previous table for that same object.
-
     This can be used to look up the object in reachability bitmaps or
     to look up its name in another object format.
 
-  - A table of 4-byte CRC32 values of the packed object data, in the
+  * A table of 4-byte CRC32 values of the packed object data, in the
     order that the objects appear in the pack file. This is to allow
     compressed data to be copied directly from pack to pack during
     repacking without undetected data corruption.
 
-  - A table of 4-byte offset values. For an object in the table of
+  * A table of 4-byte offset values. For an object in the table of
     sorted shortened object names, the value at the corresponding
     index in this table indicates where that object can be found in
     the pack file. These are usually 31-bit pack file offsets, but
     large offsets are encoded as an index into the next table with the
     most significant bit set.
 
-  - A table of 8-byte offset entries (empty for pack files less than
+  * A table of 8-byte offset entries (empty for pack files less than
     2 GiB). Pack files are organized with heavily used objects toward
     the front, so most object references should not need to refer to
     this table.
@@ -252,10 +251,10 @@ network byte order):
   up to and not including the table of CRC32 values.
 - Zero or more NUL bytes.
 - The trailer consists of the following:
-  - A copy of the 20-byte SHA-256 checksum at the end of the
+  * A copy of the 20-byte SHA-256 checksum at the end of the
     corresponding packfile.
 
-  - 20-byte SHA-256 checksum of all of the above.
+  * 20-byte SHA-256 checksum of all of the above.
 
 Loose object index
 ~~~~~~~~~~~~~~~~~~
@@ -351,7 +350,7 @@ the following steps:
 3. convert to sha256: open a new (sha256) packfile. Read the topologically
    sorted list just generated. For each object, inflate its
    sha1-content, convert to sha256-content, and write it to the sha256
-   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
+   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
 4. sort: reorder entries in the new pack to match the order of objects
    in the pack the server generated and include blobs. Write a sha256 idx
    file
@@ -391,6 +390,7 @@ existing "gpgsig" field. Its signed payload is the sha256-content of the
 commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
 
 This means commits can be signed
+
 1. using SHA-1 only, as in existing signed commit objects
 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
    fields.
@@ -408,6 +408,7 @@ sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
 SIGNATURE-----" delimited in-body signature removed.
 
 This means tags can be signed
+
 1. using SHA-1 only, as in existing signed tag objects
 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
    signature.
@@ -598,7 +599,7 @@ The user can also explicitly specify which format to use for a
 particular revision specifier and for output, overriding the mode. For
 example:
 
-git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
+    git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
 
 Choice of Hash
 --------------
@@ -636,6 +637,7 @@ We choose SHA-256.
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another:
+
 - adding a hash function API (vtable)
 - teaching fsck to tolerate the gpgsig-sha256 field
 - excluding gpgsig-* from the fields copied by "git commit --amend"
@@ -647,9 +649,9 @@ Some initial steps can be implemented independently of one another:
 - introducing index v3
 - adding support for the PSRC field and safer object pruning
 
-
 The first user-visible change is the introduction of the objectFormat
 extension (without compatObjectFormat). This requires:
+
 - teaching fsck about this mode of operation
 - using the hash function API (vtable) when computing object names
 - signing objects and verifying signatures
@@ -657,6 +659,7 @@ extension (without compatObjectFormat). This requires:
   repository
 
 Next comes introduction of compatObjectFormat:
+
 - implementing the loose-object-idx
 - translating object names between object formats
 - translating object content between object formats
@@ -669,6 +672,7 @@ Next comes introduction of compatObjectFormat:
   "Object names on the command line" above)
 
 The next step is supporting fetches and pushes to SHA-1 repositories:
+
 - allow pushes to a repository using the compat format
 - generate a topologically sorted list of the SHA-1 names of fetched
   objects
@@ -734,6 +738,7 @@ Using hash functions in parallel
 Objects newly created would be addressed by the new hash, but inside
 such an object (e.g. commit) it is still possible to address objects
 using the old hash function.
+
 * You cannot trust its history (needed for bisectability) in the
   future without further work
 * Maintenance burden as the number of supported hash functions grows
@@ -749,6 +754,7 @@ sha1-content based signatures.
 
 In other words, a single signature was used to attest to the object
 content using both hash functions. This had some advantages:
+
 * Using one signature instead of two speeds up the signing process.
 * Having one signed payload with both hashes allows the signer to
   attest to the sha1-name and sha256-name referring to the same object.
@@ -756,6 +762,7 @@ content using both hash functions. This had some advantages:
   to be detected quickly using current versions of git.
 
 However, it also came with disadvantages:
+
 * Verifying a signed object requires access to the sha1-names of all
   objects it references, even after the transition is complete and
   translation table is no longer needed for anything else. To support
@@ -782,16 +789,17 @@ Document History
 bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
 sbeller@google.com
 
-Initial version sent to
-http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
 
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
+
 * describe purpose of signed objects with each hash type
 * redefine signed object verification using object content under the
   first hash function
 
 2017-03-06 jrnieder@gmail.com
+
 * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
 * Make sha3-based signatures a separate field, avoiding the need for
   "hash" and "nohash" fields (thanks to peff[3]).
@@ -805,6 +813,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
   especially Junio).
 
 2017-09-27 jrnieder@gmail.com, sbeller@google.com
+
 * use placeholder NewHash instead of SHA3-256
 * describe criteria for picking a hash function.
 * include a transition plan (thanks especially to Brandon Williams
@@ -816,12 +825,14 @@ Incorporated suggestions from jonathantanmy and sbeller:
 
 Later history:
 
- See the history of this file in git.git for the history of subsequent
- edits. This document history is no longer being maintained as it
- would now be superfluous to the commit log
+* See the history of this file in git.git for the history of subsequent
+  edits. This document history is no longer being maintained as it
+  would now be superfluous to the commit log
+
+References:
 
-[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
-[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
-[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
-[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
-[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
+ [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+ [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+ [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+ [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+ [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use SHA-1 and SHA-256 instead of sha1 and sha256  when referring
to the hash type.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 126 +++++++++---------
 1 file changed, 63 insertions(+), 63 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index b23d23151a57..8c01608cbfa0 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -107,7 +107,7 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
 interchangeably.
 
 "git cat-file" and "git hash-object" gain options to display an object
-in its sha1 form and write an object given its sha1 form. This
+in its SHA-1 form and write an object given its SHA-1 form. This
 requires all objects referenced by that object to be present in the
 object database so that they can be named using the appropriate name
 (using the bidirectional hash mapping).
@@ -115,7 +115,7 @@ object database so that they can be named using the appropriate name
 Fetches from a SHA-1 based server convert the fetched objects into
 SHA-256 form and record the mapping in the bidirectional mapping table
 (see below for details). Pushes to a SHA-1 based server convert the
-objects being pushed into sha1 form so the server does not have to be
+objects being pushed into SHA-1 form so the server does not have to be
 aware of the hash function the client is using.
 
 Detailed Design
@@ -151,38 +151,38 @@ repository extensions.
 
 Object names
 ~~~~~~~~~~~~
-Objects can be named by their 40 hexadecimal digit sha1-name or 64
-hexadecimal digit sha256-name, plus names derived from those (see
+Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
+hexadecimal digit SHA-256 name, plus names derived from those (see
 gitrevisions(7)).
 
-The sha1-name of an object is the SHA-1 of the concatenation of its
-type, length, a nul byte, and the object's sha1-content. This is the
+The SHA-1 name of an object is the SHA-1 of the concatenation of its
+type, length, a nul byte, and the object's SHA-1 content. This is the
 traditional <sha1> used in Git to name objects.
 
-The sha256-name of an object is the SHA-256 of the concatenation of its
-type, length, a nul byte, and the object's sha256-content.
+The SHA-256 name of an object is the SHA-256 of the concatenation of its
+type, length, a nul byte, and the object's SHA-256 content.
 
 Object format
 ~~~~~~~~~~~~~
 The content as a byte sequence of a tag, commit, or tree object named
-by sha1 and sha256 differ because an object named by sha256-name refers to
-other objects by their sha256-names and an object named by sha1-name
-refers to other objects by their sha1-names.
+by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
+other objects by their SHA-256 names and an object named by SHA-1 name
+refers to other objects by their SHA-1 names.
 
-The sha256-content of an object is the same as its sha1-content, except
-that objects referenced by the object are named using their sha256-names
-instead of sha1-names. Because a blob object does not refer to any
-other object, its sha1-content and sha256-content are the same.
+The SHA-256 content of an object is the same as its SHA-1 content, except
+that objects referenced by the object are named using their SHA-256 names
+instead of SHA-1 names. Because a blob object does not refer to any
+other object, its SHA-1 content and SHA-256 content are the same.
 
-The format allows round-trip conversion between sha256-content and
-sha1-content.
+The format allows round-trip conversion between SHA-256 content and
+SHA-1 content.
 
 Object storage
 ~~~~~~~~~~~~~~
 Loose objects use zlib compression and packed objects use the packed
 format described in Documentation/technical/pack-format.txt, just like
-today. The content that is compressed and stored uses sha256-content
-instead of sha1-content.
+today. The content that is compressed and stored uses SHA-256 content
+instead of SHA-1 content.
 
 Pack index
 ~~~~~~~~~~
@@ -287,18 +287,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):
 
 Translation table
 ~~~~~~~~~~~~~~~~~
-The index files support a bidirectional mapping between sha1-names
-and sha256-names. The lookup proceeds similarly to ordinary object
-lookups. For example, to convert a sha1-name to a sha256-name:
+The index files support a bidirectional mapping between SHA-1 names
+and SHA-256 names. The lookup proceeds similarly to ordinary object
+lookups. For example, to convert a SHA-1 name to a SHA-256 name:
 
  1. Look for the object in idx files. If a match is present in the
-    idx's sorted list of truncated sha1-names, then:
-    a. Read the corresponding entry in the sha1-name order to pack
+    idx's sorted list of truncated SHA-1 names, then:
+    a. Read the corresponding entry in the SHA-1 name order to pack
        name order mapping.
-    b. Read the corresponding entry in the full sha1-name table to
+    b. Read the corresponding entry in the full SHA-1 name table to
        verify we found the right object. If it is, then
-    c. Read the corresponding entry in the full sha256-name table.
-       That is the object's sha256-name.
+    c. Read the corresponding entry in the full SHA-256 name table.
+       That is the object's SHA-256 name.
  2. Check for a loose object. Read lines from loose-object-idx until
     we find a match.
 
@@ -312,10 +312,10 @@ Since all operations that make new objects (e.g., "git commit") add
 the new objects to the corresponding index, this mapping is possible
 for all objects in the object store.
 
-Reading an object's sha1-content
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The sha1-content of an object can be read by converting all sha256-names
-its sha256-content references to sha1-names using the translation table.
+Reading an object's SHA-1 content
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The SHA-1 content of an object can be read by converting all SHA-256 names
+its SHA-256 content references to SHA-1 names using the translation table.
 
 Fetch
 ~~~~~
@@ -338,7 +338,7 @@ the following steps:
 1. index-pack: inflate each object in the packfile and compute its
    SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
    objects the client has locally. These objects can be looked up
-   using the translation table and their sha1-content read as
+   using the translation table and their SHA-1 content read as
    described above to resolve the deltas.
 2. topological sort: starting at the "want"s from the negotiation
    phase, walk through objects in the pack and emit a list of them,
@@ -347,12 +347,12 @@ the following steps:
    (This list only contains objects reachable from the "wants". If the
    pack from the server contained additional extraneous objects, then
    they will be discarded.)
-3. convert to sha256: open a new (sha256) packfile. Read the topologically
+3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
    sorted list just generated. For each object, inflate its
-   sha1-content, convert to sha256-content, and write it to the sha256
-   pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
+   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
+   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
 4. sort: reorder entries in the new pack to match the order of objects
-   in the pack the server generated and include blobs. Write a sha256 idx
+   in the pack the server generated and include blobs. Write a SHA-256 idx
    file
 5. clean up: remove the SHA-1 based pack file, index, and
    topologically sorted list obtained from the server in steps 1
@@ -377,16 +377,16 @@ experimenting to get this to perform well.
 Push
 ~~~~
 Push is simpler than fetch because the objects referenced by the
-pushed objects are already in the translation table. The sha1-content
+pushed objects are already in the translation table. The SHA-1 content
 of each object being pushed can be read as described in the "Reading
-an object's sha1-content" section to generate the pack written by git
+an object's SHA-1 content" section to generate the pack written by git
 send-pack.
 
 Signed Commits
 ~~~~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the commit object format to allow
 signing commits without relying on SHA-1. It is similar to the
-existing "gpgsig" field. Its signed payload is the sha256-content of the
+existing "gpgsig" field. Its signed payload is the SHA-256 content of the
 commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
 
 This means commits can be signed
@@ -404,7 +404,7 @@ Signed Tags
 ~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the tag object format to allow
 signing tags without relying on SHA-1. Its signed payload is the
-sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
+SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
 SIGNATURE-----" delimited in-body signature removed.
 
 This means tags can be signed
@@ -416,11 +416,11 @@ This means tags can be signed
 
 Mergetag embedding
 ~~~~~~~~~~~~~~~~~~
-The mergetag field in the sha1-content of a commit contains the
-sha1-content of a tag that was merged by that commit.
+The mergetag field in the SHA-1 content of a commit contains the
+SHA-1 content of a tag that was merged by that commit.
 
-The mergetag field in the sha256-content of the same commit contains the
-sha256-content of the same tag.
+The mergetag field in the SHA-256 content of the same commit contains the
+SHA-256 content of the same tag.
 
 Submodules
 ~~~~~~~~~~
@@ -495,7 +495,7 @@ Caveats
 -------
 Invalid objects
 ~~~~~~~~~~~~~~~
-The conversion from sha1-content to sha256-content retains any
+The conversion from SHA-1 content to SHA-256 content retains any
 brokenness in the original object (e.g., tree entry modes encoded with
 leading 0, tree objects whose paths are not sorted correctly, and
 commit objects without an author or committer). This is a deliberate
@@ -514,15 +514,15 @@ allow lifting this restriction.
 
 Alternates
 ~~~~~~~~~~
-For the same reason, a sha256 repository cannot borrow objects from a
-sha1 repository using objects/info/alternates or
+For the same reason, a SHA-256 repository cannot borrow objects from a
+SHA-1 repository using objects/info/alternates or
 $GIT_ALTERNATE_OBJECT_REPOSITORIES.
 
 git notes
 ~~~~~~~~~
-The "git notes" tool annotates objects using their sha1-name as key.
+The "git notes" tool annotates objects using their SHA-1 name as key.
 This design does not describe a way to migrate notes trees to use
-sha256-names. That migration is expected to happen separately (for
+SHA-256 names. That migration is expected to happen separately (for
 example using a file at the root of the notes tree to describe which
 hash it uses).
 
@@ -556,7 +556,7 @@ unclear:
 
 	Git 2.12
 
-Does this mean Git v2.12.0 is the commit with sha1-name
+Does this mean Git v2.12.0 is the commit with SHA-1 name
 e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
 new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
 
@@ -676,7 +676,7 @@ The next step is supporting fetches and pushes to SHA-1 repositories:
 - allow pushes to a repository using the compat format
 - generate a topologically sorted list of the SHA-1 names of fetched
   objects
-- convert the fetched packfile to sha256 format and generate an idx
+- convert the fetched packfile to SHA-256 format and generate an idx
   file
 - re-sort to match the order of objects in the fetched packfile
 
@@ -748,38 +748,38 @@ using the old hash function.
 Signed objects with multiple hashes
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Instead of introducing the gpgsig-sha256 field in commit and tag objects
-for sha256-content based signatures, an earlier version of this design
-added "hash sha256 <sha256-name>" fields to strengthen the existing
-sha1-content based signatures.
+for SHA-256 content based signatures, an earlier version of this design
+added "hash sha256 <SHA-256 name>" fields to strengthen the existing
+SHA-1 content based signatures.
 
 In other words, a single signature was used to attest to the object
 content using both hash functions. This had some advantages:
 
 * Using one signature instead of two speeds up the signing process.
 * Having one signed payload with both hashes allows the signer to
-  attest to the sha1-name and sha256-name referring to the same object.
+  attest to the SHA-1 name and SHA-256 name referring to the same object.
 * All users consume the same signature. Broken signatures are likely
   to be detected quickly using current versions of git.
 
 However, it also came with disadvantages:
 
-* Verifying a signed object requires access to the sha1-names of all
+* Verifying a signed object requires access to the SHA-1 names of all
   objects it references, even after the transition is complete and
   translation table is no longer needed for anything else. To support
-  this, the design added fields such as "hash sha1 tree <sha1-name>"
-  and "hash sha1 parent <sha1-name>" to the sha256-content of a signed
+  this, the design added fields such as "hash sha1 tree <SHA-1 name>"
+  and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
   commit, complicating the conversion process.
-* Allowing signed objects without a sha1 (for after the transition is
+* Allowing signed objects without a SHA-1 (for after the transition is
   complete) complicated the design further, requiring a "nohash sha1"
-  field to suppress including "hash sha1" fields in the sha256-content
+  field to suppress including "hash sha1" fields in the SHA-256 content
   and signed payload.
 
 Lazily populated translation table
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Some of the work of building the translation table could be deferred to
 push time, but that would significantly complicate and slow down pushes.
-Calculating the sha1-name at object creation time at the same time it is
-being streamed to disk and having its sha256-name calculated should be
+Calculating the SHA-1 name at object creation time at the same time it is
+being streamed to disk and having its SHA-256 name calculated should be
 an acceptable cost.
 
 Document History
@@ -801,7 +801,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
 2017-03-06 jrnieder@gmail.com
 
 * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
-* Make sha3-based signatures a separate field, avoiding the need for
+* Make SHA3-based signatures a separate field, avoiding the need for
   "hash" and "nohash" fields (thanks to peff[3]).
 * Add a sorting phase to fetch (thanks to Junio for noticing the need
   for this).
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 3/6] doc hash-function-transition: use upper case consistently
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use upper case consistently in Document History.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt         | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 8c01608cbfa0..9e13919a0e5b 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -794,8 +794,8 @@ sbeller@google.com
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
 
-* describe purpose of signed objects with each hash type
-* redefine signed object verification using object content under the
+* Describe purpose of signed objects with each hash type
+* Redefine signed object verification using object content under the
   first hash function
 
 2017-03-06 jrnieder@gmail.com
@@ -814,13 +814,13 @@ Incorporated suggestions from jonathantanmy and sbeller:
 
 2017-09-27 jrnieder@gmail.com, sbeller@google.com
 
-* use placeholder NewHash instead of SHA3-256
-* describe criteria for picking a hash function.
-* include a transition plan (thanks especially to Brandon Williams
+* Use placeholder NewHash instead of SHA3-256
+* Describe criteria for picking a hash function.
+* Include a transition plan (thanks especially to Brandon Williams
   for fleshing these ideas out)
-* define the translation table (thanks, Shawn Pearce[5], Jonathan
+* Define the translation table (thanks, Shawn Pearce[5], Jonathan
   Tan, and Masaya Suzuki)
-* avoid loose object overhead by packing more aggressively in
+* Avoid loose object overhead by packing more aggressively in
   "git gc --auto"
 
 Later history:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 4/6] doc hash-function-transition: fix incomplete sentence
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-02-05 18:22     ` [PATCH v3 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
  2021-02-05 18:22     ` [PATCH v3 6/6] doc: use https links Thomas Ackermann via GitGitGadget
  5 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 Documentation/technical/hash-function-transition.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 9e13919a0e5b..5ff9ee027cff 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -315,7 +315,7 @@ for all objects in the object store.
 Reading an object's SHA-1 content
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The SHA-1 content of an object can be read by converting all SHA-256 names
-its SHA-256 content references to SHA-1 names using the translation table.
+of its SHA-256 content references to SHA-1 names using the translation table.
 
 Fetch
 ~~~~~
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 5/6] doc hash-function-transition: move rationale upwards
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-02-05 18:22     ` [PATCH v3 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  2021-02-05 20:48       ` Ævar Arnfjörð Bjarmason
  2021-02-05 18:22     ` [PATCH v3 6/6] doc: use https links Thomas Ackermann via GitGitGadget
  5 siblings, 1 reply; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Move rationale for new hash function to beginning of document
so that it appears before the concrete move to SHA-256 is described.

Remove some of the details about SHA-1 weaknesses and add references
to the details on how the new hash function was chosen instead.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 .../technical/hash-function-transition.txt    | 76 +++++++++----------
 1 file changed, 34 insertions(+), 42 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 5ff9ee027cff..0c4cb98cd4e9 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -33,16 +33,9 @@ researchers. On 23 February 2017 the SHAttered attack
 
 Git v2.13.0 and later subsequently moved to a hardened SHA-1
 implementation by default, which isn't vulnerable to the SHAttered
-attack.
+attack, but SHA-1 is still weak.
 
-Thus Git has in effect already migrated to a new hash that isn't SHA-1
-and doesn't share its vulnerabilities, its new hash function just
-happens to produce exactly the same output for all known inputs,
-except two PDFs published by the SHAttered researchers, and the new
-implementation (written by those researchers) claims to detect future
-cryptanalytic collision attacks.
-
-Regardless, it's considered prudent to move past any variant of SHA-1
+Thus it's considered prudent to move past any variant of SHA-1
 to a new hash. There's no guarantee that future attacks on SHA-1 won't
 be published in the future, and those attacks may not have viable
 mitigations.
@@ -57,6 +50,38 @@ SHA-1 still possesses the other properties such as fast object lookup
 and safe error checking, but other hash functions are equally suitable
 that are believed to be cryptographically secure.
 
+Choice of Hash
+--------------
+The hash to replace the hardened SHA-1 should be stronger than SHA-1
+was: we would like it to be trustworthy and useful in practice for at
+least 10 years.
+
+Some other relevant properties:
+
+1. A 256-bit hash (long enough to match common security practice; not
+   excessively long to hurt performance and disk usage).
+
+2. High quality implementations should be widely available (e.g., in
+   OpenSSL and Apple CommonCrypto).
+
+3. The hash function's properties should match Git's needs (e.g. Git
+   requires collision and 2nd preimage resistance and does not require
+   length extension resistance).
+
+4. As a tiebreaker, the hash should be fast to compute (fortunately
+   many contenders are faster than SHA-1).
+
+There were several contenders for a successor hash to SHA-1, including
+SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
+
+In late 2018 the project picked SHA-256 as its successor hash.
+
+See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
+NewHash, 2018-08-04) and numerous mailing list threads at the time,
+particularly the one starting at
+https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
+for more information.
+
 Goals
 -----
 1. The transition to SHA-256 can be done one local repository at a time.
@@ -601,39 +626,6 @@ example:
 
     git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}
 
-Choice of Hash
---------------
-In early 2005, around the time that Git was written, Xiaoyun Wang,
-Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
-collisions in 2^69 operations. In August they published details.
-Luckily, no practical demonstrations of a collision in full SHA-1 were
-published until 10 years later, in 2017.
-
-Git v2.13.0 and later subsequently moved to a hardened SHA-1
-implementation by default that mitigates the SHAttered attack, but
-SHA-1 is still believed to be weak.
-
-The hash to replace this hardened SHA-1 should be stronger than SHA-1
-was: we would like it to be trustworthy and useful in practice for at
-least 10 years.
-
-Some other relevant properties:
-
-1. A 256-bit hash (long enough to match common security practice; not
-   excessively long to hurt performance and disk usage).
-
-2. High quality implementations should be widely available (e.g., in
-   OpenSSL and Apple CommonCrypto).
-
-3. The hash function's properties should match Git's needs (e.g. Git
-   requires collision and 2nd preimage resistance and does not require
-   length extension resistance).
-
-4. As a tiebreaker, the hash should be fast to compute (fortunately
-   many contenders are faster than SHA-1).
-
-We choose SHA-256.
-
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another:
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 6/6] doc: use https links
  2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-02-05 18:22     ` [PATCH v3 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
@ 2021-02-05 18:22     ` Thomas Ackermann via GitGitGadget
  5 siblings, 0 replies; 25+ messages in thread
From: Thomas Ackermann via GitGitGadget @ 2021-02-05 18:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason,
	brian m. carlson, Thomas Ackermann, Thomas Ackermann

From: Thomas Ackermann <th.acker@arcor.de>

Use only https links for lore.kernel.org.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
---
 Documentation/technical/hash-function-transition.txt | 10 +++++-----
 t/t0021-conversion.sh                                |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
index 0c4cb98cd4e9..7c1630bf8324 100644
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@@ -781,7 +781,7 @@ Document History
 bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
 sbeller@google.com
 
-* Initial version sent to http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
 
 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
@@ -823,8 +823,8 @@ Later history:
 
 References:
 
- [1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
- [2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
- [3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
- [4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+ [1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+ [2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+ [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+ [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
  [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e4c4de5c7456..e828ee964c1b 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -34,7 +34,7 @@ filter_git () {
 # Compare two files and ensure that `clean` and `smudge` respectively are
 # called at least once if specified in the `expect` file. The actual
 # invocation count is not relevant because their number can vary.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_count () {
 	expect=$1
 	actual=$2
@@ -49,7 +49,7 @@ test_cmp_count () {
 
 # Compare two files but exclude all `clean` invocations because Git can
 # call `clean` zero or more times.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_exclude_clean () {
 	expect=$1
 	actual=$2
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 5/6] doc hash-function-transition: move rationale upwards
  2021-02-05 18:22     ` [PATCH v3 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
@ 2021-02-05 20:48       ` Ævar Arnfjörð Bjarmason
  2021-02-05 21:49         ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-02-05 20:48 UTC (permalink / raw)
  To: Thomas Ackermann via GitGitGadget
  Cc: git, Junio C Hamano, brian m. carlson, Thomas Ackermann


On Fri, Feb 05 2021, Thomas Ackermann via GitGitGadget wrote:

> diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt
> index 5ff9ee027cff..0c4cb98cd4e9 100644
> --- a/Documentation/technical/hash-function-transition.txt
> +++ b/Documentation/technical/hash-function-transition.txt
> @@ -33,16 +33,9 @@ researchers. On 23 February 2017 the SHAttered attack
>  
>  Git v2.13.0 and later subsequently moved to a hardened SHA-1
>  implementation by default, which isn't vulnerable to the SHAttered
> -attack.
> +attack, but SHA-1 is still weak.
>  
> -Thus Git has in effect already migrated to a new hash that isn't SHA-1
> -and doesn't share its vulnerabilities, its new hash function just
> -happens to produce exactly the same output for all known inputs,
> -except two PDFs published by the SHAttered researchers, and the new
> -implementation (written by those researchers) claims to detect future
> -cryptanalytic collision attacks.
> -
> -Regardless, it's considered prudent to move past any variant of SHA-1
> +Thus it's considered prudent to move past any variant of SHA-1
>  to a new hash. There's no guarantee that future attacks on SHA-1 won't
>  be published in the future, and those attacks may not have viable
>  mitigations.
> @@ -57,6 +50,38 @@ SHA-1 still possesses the other properties such as fast object lookup
>  and safe error checking, but other hash functions are equally suitable
>  that are believed to be cryptographically secure.

I missed version 2 of this. I don't think it's an improvement to
completely remove the description of us using sha1collisiondetection by
default, i.e. effectively revert 5988eb631a3 (doc
hash-function-transition: clarify what SHAttered means, 2018-03-26)

I can see how my comment on v1 could have been read like that. FWIW I
didn't mean remove the whole thing, but that I don't think it adds much
value to our description of how we use SHA-1 to go into the level of
detail of mentioning several researchers by name, there's Wikipedia for
that.

I think what we should instead do is have some brief summary of the
vulnerabilities and how they're impacting git.

Maybe I'm barking up the wrong tree here, and what I'm describing should
be in a "man 5 gitsecurity" or something.

But anyway, I think it adds a lot of value to somewhere have not just
what amounts to "sha-1 sucks, see research papers", but to have some
brief human-readable summary of what the practical impact is on users.

In 2018 it was true that sha1collisiondetection was mitigating the known
attack in practice, and that's also true about this new attack[1] (maybe
there's others I missed ...).

Then there's the fact that we don't *just* rely on SHA-1, but e.g. the
"don't re-write objects we have already". So as a practical attack on
someone using git ...

Oh, and the attacks currently all seem to require file formats like JPEG
or PDF for anything practical, i.e. being able to spew in lots of
arbitrary data into some data segment, as opposed to e.g. creating a
program that compiles.

None of this is meant as some overall defense of SHA-1, just that most
of our users aren't security researchers, and will be helped by a
summary of how this system they're using using SHA-1, and having read
that it's "broken" or "believed to be weak" translates to a threat to
them in practice.

1. https://eprint.iacr.org/2020/014.pdf

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 5/6] doc hash-function-transition: move rationale upwards
  2021-02-05 20:48       ` Ævar Arnfjörð Bjarmason
@ 2021-02-05 21:49         ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2021-02-05 21:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Thomas Ackermann via GitGitGadget, git, brian m. carlson,
	Thomas Ackermann

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I missed version 2 of this. I don't think it's an improvement to
> completely remove the description of us using sha1collisiondetection by
> default, i.e. effectively revert 5988eb631a3 (doc
> hash-function-transition: clarify what SHAttered means, 2018-03-26)
> ...
> I can see how my comment on v1 could have been read like that. FWIW I
> didn't mean remove the whole thing, but that I don't think it adds much
> value to our description of how we use SHA-1 to go into the level of
> detail of mentioning several researchers by name, there's Wikipedia for
> that.

True.

> I think what we should instead do is have some brief summary of the
> vulnerabilities and how they're impacting git.

I am not sure.

> Maybe I'm barking up the wrong tree here, and what I'm describing should
> be in a "man 5 gitsecurity" or something.

I agree with your that it belongs to some other document, but not
here, where the primary thing is to outline how the migration will
go, and the part we are seeing is merely giving a background story.
At this point in time, readers would not have to learn the details
from this document.  People already know that we are not happy with
SHA-1 and is on our way to migrate to SHA-256.

> But anyway, I think it adds a lot of value to somewhere have not just
> what amounts to "sha-1 sucks, see research papers", but to have some
> brief human-readable summary of what the practical impact is on users.

Yeah.  I think Thomas has in [v3 5/6] gives our readers about the
right level of details.  If I were to change anything, I'd do "but
SHA-1 is {+considered+} still weak."

> In 2018 it was true that sha1collisiondetection was mitigating the known
> attack in practice, and that's also true about this new attack[1] (maybe
> there's others I missed ...).
>
> Then there's the fact that we don't *just* rely on SHA-1, but e.g. the
> "don't re-write objects we have already". So as a practical attack on
> someone using git ...
>
> Oh, and the attacks currently all seem to require file formats like JPEG
> or PDF for anything practical, i.e. being able to spew in lots of
> arbitrary data into some data segment, as opposed to e.g. creating a
> program that compiles.
>
> None of this is meant as some overall defense of SHA-1, just that most
> of our users aren't security researchers, and will be helped by a
> summary of how this system they're using using SHA-1, and having read
> that it's "broken" or "believed to be weak" translates to a threat to
> them in practice.

All of the above are good thing for somebody to write about, but I
am not sure it fits well in the context of this document.  This is
primarily about how the migration should go, and the target audience
is those of us who are already committed to the plan.  The backstory
on how the plan came about makes a nice introductory reading but it
would not be productive to spend too much bits for the purpose of the
document and its target audience, I would think.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-02-05 21:52 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <pull.858.git.1612093734.gitgitgadget@gmail.com>
     [not found] ` <3efe3392e9de6d4446665a8e6ae5a06b86bdccae.1612093734.git.gitgitgadget@gmail.com>
2021-01-31 20:23   ` [PATCH 1/6] doc hash-function-transition: fix asciidoc output Ævar Arnfjörð Bjarmason
     [not found] ` <62ca087d4ebaa5f3a7efba6a2865e89284fcd98d.1612093734.git.gitgitgadget@gmail.com>
2021-01-31 20:24   ` [PATCH 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Ævar Arnfjörð Bjarmason
     [not found] ` <d4abf1cf78e2e59e49b81bd458d85848bd3d7ff3.1612093734.git.gitgitgadget@gmail.com>
2021-01-31 20:25   ` [PATCH 4/6] doc hash-function-transition: use https links consistently Ævar Arnfjörð Bjarmason
     [not found] ` <2cdb0f8e2edc4416c5dfb88722aa05be35afba7d.1612093734.git.gitgitgadget@gmail.com>
2021-01-31 20:37   ` [PATCH 5/6] doc hash-function-transition: move rationale upwards Ævar Arnfjörð Bjarmason
2021-02-02 16:19 ` [PATCH v2 0/6] doc: improvements for hash-function-transition Thomas Ackermann via GitGitGadget
2021-02-02 16:19   ` [PATCH v2 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
2021-02-02 16:19   ` [PATCH v2 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
2021-02-02 19:39     ` Junio C Hamano
2021-02-02 23:19       ` Junio C Hamano
2021-02-02 16:19   ` [PATCH v2 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
2021-02-02 16:19   ` [PATCH v2 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
2021-02-02 16:19   ` [PATCH v2 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
2021-02-02 19:54     ` Junio C Hamano
2021-02-02 23:23       ` brian m. carlson
2021-02-02 16:19   ` [PATCH v2 6/6] doc: use https links Thomas Ackermann via GitGitGadget
2021-02-02 19:57   ` [PATCH v2 0/6] doc: improvements for hash-function-transition Junio C Hamano
2021-02-05 18:22   ` [PATCH v3 " Thomas Ackermann via GitGitGadget
2021-02-05 18:22     ` [PATCH v3 1/6] doc hash-function-transition: fix asciidoc output Thomas Ackermann via GitGitGadget
2021-02-05 18:22     ` [PATCH v3 2/6] doc hash-function-transition: use SHA-1 and SHA-256 consistently Thomas Ackermann via GitGitGadget
2021-02-05 18:22     ` [PATCH v3 3/6] doc hash-function-transition: use upper case consistently Thomas Ackermann via GitGitGadget
2021-02-05 18:22     ` [PATCH v3 4/6] doc hash-function-transition: fix incomplete sentence Thomas Ackermann via GitGitGadget
2021-02-05 18:22     ` [PATCH v3 5/6] doc hash-function-transition: move rationale upwards Thomas Ackermann via GitGitGadget
2021-02-05 20:48       ` Ævar Arnfjörð Bjarmason
2021-02-05 21:49         ` Junio C Hamano
2021-02-05 18:22     ` [PATCH v3 6/6] doc: use https links Thomas Ackermann via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).