git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SHA1dc on mac
@ 2020-02-12  8:56 Mike Hommey
  2020-02-12 16:46 ` Eric Sunshine
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Hommey @ 2020-02-12  8:56 UTC (permalink / raw)
  To: git

Hi,

If I'm not mistaken in my reading of the various files involved, it
looks like for some reason, building git on mac leads to using Apple
Common Crypto for SHA1, rather than SHA1dc, which seems unfortunate.
Is that really expected? More generally, at this point, should anything
other than SHA1dc be supported as a build option at all?

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: SHA1dc on mac
  2020-02-12  8:56 SHA1dc on mac Mike Hommey
@ 2020-02-12 16:46 ` Eric Sunshine
  2020-02-12 22:31   ` Mike Hommey
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Sunshine @ 2020-02-12 16:46 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Git List

On Wed, Feb 12, 2020 at 3:57 AM Mike Hommey <mh@glandium.org> wrote:
> If I'm not mistaken in my reading of the various files involved, it
> looks like for some reason, building git on mac leads to using Apple
> Common Crypto for SHA1, rather than SHA1dc, which seems unfortunate.
> Is that really expected?

There was a discussion on this topic a while back[1], and it does seem
that the behavior you describe is intentional[2].

> More generally, at this point, should anything
> other than SHA1dc be supported as a build option at all?

The conclusion [2,3] was that it likely would make sense to drop
support for Apple's CommonCrypto altogether, although nobody has yet
stepped up to do the work.

[1]: https://lore.kernel.org/git/CAMYxyaVQyVRQb-b0nVv412tMZ3rEnOfUPRakg2dEREg5_Ba5Ag@mail.gmail.com/T/
[2]: https://lore.kernel.org/git/20160102234923.GA14424@gmail.com/
[3]: https://lore.kernel.org/git/CAPig+cQ5kKAt2_RQnqT7Rn=uGmHV9VvxpQ+UgDPOj=D=pq6arg@mail.gmail.com/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: SHA1dc on mac
  2020-02-12 16:46 ` Eric Sunshine
@ 2020-02-12 22:31   ` Mike Hommey
  2020-02-12 22:40     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Mike Hommey @ 2020-02-12 22:31 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List

On Wed, Feb 12, 2020 at 11:46:31AM -0500, Eric Sunshine wrote:
> On Wed, Feb 12, 2020 at 3:57 AM Mike Hommey <mh@glandium.org> wrote:
> > If I'm not mistaken in my reading of the various files involved, it
> > looks like for some reason, building git on mac leads to using Apple
> > Common Crypto for SHA1, rather than SHA1dc, which seems unfortunate.
> > Is that really expected?
> 
> There was a discussion on this topic a while back[1], and it does seem
> that the behavior you describe is intentional[2].

That discussion predates SHA1dc, though.

> > More generally, at this point, should anything
> > other than SHA1dc be supported as a build option at all?
> 
> The conclusion [2,3] was that it likely would make sense to drop
> support for Apple's CommonCrypto altogether, although nobody has yet
> stepped up to do the work.

I wasn't explicit in my question, but I meant more broadly than Apple
Common Crypto. There is still opt-in support for openssl sha1 and PPC
sha1.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: SHA1dc on mac
  2020-02-12 22:31   ` Mike Hommey
@ 2020-02-12 22:40     ` Junio C Hamano
  2020-02-23 22:37       ` [PATCH] Remove non-SHA1dc sha1 implementations Mike Hommey
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2020-02-12 22:40 UTC (permalink / raw)
  To: Mike Hommey; +Cc: Eric Sunshine, Git List

Mike Hommey <mh@glandium.org> writes:

> On Wed, Feb 12, 2020 at 11:46:31AM -0500, Eric Sunshine wrote:
>> On Wed, Feb 12, 2020 at 3:57 AM Mike Hommey <mh@glandium.org> wrote:
>> > If I'm not mistaken in my reading of the various files involved, it
>> > looks like for some reason, building git on mac leads to using Apple
>> > Common Crypto for SHA1, rather than SHA1dc, which seems unfortunate.
>> > Is that really expected?
>> 
>> There was a discussion on this topic a while back[1], and it does seem
>> that the behavior you describe is intentional[2].
>
> That discussion predates SHA1dc, though.

Yes, but the essense is the same.  It was phrased as "is there a
good reason to prefer CommonCrypto over block-sha1?" but it really
was "is there a good reason to prefer CommonCrypto over the best we
offer?"  And the best we offer, which used to be block-sha1, is now
sha1dc.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] Remove non-SHA1dc sha1 implementations
  2020-02-12 22:40     ` Junio C Hamano
@ 2020-02-23 22:37       ` Mike Hommey
  2020-02-24  4:47         ` Jeff King
  2022-03-19  1:02         ` [PATCH] ppc: remove custom SHA-1 implementation Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 17+ messages in thread
From: Mike Hommey @ 2020-02-23 22:37 UTC (permalink / raw)
  To: git; +Cc: gitster, Mike Hommey

It is 2020, and with the weakening of SHA1 security-wise, there doesn't
seem to be a reason to support anything else than SHA1dc, with collision
detection.

Signed-off-by: Mike Hommey <mh@glandium.org>
---

Note: I only tested building on Linux.

 INSTALL           |   5 -
 Makefile          |  67 ++-----------
 block-sha1/sha1.c | 251 ----------------------------------------------
 block-sha1/sha1.h |  22 ----
 config.mak.uname  |   1 -
 configure.ac      |   3 -
 hash.h            |  24 -----
 ppc/sha1.c        |  72 -------------
 ppc/sha1.h        |  25 -----
 ppc/sha1ppc.S     | 224 -----------------------------------------
 10 files changed, 6 insertions(+), 688 deletions(-)
 delete mode 100644 block-sha1/sha1.c
 delete mode 100644 block-sha1/sha1.h
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S

diff --git a/INSTALL b/INSTALL
index 22c364f34f..91d649f99e 100644
--- a/INSTALL
+++ b/INSTALL
@@ -133,11 +133,6 @@ Issues of note:
 	  you are using libcurl older than 7.34.0.  Otherwise you can use
 	  NO_OPENSSL without losing git-imap-send.
 
-	  By default, git uses OpenSSL for SHA1 but it will use its own
-	  library (inspired by Mozilla's) with either NO_OPENSSL or
-	  BLK_SHA1.  Also included is a version optimized for PowerPC
-	  (PPC_SHA1).
-
 	- "libcurl" library is used by git-http-fetch, git-fetch, and, if
 	  the curl version >= 7.34.0, for git-imap-send.  You might also
 	  want the "curl" executable for debugging purposes. If you do not
diff --git a/Makefile b/Makefile
index b7d7374dac..5b4307d332 100644
--- a/Makefile
+++ b/Makefile
@@ -149,37 +149,15 @@ all::
 # specify your own (or DarwinPort's) include directories and
 # library directories by defining CFLAGS and LDFLAGS appropriately.
 #
-# Define NO_APPLE_COMMON_CRYPTO if you are building on Darwin/Mac OS X
-# and do not want to use Apple's CommonCrypto library.  This allows you
-# to provide your own OpenSSL library, for example from MacPorts.
-#
-# Define BLK_SHA1 environment variable to make use of the bundled
-# optimized C SHA1 routine.
-#
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
-# Define DC_SHA1 to unconditionally enable the collision-detecting sha1
-# algorithm. This is slower, but may detect attempted collision attacks.
-# Takes priority over other *_SHA1 knobs.
-#
-# Define DC_SHA1_EXTERNAL in addition to DC_SHA1 if you want to build / link
-# git with the external SHA1 collision-detect library.
+# Define DC_SHA1_EXTERNAL if you want to build / link git with the
+# external SHA1 collision-detect library.
 # Without this option, i.e. the default behavior is to build git with its
 # own built-in code (or submodule).
 #
-# Define DC_SHA1_SUBMODULE in addition to DC_SHA1 to use the
-# sha1collisiondetection shipped as a submodule instead of the
-# non-submodule copy in sha1dc/. This is an experimental option used
-# by the git project to migrate to using sha1collisiondetection as a
-# submodule.
-#
-# Define OPENSSL_SHA1 environment variable when running make to link
-# with the SHA1 routine from openssl library.
-#
-# Define SHA1_MAX_BLOCK_SIZE to limit the amount of data that will be hashed
-# in one call to the platform's SHA1_Update(). e.g. APPLE_COMMON_CRYPTO
-# wants 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined.
+# Define DC_SHA1_SUBMODULE to use the sha1collisiondetection shipped
+# as a submodule instead of the non-submodule copy in sha1dc/. This is
+# an experimental option used by the git project to migrate to using
+# sha1collisiondetection as a submodule.
 #
 # Define BLK_SHA256 to use the built-in SHA-256 routines.
 #
@@ -1296,11 +1274,6 @@ ifeq ($(uname_S),Darwin)
 			BASIC_LDFLAGS += -L/opt/local/lib
 		endif
 	endif
-	ifndef NO_APPLE_COMMON_CRYPTO
-		NO_OPENSSL = YesPlease
-		APPLE_COMMON_CRYPTO = YesPlease
-		COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
-	endif
 	NO_REGEX = YesPlease
 	PTHREAD_LIBS =
 endif
@@ -1430,9 +1403,6 @@ ifdef NEEDS_SSL_WITH_CRYPTO
 else
 	LIB_4_CRYPTO = $(OPENSSL_LINK) -lcrypto
 endif
-ifdef APPLE_COMMON_CRYPTO
-	LIB_4_CRYPTO += -framework Security -framework CoreFoundation
-endif
 endif
 ifndef NO_ICONV
 	ifdef NEEDS_LIBICONV
@@ -1647,27 +1617,6 @@ ifdef NO_POSIX_GOODIES
 	BASIC_CFLAGS += -DNO_POSIX_GOODIES
 endif
 
-ifdef APPLE_COMMON_CRYPTO
-	# Apple CommonCrypto requires chunking
-	SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
-endif
-
-ifdef OPENSSL_SHA1
-	EXTLIBS += $(LIB_4_CRYPTO)
-	BASIC_CFLAGS += -DSHA1_OPENSSL
-else
-ifdef BLK_SHA1
-	LIB_OBJS += block-sha1/sha1.o
-	BASIC_CFLAGS += -DSHA1_BLK
-else
-ifdef PPC_SHA1
-	LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
-	BASIC_CFLAGS += -DSHA1_PPC
-else
-ifdef APPLE_COMMON_CRYPTO
-	COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
-	BASIC_CFLAGS += -DSHA1_APPLE
-else
 	DC_SHA1 := YesPlease
 	BASIC_CFLAGS += -DSHA1_DC
 	LIB_OBJS += sha1dc_git.o
@@ -1694,10 +1643,6 @@ endif
 		-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" \
 		-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""
 endif
-endif
-endif
-endif
-endif
 
 ifdef OPENSSL_SHA256
 	EXTLIBS += $(LIB_4_CRYPTO)
diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
deleted file mode 100644
index 22b125cf8c..0000000000
--- a/block-sha1/sha1.c
+++ /dev/null
@@ -1,251 +0,0 @@
-/*
- * SHA1 routine optimized to do word accesses rather than byte accesses,
- * and to avoid unnecessary copies into the context array.
- *
- * This was initially based on the Mozilla SHA1 implementation, although
- * none of the original Mozilla code remains.
- */
-
-/* this is only to get definitions for memcpy(), ntohl() and htonl() */
-#include "../git-compat-util.h"
-
-#include "sha1.h"
-
-#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
-
-/*
- * Force usage of rol or ror by selecting the one with the smaller constant.
- * It _can_ generate slightly smaller code (a constant of 1 is special), but
- * perhaps more importantly it's possibly faster on any uarch that does a
- * rotate with a loop.
- */
-
-#define SHA_ASM(op, x, n) ({ unsigned int __res; __asm__(op " %1,%0":"=r" (__res):"i" (n), "0" (x)); __res; })
-#define SHA_ROL(x,n)	SHA_ASM("rol", x, n)
-#define SHA_ROR(x,n)	SHA_ASM("ror", x, n)
-
-#else
-
-#define SHA_ROT(X,l,r)	(((X) << (l)) | ((X) >> (r)))
-#define SHA_ROL(X,n)	SHA_ROT(X,n,32-(n))
-#define SHA_ROR(X,n)	SHA_ROT(X,32-(n),n)
-
-#endif
-
-/*
- * If you have 32 registers or more, the compiler can (and should)
- * try to change the array[] accesses into registers. However, on
- * machines with less than ~25 registers, that won't really work,
- * and at least gcc will make an unholy mess of it.
- *
- * So to avoid that mess which just slows things down, we force
- * the stores to memory to actually happen (we might be better off
- * with a 'W(t)=(val);asm("":"+m" (W(t))' there instead, as
- * suggested by Artur Skawina - that will also make gcc unable to
- * try to do the silly "optimize away loads" part because it won't
- * see what the value will be).
- *
- * Ben Herrenschmidt reports that on PPC, the C version comes close
- * to the optimized asm with this (ie on PPC you don't want that
- * 'volatile', since there are lots of registers).
- *
- * On ARM we get the best code generation by forcing a full memory barrier
- * between each SHA_ROUND, otherwise gcc happily get wild with spilling and
- * the stack frame size simply explode and performance goes down the drain.
- */
-
-#if defined(__i386__) || defined(__x86_64__)
-  #define setW(x, val) (*(volatile unsigned int *)&W(x) = (val))
-#elif defined(__GNUC__) && defined(__arm__)
-  #define setW(x, val) do { W(x) = (val); __asm__("":::"memory"); } while (0)
-#else
-  #define setW(x, val) (W(x) = (val))
-#endif
-
-/* This "rolls" over the 512-bit array */
-#define W(x) (array[(x)&15])
-
-/*
- * Where do we get the source from? The first 16 iterations get it from
- * the input data, the next mix it from the 512-bit array.
- */
-#define SHA_SRC(t) get_be32((unsigned char *) block + (t)*4)
-#define SHA_MIX(t) SHA_ROL(W((t)+13) ^ W((t)+8) ^ W((t)+2) ^ W(t), 1);
-
-#define SHA_ROUND(t, input, fn, constant, A, B, C, D, E) do { \
-	unsigned int TEMP = input(t); setW(t, TEMP); \
-	E += TEMP + SHA_ROL(A,5) + (fn) + (constant); \
-	B = SHA_ROR(B, 2); } while (0)
-
-#define T_0_15(t, A, B, C, D, E)  SHA_ROUND(t, SHA_SRC, (((C^D)&B)^D) , 0x5a827999, A, B, C, D, E )
-#define T_16_19(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (((C^D)&B)^D) , 0x5a827999, A, B, C, D, E )
-#define T_20_39(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (B^C^D) , 0x6ed9eba1, A, B, C, D, E )
-#define T_40_59(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, ((B&C)+(D&(B^C))) , 0x8f1bbcdc, A, B, C, D, E )
-#define T_60_79(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (B^C^D) ,  0xca62c1d6, A, B, C, D, E )
-
-static void blk_SHA1_Block(blk_SHA_CTX *ctx, const void *block)
-{
-	unsigned int A,B,C,D,E;
-	unsigned int array[16];
-
-	A = ctx->H[0];
-	B = ctx->H[1];
-	C = ctx->H[2];
-	D = ctx->H[3];
-	E = ctx->H[4];
-
-	/* Round 1 - iterations 0-16 take their input from 'block' */
-	T_0_15( 0, A, B, C, D, E);
-	T_0_15( 1, E, A, B, C, D);
-	T_0_15( 2, D, E, A, B, C);
-	T_0_15( 3, C, D, E, A, B);
-	T_0_15( 4, B, C, D, E, A);
-	T_0_15( 5, A, B, C, D, E);
-	T_0_15( 6, E, A, B, C, D);
-	T_0_15( 7, D, E, A, B, C);
-	T_0_15( 8, C, D, E, A, B);
-	T_0_15( 9, B, C, D, E, A);
-	T_0_15(10, A, B, C, D, E);
-	T_0_15(11, E, A, B, C, D);
-	T_0_15(12, D, E, A, B, C);
-	T_0_15(13, C, D, E, A, B);
-	T_0_15(14, B, C, D, E, A);
-	T_0_15(15, A, B, C, D, E);
-
-	/* Round 1 - tail. Input from 512-bit mixing array */
-	T_16_19(16, E, A, B, C, D);
-	T_16_19(17, D, E, A, B, C);
-	T_16_19(18, C, D, E, A, B);
-	T_16_19(19, B, C, D, E, A);
-
-	/* Round 2 */
-	T_20_39(20, A, B, C, D, E);
-	T_20_39(21, E, A, B, C, D);
-	T_20_39(22, D, E, A, B, C);
-	T_20_39(23, C, D, E, A, B);
-	T_20_39(24, B, C, D, E, A);
-	T_20_39(25, A, B, C, D, E);
-	T_20_39(26, E, A, B, C, D);
-	T_20_39(27, D, E, A, B, C);
-	T_20_39(28, C, D, E, A, B);
-	T_20_39(29, B, C, D, E, A);
-	T_20_39(30, A, B, C, D, E);
-	T_20_39(31, E, A, B, C, D);
-	T_20_39(32, D, E, A, B, C);
-	T_20_39(33, C, D, E, A, B);
-	T_20_39(34, B, C, D, E, A);
-	T_20_39(35, A, B, C, D, E);
-	T_20_39(36, E, A, B, C, D);
-	T_20_39(37, D, E, A, B, C);
-	T_20_39(38, C, D, E, A, B);
-	T_20_39(39, B, C, D, E, A);
-
-	/* Round 3 */
-	T_40_59(40, A, B, C, D, E);
-	T_40_59(41, E, A, B, C, D);
-	T_40_59(42, D, E, A, B, C);
-	T_40_59(43, C, D, E, A, B);
-	T_40_59(44, B, C, D, E, A);
-	T_40_59(45, A, B, C, D, E);
-	T_40_59(46, E, A, B, C, D);
-	T_40_59(47, D, E, A, B, C);
-	T_40_59(48, C, D, E, A, B);
-	T_40_59(49, B, C, D, E, A);
-	T_40_59(50, A, B, C, D, E);
-	T_40_59(51, E, A, B, C, D);
-	T_40_59(52, D, E, A, B, C);
-	T_40_59(53, C, D, E, A, B);
-	T_40_59(54, B, C, D, E, A);
-	T_40_59(55, A, B, C, D, E);
-	T_40_59(56, E, A, B, C, D);
-	T_40_59(57, D, E, A, B, C);
-	T_40_59(58, C, D, E, A, B);
-	T_40_59(59, B, C, D, E, A);
-
-	/* Round 4 */
-	T_60_79(60, A, B, C, D, E);
-	T_60_79(61, E, A, B, C, D);
-	T_60_79(62, D, E, A, B, C);
-	T_60_79(63, C, D, E, A, B);
-	T_60_79(64, B, C, D, E, A);
-	T_60_79(65, A, B, C, D, E);
-	T_60_79(66, E, A, B, C, D);
-	T_60_79(67, D, E, A, B, C);
-	T_60_79(68, C, D, E, A, B);
-	T_60_79(69, B, C, D, E, A);
-	T_60_79(70, A, B, C, D, E);
-	T_60_79(71, E, A, B, C, D);
-	T_60_79(72, D, E, A, B, C);
-	T_60_79(73, C, D, E, A, B);
-	T_60_79(74, B, C, D, E, A);
-	T_60_79(75, A, B, C, D, E);
-	T_60_79(76, E, A, B, C, D);
-	T_60_79(77, D, E, A, B, C);
-	T_60_79(78, C, D, E, A, B);
-	T_60_79(79, B, C, D, E, A);
-
-	ctx->H[0] += A;
-	ctx->H[1] += B;
-	ctx->H[2] += C;
-	ctx->H[3] += D;
-	ctx->H[4] += E;
-}
-
-void blk_SHA1_Init(blk_SHA_CTX *ctx)
-{
-	ctx->size = 0;
-
-	/* Initialize H with the magic constants (see FIPS180 for constants) */
-	ctx->H[0] = 0x67452301;
-	ctx->H[1] = 0xefcdab89;
-	ctx->H[2] = 0x98badcfe;
-	ctx->H[3] = 0x10325476;
-	ctx->H[4] = 0xc3d2e1f0;
-}
-
-void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *data, unsigned long len)
-{
-	unsigned int lenW = ctx->size & 63;
-
-	ctx->size += len;
-
-	/* Read the data into W and process blocks as they get full */
-	if (lenW) {
-		unsigned int left = 64 - lenW;
-		if (len < left)
-			left = len;
-		memcpy(lenW + (char *)ctx->W, data, left);
-		lenW = (lenW + left) & 63;
-		len -= left;
-		data = ((const char *)data + left);
-		if (lenW)
-			return;
-		blk_SHA1_Block(ctx, ctx->W);
-	}
-	while (len >= 64) {
-		blk_SHA1_Block(ctx, data);
-		data = ((const char *)data + 64);
-		len -= 64;
-	}
-	if (len)
-		memcpy(ctx->W, data, len);
-}
-
-void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx)
-{
-	static const unsigned char pad[64] = { 0x80 };
-	unsigned int padlen[2];
-	int i;
-
-	/* Pad with a binary 1 (ie 0x80), then zeroes, then length */
-	padlen[0] = htonl((uint32_t)(ctx->size >> 29));
-	padlen[1] = htonl((uint32_t)(ctx->size << 3));
-
-	i = ctx->size & 63;
-	blk_SHA1_Update(ctx, pad, 1 + (63 & (55 - i)));
-	blk_SHA1_Update(ctx, padlen, 8);
-
-	/* Output hash */
-	for (i = 0; i < 5; i++)
-		put_be32(hashout + i * 4, ctx->H[i]);
-}
diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h
deleted file mode 100644
index 4df6747752..0000000000
--- a/block-sha1/sha1.h
+++ /dev/null
@@ -1,22 +0,0 @@
-/*
- * SHA1 routine optimized to do word accesses rather than byte accesses,
- * and to avoid unnecessary copies into the context array.
- *
- * This was initially based on the Mozilla SHA1 implementation, although
- * none of the original Mozilla code remains.
- */
-
-typedef struct {
-	unsigned long long size;
-	unsigned int H[5];
-	unsigned int W[16];
-} blk_SHA_CTX;
-
-void blk_SHA1_Init(blk_SHA_CTX *ctx);
-void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, unsigned long len);
-void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx);
-
-#define platform_SHA_CTX	blk_SHA_CTX
-#define platform_SHA1_Init	blk_SHA1_Init
-#define platform_SHA1_Update	blk_SHA1_Update
-#define platform_SHA1_Final	blk_SHA1_Final
diff --git a/config.mak.uname b/config.mak.uname
index cc8efd95b1..785b9265c3 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -116,7 +116,6 @@ ifeq ($(uname_S),Darwin)
 	# i.e. "begins with [15678] and a dot" means "10.4.* or older".
 	ifeq ($(shell expr "$(uname_R)" : '[15678]\.'),2)
 		OLD_ICONV = UnfortunatelyYes
-		NO_APPLE_COMMON_CRYPTO = YesPlease
 	endif
 	ifeq ($(shell expr "$(uname_R)" : '[15]\.'),2)
 		NO_STRLCPY = YesPlease
diff --git a/configure.ac b/configure.ac
index 66aedb9288..dd39b7ecdb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -237,9 +237,6 @@ AC_MSG_NOTICE([CHECKS for site configuration])
 # tests.  These tests take up a significant amount of the total test time
 # but are not needed unless you plan to talk to SVN repos.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 #
 # Define OPENSSLDIR=/foo/bar if your openssl header and library files are in
diff --git a/hash.h b/hash.h
index 52a4f1a3f4..e1a3d00b13 100644
--- a/hash.h
+++ b/hash.h
@@ -3,17 +3,7 @@
 
 #include "git-compat-util.h"
 
-#if defined(SHA1_PPC)
-#include "ppc/sha1.h"
-#elif defined(SHA1_APPLE)
-#include <CommonCrypto/CommonDigest.h>
-#elif defined(SHA1_OPENSSL)
-#include <openssl/sha.h>
-#elif defined(SHA1_DC)
 #include "sha1dc_git.h"
-#else /* SHA1_BLK */
-#include "block-sha1/sha1.h"
-#endif
 
 #if defined(SHA256_GCRYPT)
 #include "sha256/gcrypt.h"
@@ -23,20 +13,6 @@
 #include "sha256/block/sha256.h"
 #endif
 
-#ifndef platform_SHA_CTX
-/*
- * platform's underlying implementation of SHA-1; could be OpenSSL,
- * blk_SHA, Apple CommonCrypto, etc...  Note that the relevant
- * SHA-1 header may have already defined platform_SHA_CTX for our
- * own implementations like block-sha1 and ppc-sha1, so we list
- * the default for OpenSSL compatible SHA-1 implementations here.
- */
-#define platform_SHA_CTX	SHA_CTX
-#define platform_SHA1_Init	SHA1_Init
-#define platform_SHA1_Update	SHA1_Update
-#define platform_SHA1_Final    	SHA1_Final
-#endif
-
 #define git_SHA_CTX		platform_SHA_CTX
 #define git_SHA1_Init		platform_SHA1_Init
 #define git_SHA1_Update		platform_SHA1_Update
diff --git a/ppc/sha1.c b/ppc/sha1.c
deleted file mode 100644
index 1b705cee1f..0000000000
--- a/ppc/sha1.c
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- *
- * This version assumes we are running on a big-endian machine.
- * It calls an external sha1_core() to process blocks of 64 bytes.
- */
-#include <stdio.h>
-#include <string.h>
-#include "sha1.h"
-
-void ppc_sha1_core(uint32_t *hash, const unsigned char *p,
-		   unsigned int nblocks);
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c)
-{
-	c->hash[0] = 0x67452301;
-	c->hash[1] = 0xEFCDAB89;
-	c->hash[2] = 0x98BADCFE;
-	c->hash[3] = 0x10325476;
-	c->hash[4] = 0xC3D2E1F0;
-	c->len = 0;
-	c->cnt = 0;
-	return 0;
-}
-
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n)
-{
-	unsigned long nb;
-	const unsigned char *p = ptr;
-
-	c->len += (uint64_t) n << 3;
-	while (n != 0) {
-		if (c->cnt || n < 64) {
-			nb = 64 - c->cnt;
-			if (nb > n)
-				nb = n;
-			memcpy(&c->buf.b[c->cnt], p, nb);
-			if ((c->cnt += nb) == 64) {
-				ppc_sha1_core(c->hash, c->buf.b, 1);
-				c->cnt = 0;
-			}
-		} else {
-			nb = n >> 6;
-			ppc_sha1_core(c->hash, p, nb);
-			nb <<= 6;
-		}
-		n -= nb;
-		p += nb;
-	}
-	return 0;
-}
-
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c)
-{
-	unsigned int cnt = c->cnt;
-
-	c->buf.b[cnt++] = 0x80;
-	if (cnt > 56) {
-		if (cnt < 64)
-			memset(&c->buf.b[cnt], 0, 64 - cnt);
-		ppc_sha1_core(c->hash, c->buf.b, 1);
-		cnt = 0;
-	}
-	if (cnt < 56)
-		memset(&c->buf.b[cnt], 0, 56 - cnt);
-	c->buf.l[7] = c->len;
-	ppc_sha1_core(c->hash, c->buf.b, 1);
-	memcpy(hash, c->hash, 20);
-	return 0;
-}
diff --git a/ppc/sha1.h b/ppc/sha1.h
deleted file mode 100644
index 9b24b32615..0000000000
--- a/ppc/sha1.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-#include <stdint.h>
-
-typedef struct {
-	uint32_t hash[5];
-	uint32_t cnt;
-	uint64_t len;
-	union {
-		unsigned char b[64];
-		uint64_t l[8];
-	} buf;
-} ppc_SHA_CTX;
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c);
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n);
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c);
-
-#define platform_SHA_CTX	ppc_SHA_CTX
-#define platform_SHA1_Init	ppc_SHA1_Init
-#define platform_SHA1_Update	ppc_SHA1_Update
-#define platform_SHA1_Final	ppc_SHA1_Final
diff --git a/ppc/sha1ppc.S b/ppc/sha1ppc.S
deleted file mode 100644
index 1711eef6e7..0000000000
--- a/ppc/sha1ppc.S
+++ /dev/null
@@ -1,224 +0,0 @@
-/*
- * SHA-1 implementation for PowerPC.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-
-/*
- * PowerPC calling convention:
- * %r0 - volatile temp
- * %r1 - stack pointer.
- * %r2 - reserved
- * %r3-%r12 - Incoming arguments & return values; volatile.
- * %r13-%r31 - Callee-save registers
- * %lr - Return address, volatile
- * %ctr - volatile
- *
- * Register usage in this routine:
- * %r0 - temp
- * %r3 - argument (pointer to 5 words of SHA state)
- * %r4 - argument (pointer to data to hash)
- * %r5 - Constant K in SHA round (initially number of blocks to hash)
- * %r6-%r10 - Working copies of SHA variables A..E (actually E..A order)
- * %r11-%r26 - Data being hashed W[].
- * %r27-%r31 - Previous copies of A..E, for final add back.
- * %ctr - loop count
- */
-
-
-/*
- * We roll the registers for A, B, C, D, E around on each
- * iteration; E on iteration t is D on iteration t+1, and so on.
- * We use registers 6 - 10 for this.  (Registers 27 - 31 hold
- * the previous values.)
- */
-#define RA(t)	(((t)+4)%5+6)
-#define RB(t)	(((t)+3)%5+6)
-#define RC(t)	(((t)+2)%5+6)
-#define RD(t)	(((t)+1)%5+6)
-#define RE(t)	(((t)+0)%5+6)
-
-/* We use registers 11 - 26 for the W values */
-#define W(t)	((t)%16+11)
-
-/* Register 5 is used for the constant k */
-
-/*
- * The basic SHA-1 round function is:
- * E += ROTL(A,5) + F(B,C,D) + W[i] + K;  B = ROTL(B,30)
- * Then the variables are renamed: (A,B,C,D,E) = (E,A,B,C,D).
- *
- * Every 20 rounds, the function F() and the constant K changes:
- * - 20 rounds of f0(b,c,d) = "bit wise b ? c : d" =  (^b & d) + (b & c)
- * - 20 rounds of f1(b,c,d) = b^c^d = (b^d)^c
- * - 20 rounds of f2(b,c,d) = majority(b,c,d) = (b&d) + ((b^d)&c)
- * - 20 more rounds of f1(b,c,d)
- *
- * These are all scheduled for near-optimal performance on a G4.
- * The G4 is a 3-issue out-of-order machine with 3 ALUs, but it can only
- * *consider* starting the oldest 3 instructions per cycle.  So to get
- * maximum performance out of it, you have to treat it as an in-order
- * machine.  Which means interleaving the computation round t with the
- * computation of W[t+4].
- *
- * The first 16 rounds use W values loaded directly from memory, while the
- * remaining 64 use values computed from those first 16.  We preload
- * 4 values before starting, so there are three kinds of rounds:
- * - The first 12 (all f0) also load the W values from memory.
- * - The next 64 compute W(i+4) in parallel. 8*f0, 20*f1, 20*f2, 16*f1.
- * - The last 4 (all f1) do not do anything with W.
- *
- * Therefore, we have 6 different round functions:
- * STEPD0_LOAD(t,s) - Perform round t and load W(s).  s < 16
- * STEPD0_UPDATE(t,s) - Perform round t and compute W(s).  s >= 16.
- * STEPD1_UPDATE(t,s)
- * STEPD2_UPDATE(t,s)
- * STEPD1(t) - Perform round t with no load or update.
- *
- * The G5 is more fully out-of-order, and can find the parallelism
- * by itself.  The big limit is that it has a 2-cycle ALU latency, so
- * even though it's 2-way, the code has to be scheduled as if it's
- * 4-way, which can be a limit.  To help it, we try to schedule the
- * read of RA(t) as late as possible so it doesn't stall waiting for
- * the previous round's RE(t-1), and we try to rotate RB(t) as early
- * as possible while reading RC(t) (= RB(t-1)) as late as possible.
- */
-
-/* the initial loads. */
-#define LOADW(s) \
-	lwz	W(s),(s)*4(%r4)
-
-/*
- * Perform a step with F0, and load W(s).  Uses W(s) as a temporary
- * before loading it.
- * This is actually 10 instructions, which is an awkward fit.
- * It can execute grouped as listed, or delayed one instruction.
- * (If delayed two instructions, there is a stall before the start of the
- * second line.)  Thus, two iterations take 7 cycles, 3.5 cycles per round.
- */
-#define STEPD0_LOAD(t,s) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t);  and    W(s),RC(t),RB(t); \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;      rotlwi RB(t),RB(t),30;   \
-add RE(t),RE(t),W(s); add    %r0,%r0,%r5;      lwz    W(s),(s)*4(%r4);  \
-add RE(t),RE(t),%r0
-
-/*
- * This is likewise awkward, 13 instructions.  However, it can also
- * execute starting with 2 out of 3 possible moduli, so it does 2 rounds
- * in 9 cycles, 4.5 cycles/round.
- */
-#define STEPD0_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  and    %r0,RC(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r5;  loadk; rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;     \
-add RE(t),RE(t),%r0
-
-/* Nicely optimal.  Conveniently, also the most common. */
-#define STEPD1_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r5;  loadk; xor %r0,%r0,RC(t);  xor W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1
-
-/*
- * The naked version, no UPDATE, for the last 4 rounds.  3 cycles per.
- * We could use W(s) as a temp register, but we don't need it.
- */
-#define STEPD1(t) \
-                        add   RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); \
-rotlwi RB(t),RB(t),30;  add   RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   \
-add    RE(t),RE(t),%r0; rotlwi %r0,RA(t),5;     /* spare slot */        \
-add    RE(t),RE(t),%r0
-
-/*
- * 14 instructions, 5 cycles per.  The majority function is a bit
- * awkward to compute.  This can execute with a 1-instruction delay,
- * but it causes a 2-instruction delay, which triggers a stall.
- */
-#define STEPD2_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); and    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  xor    %r0,RD(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r5;  loadk; and %r0,%r0,RC(t);  xor W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     rotlwi W(s),W(s),1;             \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30
-
-#define STEP0_LOAD4(t,s)		\
-	STEPD0_LOAD(t,s);		\
-	STEPD0_LOAD((t+1),(s)+1);	\
-	STEPD0_LOAD((t)+2,(s)+2);	\
-	STEPD0_LOAD((t)+3,(s)+3)
-
-#define STEPUP4(fn, t, s, loadk...)		\
-	STEP##fn##_UPDATE(t,s,);		\
-	STEP##fn##_UPDATE((t)+1,(s)+1,);	\
-	STEP##fn##_UPDATE((t)+2,(s)+2,);	\
-	STEP##fn##_UPDATE((t)+3,(s)+3,loadk)
-
-#define STEPUP20(fn, t, s, loadk...)	\
-	STEPUP4(fn, t, s,);		\
-	STEPUP4(fn, (t)+4, (s)+4,);	\
-	STEPUP4(fn, (t)+8, (s)+8,);	\
-	STEPUP4(fn, (t)+12, (s)+12,);	\
-	STEPUP4(fn, (t)+16, (s)+16, loadk)
-
-	.globl	ppc_sha1_core
-ppc_sha1_core:
-	stwu	%r1,-80(%r1)
-	stmw	%r13,4(%r1)
-
-	/* Load up A - E */
-	lmw	%r27,0(%r3)
-
-	mtctr	%r5
-
-1:
-	LOADW(0)
-	lis	%r5,0x5a82
-	mr	RE(0),%r31
-	LOADW(1)
-	mr	RD(0),%r30
-	mr	RC(0),%r29
-	LOADW(2)
-	ori	%r5,%r5,0x7999	/* K0-19 */
-	mr	RB(0),%r28
-	LOADW(3)
-	mr	RA(0),%r27
-
-	STEP0_LOAD4(0, 4)
-	STEP0_LOAD4(4, 8)
-	STEP0_LOAD4(8, 12)
-	STEPUP4(D0, 12, 16,)
-	STEPUP4(D0, 16, 20, lis %r5,0x6ed9)
-
-	ori	%r5,%r5,0xeba1	/* K20-39 */
-	STEPUP20(D1, 20, 24, lis %r5,0x8f1b)
-
-	ori	%r5,%r5,0xbcdc	/* K40-59 */
-	STEPUP20(D2, 40, 44, lis %r5,0xca62)
-
-	ori	%r5,%r5,0xc1d6	/* K60-79 */
-	STEPUP4(D1, 60, 64,)
-	STEPUP4(D1, 64, 68,)
-	STEPUP4(D1, 68, 72,)
-	STEPUP4(D1, 72, 76,)
-	addi	%r4,%r4,64
-	STEPD1(76)
-	STEPD1(77)
-	STEPD1(78)
-	STEPD1(79)
-
-	/* Add results to original values */
-	add	%r31,%r31,RE(0)
-	add	%r30,%r30,RD(0)
-	add	%r29,%r29,RC(0)
-	add	%r28,%r28,RB(0)
-	add	%r27,%r27,RA(0)
-
-	bdnz	1b
-
-	/* Save final hash, restore registers, and return */
-	stmw	%r27,0(%r3)
-	lmw	%r13,4(%r1)
-	addi	%r1,%r1,80
-	blr
-- 
2.24.0.424.g559c6fc317.dirty


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] Remove non-SHA1dc sha1 implementations
  2020-02-23 22:37       ` [PATCH] Remove non-SHA1dc sha1 implementations Mike Hommey
@ 2020-02-24  4:47         ` Jeff King
  2020-02-24  4:52           ` Jeff King
  2022-03-19  1:02         ` [PATCH] ppc: remove custom SHA-1 implementation Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2020-02-24  4:47 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git, gitster

On Mon, Feb 24, 2020 at 07:37:58AM +0900, Mike Hommey wrote:

> It is 2020, and with the weakening of SHA1 security-wise, there doesn't
> seem to be a reason to support anything else than SHA1dc, with collision
> detection.

One possible reason is that they're way faster than sha1dc (block-sha1
maybe only a little, but openssl's sha1 is over twice as fast).

To be clear, I think the slowdown is worth the extra safety, but:

 - do we still want to care about people who prefer to make the tradeoff
   differently?

 - when we first switched the default to sha1dc, the idea was raised of
   continuing to use a faster implementation for non-security checksums
   (e.g., the checksums at the end of packfiles, index files, etc). I
   don't think anybody ever implemented that, but it's not a terrible
   idea. OTOH, if nobody noticed the bottleneck enough to care, maybe
   it's not worth worrying about.

I'm not convinced the answer to those questions is "yes", but I think
it's worth at least raising them (and arguing against them in the commit
message).

One thing that compels me is the recent report that we still build with
common crypto by default on macOS, which was definitely _not_ intended.
That's a bug that can be fixed, but it wouldn't have happened in the
first place if we only supported sha1dc.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Remove non-SHA1dc sha1 implementations
  2020-02-24  4:47         ` Jeff King
@ 2020-02-24  4:52           ` Jeff King
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff King @ 2020-02-24  4:52 UTC (permalink / raw)
  To: Mike Hommey; +Cc: git, gitster

On Sun, Feb 23, 2020 at 11:47:32PM -0500, Jeff King wrote:

> One thing that compels me is the recent report that we still build with
> common crypto by default on macOS, which was definitely _not_ intended.
> That's a bug that can be fixed, but it wouldn't have happened in the
> first place if we only supported sha1dc.

I just noticed you were the original reporter there, too. So I guess it
compelled you, too. ;)

If we do want to keep the other implementations around, another thing
that might be worth doing is to teach t0013 to complain when the
collision-detecting sha1 is not in use (i.e., rather than auto-skipping
when built without DC_SHA1, require the user to set a special
NO_REALLY_I_CHOOSE_NOT_TO_USE_DC_SHA1_AND_AM_AWARE_OF_THE_IMPLICATIONS
variable). That would provide a cross-check on the build flags.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] ppc: remove custom SHA-1 implementation
  2020-02-23 22:37       ` [PATCH] Remove non-SHA1dc sha1 implementations Mike Hommey
  2020-02-24  4:47         ` Jeff King
@ 2022-03-19  1:02         ` Ævar Arnfjörð Bjarmason
  2022-03-21 16:39           ` Junio C Hamano
  2022-03-21 17:06           ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-19  1:02 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.

Since it was added we've moved to DC_SHA1 by default, and anyone
wanting hard-rolled non-DC SHA-1 implementation can use OpenSSL's via
the OPENSSL_SHA1 knob.

I'm unsure if this was ever supposed to work on 64-bit PPC. It clearly
originally targeted 32 bit PPC, but there's some mailing list
references to this being tried on G5 (PPC 970). I can't get it to do
anything but segfault on the BE POWER8 machine in the GCC compile
farm. Anyone caring about speed on PPC these days is likely to be
using IBM's POWER, not PPC 970.

There have been proposals to entirely remove non-DC_SHA1
implementations from the tree[1]. I think per [2] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).

The main reason for doing so at this point is to simplify follow-up
Makefile change. Since PPC_SHA1 included the only in-tree *.S assembly
file we needed to keep around special support for building objects
from it. By getting rid of it we know we'll always build *.o from *.c
files, which makes the build process simpler.

As an aside the code being removed here was also throwing warnings
with the "-pedantic" flag, but let's remove it instead of fixing it,
as 544d93bc3b4 (block-sha1: remove use of obsolete x86 assembly,
2022-03-10) did for block-sha1/*.

1. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
2. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 INSTALL       |   3 +-
 Makefile      |  35 ++++----
 configure.ac  |   3 -
 hash.h        |   6 +-
 ppc/sha1.c    |  72 ----------------
 ppc/sha1.h    |  25 ------
 ppc/sha1ppc.S | 224 --------------------------------------------------
 7 files changed, 17 insertions(+), 351 deletions(-)
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S

diff --git a/INSTALL b/INSTALL
index 4140a3f5c8b..89b15d71df5 100644
--- a/INSTALL
+++ b/INSTALL
@@ -135,8 +135,7 @@ Issues of note:
 
 	  By default, git uses OpenSSL for SHA1 but it will use its own
 	  library (inspired by Mozilla's) with either NO_OPENSSL or
-	  BLK_SHA1.  Also included is a version optimized for PowerPC
-	  (PPC_SHA1).
+	  BLK_SHA1.
 
 	- "libcurl" library is used for fetching and pushing
 	  repositories over http:// or https://, as well as by
diff --git a/Makefile b/Makefile
index 70f0a004e75..965e51f773e 100644
--- a/Makefile
+++ b/Makefile
@@ -155,9 +155,6 @@ include shared.mak
 # Define BLK_SHA1 environment variable to make use of the bundled
 # optimized C SHA1 routine.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define DC_SHA1 to unconditionally enable the collision-detecting sha1
 # algorithm. This is slower, but may detect attempted collision attacks.
 # Takes priority over other *_SHA1 knobs.
@@ -1770,14 +1767,14 @@ ifdef OPENSSL_SHA1
 	EXTLIBS += $(LIB_4_CRYPTO)
 	BASIC_CFLAGS += -DSHA1_OPENSSL
 else
+ifdef PPC_SHA1
+$(error PPC_SHA1 has been removed! You should almost definitely remove that \
+knob and use the DC_SHA1 default! See INSTALL for more information)
+endif
 ifdef BLK_SHA1
 	LIB_OBJS += block-sha1/sha1.o
 	BASIC_CFLAGS += -DSHA1_BLK
 else
-ifdef PPC_SHA1
-	LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
-	BASIC_CFLAGS += -DSHA1_PPC
-else
 ifdef APPLE_COMMON_CRYPTO
 	COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
 	BASIC_CFLAGS += -DSHA1_APPLE
@@ -1811,7 +1808,6 @@ endif
 endif
 endif
 endif
-endif
 
 ifdef OPENSSL_SHA256
 	EXTLIBS += $(LIB_4_CRYPTO)
@@ -2509,6 +2505,11 @@ OBJECTS += $(SCALAR_OBJECTS)
 .PHONY: objects
 objects: $(OBJECTS)
 
+# Derived from $(OBJECTS)
+OBJECTS_C = $(OBJECTS:%.o=%.c)
+OBJECTS_S = $(OBJECTS:%.o=%.s)
+OBJECTS_SP = $(OBJECTS:%.o=%.sp)
+
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
 dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
 
@@ -2540,13 +2541,7 @@ missing_compdb_dir =
 compdb_args =
 endif
 
-ASM_SRC := $(wildcard $(OBJECTS:o=S))
-ASM_OBJ := $(ASM_SRC:S=o)
-C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
-
-$(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
-	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
-$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
+$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
 	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
 
 %.s: %.c GIT-CFLAGS FORCE
@@ -2692,7 +2687,7 @@ XGETTEXT_FLAGS_SH = $(XGETTEXT_FLAGS) --language=Shell \
 	--keyword=gettextln --keyword=eval_gettextln
 XGETTEXT_FLAGS_PERL = $(XGETTEXT_FLAGS) --language=Perl \
 	--keyword=__ --keyword=N__ --keyword="__n:1,2"
-LOCALIZED_C = $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
+LOCALIZED_C = $(OBJECTS_C) $(LIB_H) $(GENERATED_H)
 LOCALIZED_SH = $(SCRIPT_SH)
 LOCALIZED_SH += git-sh-setup.sh
 LOCALIZED_PERL = $(SCRIPT_PERL)
@@ -2939,16 +2934,14 @@ t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 check-sha1:: t/helper/test-tool$X
 	t/helper/test-sha1.sh
 
-SP_OBJ = $(patsubst %.o,%.sp,$(C_OBJ))
-
-$(SP_OBJ): %.sp: %.c %.o
+$(OBJECTS_SP): %.sp: %.c %.o
 	$(QUIET_SP)cgcc -no-compile $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) \
 		-Wsparse-error \
 		$(SPARSE_FLAGS) $(SP_EXTRA_FLAGS) $< && \
 	>$@
 
 .PHONY: sparse
-sparse: $(SP_OBJ)
+sparse: $(OBJECTS_SP)
 
 EXCEPT_HDRS := $(GENERATED_H) unicode-width.h compat/% xdiff/%
 ifndef GCRYPT_SHA256
@@ -3272,7 +3265,7 @@ clean: profile-clean coverage-clean cocciclean
 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
 	$(RM) $(TEST_PROGRAMS)
 	$(RM) $(FUZZ_PROGRAMS)
-	$(RM) $(SP_OBJ)
+	$(RM) $(OBJECTS_SP)
 	$(RM) $(HCC)
 	$(RM) -r bin-wrappers $(dep_dirs) $(compdb_dir) compile_commands.json
 	$(RM) -r po/build/
diff --git a/configure.ac b/configure.ac
index 5ee25ec95c8..9c75b00d3eb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -237,9 +237,6 @@ AC_MSG_NOTICE([CHECKS for site configuration])
 # tests.  These tests take up a significant amount of the total test time
 # but are not needed unless you plan to talk to SVN repos.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 #
 # Define OPENSSLDIR=/foo/bar if your openssl header and library files are in
diff --git a/hash.h b/hash.h
index 5d40368f18a..efc14c5f56d 100644
--- a/hash.h
+++ b/hash.h
@@ -4,9 +4,7 @@
 #include "git-compat-util.h"
 #include "repository.h"
 
-#if defined(SHA1_PPC)
-#include "ppc/sha1.h"
-#elif defined(SHA1_APPLE)
+#if defined(SHA1_APPLE)
 #include <CommonCrypto/CommonDigest.h>
 #elif defined(SHA1_OPENSSL)
 #include <openssl/sha.h>
@@ -30,7 +28,7 @@
  * platform's underlying implementation of SHA-1; could be OpenSSL,
  * blk_SHA, Apple CommonCrypto, etc...  Note that the relevant
  * SHA-1 header may have already defined platform_SHA_CTX for our
- * own implementations like block-sha1 and ppc-sha1, so we list
+ * own implementations like block-sha1, so we list
  * the default for OpenSSL compatible SHA-1 implementations here.
  */
 #define platform_SHA_CTX	SHA_CTX
diff --git a/ppc/sha1.c b/ppc/sha1.c
deleted file mode 100644
index 1b705cee1fe..00000000000
--- a/ppc/sha1.c
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- *
- * This version assumes we are running on a big-endian machine.
- * It calls an external sha1_core() to process blocks of 64 bytes.
- */
-#include <stdio.h>
-#include <string.h>
-#include "sha1.h"
-
-void ppc_sha1_core(uint32_t *hash, const unsigned char *p,
-		   unsigned int nblocks);
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c)
-{
-	c->hash[0] = 0x67452301;
-	c->hash[1] = 0xEFCDAB89;
-	c->hash[2] = 0x98BADCFE;
-	c->hash[3] = 0x10325476;
-	c->hash[4] = 0xC3D2E1F0;
-	c->len = 0;
-	c->cnt = 0;
-	return 0;
-}
-
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n)
-{
-	unsigned long nb;
-	const unsigned char *p = ptr;
-
-	c->len += (uint64_t) n << 3;
-	while (n != 0) {
-		if (c->cnt || n < 64) {
-			nb = 64 - c->cnt;
-			if (nb > n)
-				nb = n;
-			memcpy(&c->buf.b[c->cnt], p, nb);
-			if ((c->cnt += nb) == 64) {
-				ppc_sha1_core(c->hash, c->buf.b, 1);
-				c->cnt = 0;
-			}
-		} else {
-			nb = n >> 6;
-			ppc_sha1_core(c->hash, p, nb);
-			nb <<= 6;
-		}
-		n -= nb;
-		p += nb;
-	}
-	return 0;
-}
-
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c)
-{
-	unsigned int cnt = c->cnt;
-
-	c->buf.b[cnt++] = 0x80;
-	if (cnt > 56) {
-		if (cnt < 64)
-			memset(&c->buf.b[cnt], 0, 64 - cnt);
-		ppc_sha1_core(c->hash, c->buf.b, 1);
-		cnt = 0;
-	}
-	if (cnt < 56)
-		memset(&c->buf.b[cnt], 0, 56 - cnt);
-	c->buf.l[7] = c->len;
-	ppc_sha1_core(c->hash, c->buf.b, 1);
-	memcpy(hash, c->hash, 20);
-	return 0;
-}
diff --git a/ppc/sha1.h b/ppc/sha1.h
deleted file mode 100644
index 9b24b326159..00000000000
--- a/ppc/sha1.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-#include <stdint.h>
-
-typedef struct {
-	uint32_t hash[5];
-	uint32_t cnt;
-	uint64_t len;
-	union {
-		unsigned char b[64];
-		uint64_t l[8];
-	} buf;
-} ppc_SHA_CTX;
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c);
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n);
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c);
-
-#define platform_SHA_CTX	ppc_SHA_CTX
-#define platform_SHA1_Init	ppc_SHA1_Init
-#define platform_SHA1_Update	ppc_SHA1_Update
-#define platform_SHA1_Final	ppc_SHA1_Final
diff --git a/ppc/sha1ppc.S b/ppc/sha1ppc.S
deleted file mode 100644
index 1711eef6e71..00000000000
--- a/ppc/sha1ppc.S
+++ /dev/null
@@ -1,224 +0,0 @@
-/*
- * SHA-1 implementation for PowerPC.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-
-/*
- * PowerPC calling convention:
- * %r0 - volatile temp
- * %r1 - stack pointer.
- * %r2 - reserved
- * %r3-%r12 - Incoming arguments & return values; volatile.
- * %r13-%r31 - Callee-save registers
- * %lr - Return address, volatile
- * %ctr - volatile
- *
- * Register usage in this routine:
- * %r0 - temp
- * %r3 - argument (pointer to 5 words of SHA state)
- * %r4 - argument (pointer to data to hash)
- * %r5 - Constant K in SHA round (initially number of blocks to hash)
- * %r6-%r10 - Working copies of SHA variables A..E (actually E..A order)
- * %r11-%r26 - Data being hashed W[].
- * %r27-%r31 - Previous copies of A..E, for final add back.
- * %ctr - loop count
- */
-
-
-/*
- * We roll the registers for A, B, C, D, E around on each
- * iteration; E on iteration t is D on iteration t+1, and so on.
- * We use registers 6 - 10 for this.  (Registers 27 - 31 hold
- * the previous values.)
- */
-#define RA(t)	(((t)+4)%5+6)
-#define RB(t)	(((t)+3)%5+6)
-#define RC(t)	(((t)+2)%5+6)
-#define RD(t)	(((t)+1)%5+6)
-#define RE(t)	(((t)+0)%5+6)
-
-/* We use registers 11 - 26 for the W values */
-#define W(t)	((t)%16+11)
-
-/* Register 5 is used for the constant k */
-
-/*
- * The basic SHA-1 round function is:
- * E += ROTL(A,5) + F(B,C,D) + W[i] + K;  B = ROTL(B,30)
- * Then the variables are renamed: (A,B,C,D,E) = (E,A,B,C,D).
- *
- * Every 20 rounds, the function F() and the constant K changes:
- * - 20 rounds of f0(b,c,d) = "bit wise b ? c : d" =  (^b & d) + (b & c)
- * - 20 rounds of f1(b,c,d) = b^c^d = (b^d)^c
- * - 20 rounds of f2(b,c,d) = majority(b,c,d) = (b&d) + ((b^d)&c)
- * - 20 more rounds of f1(b,c,d)
- *
- * These are all scheduled for near-optimal performance on a G4.
- * The G4 is a 3-issue out-of-order machine with 3 ALUs, but it can only
- * *consider* starting the oldest 3 instructions per cycle.  So to get
- * maximum performance out of it, you have to treat it as an in-order
- * machine.  Which means interleaving the computation round t with the
- * computation of W[t+4].
- *
- * The first 16 rounds use W values loaded directly from memory, while the
- * remaining 64 use values computed from those first 16.  We preload
- * 4 values before starting, so there are three kinds of rounds:
- * - The first 12 (all f0) also load the W values from memory.
- * - The next 64 compute W(i+4) in parallel. 8*f0, 20*f1, 20*f2, 16*f1.
- * - The last 4 (all f1) do not do anything with W.
- *
- * Therefore, we have 6 different round functions:
- * STEPD0_LOAD(t,s) - Perform round t and load W(s).  s < 16
- * STEPD0_UPDATE(t,s) - Perform round t and compute W(s).  s >= 16.
- * STEPD1_UPDATE(t,s)
- * STEPD2_UPDATE(t,s)
- * STEPD1(t) - Perform round t with no load or update.
- *
- * The G5 is more fully out-of-order, and can find the parallelism
- * by itself.  The big limit is that it has a 2-cycle ALU latency, so
- * even though it's 2-way, the code has to be scheduled as if it's
- * 4-way, which can be a limit.  To help it, we try to schedule the
- * read of RA(t) as late as possible so it doesn't stall waiting for
- * the previous round's RE(t-1), and we try to rotate RB(t) as early
- * as possible while reading RC(t) (= RB(t-1)) as late as possible.
- */
-
-/* the initial loads. */
-#define LOADW(s) \
-	lwz	W(s),(s)*4(%r4)
-
-/*
- * Perform a step with F0, and load W(s).  Uses W(s) as a temporary
- * before loading it.
- * This is actually 10 instructions, which is an awkward fit.
- * It can execute grouped as listed, or delayed one instruction.
- * (If delayed two instructions, there is a stall before the start of the
- * second line.)  Thus, two iterations take 7 cycles, 3.5 cycles per round.
- */
-#define STEPD0_LOAD(t,s) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t);  and    W(s),RC(t),RB(t); \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;      rotlwi RB(t),RB(t),30;   \
-add RE(t),RE(t),W(s); add    %r0,%r0,%r5;      lwz    W(s),(s)*4(%r4);  \
-add RE(t),RE(t),%r0
-
-/*
- * This is likewise awkward, 13 instructions.  However, it can also
- * execute starting with 2 out of 3 possible moduli, so it does 2 rounds
- * in 9 cycles, 4.5 cycles/round.
- */
-#define STEPD0_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  and    %r0,RC(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r5;  loadk; rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;     \
-add RE(t),RE(t),%r0
-
-/* Nicely optimal.  Conveniently, also the most common. */
-#define STEPD1_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r5;  loadk; xor %r0,%r0,RC(t);  xor W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1
-
-/*
- * The naked version, no UPDATE, for the last 4 rounds.  3 cycles per.
- * We could use W(s) as a temp register, but we don't need it.
- */
-#define STEPD1(t) \
-                        add   RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); \
-rotlwi RB(t),RB(t),30;  add   RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   \
-add    RE(t),RE(t),%r0; rotlwi %r0,RA(t),5;     /* spare slot */        \
-add    RE(t),RE(t),%r0
-
-/*
- * 14 instructions, 5 cycles per.  The majority function is a bit
- * awkward to compute.  This can execute with a 1-instruction delay,
- * but it causes a 2-instruction delay, which triggers a stall.
- */
-#define STEPD2_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); and    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  xor    %r0,RD(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r5;  loadk; and %r0,%r0,RC(t);  xor W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     rotlwi W(s),W(s),1;             \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30
-
-#define STEP0_LOAD4(t,s)		\
-	STEPD0_LOAD(t,s);		\
-	STEPD0_LOAD((t+1),(s)+1);	\
-	STEPD0_LOAD((t)+2,(s)+2);	\
-	STEPD0_LOAD((t)+3,(s)+3)
-
-#define STEPUP4(fn, t, s, loadk...)		\
-	STEP##fn##_UPDATE(t,s,);		\
-	STEP##fn##_UPDATE((t)+1,(s)+1,);	\
-	STEP##fn##_UPDATE((t)+2,(s)+2,);	\
-	STEP##fn##_UPDATE((t)+3,(s)+3,loadk)
-
-#define STEPUP20(fn, t, s, loadk...)	\
-	STEPUP4(fn, t, s,);		\
-	STEPUP4(fn, (t)+4, (s)+4,);	\
-	STEPUP4(fn, (t)+8, (s)+8,);	\
-	STEPUP4(fn, (t)+12, (s)+12,);	\
-	STEPUP4(fn, (t)+16, (s)+16, loadk)
-
-	.globl	ppc_sha1_core
-ppc_sha1_core:
-	stwu	%r1,-80(%r1)
-	stmw	%r13,4(%r1)
-
-	/* Load up A - E */
-	lmw	%r27,0(%r3)
-
-	mtctr	%r5
-
-1:
-	LOADW(0)
-	lis	%r5,0x5a82
-	mr	RE(0),%r31
-	LOADW(1)
-	mr	RD(0),%r30
-	mr	RC(0),%r29
-	LOADW(2)
-	ori	%r5,%r5,0x7999	/* K0-19 */
-	mr	RB(0),%r28
-	LOADW(3)
-	mr	RA(0),%r27
-
-	STEP0_LOAD4(0, 4)
-	STEP0_LOAD4(4, 8)
-	STEP0_LOAD4(8, 12)
-	STEPUP4(D0, 12, 16,)
-	STEPUP4(D0, 16, 20, lis %r5,0x6ed9)
-
-	ori	%r5,%r5,0xeba1	/* K20-39 */
-	STEPUP20(D1, 20, 24, lis %r5,0x8f1b)
-
-	ori	%r5,%r5,0xbcdc	/* K40-59 */
-	STEPUP20(D2, 40, 44, lis %r5,0xca62)
-
-	ori	%r5,%r5,0xc1d6	/* K60-79 */
-	STEPUP4(D1, 60, 64,)
-	STEPUP4(D1, 64, 68,)
-	STEPUP4(D1, 68, 72,)
-	STEPUP4(D1, 72, 76,)
-	addi	%r4,%r4,64
-	STEPD1(76)
-	STEPD1(77)
-	STEPD1(78)
-	STEPD1(79)
-
-	/* Add results to original values */
-	add	%r31,%r31,RE(0)
-	add	%r30,%r30,RD(0)
-	add	%r29,%r29,RC(0)
-	add	%r28,%r28,RB(0)
-	add	%r27,%r27,RA(0)
-
-	bdnz	1b
-
-	/* Save final hash, restore registers, and return */
-	stmw	%r27,0(%r3)
-	lmw	%r13,4(%r1)
-	addi	%r1,%r1,80
-	blr
-- 
2.35.1.1438.g8874c8eeb35


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] ppc: remove custom SHA-1 implementation
  2022-03-19  1:02         ` [PATCH] ppc: remove custom SHA-1 implementation Ævar Arnfjörð Bjarmason
@ 2022-03-21 16:39           ` Junio C Hamano
  2022-03-21 17:06           ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2022-03-21 16:39 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The main reason for doing so at this point is to simplify follow-up
> Makefile change. Since PPC_SHA1 included the only in-tree *.S assembly
> file we needed to keep around special support for building objects
> from it. By getting rid of it we know we'll always build *.o from *.c
> files, which makes the build process simpler.

Yuck.

> diff --git a/Makefile b/Makefile
> index 70f0a004e75..965e51f773e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -155,9 +155,6 @@ include shared.mak
>  # Define BLK_SHA1 environment variable to make use of the bundled
>  # optimized C SHA1 routine.
>  #
> -# Define PPC_SHA1 environment variable when running make to make use of
> -# a bundled SHA1 routine optimized for PowerPC.
> -#
>  # Define DC_SHA1 to unconditionally enable the collision-detecting sha1
>  # algorithm. This is slower, but may detect attempted collision attacks.
>  # Takes priority over other *_SHA1 knobs.
> @@ -1770,14 +1767,14 @@ ifdef OPENSSL_SHA1
>  	EXTLIBS += $(LIB_4_CRYPTO)
>  	BASIC_CFLAGS += -DSHA1_OPENSSL
>  else
> +ifdef PPC_SHA1
> +$(error PPC_SHA1 has been removed! You should almost definitely remove that \
> +knob and use the DC_SHA1 default! See INSTALL for more information)
> +endif

"use the DC_SHA1 default"?  

	PPC_SHA1 support is being removed.  Use DC_SHA1 instead, 
	which is the default.


I am wondering if we can make only these four lines the first step
of remova, without doing anything else.  It would give us a good
feel on how many users we may be inconveniencing (not necessarily
hurting, as switching to DC_SHA1 would be a good move) by the
removal.

> @@ -2509,6 +2505,11 @@ OBJECTS += $(SCALAR_OBJECTS)
>  .PHONY: objects
>  objects: $(OBJECTS)
>  
> +# Derived from $(OBJECTS)
> +OBJECTS_C = $(OBJECTS:%.o=%.c)
> +OBJECTS_S = $(OBJECTS:%.o=%.s)
> +OBJECTS_SP = $(OBJECTS:%.o=%.sp)

Usually we build objects from sources, so "derived from" is a
puzzling way to call them.  I understand we are deriving a list from
another list, but it still feels confusing (see below for ASM_OBJ).

This seems to have nothing to do with "we no longer have *.S files"
and looks more like "now we have excuse to touch Makefile, make
random changes that look subjectively good to me, burying them in
the noise so that we can sneak them in without justifying them much
in the proposed log message".

>  dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
>  dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
>  
> @@ -2540,13 +2541,7 @@ missing_compdb_dir =
>  compdb_args =
>  endif
>  
> -ASM_SRC := $(wildcard $(OBJECTS:o=S))
> -ASM_OBJ := $(ASM_SRC:S=o)
> -C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))

I tend to agree with this patch that these three lines are ugly in
multiple ways.

It's a confusing construct.  The list of OBJECTS is used as the
single source of truth and others are derived by filtering the list
and futzing the suffix of the resulting subset of elements; it makes
me wonder if it should be the other way around (i.e. we have a list
of source files in various languages, all get turned into objects,
rather than we have a list of object files, and we see if
corresponding source file in possible languages, if any, exists on
disk).  It is pleasing that we can see them go.

Unfortunately the new "Derived from $(OBJECTS) lists are in the same
spirit as used by ASM_SRC we are removing here, aren't they?

> -$(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
> -	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
> -$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
> +$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
>  	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<

Now we deal only with *.c sources and all objects come from them,
and the action is the same as the old C_OBJ rule, naturally.

>  %.s: %.c GIT-CFLAGS FORCE

This is the only remaining rule regarding assembly and it is the
target to generate for debugging, not even used as a source to
create object files.  Do we need OBJECTS_S defined above?  I somehow
doubt it.  FWIW, I do not see the need for OBJECTS_SP or OBJECTS_C,
either.  E.g.

> @@ -2692,7 +2687,7 @@ XGETTEXT_FLAGS_SH = $(XGETTEXT_FLAGS) --language=Shell \
>  	--keyword=gettextln --keyword=eval_gettextln
>  XGETTEXT_FLAGS_PERL = $(XGETTEXT_FLAGS) --language=Perl \
>  	--keyword=__ --keyword=N__ --keyword="__n:1,2"
> -LOCALIZED_C = $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
> +LOCALIZED_C = $(OBJECTS_C) $(LIB_H) $(GENERATED_H)

Shouldn't it be sufficient to use %(OBJECTS:o=c) here, i.e. equating
the old C_OBJ with OBJECTS, now we know all objects come from C?
The same question for SP_OBJ below, but I won't repeat.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] ppc: remove custom SHA-1 implementation
  2022-03-19  1:02         ` [PATCH] ppc: remove custom SHA-1 implementation Ævar Arnfjörð Bjarmason
  2022-03-21 16:39           ` Junio C Hamano
@ 2022-03-21 17:06           ` Ævar Arnfjörð Bjarmason
  2022-03-21 21:19             ` brian m. carlson
  2022-08-31  9:18             ` [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-21 17:06 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.

Since it was added we've moved to DC_SHA1 by default, and anyone
wanting hard-rolled non-DC SHA-1 implementation can use OpenSSL's via
the OPENSSL_SHA1 knob.

I'm unsure if this was ever supposed to work on 64-bit PPC. It clearly
originally targeted 32 bit PPC, but there's some mailing list
references to this being tried on G5 (PPC 970). I can't get it to do
anything but segfault on the BE POWER8 machine in the GCC compile
farm. Anyone caring about speed on PPC these days is likely to be
using IBM's POWER, not PPC 970.

There have been proposals to entirely remove non-DC_SHA1
implementations from the tree[1]. I think per [2] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).

The main reason for doing so at this point is to simplify follow-up
Makefile change. Since PPC_SHA1 included the only in-tree *.S assembly
file we needed to keep around special support for building objects
from it. By getting rid of it we know we'll always build *.o from *.c
files, which makes the build process simpler.

As an aside the code being removed here was also throwing warnings
with the "-pedantic" flag, but let's remove it instead of fixing it,
as 544d93bc3b4 (block-sha1: remove use of obsolete x86 assembly,
2022-03-10) did for block-sha1/*.

1. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
2. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

A more minimal change to the Makefile than that in v1. In v1 I thought
it was a bit odd to leave in-place the C_OBJ variable that was only
declared for now-removed code. But sure, we can keep it and keep
defining SP_OBJ etc. as being derived from it.

Range-diff against v1:
1:  05dcdca3877 ! 1:  e77fd23a824 ppc: remove custom SHA-1 implementation
    @@ Makefile: ifdef OPENSSL_SHA1
      	BASIC_CFLAGS += -DSHA1_OPENSSL
      else
     +ifdef PPC_SHA1
    -+$(error PPC_SHA1 has been removed! You should almost definitely remove that \
    -+knob and use the DC_SHA1 default! See INSTALL for more information)
    ++$(error PPC_SHA1 has been removed! Use DC_SHA1 instead, which is the default)
     +endif
      ifdef BLK_SHA1
      	LIB_OBJS += block-sha1/sha1.o
    @@ Makefile: endif
      
      ifdef OPENSSL_SHA256
      	EXTLIBS += $(LIB_4_CRYPTO)
    -@@ Makefile: OBJECTS += $(SCALAR_OBJECTS)
    - .PHONY: objects
    - objects: $(OBJECTS)
    - 
    -+# Derived from $(OBJECTS)
    -+OBJECTS_C = $(OBJECTS:%.o=%.c)
    -+OBJECTS_S = $(OBJECTS:%.o=%.s)
    -+OBJECTS_SP = $(OBJECTS:%.o=%.sp)
    -+
    - dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
    - dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
    - 
     @@ Makefile: missing_compdb_dir =
      compdb_args =
      endif
    @@ Makefile: missing_compdb_dir =
     -ASM_SRC := $(wildcard $(OBJECTS:o=S))
     -ASM_OBJ := $(ASM_SRC:S=o)
     -C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
    --
    --$(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
    --	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
    --$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
    -+$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
    ++C_OBJ = $(OBJECTS)
    + 
    + $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
      	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
    +-$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
    +-	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
      
      %.s: %.c GIT-CFLAGS FORCE
    -@@ Makefile: XGETTEXT_FLAGS_SH = $(XGETTEXT_FLAGS) --language=Shell \
    - 	--keyword=gettextln --keyword=eval_gettextln
    - XGETTEXT_FLAGS_PERL = $(XGETTEXT_FLAGS) --language=Perl \
    - 	--keyword=__ --keyword=N__ --keyword="__n:1,2"
    --LOCALIZED_C = $(C_OBJ:o=c) $(LIB_H) $(GENERATED_H)
    -+LOCALIZED_C = $(OBJECTS_C) $(LIB_H) $(GENERATED_H)
    - LOCALIZED_SH = $(SCRIPT_SH)
    - LOCALIZED_SH += git-sh-setup.sh
    - LOCALIZED_PERL = $(SCRIPT_PERL)
    -@@ Makefile: t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
    - check-sha1:: t/helper/test-tool$X
    - 	t/helper/test-sha1.sh
    - 
    --SP_OBJ = $(patsubst %.o,%.sp,$(C_OBJ))
    --
    --$(SP_OBJ): %.sp: %.c %.o
    -+$(OBJECTS_SP): %.sp: %.c %.o
    - 	$(QUIET_SP)cgcc -no-compile $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) \
    - 		-Wsparse-error \
    - 		$(SPARSE_FLAGS) $(SP_EXTRA_FLAGS) $< && \
    - 	>$@
    - 
    - .PHONY: sparse
    --sparse: $(SP_OBJ)
    -+sparse: $(OBJECTS_SP)
    - 
    - EXCEPT_HDRS := $(GENERATED_H) unicode-width.h compat/% xdiff/%
    - ifndef GCRYPT_SHA256
    -@@ Makefile: clean: profile-clean coverage-clean cocciclean
    - 	$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
    - 	$(RM) $(TEST_PROGRAMS)
    - 	$(RM) $(FUZZ_PROGRAMS)
    --	$(RM) $(SP_OBJ)
    -+	$(RM) $(OBJECTS_SP)
    - 	$(RM) $(HCC)
    - 	$(RM) -r bin-wrappers $(dep_dirs) $(compdb_dir) compile_commands.json
    - 	$(RM) -r po/build/
    + 	$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
     
      ## configure.ac ##
     @@ configure.ac: AC_MSG_NOTICE([CHECKS for site configuration])

 INSTALL       |   3 +-
 Makefile      |  17 +---
 configure.ac  |   3 -
 hash.h        |   6 +-
 ppc/sha1.c    |  72 ----------------
 ppc/sha1.h    |  25 ------
 ppc/sha1ppc.S | 224 --------------------------------------------------
 7 files changed, 7 insertions(+), 343 deletions(-)
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S

diff --git a/INSTALL b/INSTALL
index 4140a3f5c8b..89b15d71df5 100644
--- a/INSTALL
+++ b/INSTALL
@@ -135,8 +135,7 @@ Issues of note:
 
 	  By default, git uses OpenSSL for SHA1 but it will use its own
 	  library (inspired by Mozilla's) with either NO_OPENSSL or
-	  BLK_SHA1.  Also included is a version optimized for PowerPC
-	  (PPC_SHA1).
+	  BLK_SHA1.
 
 	- "libcurl" library is used for fetching and pushing
 	  repositories over http:// or https://, as well as by
diff --git a/Makefile b/Makefile
index 70f0a004e75..33c6db5e6c9 100644
--- a/Makefile
+++ b/Makefile
@@ -155,9 +155,6 @@ include shared.mak
 # Define BLK_SHA1 environment variable to make use of the bundled
 # optimized C SHA1 routine.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define DC_SHA1 to unconditionally enable the collision-detecting sha1
 # algorithm. This is slower, but may detect attempted collision attacks.
 # Takes priority over other *_SHA1 knobs.
@@ -1770,14 +1767,13 @@ ifdef OPENSSL_SHA1
 	EXTLIBS += $(LIB_4_CRYPTO)
 	BASIC_CFLAGS += -DSHA1_OPENSSL
 else
+ifdef PPC_SHA1
+$(error PPC_SHA1 has been removed! Use DC_SHA1 instead, which is the default)
+endif
 ifdef BLK_SHA1
 	LIB_OBJS += block-sha1/sha1.o
 	BASIC_CFLAGS += -DSHA1_BLK
 else
-ifdef PPC_SHA1
-	LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
-	BASIC_CFLAGS += -DSHA1_PPC
-else
 ifdef APPLE_COMMON_CRYPTO
 	COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
 	BASIC_CFLAGS += -DSHA1_APPLE
@@ -1811,7 +1807,6 @@ endif
 endif
 endif
 endif
-endif
 
 ifdef OPENSSL_SHA256
 	EXTLIBS += $(LIB_4_CRYPTO)
@@ -2540,14 +2535,10 @@ missing_compdb_dir =
 compdb_args =
 endif
 
-ASM_SRC := $(wildcard $(OBJECTS:o=S))
-ASM_OBJ := $(ASM_SRC:S=o)
-C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
+C_OBJ = $(OBJECTS)
 
 $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
 	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
-$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
-	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
 
 %.s: %.c GIT-CFLAGS FORCE
 	$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
diff --git a/configure.ac b/configure.ac
index 5ee25ec95c8..9c75b00d3eb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -237,9 +237,6 @@ AC_MSG_NOTICE([CHECKS for site configuration])
 # tests.  These tests take up a significant amount of the total test time
 # but are not needed unless you plan to talk to SVN repos.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 #
 # Define OPENSSLDIR=/foo/bar if your openssl header and library files are in
diff --git a/hash.h b/hash.h
index 5d40368f18a..efc14c5f56d 100644
--- a/hash.h
+++ b/hash.h
@@ -4,9 +4,7 @@
 #include "git-compat-util.h"
 #include "repository.h"
 
-#if defined(SHA1_PPC)
-#include "ppc/sha1.h"
-#elif defined(SHA1_APPLE)
+#if defined(SHA1_APPLE)
 #include <CommonCrypto/CommonDigest.h>
 #elif defined(SHA1_OPENSSL)
 #include <openssl/sha.h>
@@ -30,7 +28,7 @@
  * platform's underlying implementation of SHA-1; could be OpenSSL,
  * blk_SHA, Apple CommonCrypto, etc...  Note that the relevant
  * SHA-1 header may have already defined platform_SHA_CTX for our
- * own implementations like block-sha1 and ppc-sha1, so we list
+ * own implementations like block-sha1, so we list
  * the default for OpenSSL compatible SHA-1 implementations here.
  */
 #define platform_SHA_CTX	SHA_CTX
diff --git a/ppc/sha1.c b/ppc/sha1.c
deleted file mode 100644
index 1b705cee1fe..00000000000
--- a/ppc/sha1.c
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- *
- * This version assumes we are running on a big-endian machine.
- * It calls an external sha1_core() to process blocks of 64 bytes.
- */
-#include <stdio.h>
-#include <string.h>
-#include "sha1.h"
-
-void ppc_sha1_core(uint32_t *hash, const unsigned char *p,
-		   unsigned int nblocks);
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c)
-{
-	c->hash[0] = 0x67452301;
-	c->hash[1] = 0xEFCDAB89;
-	c->hash[2] = 0x98BADCFE;
-	c->hash[3] = 0x10325476;
-	c->hash[4] = 0xC3D2E1F0;
-	c->len = 0;
-	c->cnt = 0;
-	return 0;
-}
-
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n)
-{
-	unsigned long nb;
-	const unsigned char *p = ptr;
-
-	c->len += (uint64_t) n << 3;
-	while (n != 0) {
-		if (c->cnt || n < 64) {
-			nb = 64 - c->cnt;
-			if (nb > n)
-				nb = n;
-			memcpy(&c->buf.b[c->cnt], p, nb);
-			if ((c->cnt += nb) == 64) {
-				ppc_sha1_core(c->hash, c->buf.b, 1);
-				c->cnt = 0;
-			}
-		} else {
-			nb = n >> 6;
-			ppc_sha1_core(c->hash, p, nb);
-			nb <<= 6;
-		}
-		n -= nb;
-		p += nb;
-	}
-	return 0;
-}
-
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c)
-{
-	unsigned int cnt = c->cnt;
-
-	c->buf.b[cnt++] = 0x80;
-	if (cnt > 56) {
-		if (cnt < 64)
-			memset(&c->buf.b[cnt], 0, 64 - cnt);
-		ppc_sha1_core(c->hash, c->buf.b, 1);
-		cnt = 0;
-	}
-	if (cnt < 56)
-		memset(&c->buf.b[cnt], 0, 56 - cnt);
-	c->buf.l[7] = c->len;
-	ppc_sha1_core(c->hash, c->buf.b, 1);
-	memcpy(hash, c->hash, 20);
-	return 0;
-}
diff --git a/ppc/sha1.h b/ppc/sha1.h
deleted file mode 100644
index 9b24b326159..00000000000
--- a/ppc/sha1.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-#include <stdint.h>
-
-typedef struct {
-	uint32_t hash[5];
-	uint32_t cnt;
-	uint64_t len;
-	union {
-		unsigned char b[64];
-		uint64_t l[8];
-	} buf;
-} ppc_SHA_CTX;
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c);
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n);
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c);
-
-#define platform_SHA_CTX	ppc_SHA_CTX
-#define platform_SHA1_Init	ppc_SHA1_Init
-#define platform_SHA1_Update	ppc_SHA1_Update
-#define platform_SHA1_Final	ppc_SHA1_Final
diff --git a/ppc/sha1ppc.S b/ppc/sha1ppc.S
deleted file mode 100644
index 1711eef6e71..00000000000
--- a/ppc/sha1ppc.S
+++ /dev/null
@@ -1,224 +0,0 @@
-/*
- * SHA-1 implementation for PowerPC.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-
-/*
- * PowerPC calling convention:
- * %r0 - volatile temp
- * %r1 - stack pointer.
- * %r2 - reserved
- * %r3-%r12 - Incoming arguments & return values; volatile.
- * %r13-%r31 - Callee-save registers
- * %lr - Return address, volatile
- * %ctr - volatile
- *
- * Register usage in this routine:
- * %r0 - temp
- * %r3 - argument (pointer to 5 words of SHA state)
- * %r4 - argument (pointer to data to hash)
- * %r5 - Constant K in SHA round (initially number of blocks to hash)
- * %r6-%r10 - Working copies of SHA variables A..E (actually E..A order)
- * %r11-%r26 - Data being hashed W[].
- * %r27-%r31 - Previous copies of A..E, for final add back.
- * %ctr - loop count
- */
-
-
-/*
- * We roll the registers for A, B, C, D, E around on each
- * iteration; E on iteration t is D on iteration t+1, and so on.
- * We use registers 6 - 10 for this.  (Registers 27 - 31 hold
- * the previous values.)
- */
-#define RA(t)	(((t)+4)%5+6)
-#define RB(t)	(((t)+3)%5+6)
-#define RC(t)	(((t)+2)%5+6)
-#define RD(t)	(((t)+1)%5+6)
-#define RE(t)	(((t)+0)%5+6)
-
-/* We use registers 11 - 26 for the W values */
-#define W(t)	((t)%16+11)
-
-/* Register 5 is used for the constant k */
-
-/*
- * The basic SHA-1 round function is:
- * E += ROTL(A,5) + F(B,C,D) + W[i] + K;  B = ROTL(B,30)
- * Then the variables are renamed: (A,B,C,D,E) = (E,A,B,C,D).
- *
- * Every 20 rounds, the function F() and the constant K changes:
- * - 20 rounds of f0(b,c,d) = "bit wise b ? c : d" =  (^b & d) + (b & c)
- * - 20 rounds of f1(b,c,d) = b^c^d = (b^d)^c
- * - 20 rounds of f2(b,c,d) = majority(b,c,d) = (b&d) + ((b^d)&c)
- * - 20 more rounds of f1(b,c,d)
- *
- * These are all scheduled for near-optimal performance on a G4.
- * The G4 is a 3-issue out-of-order machine with 3 ALUs, but it can only
- * *consider* starting the oldest 3 instructions per cycle.  So to get
- * maximum performance out of it, you have to treat it as an in-order
- * machine.  Which means interleaving the computation round t with the
- * computation of W[t+4].
- *
- * The first 16 rounds use W values loaded directly from memory, while the
- * remaining 64 use values computed from those first 16.  We preload
- * 4 values before starting, so there are three kinds of rounds:
- * - The first 12 (all f0) also load the W values from memory.
- * - The next 64 compute W(i+4) in parallel. 8*f0, 20*f1, 20*f2, 16*f1.
- * - The last 4 (all f1) do not do anything with W.
- *
- * Therefore, we have 6 different round functions:
- * STEPD0_LOAD(t,s) - Perform round t and load W(s).  s < 16
- * STEPD0_UPDATE(t,s) - Perform round t and compute W(s).  s >= 16.
- * STEPD1_UPDATE(t,s)
- * STEPD2_UPDATE(t,s)
- * STEPD1(t) - Perform round t with no load or update.
- *
- * The G5 is more fully out-of-order, and can find the parallelism
- * by itself.  The big limit is that it has a 2-cycle ALU latency, so
- * even though it's 2-way, the code has to be scheduled as if it's
- * 4-way, which can be a limit.  To help it, we try to schedule the
- * read of RA(t) as late as possible so it doesn't stall waiting for
- * the previous round's RE(t-1), and we try to rotate RB(t) as early
- * as possible while reading RC(t) (= RB(t-1)) as late as possible.
- */
-
-/* the initial loads. */
-#define LOADW(s) \
-	lwz	W(s),(s)*4(%r4)
-
-/*
- * Perform a step with F0, and load W(s).  Uses W(s) as a temporary
- * before loading it.
- * This is actually 10 instructions, which is an awkward fit.
- * It can execute grouped as listed, or delayed one instruction.
- * (If delayed two instructions, there is a stall before the start of the
- * second line.)  Thus, two iterations take 7 cycles, 3.5 cycles per round.
- */
-#define STEPD0_LOAD(t,s) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t);  and    W(s),RC(t),RB(t); \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;      rotlwi RB(t),RB(t),30;   \
-add RE(t),RE(t),W(s); add    %r0,%r0,%r5;      lwz    W(s),(s)*4(%r4);  \
-add RE(t),RE(t),%r0
-
-/*
- * This is likewise awkward, 13 instructions.  However, it can also
- * execute starting with 2 out of 3 possible moduli, so it does 2 rounds
- * in 9 cycles, 4.5 cycles/round.
- */
-#define STEPD0_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  and    %r0,RC(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r5;  loadk; rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;     \
-add RE(t),RE(t),%r0
-
-/* Nicely optimal.  Conveniently, also the most common. */
-#define STEPD1_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r5;  loadk; xor %r0,%r0,RC(t);  xor W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1
-
-/*
- * The naked version, no UPDATE, for the last 4 rounds.  3 cycles per.
- * We could use W(s) as a temp register, but we don't need it.
- */
-#define STEPD1(t) \
-                        add   RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); \
-rotlwi RB(t),RB(t),30;  add   RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   \
-add    RE(t),RE(t),%r0; rotlwi %r0,RA(t),5;     /* spare slot */        \
-add    RE(t),RE(t),%r0
-
-/*
- * 14 instructions, 5 cycles per.  The majority function is a bit
- * awkward to compute.  This can execute with a 1-instruction delay,
- * but it causes a 2-instruction delay, which triggers a stall.
- */
-#define STEPD2_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); and    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  xor    %r0,RD(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r5;  loadk; and %r0,%r0,RC(t);  xor W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     rotlwi W(s),W(s),1;             \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30
-
-#define STEP0_LOAD4(t,s)		\
-	STEPD0_LOAD(t,s);		\
-	STEPD0_LOAD((t+1),(s)+1);	\
-	STEPD0_LOAD((t)+2,(s)+2);	\
-	STEPD0_LOAD((t)+3,(s)+3)
-
-#define STEPUP4(fn, t, s, loadk...)		\
-	STEP##fn##_UPDATE(t,s,);		\
-	STEP##fn##_UPDATE((t)+1,(s)+1,);	\
-	STEP##fn##_UPDATE((t)+2,(s)+2,);	\
-	STEP##fn##_UPDATE((t)+3,(s)+3,loadk)
-
-#define STEPUP20(fn, t, s, loadk...)	\
-	STEPUP4(fn, t, s,);		\
-	STEPUP4(fn, (t)+4, (s)+4,);	\
-	STEPUP4(fn, (t)+8, (s)+8,);	\
-	STEPUP4(fn, (t)+12, (s)+12,);	\
-	STEPUP4(fn, (t)+16, (s)+16, loadk)
-
-	.globl	ppc_sha1_core
-ppc_sha1_core:
-	stwu	%r1,-80(%r1)
-	stmw	%r13,4(%r1)
-
-	/* Load up A - E */
-	lmw	%r27,0(%r3)
-
-	mtctr	%r5
-
-1:
-	LOADW(0)
-	lis	%r5,0x5a82
-	mr	RE(0),%r31
-	LOADW(1)
-	mr	RD(0),%r30
-	mr	RC(0),%r29
-	LOADW(2)
-	ori	%r5,%r5,0x7999	/* K0-19 */
-	mr	RB(0),%r28
-	LOADW(3)
-	mr	RA(0),%r27
-
-	STEP0_LOAD4(0, 4)
-	STEP0_LOAD4(4, 8)
-	STEP0_LOAD4(8, 12)
-	STEPUP4(D0, 12, 16,)
-	STEPUP4(D0, 16, 20, lis %r5,0x6ed9)
-
-	ori	%r5,%r5,0xeba1	/* K20-39 */
-	STEPUP20(D1, 20, 24, lis %r5,0x8f1b)
-
-	ori	%r5,%r5,0xbcdc	/* K40-59 */
-	STEPUP20(D2, 40, 44, lis %r5,0xca62)
-
-	ori	%r5,%r5,0xc1d6	/* K60-79 */
-	STEPUP4(D1, 60, 64,)
-	STEPUP4(D1, 64, 68,)
-	STEPUP4(D1, 68, 72,)
-	STEPUP4(D1, 72, 76,)
-	addi	%r4,%r4,64
-	STEPD1(76)
-	STEPD1(77)
-	STEPD1(78)
-	STEPD1(79)
-
-	/* Add results to original values */
-	add	%r31,%r31,RE(0)
-	add	%r30,%r30,RD(0)
-	add	%r29,%r29,RC(0)
-	add	%r28,%r28,RB(0)
-	add	%r27,%r27,RA(0)
-
-	bdnz	1b
-
-	/* Save final hash, restore registers, and return */
-	stmw	%r27,0(%r3)
-	lmw	%r13,4(%r1)
-	addi	%r1,%r1,80
-	blr
-- 
2.35.1.1441.g83331fcb493


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] ppc: remove custom SHA-1 implementation
  2022-03-21 17:06           ` [PATCH v2] " Ævar Arnfjörð Bjarmason
@ 2022-03-21 21:19             ` brian m. carlson
  2022-08-31  9:18             ` [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 17+ messages in thread
From: brian m. carlson @ 2022-03-21 21:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Mike Hommey,
	Carlo Marcelo Arenas Belón, Taylor Blau

[-- Attachment #1: Type: text/plain, Size: 2270 bytes --]

On 2022-03-21 at 17:06:12, Ævar Arnfjörð Bjarmason wrote:
> Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
> assembly implementation of SHA1, 2005-04-22). When this was added
> Apple consumer hardware used the PPC architecture, and the
> implementation was intended to improve SHA-1 speed there.
> 
> Since it was added we've moved to DC_SHA1 by default, and anyone
> wanting hard-rolled non-DC SHA-1 implementation can use OpenSSL's via
> the OPENSSL_SHA1 knob.
> 
> I'm unsure if this was ever supposed to work on 64-bit PPC. It clearly
> originally targeted 32 bit PPC, but there's some mailing list
> references to this being tried on G5 (PPC 970). I can't get it to do
> anything but segfault on the BE POWER8 machine in the GCC compile
> farm. Anyone caring about speed on PPC these days is likely to be
> using IBM's POWER, not PPC 970.
> 
> There have been proposals to entirely remove non-DC_SHA1
> implementations from the tree[1]. I think per [2] that would be a bit
> overzealous. I.e. there are various set-ups git's speed is going to be
> more important than the relatively implausible SHA-1 collision attack,
> or where such attacks are entirely mitigated by other means (e.g. by
> incoming objects being checked with DC_SHA1).
> 
> The main reason for doing so at this point is to simplify follow-up
> Makefile change. Since PPC_SHA1 included the only in-tree *.S assembly
> file we needed to keep around special support for building objects
> from it. By getting rid of it we know we'll always build *.o from *.c
> files, which makes the build process simpler.
> 
> As an aside the code being removed here was also throwing warnings
> with the "-pedantic" flag, but let's remove it instead of fixing it,
> as 544d93bc3b4 (block-sha1: remove use of obsolete x86 assembly,
> 2022-03-10) did for block-sha1/*.

While I don't agree that we shouldn't remove the other non-DC SHA-1
implementations, I do agree that we should remove this one.  Given the
testing you've done and the fact that almost everyone desiring speed is
using a 64-bit machine these days, I think it's unlikely that anyone is
using this in the real world.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation
  2022-03-21 17:06           ` [PATCH v2] " Ævar Arnfjörð Bjarmason
  2022-03-21 21:19             ` brian m. carlson
@ 2022-08-31  9:18             ` Ævar Arnfjörð Bjarmason
  2022-08-31  9:18               ` [PATCH v3 1/2] " Ævar Arnfjörð Bjarmason
  2022-08-31  9:18               ` [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ) Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-31  9:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau,
	Ævar Arnfjörð Bjarmason

This is a re-roll of [1] sent back in March. It removes the PPC_SHA1
implementation, which as noted in in 1/2 is likely completely unused
at this point, if anyone's using it we have other better tested (and
just as fast) SHA-1 implementations.

I then included this change in a larger series sent in late April[2],
which stalled for lack of feedback from the OSX crowd[3].

But this part should be uncontroversial, and is an obvious win in
terms of the diffstat, and in reducing our complexity surface in this
area.

Changes since v2:

 * Much rewritten commit message (see range-diff)
 * Rebased on upstream changes
 * More PPC_SHA1 removal of comments/docs that I missed the first time
   around.
 * The s/C_OBJS/OBJECTS/ Makefile change is now split into a new 2/2.
 * Added brian's Acked-by, per [4] (not explicitly an "Acked-by", but
   I thought adding amounted to a fair paraphrasing of the feedback
   there)

1. https://lore.kernel.org/git/patch-v2-1.1-e77fd23a824-20220321T170412Z-avarab@gmail.com/
2. https://lore.kernel.org/git/cover-0.5-00000000000-20220422T094624Z-avarab@gmail.com/
3. https://lore.kernel.org/git/xmqqv8v17xrl.fsf@gitster.g/
4. https://lore.kernel.org/git/Yjjr9fkybVmB53M7@camp.crustytoothpaste.net/

Ævar Arnfjörð Bjarmason (2):
  Makefile + hash.h: remove PPC_SHA1 implementation
  Makefile: use $(OBJECTS) instead of $(C_OBJ)

 INSTALL           |   3 +-
 Makefile          |  22 ++---
 block-sha1/sha1.c |   4 -
 configure.ac      |   3 -
 hash.h            |   6 +-
 ppc/sha1.c        |  72 ---------------
 ppc/sha1.h        |  25 ------
 ppc/sha1ppc.S     | 224 ----------------------------------------------
 8 files changed, 9 insertions(+), 350 deletions(-)
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S

Range-diff against v2:
1:  e77fd23a824 ! 1:  87a204b8937 ppc: remove custom SHA-1 implementation
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    ppc: remove custom SHA-1 implementation
    +    Makefile + hash.h: remove PPC_SHA1 implementation
     
         Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
         assembly implementation of SHA1, 2005-04-22). When this was added
         Apple consumer hardware used the PPC architecture, and the
         implementation was intended to improve SHA-1 speed there.
     
    -    Since it was added we've moved to DC_SHA1 by default, and anyone
    -    wanting hard-rolled non-DC SHA-1 implementation can use OpenSSL's via
    -    the OPENSSL_SHA1 knob.
    +    Since it was added we've moved to using sha1collisiondetection by
    +    default, and anyone wanting hard-rolled non-DC SHA-1 implementation
    +    can use OpenSSL's via the OPENSSL_SHA1 knob.
     
    -    I'm unsure if this was ever supposed to work on 64-bit PPC. It clearly
    -    originally targeted 32 bit PPC, but there's some mailing list
    -    references to this being tried on G5 (PPC 970). I can't get it to do
    -    anything but segfault on the BE POWER8 machine in the GCC compile
    -    farm. Anyone caring about speed on PPC these days is likely to be
    -    using IBM's POWER, not PPC 970.
    +    The PPC_SHA1 originally originally targeted 32 bit PPC, and later the
    +    64 bit PPC 970 (a.k.a. Apple PowerPC G5). See 926172c5e48 (block-sha1:
    +    improve code on large-register-set machines, 2009-08-10) for a
    +    reference about the performance on G5 (a comment in block-sha1/sha1.c
    +    being removed here).
     
    -    There have been proposals to entirely remove non-DC_SHA1
    -    implementations from the tree[1]. I think per [2] that would be a bit
    +    I can't get it to do anything but segfault on both the BE and LE POWER
    +    machines in the GCC compile farm[1]. Anyone who's concerned about
    +    performance on PPC these days is likely to be using the IBM POWER
    +    processors.
    +
    +    There have been proposals to entirely remove non-sha1collisiondetection
    +    implementations from the tree[2]. I think per [3] that would be a bit
         overzealous. I.e. there are various set-ups git's speed is going to be
         more important than the relatively implausible SHA-1 collision attack,
         or where such attacks are entirely mitigated by other means (e.g. by
         incoming objects being checked with DC_SHA1).
     
    -    The main reason for doing so at this point is to simplify follow-up
    -    Makefile change. Since PPC_SHA1 included the only in-tree *.S assembly
    -    file we needed to keep around special support for building objects
    -    from it. By getting rid of it we know we'll always build *.o from *.c
    -    files, which makes the build process simpler.
    +    But that really doesn't apply to PPC_SHA1 in particular, which seems
    +    to have outlived its usefulness.
    +
    +    As this gets rid of the only in-tree *.S assembly file we can remove
    +    the small bits of logic from the Makefile needed to build objects
    +    from *.S (as opposed to *.c)
     
    -    As an aside the code being removed here was also throwing warnings
    -    with the "-pedantic" flag, but let's remove it instead of fixing it,
    -    as 544d93bc3b4 (block-sha1: remove use of obsolete x86 assembly,
    -    2022-03-10) did for block-sha1/*.
    +    The code being removed here was also throwing warnings with the
    +    "-pedantic" flag, it could have been fixed as 544d93bc3b4 (block-sha1:
    +    remove use of obsolete x86 assembly, 2022-03-10) did for block-sha1/*,
    +    but as noted above let's remove it instead.
     
    -    1. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
    -    2. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/
    +    1. https://cfarm.tetaneutral.net/machines/list/
    +       Tested on gcc{110,112,135,203}, a mixture of POWER [789] ppc64 and
    +       ppc64le. All segfault in anything needing object
    +       hashing (e.g. t/t1007-hash-object.sh) when compiled with
    +       PPC_SHA1=Y.
    +    2. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
    +    3. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/
     
    +    Acked-by: brian m. carlson" <sandals@crustytoothpaste.net>
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## INSTALL ##
    @@ Makefile: include shared.mak
      # Define DC_SHA1 to unconditionally enable the collision-detecting sha1
      # algorithm. This is slower, but may detect attempted collision attacks.
      # Takes priority over other *_SHA1 knobs.
    -@@ Makefile: ifdef OPENSSL_SHA1
    - 	EXTLIBS += $(LIB_4_CRYPTO)
    - 	BASIC_CFLAGS += -DSHA1_OPENSSL
    - else
    +@@ Makefile: ifdef APPLE_COMMON_CRYPTO
    + 	SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
    + endif
    + 
     +ifdef PPC_SHA1
    -+$(error PPC_SHA1 has been removed! Use DC_SHA1 instead, which is the default)
    ++$(error the PPC_SHA1 flag has been removed along with the PowerPC-specific SHA-1 implementation.)
     +endif
    - ifdef BLK_SHA1
    ++
    + ifdef OPENSSL_SHA1
    + 	EXTLIBS += $(LIB_4_CRYPTO)
    + 	BASIC_CFLAGS += -DSHA1_OPENSSL
    +@@ Makefile: ifdef BLK_SHA1
      	LIB_OBJS += block-sha1/sha1.o
      	BASIC_CFLAGS += -DSHA1_BLK
      else
    @@ Makefile: missing_compdb_dir =
     -ASM_SRC := $(wildcard $(OBJECTS:o=S))
     -ASM_OBJ := $(ASM_SRC:S=o)
     -C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
    -+C_OBJ = $(OBJECTS)
    ++C_OBJ := $(OBJECTS)
      
      $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
      	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
    @@ Makefile: missing_compdb_dir =
      %.s: %.c GIT-CFLAGS FORCE
      	$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
     
    + ## block-sha1/sha1.c ##
    +@@
    +  * try to do the silly "optimize away loads" part because it won't
    +  * see what the value will be).
    +  *
    +- * Ben Herrenschmidt reports that on PPC, the C version comes close
    +- * to the optimized asm with this (ie on PPC you don't want that
    +- * 'volatile', since there are lots of registers).
    +- *
    +  * On ARM we get the best code generation by forcing a full memory barrier
    +  * between each SHA_ROUND, otherwise gcc happily get wild with spilling and
    +  * the stack frame size simply explode and performance goes down the drain.
    +
      ## configure.ac ##
     @@ configure.ac: AC_MSG_NOTICE([CHECKS for site configuration])
      # tests.  These tests take up a significant amount of the total test time
-:  ----------- > 2:  cb3bc8b5029 Makefile: use $(OBJECTS) instead of $(C_OBJ)
-- 
2.37.3.1406.g184357183a6


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/2] Makefile + hash.h: remove PPC_SHA1 implementation
  2022-08-31  9:18             ` [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation Ævar Arnfjörð Bjarmason
@ 2022-08-31  9:18               ` Ævar Arnfjörð Bjarmason
  2022-08-31  9:18               ` [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ) Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-31  9:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau,
	Ævar Arnfjörð Bjarmason

Remove the PPC_SHA1 implementation added in a6ef3518f9a ([PATCH] PPC
assembly implementation of SHA1, 2005-04-22). When this was added
Apple consumer hardware used the PPC architecture, and the
implementation was intended to improve SHA-1 speed there.

Since it was added we've moved to using sha1collisiondetection by
default, and anyone wanting hard-rolled non-DC SHA-1 implementation
can use OpenSSL's via the OPENSSL_SHA1 knob.

The PPC_SHA1 originally originally targeted 32 bit PPC, and later the
64 bit PPC 970 (a.k.a. Apple PowerPC G5). See 926172c5e48 (block-sha1:
improve code on large-register-set machines, 2009-08-10) for a
reference about the performance on G5 (a comment in block-sha1/sha1.c
being removed here).

I can't get it to do anything but segfault on both the BE and LE POWER
machines in the GCC compile farm[1]. Anyone who's concerned about
performance on PPC these days is likely to be using the IBM POWER
processors.

There have been proposals to entirely remove non-sha1collisiondetection
implementations from the tree[2]. I think per [3] that would be a bit
overzealous. I.e. there are various set-ups git's speed is going to be
more important than the relatively implausible SHA-1 collision attack,
or where such attacks are entirely mitigated by other means (e.g. by
incoming objects being checked with DC_SHA1).

But that really doesn't apply to PPC_SHA1 in particular, which seems
to have outlived its usefulness.

As this gets rid of the only in-tree *.S assembly file we can remove
the small bits of logic from the Makefile needed to build objects
from *.S (as opposed to *.c)

The code being removed here was also throwing warnings with the
"-pedantic" flag, it could have been fixed as 544d93bc3b4 (block-sha1:
remove use of obsolete x86 assembly, 2022-03-10) did for block-sha1/*,
but as noted above let's remove it instead.

1. https://cfarm.tetaneutral.net/machines/list/
   Tested on gcc{110,112,135,203}, a mixture of POWER [789] ppc64 and
   ppc64le. All segfault in anything needing object
   hashing (e.g. t/t1007-hash-object.sh) when compiled with
   PPC_SHA1=Y.
2. https://lore.kernel.org/git/20200223223758.120941-1-mh@glandium.org/
3. https://lore.kernel.org/git/20200224044732.GK1018190@coredump.intra.peff.net/

Acked-by: brian m. carlson" <sandals@crustytoothpaste.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 INSTALL           |   3 +-
 Makefile          |  18 ++--
 block-sha1/sha1.c |   4 -
 configure.ac      |   3 -
 hash.h            |   6 +-
 ppc/sha1.c        |  72 ---------------
 ppc/sha1.h        |  25 ------
 ppc/sha1ppc.S     | 224 ----------------------------------------------
 8 files changed, 8 insertions(+), 347 deletions(-)
 delete mode 100644 ppc/sha1.c
 delete mode 100644 ppc/sha1.h
 delete mode 100644 ppc/sha1ppc.S

diff --git a/INSTALL b/INSTALL
index 4140a3f5c8b..89b15d71df5 100644
--- a/INSTALL
+++ b/INSTALL
@@ -135,8 +135,7 @@ Issues of note:
 
 	  By default, git uses OpenSSL for SHA1 but it will use its own
 	  library (inspired by Mozilla's) with either NO_OPENSSL or
-	  BLK_SHA1.  Also included is a version optimized for PowerPC
-	  (PPC_SHA1).
+	  BLK_SHA1.
 
 	- "libcurl" library is used for fetching and pushing
 	  repositories over http:// or https://, as well as by
diff --git a/Makefile b/Makefile
index eac30126e29..7feda7e79be 100644
--- a/Makefile
+++ b/Makefile
@@ -155,9 +155,6 @@ include shared.mak
 # Define BLK_SHA1 environment variable to make use of the bundled
 # optimized C SHA1 routine.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define DC_SHA1 to unconditionally enable the collision-detecting sha1
 # algorithm. This is slower, but may detect attempted collision attacks.
 # Takes priority over other *_SHA1 knobs.
@@ -1802,6 +1799,10 @@ ifdef APPLE_COMMON_CRYPTO
 	SHA1_MAX_BLOCK_SIZE = 1024L*1024L*1024L
 endif
 
+ifdef PPC_SHA1
+$(error the PPC_SHA1 flag has been removed along with the PowerPC-specific SHA-1 implementation.)
+endif
+
 ifdef OPENSSL_SHA1
 	EXTLIBS += $(LIB_4_CRYPTO)
 	BASIC_CFLAGS += -DSHA1_OPENSSL
@@ -1810,10 +1811,6 @@ ifdef BLK_SHA1
 	LIB_OBJS += block-sha1/sha1.o
 	BASIC_CFLAGS += -DSHA1_BLK
 else
-ifdef PPC_SHA1
-	LIB_OBJS += ppc/sha1.o ppc/sha1ppc.o
-	BASIC_CFLAGS += -DSHA1_PPC
-else
 ifdef APPLE_COMMON_CRYPTO
 	COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL
 	BASIC_CFLAGS += -DSHA1_APPLE
@@ -1847,7 +1844,6 @@ endif
 endif
 endif
 endif
-endif
 
 ifdef OPENSSL_SHA256
 	EXTLIBS += $(LIB_4_CRYPTO)
@@ -2594,14 +2590,10 @@ missing_compdb_dir =
 compdb_args =
 endif
 
-ASM_SRC := $(wildcard $(OBJECTS:o=S))
-ASM_OBJ := $(ASM_SRC:S=o)
-C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
+C_OBJ := $(OBJECTS)
 
 $(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
 	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
-$(ASM_OBJ): %.o: %.S GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
-	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
 
 %.s: %.c GIT-CFLAGS FORCE
 	$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 5974cd7dd3c..80cebd27564 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -28,10 +28,6 @@
  * try to do the silly "optimize away loads" part because it won't
  * see what the value will be).
  *
- * Ben Herrenschmidt reports that on PPC, the C version comes close
- * to the optimized asm with this (ie on PPC you don't want that
- * 'volatile', since there are lots of registers).
- *
  * On ARM we get the best code generation by forcing a full memory barrier
  * between each SHA_ROUND, otherwise gcc happily get wild with spilling and
  * the stack frame size simply explode and performance goes down the drain.
diff --git a/configure.ac b/configure.ac
index 7dcd0482042..38ff86678a0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -237,9 +237,6 @@ AC_MSG_NOTICE([CHECKS for site configuration])
 # tests.  These tests take up a significant amount of the total test time
 # but are not needed unless you plan to talk to SVN repos.
 #
-# Define PPC_SHA1 environment variable when running make to make use of
-# a bundled SHA1 routine optimized for PowerPC.
-#
 # Define NO_OPENSSL environment variable if you do not have OpenSSL.
 #
 # Define OPENSSLDIR=/foo/bar if your openssl header and library files are in
diff --git a/hash.h b/hash.h
index ea87ae9d92f..36b64165fc9 100644
--- a/hash.h
+++ b/hash.h
@@ -4,9 +4,7 @@
 #include "git-compat-util.h"
 #include "repository.h"
 
-#if defined(SHA1_PPC)
-#include "ppc/sha1.h"
-#elif defined(SHA1_APPLE)
+#if defined(SHA1_APPLE)
 #include <CommonCrypto/CommonDigest.h>
 #elif defined(SHA1_OPENSSL)
 #include <openssl/sha.h>
@@ -32,7 +30,7 @@
  * platform's underlying implementation of SHA-1; could be OpenSSL,
  * blk_SHA, Apple CommonCrypto, etc...  Note that the relevant
  * SHA-1 header may have already defined platform_SHA_CTX for our
- * own implementations like block-sha1 and ppc-sha1, so we list
+ * own implementations like block-sha1, so we list
  * the default for OpenSSL compatible SHA-1 implementations here.
  */
 #define platform_SHA_CTX	SHA_CTX
diff --git a/ppc/sha1.c b/ppc/sha1.c
deleted file mode 100644
index 1b705cee1fe..00000000000
--- a/ppc/sha1.c
+++ /dev/null
@@ -1,72 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- *
- * This version assumes we are running on a big-endian machine.
- * It calls an external sha1_core() to process blocks of 64 bytes.
- */
-#include <stdio.h>
-#include <string.h>
-#include "sha1.h"
-
-void ppc_sha1_core(uint32_t *hash, const unsigned char *p,
-		   unsigned int nblocks);
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c)
-{
-	c->hash[0] = 0x67452301;
-	c->hash[1] = 0xEFCDAB89;
-	c->hash[2] = 0x98BADCFE;
-	c->hash[3] = 0x10325476;
-	c->hash[4] = 0xC3D2E1F0;
-	c->len = 0;
-	c->cnt = 0;
-	return 0;
-}
-
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *ptr, unsigned long n)
-{
-	unsigned long nb;
-	const unsigned char *p = ptr;
-
-	c->len += (uint64_t) n << 3;
-	while (n != 0) {
-		if (c->cnt || n < 64) {
-			nb = 64 - c->cnt;
-			if (nb > n)
-				nb = n;
-			memcpy(&c->buf.b[c->cnt], p, nb);
-			if ((c->cnt += nb) == 64) {
-				ppc_sha1_core(c->hash, c->buf.b, 1);
-				c->cnt = 0;
-			}
-		} else {
-			nb = n >> 6;
-			ppc_sha1_core(c->hash, p, nb);
-			nb <<= 6;
-		}
-		n -= nb;
-		p += nb;
-	}
-	return 0;
-}
-
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c)
-{
-	unsigned int cnt = c->cnt;
-
-	c->buf.b[cnt++] = 0x80;
-	if (cnt > 56) {
-		if (cnt < 64)
-			memset(&c->buf.b[cnt], 0, 64 - cnt);
-		ppc_sha1_core(c->hash, c->buf.b, 1);
-		cnt = 0;
-	}
-	if (cnt < 56)
-		memset(&c->buf.b[cnt], 0, 56 - cnt);
-	c->buf.l[7] = c->len;
-	ppc_sha1_core(c->hash, c->buf.b, 1);
-	memcpy(hash, c->hash, 20);
-	return 0;
-}
diff --git a/ppc/sha1.h b/ppc/sha1.h
deleted file mode 100644
index 9b24b326159..00000000000
--- a/ppc/sha1.h
+++ /dev/null
@@ -1,25 +0,0 @@
-/*
- * SHA-1 implementation.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-#include <stdint.h>
-
-typedef struct {
-	uint32_t hash[5];
-	uint32_t cnt;
-	uint64_t len;
-	union {
-		unsigned char b[64];
-		uint64_t l[8];
-	} buf;
-} ppc_SHA_CTX;
-
-int ppc_SHA1_Init(ppc_SHA_CTX *c);
-int ppc_SHA1_Update(ppc_SHA_CTX *c, const void *p, unsigned long n);
-int ppc_SHA1_Final(unsigned char *hash, ppc_SHA_CTX *c);
-
-#define platform_SHA_CTX	ppc_SHA_CTX
-#define platform_SHA1_Init	ppc_SHA1_Init
-#define platform_SHA1_Update	ppc_SHA1_Update
-#define platform_SHA1_Final	ppc_SHA1_Final
diff --git a/ppc/sha1ppc.S b/ppc/sha1ppc.S
deleted file mode 100644
index 1711eef6e71..00000000000
--- a/ppc/sha1ppc.S
+++ /dev/null
@@ -1,224 +0,0 @@
-/*
- * SHA-1 implementation for PowerPC.
- *
- * Copyright (C) 2005 Paul Mackerras <paulus@samba.org>
- */
-
-/*
- * PowerPC calling convention:
- * %r0 - volatile temp
- * %r1 - stack pointer.
- * %r2 - reserved
- * %r3-%r12 - Incoming arguments & return values; volatile.
- * %r13-%r31 - Callee-save registers
- * %lr - Return address, volatile
- * %ctr - volatile
- *
- * Register usage in this routine:
- * %r0 - temp
- * %r3 - argument (pointer to 5 words of SHA state)
- * %r4 - argument (pointer to data to hash)
- * %r5 - Constant K in SHA round (initially number of blocks to hash)
- * %r6-%r10 - Working copies of SHA variables A..E (actually E..A order)
- * %r11-%r26 - Data being hashed W[].
- * %r27-%r31 - Previous copies of A..E, for final add back.
- * %ctr - loop count
- */
-
-
-/*
- * We roll the registers for A, B, C, D, E around on each
- * iteration; E on iteration t is D on iteration t+1, and so on.
- * We use registers 6 - 10 for this.  (Registers 27 - 31 hold
- * the previous values.)
- */
-#define RA(t)	(((t)+4)%5+6)
-#define RB(t)	(((t)+3)%5+6)
-#define RC(t)	(((t)+2)%5+6)
-#define RD(t)	(((t)+1)%5+6)
-#define RE(t)	(((t)+0)%5+6)
-
-/* We use registers 11 - 26 for the W values */
-#define W(t)	((t)%16+11)
-
-/* Register 5 is used for the constant k */
-
-/*
- * The basic SHA-1 round function is:
- * E += ROTL(A,5) + F(B,C,D) + W[i] + K;  B = ROTL(B,30)
- * Then the variables are renamed: (A,B,C,D,E) = (E,A,B,C,D).
- *
- * Every 20 rounds, the function F() and the constant K changes:
- * - 20 rounds of f0(b,c,d) = "bit wise b ? c : d" =  (^b & d) + (b & c)
- * - 20 rounds of f1(b,c,d) = b^c^d = (b^d)^c
- * - 20 rounds of f2(b,c,d) = majority(b,c,d) = (b&d) + ((b^d)&c)
- * - 20 more rounds of f1(b,c,d)
- *
- * These are all scheduled for near-optimal performance on a G4.
- * The G4 is a 3-issue out-of-order machine with 3 ALUs, but it can only
- * *consider* starting the oldest 3 instructions per cycle.  So to get
- * maximum performance out of it, you have to treat it as an in-order
- * machine.  Which means interleaving the computation round t with the
- * computation of W[t+4].
- *
- * The first 16 rounds use W values loaded directly from memory, while the
- * remaining 64 use values computed from those first 16.  We preload
- * 4 values before starting, so there are three kinds of rounds:
- * - The first 12 (all f0) also load the W values from memory.
- * - The next 64 compute W(i+4) in parallel. 8*f0, 20*f1, 20*f2, 16*f1.
- * - The last 4 (all f1) do not do anything with W.
- *
- * Therefore, we have 6 different round functions:
- * STEPD0_LOAD(t,s) - Perform round t and load W(s).  s < 16
- * STEPD0_UPDATE(t,s) - Perform round t and compute W(s).  s >= 16.
- * STEPD1_UPDATE(t,s)
- * STEPD2_UPDATE(t,s)
- * STEPD1(t) - Perform round t with no load or update.
- *
- * The G5 is more fully out-of-order, and can find the parallelism
- * by itself.  The big limit is that it has a 2-cycle ALU latency, so
- * even though it's 2-way, the code has to be scheduled as if it's
- * 4-way, which can be a limit.  To help it, we try to schedule the
- * read of RA(t) as late as possible so it doesn't stall waiting for
- * the previous round's RE(t-1), and we try to rotate RB(t) as early
- * as possible while reading RC(t) (= RB(t-1)) as late as possible.
- */
-
-/* the initial loads. */
-#define LOADW(s) \
-	lwz	W(s),(s)*4(%r4)
-
-/*
- * Perform a step with F0, and load W(s).  Uses W(s) as a temporary
- * before loading it.
- * This is actually 10 instructions, which is an awkward fit.
- * It can execute grouped as listed, or delayed one instruction.
- * (If delayed two instructions, there is a stall before the start of the
- * second line.)  Thus, two iterations take 7 cycles, 3.5 cycles per round.
- */
-#define STEPD0_LOAD(t,s) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t);  and    W(s),RC(t),RB(t); \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;      rotlwi RB(t),RB(t),30;   \
-add RE(t),RE(t),W(s); add    %r0,%r0,%r5;      lwz    W(s),(s)*4(%r4);  \
-add RE(t),RE(t),%r0
-
-/*
- * This is likewise awkward, 13 instructions.  However, it can also
- * execute starting with 2 out of 3 possible moduli, so it does 2 rounds
- * in 9 cycles, 4.5 cycles/round.
- */
-#define STEPD0_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); andc   %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  and    %r0,RC(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r5;  loadk; rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1;     \
-add RE(t),RE(t),%r0
-
-/* Nicely optimal.  Conveniently, also the most common. */
-#define STEPD1_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r5;  loadk; xor %r0,%r0,RC(t);  xor W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     xor    W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30;  rotlwi W(s),W(s),1
-
-/*
- * The naked version, no UPDATE, for the last 4 rounds.  3 cycles per.
- * We could use W(s) as a temp register, but we don't need it.
- */
-#define STEPD1(t) \
-                        add   RE(t),RE(t),W(t); xor    %r0,RD(t),RB(t); \
-rotlwi RB(t),RB(t),30;  add   RE(t),RE(t),%r5;  xor    %r0,%r0,RC(t);   \
-add    RE(t),RE(t),%r0; rotlwi %r0,RA(t),5;     /* spare slot */        \
-add    RE(t),RE(t),%r0
-
-/*
- * 14 instructions, 5 cycles per.  The majority function is a bit
- * awkward to compute.  This can execute with a 1-instruction delay,
- * but it causes a 2-instruction delay, which triggers a stall.
- */
-#define STEPD2_UPDATE(t,s,loadk...) \
-add RE(t),RE(t),W(t); and    %r0,RD(t),RB(t); xor    W(s),W((s)-16),W((s)-3); \
-add RE(t),RE(t),%r0;  xor    %r0,RD(t),RB(t); xor    W(s),W(s),W((s)-8);      \
-add RE(t),RE(t),%r5;  loadk; and %r0,%r0,RC(t);  xor W(s),W(s),W((s)-14);     \
-add RE(t),RE(t),%r0;  rotlwi %r0,RA(t),5;     rotlwi W(s),W(s),1;             \
-add RE(t),RE(t),%r0;  rotlwi RB(t),RB(t),30
-
-#define STEP0_LOAD4(t,s)		\
-	STEPD0_LOAD(t,s);		\
-	STEPD0_LOAD((t+1),(s)+1);	\
-	STEPD0_LOAD((t)+2,(s)+2);	\
-	STEPD0_LOAD((t)+3,(s)+3)
-
-#define STEPUP4(fn, t, s, loadk...)		\
-	STEP##fn##_UPDATE(t,s,);		\
-	STEP##fn##_UPDATE((t)+1,(s)+1,);	\
-	STEP##fn##_UPDATE((t)+2,(s)+2,);	\
-	STEP##fn##_UPDATE((t)+3,(s)+3,loadk)
-
-#define STEPUP20(fn, t, s, loadk...)	\
-	STEPUP4(fn, t, s,);		\
-	STEPUP4(fn, (t)+4, (s)+4,);	\
-	STEPUP4(fn, (t)+8, (s)+8,);	\
-	STEPUP4(fn, (t)+12, (s)+12,);	\
-	STEPUP4(fn, (t)+16, (s)+16, loadk)
-
-	.globl	ppc_sha1_core
-ppc_sha1_core:
-	stwu	%r1,-80(%r1)
-	stmw	%r13,4(%r1)
-
-	/* Load up A - E */
-	lmw	%r27,0(%r3)
-
-	mtctr	%r5
-
-1:
-	LOADW(0)
-	lis	%r5,0x5a82
-	mr	RE(0),%r31
-	LOADW(1)
-	mr	RD(0),%r30
-	mr	RC(0),%r29
-	LOADW(2)
-	ori	%r5,%r5,0x7999	/* K0-19 */
-	mr	RB(0),%r28
-	LOADW(3)
-	mr	RA(0),%r27
-
-	STEP0_LOAD4(0, 4)
-	STEP0_LOAD4(4, 8)
-	STEP0_LOAD4(8, 12)
-	STEPUP4(D0, 12, 16,)
-	STEPUP4(D0, 16, 20, lis %r5,0x6ed9)
-
-	ori	%r5,%r5,0xeba1	/* K20-39 */
-	STEPUP20(D1, 20, 24, lis %r5,0x8f1b)
-
-	ori	%r5,%r5,0xbcdc	/* K40-59 */
-	STEPUP20(D2, 40, 44, lis %r5,0xca62)
-
-	ori	%r5,%r5,0xc1d6	/* K60-79 */
-	STEPUP4(D1, 60, 64,)
-	STEPUP4(D1, 64, 68,)
-	STEPUP4(D1, 68, 72,)
-	STEPUP4(D1, 72, 76,)
-	addi	%r4,%r4,64
-	STEPD1(76)
-	STEPD1(77)
-	STEPD1(78)
-	STEPD1(79)
-
-	/* Add results to original values */
-	add	%r31,%r31,RE(0)
-	add	%r30,%r30,RD(0)
-	add	%r29,%r29,RC(0)
-	add	%r28,%r28,RB(0)
-	add	%r27,%r27,RA(0)
-
-	bdnz	1b
-
-	/* Save final hash, restore registers, and return */
-	stmw	%r27,0(%r3)
-	lmw	%r13,4(%r1)
-	addi	%r1,%r1,80
-	blr
-- 
2.37.3.1406.g184357183a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ)
  2022-08-31  9:18             ` [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation Ævar Arnfjörð Bjarmason
  2022-08-31  9:18               ` [PATCH v3 1/2] " Ævar Arnfjörð Bjarmason
@ 2022-08-31  9:18               ` Ævar Arnfjörð Bjarmason
  2022-08-31 21:44                 ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-31  9:18 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau,
	Ævar Arnfjörð Bjarmason

In the preceding commit $(C_OBJ) added in c373991375a (Makefile: list
generated object files in OBJECTS, 2010-01-26) became synonymous with
$(OBJECTS). Let's avoid the indirection and use the $(OBJECTS)
variable directly instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 7feda7e79be..8956cace8eb 100644
--- a/Makefile
+++ b/Makefile
@@ -2590,9 +2590,7 @@ missing_compdb_dir =
 compdb_args =
 endif
 
-C_OBJ := $(OBJECTS)
-
-$(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
+$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
 	$(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
 
 %.s: %.c GIT-CFLAGS FORCE
@@ -3084,7 +3082,7 @@ t/helper/test-%$X: t/helper/test-%.o GIT-LDFLAGS $(GITLIBS) $(REFTABLE_TEST_LIB)
 check-sha1:: t/helper/test-tool$X
 	t/helper/test-sha1.sh
 
-SP_OBJ = $(patsubst %.o,%.sp,$(C_OBJ))
+SP_OBJ = $(patsubst %.o,%.sp,$(OBJECTS))
 
 $(SP_OBJ): %.sp: %.c %.o
 	$(QUIET_SP)cgcc -no-compile $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) \
-- 
2.37.3.1406.g184357183a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ)
  2022-08-31  9:18               ` [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ) Ævar Arnfjörð Bjarmason
@ 2022-08-31 21:44                 ` Junio C Hamano
  2022-09-01 14:52                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2022-08-31 21:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> In the preceding commit $(C_OBJ) added in c373991375a (Makefile: list
> generated object files in OBJECTS, 2010-01-26) became synonymous with
> $(OBJECTS). Let's avoid the indirection and use the $(OBJECTS)
> variable directly instead.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  Makefile | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)

This is a declaration that we would never ever build .o files out of
sources other than .c files.  While it does make sense to have it
outside the scope of [PATCH 1/2], I am not sure if it even belongs
to the same series.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ)
  2022-08-31 21:44                 ` Junio C Hamano
@ 2022-09-01 14:52                   ` Ævar Arnfjörð Bjarmason
  2022-09-01 15:48                     ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-01 14:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau


On Wed, Aug 31 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> In the preceding commit $(C_OBJ) added in c373991375a (Makefile: list
>> generated object files in OBJECTS, 2010-01-26) became synonymous with
>> $(OBJECTS). Let's avoid the indirection and use the $(OBJECTS)
>> variable directly instead.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  Makefile | 6 ++----
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> This is a declaration that we would never ever build .o files out of
> sources other than .c files.  While it does make sense to have it
> outside the scope of [PATCH 1/2], I am not sure if it even belongs
> to the same series.

I think it does. Before this the C_OBJ would be:

	C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))

but after 1/2 it's the same as $(OBJECTS). An earlier iteration of this
did this cleanup "while we're at it" (which I do think makes sense as an
atomic change), but I got the feedback that the cleanup wasn't strictly
necessary.

But as 1/2 has removed the ability to build those $(ASM_OBJ), as we had
only one of those, I don't think keeping this particular bit of
indirection makes sense.

Of course it doesn't really matter at all, the real change is the
removal of $(ASM_OBJ).

If we do start building *.o files out of *.S files (or other non-*.c)
again we'll need new rules anyway. I think we should just add any such
variables back then, and not keep this small bit of dead husk around.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ)
  2022-09-01 14:52                   ` Ævar Arnfjörð Bjarmason
@ 2022-09-01 15:48                     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2022-09-01 15:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Mike Hommey, brian m . carlson,
	Carlo Marcelo Arenas Belón, Taylor Blau

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> This is a declaration that we would never ever build .o files out of
>> sources other than .c files.  While it does make sense to have it
>> outside the scope of [PATCH 1/2], I am not sure if it even belongs
>> to the same series.
>
> I think it does. Before this the C_OBJ would be:
>
> 	C_OBJ := $(filter-out $(ASM_OBJ),$(OBJECTS))
>
> but after 1/2 it's the same as $(OBJECTS). An earlier iteration of this
> did this cleanup "while we're at it" (which I do think makes sense as an
> atomic change), but I got the feedback that the cleanup wasn't strictly
> necessary.
>
> But as 1/2 has removed the ability to build those $(ASM_OBJ), as we had
> only one of those, I don't think keeping this particular bit of
> indirection makes sense.

You are not thinking for longer term to help project maintenance.

This change removes distinction between C_OBJ and OBJECTS, only
because the sources to the objects we HAPPEN TO have are only C
files.  It is premature and short sighted to declare that it has to
stay that way forever.  And such a declaration is not something we
would casually make "while at it" in a topic like this.

When we add a source written in another language, say xyzzy, to be
compiled into an object file, we'd add $(XYZZY_OBJ), and they will
become part of $(OBJECTS), but the current rule to create $(C_OBJ)
will not apply to $(XYZZY_OBJ).  But you do this:

    -$(C_OBJ): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
    +$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
            $(QUIET_CC)$(CC) -o $*.o -c ... $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<

Right now, we know where this patch affected the build procedure,
because the patch highlights what is being changed.  But when future
developers need to produce some files that belong to $(OBJECTS) out
of source files that are not .c, they first need to locate the above
hunk and revert it.  I do not see the benefit of being hostile to
future developers with this patch.  Not before we know that it is
not likely that we would add any non-C sources in the future, by
running with 1/2 alone for a year or two.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-09-01 15:48 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12  8:56 SHA1dc on mac Mike Hommey
2020-02-12 16:46 ` Eric Sunshine
2020-02-12 22:31   ` Mike Hommey
2020-02-12 22:40     ` Junio C Hamano
2020-02-23 22:37       ` [PATCH] Remove non-SHA1dc sha1 implementations Mike Hommey
2020-02-24  4:47         ` Jeff King
2020-02-24  4:52           ` Jeff King
2022-03-19  1:02         ` [PATCH] ppc: remove custom SHA-1 implementation Ævar Arnfjörð Bjarmason
2022-03-21 16:39           ` Junio C Hamano
2022-03-21 17:06           ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2022-03-21 21:19             ` brian m. carlson
2022-08-31  9:18             ` [PATCH v3 0/2] Makefile + hash.h: remove PPC_SHA1 implementation Ævar Arnfjörð Bjarmason
2022-08-31  9:18               ` [PATCH v3 1/2] " Ævar Arnfjörð Bjarmason
2022-08-31  9:18               ` [PATCH v3 2/2] Makefile: use $(OBJECTS) instead of $(C_OBJ) Ævar Arnfjörð Bjarmason
2022-08-31 21:44                 ` Junio C Hamano
2022-09-01 14:52                   ` Ævar Arnfjörð Bjarmason
2022-09-01 15:48                     ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).