All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN
@ 2017-03-24 22:52 Johannes Schindelin
  2017-03-25  6:03 ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2017-03-24 22:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

In sha1dc/sha1.c, we #define BIGENDIAN under certain circumstances, and
obviously leave the door open for scenarios where our conditions do not
catch and that constant is #defined elsewhere.

However, we did not expect that anybody would possibly #define BIGENDIAN
to 0, indicating that the current platform is *not* big endian.

This is not just a theoretical consideration: On Windows, the winsock2.h
header file (which is used to allow Git to communicate via network) does
indeed do this.

Let's test for that circumstance, too, and byte-swap as intended in that
case.

This fixes a massive breakage on Windows where current `pu` (having
switched on DC_SHA1 by default) breaks pretty much every single test
case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Published-As: https://github.com/dscho/git/releases/tag/sha1dc-bigendian=0-v1
Fetch-It-Via: git fetch https://github.com/dscho/git sha1dc-bigendian=0-v1

	Obviously, this patch is based on `next`.

 sha1dc/sha1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 6dd0da36084..d99db4f2e1b 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -35,7 +35,7 @@
 
 #define sha1_mix(W, t)  (rotate_left(W[t - 3] ^ W[t - 8] ^ W[t - 14] ^ W[t - 16], 1))
 
-#if defined(BIGENDIAN)
+#if defined(BIGENDIAN) && BIGENDIAN != 0
 	#define sha1_load(m, t, temp)  { temp = m[t]; }
 #else
 	#define sha1_load(m, t, temp)  { temp = m[t]; sha1_bswap32(temp); }

base-commit: c21884356fab0bc6bc5fa6abcadbda27a112a76c
-- 
2.12.1.windows.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN
  2017-03-24 22:52 [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN Johannes Schindelin
@ 2017-03-25  6:03 ` Junio C Hamano
  2017-03-25 17:22   ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-03-25  6:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
> index 6dd0da36084..d99db4f2e1b 100644
> --- a/sha1dc/sha1.c
> +++ b/sha1dc/sha1.c
> @@ -35,7 +35,7 @@
>  
>  #define sha1_mix(W, t)  (rotate_left(W[t - 3] ^ W[t - 8] ^ W[t - 14] ^ W[t - 16], 1))
>  
> -#if defined(BIGENDIAN)
> +#if defined(BIGENDIAN) && BIGENDIAN != 0
>  	#define sha1_load(m, t, temp)  { temp = m[t]; }
>  #else
>  	#define sha1_load(m, t, temp)  { temp = m[t]; sha1_bswap32(temp); }
>
> base-commit: c21884356fab0bc6bc5fa6abcadbda27a112a76c

Is this change _only_ for Windows where BIGENDIAN happens to be
defined to be 0?

If we only assume that BIGENDIAN can be either (0) Not set at all by
platform headers, (1) Not set or set to "0" on little-endian
systems, or (2) Set to any value other than "0" on big-endian
systems, then the patch looks OK.

But it feels a bit too brittle [*1*].

Looking at sha1dc/sha1.c, I notice that it begins like so:

    /*
       Because Little-Endian architectures are most common,
       we only set BIGENDIAN if one of these conditions is met.
       Note that all MSFT platforms are little endian,
       so none of these will be defined under the MSC compiler.
       If you are compiling on a big endian platform and your
       compiler does not define one of these, you will have to add
       whatever macros your tool chain defines to indicate
       Big-Endianness.
     */
    #if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
        (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __BIG_ENDIAN__)) || \
        defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
        defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)

    #define BIGENDIAN	(1)

    #endif /*ENDIANNESS SELECTION*/

Notice:

 - It does not consider BIGENDIAN can be set from the outside (i.e.
   it does not say "#ifndef BIGENDIAN" around the above auto
   selection).

 - It considers the actual value of BIGENDIAN given from the outside
   is expendable (i.e. if somehow the caller sets it to 2, it is
   reset to 1 if __BYTE_ORDER and friends tells us we are on a
   big-endian box.

 - It really wants to tell little endian machines not to set
   BIGENDIAN (i.e. lack of "#else" that does "#undef BIGENDIAN").

Which leads me to wonder if a more robust solution that is in line
with the original design of sha1dc/sha1.c code may be to do an
unconditional "#undef BIGENDIAN" before the above block, so that no
matter what the calling environment sets BIGENDIAN to (including
"0"), it gets ignored and we always use the auto-selection.

Anyway, thanks for noticing and raising the issue.


[Footnote]

*1* The "brittle" comment is because the assumption feels a bit too
    narrow, and if we want to solve this without assuming too much
    (i.e. stop thinking about "I just want to fix Windows; I do not
    care about a more general and robust solution"), I think we
    could solve it for something that completely violates your
    assumption.  For example, a platform where something like this
    is used in the header to switch behaviour depending on the
    endianness:

	/* Symbolic constants */
	#define BIGENDIAN 0
	#define LITTLEENDIAN 1
	...

	/* Say which one MY build is */
	#define MYENDIAN BIGENDIAN
	...

	#if MYENDIAN == LITTLEENDIAN
	... code for little endian systems ...
	#else /* MYENDIAN == BIGENDIAN */
	... code for big endian systems ...
	#endif

    would break your patch.  Such a big-endian box will end up using
    the wrong definition of sha1_load().  The original happens to
    choose the right definition by accident, as the header's design
    is that MYENDIAN describes the box you are on, though.

    But as my analysis of how BIGENDIAN is set in the sha1.c file
    itself shows, I think it would be a more robust solution if we
    made sure that we do not pay attention to BIGENDIAN coming from
    the outside world at all, i.e. "#undef" upfront.

    Another possibility would be to rename BIGENDIAN used in
    sha1dc/sha1.c file to something a lot less generic,
    e.g. SHA1DC_BIGENDIAN.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN
  2017-03-25  6:03 ` Junio C Hamano
@ 2017-03-25 17:22   ` Junio C Hamano
  2017-03-27 15:39     ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2017-03-25 17:22 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Which leads me to wonder if a more robust solution that is in line
> with the original design of sha1dc/sha1.c code may be to do an
> unconditional "#undef BIGENDIAN" before the above block, so that no
> matter what the calling environment sets BIGENDIAN to (including
> "0"), it gets ignored and we always use the auto-selection.

So here is what I came up with as a replacement (this time as a
proper patch not a comment on a patch).

Dscho, could you see if this fixes your build?

Also, could people on big-endian boxes see if this breaks your
setting (relative to the state before this patch)?

Thanks.

-- >8 --
From: Junio C Hamano <gitster@pobox.com>
Date: Sat, 25 Mar 2017 10:05:13 -0700
Subject: [PATCH] sha1dc: avoid CPP macro collisions

In an early part of sha1dc/sha1.c, the code checks the endianness of
the target platform by inspecting common CPP macros defined on
big-endian boxes, and sets BIGENDIAN macro to 1.  If these common
CPP macros are not defined, the code declares that the target
platform is little endian and does nothing (most notably, it does
not #undef its BIGENDIAN macro).

The code that does so even has this comment

    Note that all MSFT platforms are little endian,
    so none of these will be defined under the MSC compiler.

and later, the defined-ness of the BIGENDIAN macro is used to switch
the implementation of sha1_load() macro.

One thing the code did not anticipate is that somebody might define
BIGENDIAN macro in some header it includes to 0 on a little-endian
target platform.  Because the auto-detection based on common macros
do not touch BIGENDIAN macro when it detects a little-endian target,
such a definition is still valid and then defined-ness test will say
"Ah, BIGENDIAN is defined" and takes the wrong sha1_load().

As this auto-detection logic pretends as if it owns the BIGENDIAN
macro by ignoring the setting that may come from the outside and by
not explicitly unsetting when it decides that it is working for a
little-endian target, solve this problem without breaking that
assumption.  Namely, we can rename BIGENDIAN this code uses to
something much less generic, i.e. SHA1DC_BIGENDIAN.  For extra
protection, undef the macro on a little-endian target.

It is possible to work it around by instead #undef BIGENDIAN in
the auto-detection code, but a macro (or include) that happens later
in the code can be implemented in terms of BIGENDIAN on Windows and
it is possible that the implementation gets upset when it sees the
CPP macro undef'ed (instead of set to 0).  Renaming the private macro
intended to be used only in this file to a less generic name relieves
us from having to worry about that kind of breakage.

Noticed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1dc/sha1.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 6dd0da3608..35e9dd5bf4 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -12,7 +12,7 @@
 
 /*
    Because Little-Endian architectures are most common,
-   we only set BIGENDIAN if one of these conditions is met.
+   we only set SHA1DC_BIGENDIAN if one of these conditions is met.
    Note that all MSFT platforms are little endian,
    so none of these will be defined under the MSC compiler.
    If you are compiling on a big endian platform and your compiler does not define one of these,
@@ -23,8 +23,9 @@
     defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
     defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)
 
-#define BIGENDIAN	(1)
-
+#define SHA1DC_BIGENDIAN	1
+#else
+#undef SHA1DC_BIGENDIAN
 #endif /*ENDIANNESS SELECTION*/
 
 #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))
@@ -35,11 +36,11 @@
 
 #define sha1_mix(W, t)  (rotate_left(W[t - 3] ^ W[t - 8] ^ W[t - 14] ^ W[t - 16], 1))
 
-#if defined(BIGENDIAN)
+#if defined(SHA1DC_BIGENDIAN)
 	#define sha1_load(m, t, temp)  { temp = m[t]; }
 #else
 	#define sha1_load(m, t, temp)  { temp = m[t]; sha1_bswap32(temp); }
-#endif /*define(BIGENDIAN)*/
+#endif /* !defined(SHA1DC_BIGENDIAN) */
 
 #define sha1_store(W, t, x)	*(volatile uint32_t *)&W[t] = x
 
-- 
2.12.2-399-g034667a458


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN
  2017-03-25 17:22   ` Junio C Hamano
@ 2017-03-27 15:39     ` Johannes Schindelin
  2017-03-27 17:05       ` Junio C Hamano
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2017-03-27 15:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi Junio,

On Sat, 25 Mar 2017, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Which leads me to wonder if a more robust solution that is in line
> > with the original design of sha1dc/sha1.c code may be to do an
> > unconditional "#undef BIGENDIAN" before the above block, so that no
> > matter what the calling environment sets BIGENDIAN to (including
> > "0"), it gets ignored and we always use the auto-selection.
> 
> So here is what I came up with as a replacement (this time as a
> proper patch not a comment on a patch).
> 
> Dscho, could you see if this fixes your build?

The Continuous Testing is back to normal, thanks.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN
  2017-03-27 15:39     ` Johannes Schindelin
@ 2017-03-27 17:05       ` Junio C Hamano
  0 siblings, 0 replies; 5+ messages in thread
From: Junio C Hamano @ 2017-03-27 17:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> So here is what I came up with as a replacement (this time as a
>> proper patch not a comment on a patch).
>> 
>> Dscho, could you see if this fixes your build?
>
> The Continuous Testing is back to normal, thanks.

Thanks for a quick response.

I wished if we can reuse the endianness detection we already have in
"compat/bswap.h", which is included by git-compat-util.h hence also
here.  It would have been very nice if we could say

    #if GIT_BYTE_ORDER == GIT_BIG_ENDIAN
        ... do big endian thing ...
    #else
        ... do little endian thing ...
    #endif

when necessary, without having to reinvent the cheks on __BYTE_ORDER
and friends.  It however needs a bit of rework in "compat/bswap.h",
which is way beyond the scope of making a quick fix-up to unblock us.






^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-27 17:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-24 22:52 [PATCH] sha1dc: safeguard against outside definitions of BIGENDIAN Johannes Schindelin
2017-03-25  6:03 ` Junio C Hamano
2017-03-25 17:22   ` Junio C Hamano
2017-03-27 15:39     ` Johannes Schindelin
2017-03-27 17:05       ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.