All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
@ 2009-08-18  7:15 Johannes Sixt
  2009-08-18 10:45 ` Sebastian Schuberth
  0 siblings, 1 reply; 34+ messages in thread
From: Johannes Sixt @ 2009-08-18  7:15 UTC (permalink / raw)
  To: msysGit; +Cc: Junio C Hamano, Git Mailing List

From: Johannes Sixt <j6t@kdbg.org>

This is a minimal fix to compile block-sha1 on Windows. I did not do any
benchmarks whether the implementation of ntohl() is actually faster than
bytewise access and shifts.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 I would appreciate if our Windows experts could tell whether the
 implementation of ntohl/htonl is worth its money or whether we should
 go with the generic byte access plus shifts.

 the function call over
 block-sha1/sha1.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index a1228cf..67c1ee8 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -7,7 +7,11 @@
  */

 #include <string.h>
+#ifndef _WIN32
 #include <arpa/inet.h>
+#else
+#include <winsock2.h>
+#endif

 #include "sha1.h"

-- 
1.6.4.1179.g9a91.dirty

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18  7:15 [PATCH] block-sha1: Windows declares ntohl() in winsock2.h Johannes Sixt
@ 2009-08-18 10:45 ` Sebastian Schuberth
  2009-08-18 11:07   ` Junio C Hamano
  2009-08-18 12:56   ` Artur Skawina
  0 siblings, 2 replies; 34+ messages in thread
From: Sebastian Schuberth @ 2009-08-18 10:45 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: msysGit, Junio C Hamano, Git Mailing List

> This is a minimal fix to compile block-sha1 on Windows. I did not do any
> benchmarks whether the implementation of ntohl() is actually faster than
> bytewise access and shifts.

As ntohl()/htonl() are function calls (that internally do shifts), I 
doubt they're faster than the shift macros, though I haven't measured 
it. However, I do not suggest to go for the macros on Windows/Intel, but 
to apply the following patch on top of your patch:

 From 34402f0e5691ca46cd63c930eef8fc9cadf80493 Mon Sep 17 00:00:00 2001
From: Sebastian Schuberth <sschuberth@gmail.com>
Date: Tue, 18 Aug 2009 12:33:35 +0200
Subject: [PATCH] block-sha1: On Intel, use bswap built-in in favor of 
ntohl()/htonl()

On Windows/Intel, ntohl()/htonl() are function calls that do shifts to 
swap the
byte order. Using the native bswap instruction boths gets rid of the 
shifts and
the function call overhead to gain some performance.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
---
  block-sha1/sha1.c |   15 ++++++++++-----
  1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index f2830c0..07f2937 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -66,15 +66,20 @@

  /*
   * Performance might be improved if the CPU architecture is OK with
- * unaligned 32-bit loads and a fast ntohl() is available.
+ * unaligned 32-bit loads and a fast ntohl() is available. On Intel,
+ * use the bswap built-in to get rid of the function call overhead.
   * Otherwise fall back to byte loads and shifts which is portable,
   * and is faster on architectures with memory alignment issues.
   */

-#if defined(__i386__) || defined(__x86_64__) || \
-    defined(__ppc__) || defined(__ppc64__) || \
-    defined(__powerpc__) || defined(__powerpc64__) || \
-    defined(__s390__) || defined(__s390x__)
+#if defined(__i386__) || defined(__x86_64__)
+
+#define get_be32(p)	__builtin_bswap32(*(unsigned int *)(p))
+#define put_be32(p, v)	do { *(unsigned int *)(p) = 
__builtin_bswap32(v); } while (0)
+
+#elif defined(__ppc__) || defined(__ppc64__) || \
+      defined(__powerpc__) || defined(__powerpc64__) || \
+      defined(__s390__) || defined(__s390x__)

  #define get_be32(p)	ntohl(*(unsigned int *)(p))
  #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
-- 
1.6.4.169.g64d5.dirty

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 10:45 ` Sebastian Schuberth
@ 2009-08-18 11:07   ` Junio C Hamano
  2009-08-18 11:28     ` Sebastian Schuberth
  2009-08-18 12:56   ` Artur Skawina
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 11:07 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Johannes Sixt, msysGit, Linus Torvalds, Nicolas Pitre, Git Mailing List

Sebastian Schuberth <sschuberth@gmail.com> writes:

> As ntohl()/htonl() are function calls (that internally do shifts), I
> doubt they're faster than the shift macros, though I haven't measured
> it. However, I do not suggest to go for the macros on Windows/Intel,
> but to apply the following patch on top of your patch:

Your proposed commit log message makes it sound as if this change is
limited to Windows, but it is not protected with "#ifdef WIN32"; the
change should be applicable to non-windows but the message is misleading.

It should help any i386/amd64 platform whose ntohl()/htonl() is crappy, as
long as __builtin_bswap32() is supported by the compiler.  And it should
not harm other platforms, nor i386/amd64 whose ntohl()/htonl() are sane.
As i386/amd64 part of block-sha1/sha1.c has gcc dependency already, I
think it would be safe to assume __builtin_bswap32() is available.

But I'd want an Ack/Nack from the original authors (Cc'ed).

It seems that your patch is linewrapped, so please be careful _if_ it
needs to be modified and resent (if this version gets trivially acked I
can fix it up when applying and in such a case there is no need to
resend).

> From: Sebastian Schuberth <sschuberth@gmail.com>
> Date: Tue, 18 Aug 2009 12:33:35 +0200
> Subject: [PATCH] block-sha1: On Intel, use bswap built-in in favor of
> ntohl()/htonl()
>
> On Windows/Intel, ntohl()/htonl() are function calls that do shifts to
> swap the
> byte order. Using the native bswap instruction boths gets rid of the
> shifts and
> the function call overhead to gain some performance.
>
> Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
> ---
>  block-sha1/sha1.c |   15 ++++++++++-----
>  1 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
> index f2830c0..07f2937 100644
> --- a/block-sha1/sha1.c
> +++ b/block-sha1/sha1.c
> @@ -66,15 +66,20 @@
>
>  /*
>   * Performance might be improved if the CPU architecture is OK with
> - * unaligned 32-bit loads and a fast ntohl() is available.
> + * unaligned 32-bit loads and a fast ntohl() is available. On Intel,
> + * use the bswap built-in to get rid of the function call overhead.
>   * Otherwise fall back to byte loads and shifts which is portable,
>   * and is faster on architectures with memory alignment issues.
>   */
>
> -#if defined(__i386__) || defined(__x86_64__) || \
> -    defined(__ppc__) || defined(__ppc64__) || \
> -    defined(__powerpc__) || defined(__powerpc64__) || \
> -    defined(__s390__) || defined(__s390x__)
> +#if defined(__i386__) || defined(__x86_64__)
> +
> +#define get_be32(p)	__builtin_bswap32(*(unsigned int *)(p))
> +#define put_be32(p, v)	do { *(unsigned int *)(p) =
> __builtin_bswap32(v); } while (0)
> +
> +#elif defined(__ppc__) || defined(__ppc64__) || \
> +      defined(__powerpc__) || defined(__powerpc64__) || \
> +      defined(__s390__) || defined(__s390x__)
>
>  #define get_be32(p)	ntohl(*(unsigned int *)(p))
>  #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
> -- 
> 1.6.4.169.g64d5.dirty

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 11:07   ` Junio C Hamano
@ 2009-08-18 11:28     ` Sebastian Schuberth
  0 siblings, 0 replies; 34+ messages in thread
From: Sebastian Schuberth @ 2009-08-18 11:28 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Sixt, msysGit, Linus Torvalds, Nicolas Pitre, Git Mailing List

On 18.08.2009 13:07, Junio C Hamano wrote:

> Your proposed commit log message makes it sound as if this change is
> limited to Windows, but it is not protected with "#ifdef WIN32"; the
> change should be applicable to non-windows but the message is misleading.

True, the change should apply for x86, no matter what the OS. I wanted to express that *on Windows* I know ntohl()/htonl() are function that do shifts internally. Maybe on some other OS  these have better implementations (for x86).

> It seems that your patch is linewrapped, so please be careful _if_ it
> needs to be modified and resent (if this version gets trivially acked I
> can fix it up when applying and in such a case there is no need to
> resend).

Here's an updated version of the patch that both improves the commit message and fixes the line wrap.

 From b4f40f73e8bf410c975b3f29d10ca779343e3095 Mon Sep 17 00:00:00 2001
From: Sebastian Schuberth <sschuberth@gmail.com>
Date: Tue, 18 Aug 2009 12:33:35 +0200
Subject: [PATCH] block-sha1: On x86, use bswap built-in in favor of ntohl()/htonl()

On x86 and compatible, use the native bswap instruction in favor of ntohl()/
htonl(). In the best case, this gets rid of function calls to crappy
implementations, in the worst case, it should be no slower than sane (inlined)
implementations. The current code depends on GCC already, so relying on
__builtin_bswap32() to be available should be safe.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
---
  block-sha1/sha1.c |   15 ++++++++++-----
  1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index f2830c0..07f2937 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -66,15 +66,20 @@
  
  /*
   * Performance might be improved if the CPU architecture is OK with
- * unaligned 32-bit loads and a fast ntohl() is available.
+ * unaligned 32-bit loads and a fast ntohl() is available. On Intel,
+ * use the bswap built-in to get rid of the function call overhead.
   * Otherwise fall back to byte loads and shifts which is portable,
   * and is faster on architectures with memory alignment issues.
   */
  
-#if defined(__i386__) || defined(__x86_64__) || \
-    defined(__ppc__) || defined(__ppc64__) || \
-    defined(__powerpc__) || defined(__powerpc64__) || \
-    defined(__s390__) || defined(__s390x__)
+#if defined(__i386__) || defined(__x86_64__)
+
+#define get_be32(p)	__builtin_bswap32(*(unsigned int *)(p))
+#define put_be32(p, v)	do { *(unsigned int *)(p) = __builtin_bswap32(v); } while (0)
+
+#elif defined(__ppc__) || defined(__ppc64__) || \
+      defined(__powerpc__) || defined(__powerpc64__) || \
+      defined(__s390__) || defined(__s390x__)
  
  #define get_be32(p)	ntohl(*(unsigned int *)(p))
  #define put_be32(p, v)	do { *(unsigned int *)(p) = htonl(v); } while (0)
-- 
1.6.4.169.g64d5.dirty

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 10:45 ` Sebastian Schuberth
  2009-08-18 11:07   ` Junio C Hamano
@ 2009-08-18 12:56   ` Artur Skawina
  2009-08-18 13:17     ` Sebastian Schuberth
  1 sibling, 1 reply; 34+ messages in thread
From: Artur Skawina @ 2009-08-18 12:56 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Johannes Sixt, msysGit, Junio C Hamano, Git Mailing List

Sebastian Schuberth wrote:
> As ntohl()/htonl() are function calls (that internally do shifts), I
> doubt they're faster than the shift macros, though I haven't measured
> it. However, I do not suggest to go for the macros on Windows/Intel, but
> to apply the following patch on top of your patch:

> On Windows/Intel, ntohl()/htonl() are function calls that do shifts to
> swap the
> byte order. Using the native bswap instruction boths gets rid of the
> shifts and
> the function call overhead to gain some performance.

Umm, nothing like this should be needed on linux; the compiler/glibc
will choose bswap itself. (see endian.h and bits/byteswap.h).
I did try using __builtin_bswap32 directly and the result was a few
(3 or 4, iirc) differently scheduled instructions, that's all, no
performance difference.

>   * Performance might be improved if the CPU architecture is OK with
> - * unaligned 32-bit loads and a fast ntohl() is available.
> + * unaligned 32-bit loads and a fast ntohl() is available. On Intel,
> + * use the bswap built-in to get rid of the function call overhead.
>   * Otherwise fall back to byte loads and shifts which is portable,
>   * and is faster on architectures with memory alignment issues.
>   */
> 
> -#if defined(__i386__) || defined(__x86_64__) || \
> -    defined(__ppc__) || defined(__ppc64__) || \
> -    defined(__powerpc__) || defined(__powerpc64__) || \
> -    defined(__s390__) || defined(__s390x__)
> +#if defined(__i386__) || defined(__x86_64__)
>
> +#define get_be32(p)    __builtin_bswap32(*(unsigned int *)(p))
> +#define put_be32(p, v)    do { *(unsigned int *)(p) = __builtin_bswap32(v); } while (0)
> +
> +#elif defined(__ppc__) || defined(__ppc64__) || \
> +      defined(__powerpc__) || defined(__powerpc64__) || \
> +      defined(__s390__) || defined(__s390x__)

I'd limit it to windows and any other ia32 platform that doesn't pick the
bswaps itself; as is, it just adds an unnecessary hidden gcc dependency.

Hmm, it's actually a gcc-4.3+ dependency, so it won't even build w/ gcc 4.2;
something like this would be required: "(__GNUC__>=4 && __GNUC_MINOR__>=3)" .

artur

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 12:56   ` Artur Skawina
@ 2009-08-18 13:17     ` Sebastian Schuberth
  2009-08-18 13:39       ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Sebastian Schuberth @ 2009-08-18 13:17 UTC (permalink / raw)
  To: Artur Skawina; +Cc: Johannes Sixt, msysGit, Junio C Hamano, Git Mailing List

On Tue, Aug 18, 2009 at 14:56, Artur Skawina<art.08.09@gmail.com> wrote:

> I did try using __builtin_bswap32 directly and the result was a few
> (3 or 4, iirc) differently scheduled instructions, that's all, no
> performance difference.

[...]

> I'd limit it to windows and any other ia32 platform that doesn't pick the
> bswaps itself; as is, it just adds an unnecessary hidden gcc dependency.
>
> Hmm, it's actually a gcc-4.3+ dependency, so it won't even build w/ gcc 4.2;
> something like this would be required: "(__GNUC__>=4 && __GNUC_MINOR__>=3)" .

So, as you say the code makes no difference under Linux, would you be
OK with just testing for GCC 4.3+, and not for Windows? That would get
rid of the "hidden" GCC dependency and not make the preprocessor
checks overly complex. Moreover, limiting my patch to any "platform
that doesn't pick the bswaps itself" could possibly require
maintenance on compiler / CRT updates.

-- 
Sebastian Schuberth

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 13:17     ` Sebastian Schuberth
@ 2009-08-18 13:39       ` Junio C Hamano
  2009-08-18 15:40         ` Linus Torvalds
  2009-08-18 16:30         ` Nicolas Pitre
  0 siblings, 2 replies; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 13:39 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Artur Skawina, Johannes Sixt, msysGit, Junio C Hamano,
	Linus Torvalds, Nicolas Pitre, Git Mailing List

Sebastian Schuberth <sschuberth@gmail.com> writes:

> On Tue, Aug 18, 2009 at 14:56, Artur Skawina<art.08.09@gmail.com> wrote:
> ...
>> I'd limit it to windows and any other ia32 platform that doesn't pick the
>> bswaps itself; as is, it just adds an unnecessary hidden gcc dependency.
>>
>> Hmm, it's actually a gcc-4.3+ dependency, so it won't even build w/ gcc 4.2;
>> something like this would be required: "(__GNUC__>=4 && __GNUC_MINOR__>=3)" .
>
> So, as you say the code makes no difference under Linux, would you be
> OK with just testing for GCC 4.3+, and not for Windows? That would get
> rid of the "hidden" GCC dependency and not make the preprocessor
> checks overly complex. Moreover, limiting my patch to any "platform
> that doesn't pick the bswaps itself" could possibly require
> maintenance on compiler / CRT updates.

I would say that should be fine, but I'd let Linus and Nico to overrule me
on this if they have any input.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 13:39       ` Junio C Hamano
@ 2009-08-18 15:40         ` Linus Torvalds
  2009-08-18 16:08           ` Linus Torvalds
  2009-08-18 16:30         ` Nicolas Pitre
  1 sibling, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2009-08-18 15:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Nicolas Pitre, Git Mailing List



On Tue, 18 Aug 2009, Junio C Hamano wrote:

> Sebastian Schuberth <sschuberth@gmail.com> writes:
> 
> > On Tue, Aug 18, 2009 at 14:56, Artur Skawina<art.08.09@gmail.com> wrote:
> > ...
> >> I'd limit it to windows and any other ia32 platform that doesn't pick the
> >> bswaps itself; as is, it just adds an unnecessary hidden gcc dependency.
> >>
> >> Hmm, it's actually a gcc-4.3+ dependency, so it won't even build w/ gcc 4.2;
> >> something like this would be required: "(__GNUC__>=4 && __GNUC_MINOR__>=3)" .
> >
> > So, as you say the code makes no difference under Linux, would you be
> > OK with just testing for GCC 4.3+, and not for Windows? That would get
> > rid of the "hidden" GCC dependency and not make the preprocessor
> > checks overly complex. Moreover, limiting my patch to any "platform
> > that doesn't pick the bswaps itself" could possibly require
> > maintenance on compiler / CRT updates.
> 
> I would say that should be fine, but I'd let Linus and Nico to overrule me
> on this if they have any input.

I'd suggest not using a gcc builtin, since if you're using gcc you might 
as well just use inline asm that has been around forever (unlike the 
builtin).

Just do:

	#if defined(__GNUC__)
	#define htonl(x) ({ unsigned int __res; \
		__asm__("bswap %0":"=r" (__res):"0" (x)); \
		__res; })
	#define ntohl(x) htonl(x)
	#endif

or similar in the x86 section that does the rol/ror thing.

			Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 15:40         ` Linus Torvalds
@ 2009-08-18 16:08           ` Linus Torvalds
  2009-08-18 16:23             ` Sebastian Schuberth
  0 siblings, 1 reply; 34+ messages in thread
From: Linus Torvalds @ 2009-08-18 16:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Nicolas Pitre, Git Mailing List



On Tue, 18 Aug 2009, Linus Torvalds wrote:
> 
> I'd suggest not using a gcc builtin, since if you're using gcc you might 
> as well just use inline asm that has been around forever (unlike the 
> builtin).

That seems to be what glibc does too.

Here's a patch.

		Linus
---
 block-sha1/sha1.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 464cb25..e6e7170 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -22,6 +22,11 @@
 #define SHA_ROL(x,n)	SHA_ASM("rol", x, n)
 #define SHA_ROR(x,n)	SHA_ASM("ror", x, n)
 
+#undef htonl
+#undef ntohl
+#define htonl(x) ({ unsigned int __res; __asm__("bswap %0":"=r" (__res):"0" (x)); __res; })
+#define ntohl(x) htonl(x)
+
 #else
 
 #define SHA_ROT(X,l,r)	(((X) << (l)) | ((X) >> (r)))

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:08           ` Linus Torvalds
@ 2009-08-18 16:23             ` Sebastian Schuberth
  2009-08-18 16:43               ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Sebastian Schuberth @ 2009-08-18 16:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Artur Skawina, Johannes Sixt, msysGit,
	Nicolas Pitre, Git Mailing List

On Tue, Aug 18, 2009 at 18:08, Linus
Torvalds<torvalds@linux-foundation.org> wrote:

>> I'd suggest not using a gcc builtin, since if you're using gcc you might
>> as well just use inline asm that has been around forever (unlike the
>> builtin).
>
> That seems to be what glibc does too.
>
> Here's a patch.

Looks good to me, compiles & runs fine on Windows (with Hannes' patch
also applied).

-- 
Sebastian Schuberth

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 13:39       ` Junio C Hamano
  2009-08-18 15:40         ` Linus Torvalds
@ 2009-08-18 16:30         ` Nicolas Pitre
  2009-08-18 16:43           ` Nicolas Pitre
  1 sibling, 1 reply; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 16:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Junio C Hamano wrote:

> Sebastian Schuberth <sschuberth@gmail.com> writes:
> 
> > On Tue, Aug 18, 2009 at 14:56, Artur Skawina<art.08.09@gmail.com> wrote:
> > ...
> >> I'd limit it to windows and any other ia32 platform that doesn't pick the
> >> bswaps itself; as is, it just adds an unnecessary hidden gcc dependency.
> >>
> >> Hmm, it's actually a gcc-4.3+ dependency, so it won't even build w/ gcc 4.2;
> >> something like this would be required: "(__GNUC__>=4 && __GNUC_MINOR__>=3)" .
> >
> > So, as you say the code makes no difference under Linux, would you be
> > OK with just testing for GCC 4.3+, and not for Windows? That would get
> > rid of the "hidden" GCC dependency and not make the preprocessor
> > checks overly complex. Moreover, limiting my patch to any "platform
> > that doesn't pick the bswaps itself" could possibly require
> > maintenance on compiler / CRT updates.
> 
> I would say that should be fine, but I'd let Linus and Nico to overrule me
> on this if they have any input.

Well...  Given that git already uses ntohl/htonl quite extensively in 
its core already, I'd suggest making this more globally available 
instead.


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:30         ` Nicolas Pitre
@ 2009-08-18 16:43           ` Nicolas Pitre
  2009-08-18 16:50             ` Junio C Hamano
  2009-08-18 16:59             ` Sebastian Schuberth
  0 siblings, 2 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 16:43 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Nicolas Pitre wrote:

> Well...  Given that git already uses ntohl/htonl quite extensively in 
> its core already, I'd suggest making this more globally available 
> instead.

What about something like this?

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 464cb25..51a27c1 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -4,9 +4,7 @@
  * and to avoid unnecessary copies into the context array.
  */
 
-#include <string.h>
-#include <arpa/inet.h>
-
+#include "../git-compat-util.h"
 #include "sha1.h"
 
 #if defined(__i386__) || defined(__x86_64__)
diff --git a/compat/bswap.h b/compat/bswap.h
new file mode 100644
index 0000000..b436360
--- /dev/null
+++ b/compat/bswap.h
@@ -0,0 +1,19 @@
+/*
+ * Let's make sure we always have a sane definition for ntohl()/htonl().
+ * Some libraries define those as a function call, just to perform byte
+ * shifting, bringing significant overhead to what should be a sinple
+ * operation.
+ */
+
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
+
+#define bswap32(x) ({ \
+	unsigned int __res; \
+	__asm__("bswap %0" : "=r" (__res) : "0" (x)); \
+	__res; })
+#undef ntohl
+#undef htonl
+#define ntohl(x) bswap32(x)
+#define htonl(x) bswap32(x)
+
+#endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 9f941e4..000859e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -176,6 +176,8 @@ extern char *gitbasename(char *);
 #endif
 #endif
 
+#include "compat/bswap.h"
+
 /* General helper functions */
 extern void usage(const char *err) NORETURN;
 extern void die(const char *err, ...) NORETURN __attribute__((format (printf, 1, 2)));

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:23             ` Sebastian Schuberth
@ 2009-08-18 16:43               ` Junio C Hamano
  0 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 16:43 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Linus Torvalds, Junio C Hamano, Artur Skawina, Johannes Sixt,
	msysGit, Nicolas Pitre, Git Mailing List

Sebastian Schuberth <sschuberth@gmail.com> writes:

> On Tue, Aug 18, 2009 at 18:08, Linus
> Torvalds<torvalds@linux-foundation.org> wrote:
>
>>> I'd suggest not using a gcc builtin, since if you're using gcc you might
>>> as well just use inline asm that has been around forever (unlike the
>>> builtin).
>>
>> That seems to be what glibc does too.
>>
>> Here's a patch.
>
> Looks good to me, compiles & runs fine on Windows (with Hannes' patch
> also applied).

But the Windows part that avoids arpa/inet.h and includes winsock2.h, only
to undef the two macros immediately after doing so, now looks quite silly.
Are there non i386/amd64 Windows we care about?

Squashing Linus's and Hannes's patch here is what I came up with.

-- >8 --
block-sha1: avoid potentially inefficient ntohl/htonl on i386/x86-64

Johannes Sixt reports that on Windows ntohl()/htonl() are not found in
<arpa/inet.h>, and minimally we need to include <winsock2.h> instead.
Sebastian Schuberth points out that they are implemented as out-of-line
functions on Windows, which defeats the use of these byteorder "macros"
for performance.

Use bswap instruction through gcc inline asm instead on i386/x86-64
as a generic solution to this.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>,
---
diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index a1228cf..fa909a3 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -7,8 +7,6 @@
  */
 
 #include <string.h>
-#include <arpa/inet.h>
-
 #include "sha1.h"
 
 #if defined(__i386__) || defined(__x86_64__)
@@ -24,8 +22,15 @@
 #define SHA_ROL(x,n)	SHA_ASM("rol", x, n)
 #define SHA_ROR(x,n)	SHA_ASM("ror", x, n)
 
+#undef htonl
+#undef ntohl
+#define htonl(x) ({ unsigned int __res; __asm__("bswap %0":"=r" (__res):"0" (x)); __res; })
+#define ntohl(x) htonl(x)
+
 #else
 
+#include <arpa/inet.h>
+
 #define SHA_ROT(X,l,r)	(((X) << (l)) | ((X) >> (r)))
 #define SHA_ROL(X,n)	SHA_ROT(X,n,32-(n))
 #define SHA_ROR(X,n)	SHA_ROT(X,32-(n),n)

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:43           ` Nicolas Pitre
@ 2009-08-18 16:50             ` Junio C Hamano
  2009-08-18 18:01               ` Nicolas Pitre
  2009-08-18 16:59             ` Sebastian Schuberth
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 16:50 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

Nicolas Pitre <nico@cam.org> writes:

> On Tue, 18 Aug 2009, Nicolas Pitre wrote:
>
>> Well...  Given that git already uses ntohl/htonl quite extensively in 
>> its core already, I'd suggest making this more globally available 
>> instead.
>
> What about something like this?

For git's own use, I would be much happier with this change.

But given that there are some people wanting to snarf block-sha1/*.[ch]
out to use them standalone, I have a slight hesitation against introducing
the dependency to git-compat-util.h, making it unclear to them that all
this file wants from outside are ntohl, htonl and memcpy.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:43           ` Nicolas Pitre
  2009-08-18 16:50             ` Junio C Hamano
@ 2009-08-18 16:59             ` Sebastian Schuberth
  2009-08-18 17:05               ` Junio C Hamano
  2009-08-18 18:10               ` Nicolas Pitre
  1 sibling, 2 replies; 34+ messages in thread
From: Sebastian Schuberth @ 2009-08-18 16:59 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, Aug 18, 2009 at 18:43, Nicolas Pitre<nico@cam.org> wrote:

>> Well...  Given that git already uses ntohl/htonl quite extensively in
>> its core already, I'd suggest making this more globally available
>> instead.
>
> What about something like this?

I like the idea of making bswap available more globally, but I'm not
sure if it's worth to introduce a new file for only that purpose.
Isn't there already a central header for such things?

Moreover, including compat/bswap.h would only give you ntohl()/htonl()
on one platform. For consistency, I'd expect to get those for any
platform if I include compat/bswap.h, but maybe I'm not aware of some
Git source code rules.

Finally, there's a typo in your comment saying "sinple" instead of "simple".

-- 
Sebastian Schuberth

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:59             ` Sebastian Schuberth
@ 2009-08-18 17:05               ` Junio C Hamano
  2009-08-18 18:16                 ` Nicolas Pitre
  2009-08-18 18:10               ` Nicolas Pitre
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 17:05 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Nicolas Pitre, Junio C Hamano, Artur Skawina, Johannes Sixt,
	msysGit, Linus Torvalds, Git Mailing List

Sebastian Schuberth <sschuberth@gmail.com> writes:

> I like the idea of making bswap available more globally, but I'm not
> sure if it's worth to introduce a new file for only that purpose.
> Isn't there already a central header for such things?
>
> Moreover, including compat/bswap.h would only give you ntohl()/htonl()
> on one platform.

We do not include compat/ directly from the source; git-compat-util.h is
supposed to be the first thing included (as some platforms have peculiar
requirements on the order in which system header files are included, and
one of the reasons git-compat-util.h is there).  Hence by including it,
you get ntohl/htonl everywhere.

To reduce confusion, you may want to rename compat/bswap.h to something
like compat/ntohl-htonl-fix.h ;-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:50             ` Junio C Hamano
@ 2009-08-18 18:01               ` Nicolas Pitre
  2009-08-18 19:00                 ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 18:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Tue, 18 Aug 2009, Nicolas Pitre wrote:
> >
> >> Well...  Given that git already uses ntohl/htonl quite extensively in 
> >> its core already, I'd suggest making this more globally available 
> >> instead.
> >
> > What about something like this?
> 
> For git's own use, I would be much happier with this change.
> 
> But given that there are some people wanting to snarf block-sha1/*.[ch]
> out to use them standalone, I have a slight hesitation against introducing
> the dependency to git-compat-util.h, making it unclear to them that all
> this file wants from outside are ntohl, htonl and memcpy.

Should we really care to keep our code suboptimal just to make it 
readily reusable by other projects?  That seems a bit backward to me.

We might simply add a comment telling people what git-compat-util.h is 
actually for, namely memcpy(), ntohl() and htonl().  That should be 
trivial for them to use the appropriate substitute.


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 16:59             ` Sebastian Schuberth
  2009-08-18 17:05               ` Junio C Hamano
@ 2009-08-18 18:10               ` Nicolas Pitre
  1 sibling, 0 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 18:10 UTC (permalink / raw)
  To: Sebastian Schuberth
  Cc: Junio C Hamano, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1542 bytes --]

On Tue, 18 Aug 2009, Sebastian Schuberth wrote:

> On Tue, Aug 18, 2009 at 18:43, Nicolas Pitre<nico@cam.org> wrote:
> 
> >> Well...  Given that git already uses ntohl/htonl quite extensively in
> >> its core already, I'd suggest making this more globally available
> >> instead.
> >
> > What about something like this?
> 
> I like the idea of making bswap available more globally, but I'm not
> sure if it's worth to introduce a new file for only that purpose.
> Isn't there already a central header for such things?

That central header is already quite crowded.  A bit of isolation might 
not hurt.

Furthermore, other platforms might wish to add their own (re)definitions 
for those byte swap operations, so it has the potential to grow.  I for 
example have a better implementation for ARM than what is provided by 
glibc.  (Yeah yeah, maybe glibc should be fixed instead, but that 
reasoning goes for all those other libraries too).

> Moreover, including compat/bswap.h would only give you ntohl()/htonl()
> on one platform. For consistency, I'd expect to get those for any
> platform if I include compat/bswap.h, but maybe I'm not aware of some
> Git source code rules.

You get it by default for all platforms already by including 
git-compat-util.h.  The compat/bswap.h is not meant to be included by 
random c files.  If compat/bswap.h happens to contain a better version 
for your architecture then it'll override the default one.

> Finally, there's a typo in your comment saying "sinple" instead of "simple".

Thanks


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 17:05               ` Junio C Hamano
@ 2009-08-18 18:16                 ` Nicolas Pitre
  2009-08-18 20:17                   ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 18:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Junio C Hamano wrote:

> To reduce confusion, you may want to rename compat/bswap.h to something
> like compat/ntohl-htonl-fix.h ;-)

Bah.  If you wish, you can edit the patch directly for this, unless you 
really prefer me to repost.  Maybe we might want to add a 8-byte 
versions of those as well eventually, which is why I chose a more 
generic name.


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 18:01               ` Nicolas Pitre
@ 2009-08-18 19:00                 ` Junio C Hamano
  2009-08-18 19:22                   ` Nicolas Pitre
                                     ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 19:00 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

Nicolas Pitre <nico@cam.org> writes:

> On Tue, 18 Aug 2009, Junio C Hamano wrote:
>
>> For git's own use, I would be much happier with this change.
>> 
>> But given that there are some people wanting to snarf block-sha1/*.[ch]
>> out to use them standalone, I have a slight hesitation against introducing
>> the dependency to git-compat-util.h, making it unclear to them that all
>> this file wants from outside are ntohl, htonl and memcpy.
>
> Should we really care to keep our code suboptimal just to make it 
> readily reusable by other projects?  That seems a bit backward to me.

You are right; and I should give a bit more credit to their intelligence.
The source (block-sha1/sha1.c) is short enough that they can figure this
out for themselves even without any additional comments.

Another issue, especially with your "openssl sha1 removal" patch, is if we
can assume gcc everywhere.  As far as I can tell, block-sha1/sha1.c will
be the first unconditional use of inline asm or statement expression on
i386/amd64.  Are folks on Solaris and other platforms Ok with this?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 19:00                 ` Junio C Hamano
@ 2009-08-18 19:22                   ` Nicolas Pitre
  2009-08-18 19:26                     ` [PATCH] make sure byte swapping is optimal for git Nicolas Pitre
  2009-08-18 19:37                     ` [PATCH] block-sha1: guard gcc extensions with __GNUC__ Nicolas Pitre
  2009-08-18 19:40                   ` [PATCH] block-sha1: Windows declares ntohl() in winsock2.h Junio C Hamano
  2009-08-18 19:56                   ` Brandon Casey
  2 siblings, 2 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 19:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Tue, 18 Aug 2009, Junio C Hamano wrote:
> >
> >> For git's own use, I would be much happier with this change.
> >> 
> >> But given that there are some people wanting to snarf block-sha1/*.[ch]
> >> out to use them standalone, I have a slight hesitation against introducing
> >> the dependency to git-compat-util.h, making it unclear to them that all
> >> this file wants from outside are ntohl, htonl and memcpy.
> >
> > Should we really care to keep our code suboptimal just to make it 
> > readily reusable by other projects?  That seems a bit backward to me.
> 
> You are right; and I should give a bit more credit to their intelligence.
> The source (block-sha1/sha1.c) is short enough that they can figure this
> out for themselves even without any additional comments.

Well, I gave in and added a comment to the patch anyway, with more 
improvements in the case of constant values.  Patch follows.

> Another issue, especially with your "openssl sha1 removal" patch, is if we
> can assume gcc everywhere.  As far as I can tell, block-sha1/sha1.c will
> be the first unconditional use of inline asm or statement expression on
> i386/amd64.  Are folks on Solaris and other platforms Ok with this?

I guess we can guard the first with 
ifdef(__GNUC__) which should help 
people with MSVC.  That should take care of x86 at least.

Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] make sure byte swapping is optimal for git
  2009-08-18 19:22                   ` Nicolas Pitre
@ 2009-08-18 19:26                     ` Nicolas Pitre
  2009-08-18 19:37                     ` [PATCH] block-sha1: guard gcc extensions with __GNUC__ Nicolas Pitre
  1 sibling, 0 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 19:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

We rely on ntohl() and htonl() to perform byte swapping in many places.
However, some platforms have libraries providing really poor
implementations of those which might cause significant performance
issues, especially with the block-sha1 code.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

On Tue, 18 Aug 2009, Nicolas Pitre wrote:

> Well, I gave in and added a comment to the patch anyway, with more 
> improvements in the case of constant values.  Patch follows.

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 464cb25..d31f2e3 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -4,8 +4,8 @@
  * and to avoid unnecessary copies into the context array.
  */
 
-#include <string.h>
-#include <arpa/inet.h>
+/* this is only to get definitions for memcpy(), ntohl() and htonl() */
+#include "../git-compat-util.h"
 
 #include "sha1.h"
 
diff --git a/compat/bswap.h b/compat/bswap.h
new file mode 100644
index 0000000..7246a12
--- /dev/null
+++ b/compat/bswap.h
@@ -0,0 +1,36 @@
+/*
+ * Let's make sure we always have a sane definition for ntohl()/htonl().
+ * Some libraries define those as a function call, just to perform byte
+ * shifting, bringing significant overhead to what should be a simple
+ * operation.
+ */
+
+/*
+ * Default version that the compiler ought to optimize properly with
+ * constant values.
+ */
+static inline unsigned int default_swab32(unsigned int val)
+{
+	return (((val & 0xff000000) >> 24) |
+		((val & 0x00ff0000) >>  8) |
+		((val & 0x0000ff00) <<  8) |
+		((val & 0x000000ff) << 24));
+}
+
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
+
+#define bswap32(x) ({ \
+	unsigned int __res; \
+	if (__builtin_constant_p(x)) { \
+		__res = default_swab32(x); \
+	} else { \
+		__asm__("bswap %0" : "=r" (__res) : "0" (x)); \
+	} \
+	__res; })
+
+#undef ntohl
+#undef htonl
+#define ntohl(x) bswap32(x)
+#define htonl(x) bswap32(x)
+
+#endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 9f941e4..000859e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -176,6 +176,8 @@ extern char *gitbasename(char *);
 #endif
 #endif
 
+#include "compat/bswap.h"
+
 /* General helper functions */
 extern void usage(const char *err) NORETURN;
 extern void die(const char *err, ...) NORETURN __attribute__((format (printf, 1, 2)));

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH] block-sha1: guard gcc extensions with __GNUC__
  2009-08-18 19:22                   ` Nicolas Pitre
  2009-08-18 19:26                     ` [PATCH] make sure byte swapping is optimal for git Nicolas Pitre
@ 2009-08-18 19:37                     ` Nicolas Pitre
  1 sibling, 0 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 19:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

With this, the code should now be portable to any C compiler.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

On Tue, 18 Aug 2009, Nicolas Pitre wrote:

> On Tue, 18 Aug 2009, Junio C Hamano wrote:
> 
> > Another issue, especially with your "openssl sha1 removal" patch, is if we
> > can assume gcc everywhere.  As far as I can tell, block-sha1/sha1.c will
> > be the first unconditional use of inline asm or statement expression on
> > i386/amd64.  Are folks on Solaris and other platforms Ok with this?
> 
> I guess we can guard the first with ifdef(__GNUC__) which should help 
> people with MSVC.  That should take care of x86 at least.

Here it is.

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index d31f2e3..92d9121 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -9,7 +9,7 @@
 
 #include "sha1.h"
 
-#if defined(__i386__) || defined(__x86_64__)
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
 
 /*
  * Force usage of rol or ror by selecting the one with the smaller constant.
@@ -54,7 +54,7 @@
 
 #if defined(__i386__) || defined(__x86_64__)
   #define setW(x, val) (*(volatile unsigned int *)&W(x) = (val))
-#elif defined(__arm__)
+#elif defined(__GNUC__) && defined(__arm__)
   #define setW(x, val) do { W(x) = (val); __asm__("":::"memory"); } while (0)
 #else
   #define setW(x, val) (W(x) = (val))

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 19:00                 ` Junio C Hamano
  2009-08-18 19:22                   ` Nicolas Pitre
@ 2009-08-18 19:40                   ` Junio C Hamano
  2009-08-18 19:56                   ` Brandon Casey
  2 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 19:40 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> Another issue, especially with your "openssl sha1 removal" patch, is if we
> can assume gcc everywhere.  As far as I can tell, block-sha1/sha1.c will
> be the first unconditional use of inline asm or statement expression on
> i386/amd64.  Are folks on Solaris and other platforms Ok with this?

Ahhh, your bswap.h is a no-op unless you use gcc, so my worry was
unfounded.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 19:00                 ` Junio C Hamano
  2009-08-18 19:22                   ` Nicolas Pitre
  2009-08-18 19:40                   ` [PATCH] block-sha1: Windows declares ntohl() in winsock2.h Junio C Hamano
@ 2009-08-18 19:56                   ` Brandon Casey
  2009-08-18 20:10                     ` Nicolas Pitre
  2009-08-20  2:26                     ` Nicolas Pitre
  2 siblings, 2 replies; 34+ messages in thread
From: Brandon Casey @ 2009-08-18 19:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nicolas Pitre, Sebastian Schuberth, Artur Skawina, Johannes Sixt,
	msysGit, Linus Torvalds, Git Mailing List

Junio C Hamano wrote:

> Another issue, especially with your "openssl sha1 removal" patch, is if we
> can assume gcc everywhere.  As far as I can tell, block-sha1/sha1.c will
> be the first unconditional use of inline asm or statement expression on
> i386/amd64.  Are folks on Solaris and other platforms Ok with this?

The SUNWspro compiler doesn't set __i386__.  Instead it sets __i386, and
I think __x86_64 and __amd64 where appropriate.  So, compilation with
the SUNWspro compiler on x86 is currently unaffected by these changes and
falls back to the generic routines.

It seems that v5.10 of the compiler can grok both the __asm__ statements
and the ({...}) naked block notation and passes all of the tests when the
block_sha1 code is modified to add defined(__i386) to each of the macro
statements.

The 5.8 version cannot grok the naked block, and requires spelling __asm__
as __asm for inline assembly.  Even then it appears that there is a bug in
the assembly that is produced (a google search told me so), so the assembly
code does not successfully compile.

I haven't had much time to think about how or whether to address this.

Adding something like the following would get ugly real quick:

   (defined(__i386) && defined(__SUNPRO_C) && (__SUNPRO_C >= 0x5100))

For now, the code compiles fine using the SUNWspro compiler on x86 even if
it is suboptimal compared to gcc.  It is still an improvement over the
mozilla code.

-brandon

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 19:56                   ` Brandon Casey
@ 2009-08-18 20:10                     ` Nicolas Pitre
  2009-08-18 20:25                       ` Linus Torvalds
  2009-08-18 20:29                       ` Brandon Casey
  2009-08-20  2:26                     ` Nicolas Pitre
  1 sibling, 2 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 20:10 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Sebastian Schuberth, Artur Skawina,
	Johannes Sixt, msysGit, Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Brandon Casey wrote:

> The SUNWspro compiler doesn't set __i386__.  Instead it sets __i386, and
> I think __x86_64 and __amd64 where appropriate.  So, compilation with
> the SUNWspro compiler on x86 is currently unaffected by these changes and
> falls back to the generic routines.
> 
> It seems that v5.10 of the compiler can grok both the __asm__ statements
> and the ({...}) naked block notation and passes all of the tests when the
> block_sha1 code is modified to add defined(__i386) to each of the macro
> statements.

Does it really implement the gcc inline assembly syntax?


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 18:16                 ` Nicolas Pitre
@ 2009-08-18 20:17                   ` Junio C Hamano
  2009-08-18 20:30                     ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 20:17 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Sebastian Schuberth, Artur Skawina,
	Johannes Sixt, msysGit, Linus Torvalds, Git Mailing List

Nicolas Pitre <nico@cam.org> writes:

> On Tue, 18 Aug 2009, Junio C Hamano wrote:
>
>> To reduce confusion, you may want to rename compat/bswap.h to something
>> like compat/ntohl-htonl-fix.h ;-)
>
> Bah.  If you wish, you can edit the patch directly for this, unless you 
> really prefer me to repost.  Maybe we might want to add a 8-byte 
> versions of those as well eventually, which is why I chose a more 
> generic name.

Ok, here is what I came up with after many squashing...

-- >8 --
From: Nicolas Pitre <nico@cam.org>
Date: Tue, 18 Aug 2009 12:43:08 -0400
Subject: [PATCH] bswap: avoid potentially inefficient ntohl/htonl on i386/x86-64

Johannes Sixt reports that on Windows ntohl()/htonl() are not found in
<arpa/inet.h>, and as a minimal fix we need to include <winsock2.h>
instead.  Sebastian Schuberth points out that they are implemented as
out-of-line functions on Windows, which defeats these byteorder "macros"
used for performance.

Use bswap instruction through gcc inline asm instead on i386/x86-64
as a generic solution to this.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 block-sha1/sha1.c |    4 +---
 compat/bswap.h    |   19 +++++++++++++++++++
 git-compat-util.h |    2 ++
 3 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 compat/bswap.h

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 464cb25..51a27c1 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -4,9 +4,7 @@
  * and to avoid unnecessary copies into the context array.
  */
 
-#include <string.h>
-#include <arpa/inet.h>
-
+#include "../git-compat-util.h"
 #include "sha1.h"
 
 #if defined(__i386__) || defined(__x86_64__)
diff --git a/compat/bswap.h b/compat/bswap.h
new file mode 100644
index 0000000..78fd2df
--- /dev/null
+++ b/compat/bswap.h
@@ -0,0 +1,19 @@
+/*
+ * Let's make sure we always have a sane definition for ntohl()/htonl().
+ * Some libraries define those as a function call, just to perform byte
+ * swapping, bringing significant overhead to what should be a simple
+ * operation.
+ */
+
+#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
+
+#define bswap32(x) ({ \
+	unsigned int __res; \
+	__asm__("bswap %0" : "=r" (__res) : "0" (x)); \
+	__res; })
+#undef ntohl
+#undef htonl
+#define ntohl(x) bswap32(x)
+#define htonl(x) bswap32(x)
+
+#endif
diff --git a/git-compat-util.h b/git-compat-util.h
index 9f941e4..000859e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -176,6 +176,8 @@ extern char *gitbasename(char *);
 #endif
 #endif
 
+#include "compat/bswap.h"
+
 /* General helper functions */
 extern void usage(const char *err) NORETURN;
 extern void die(const char *err, ...) NORETURN __attribute__((format (printf, 1, 2)));
-- 
1.6.4.245.g50659

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 20:10                     ` Nicolas Pitre
@ 2009-08-18 20:25                       ` Linus Torvalds
  2009-08-18 20:29                       ` Brandon Casey
  1 sibling, 0 replies; 34+ messages in thread
From: Linus Torvalds @ 2009-08-18 20:25 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Brandon Casey, Junio C Hamano, Sebastian Schuberth,
	Artur Skawina, Johannes Sixt, msysGit, Git Mailing List



On Tue, 18 Aug 2009, Nicolas Pitre wrote:
> 
> Does it really implement the gcc inline assembly syntax?

A fair number of compilers do. The Intel compiler does too.

The reason? The gcc inline asm is as close to a standard there is, and is 
fairly expressive without being the mess that is MSC. So if you do inline 
asm at all (and considering the target market for a C compiler, most do), 
gcc asm is a good thing to aim for.

			Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 20:10                     ` Nicolas Pitre
  2009-08-18 20:25                       ` Linus Torvalds
@ 2009-08-18 20:29                       ` Brandon Casey
  1 sibling, 0 replies; 34+ messages in thread
From: Brandon Casey @ 2009-08-18 20:29 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Sebastian Schuberth, Artur Skawina,
	Johannes Sixt, msysGit, Linus Torvalds, Git Mailing List

Nicolas Pitre wrote:
> On Tue, 18 Aug 2009, Brandon Casey wrote:
> 
>> The SUNWspro compiler doesn't set __i386__.  Instead it sets __i386, and
>> I think __x86_64 and __amd64 where appropriate.  So, compilation with
>> the SUNWspro compiler on x86 is currently unaffected by these changes and
>> falls back to the generic routines.
>>
>> It seems that v5.10 of the compiler can grok both the __asm__ statements
>> and the ({...}) naked block notation and passes all of the tests when the
>> block_sha1 code is modified to add defined(__i386) to each of the macro
>> statements.
> 
> Does it really implement the gcc inline assembly syntax?

Yes, I think it does.  I actually extracted the assembly from sha1.c and tested
in a separate test program.  The v5.10 sunwspro compiler compiles and executes
it correctly.

-brandon

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 20:17                   ` Junio C Hamano
@ 2009-08-18 20:30                     ` Junio C Hamano
  2009-08-18 20:49                       ` Nicolas Pitre
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2009-08-18 20:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nicolas Pitre, Sebastian Schuberth, Artur Skawina, Johannes Sixt,
	msysGit, Linus Torvalds, Git Mailing List

Junio C Hamano <gitster@pobox.com> writes:

> Nicolas Pitre <nico@cam.org> writes:
>
>> On Tue, 18 Aug 2009, Junio C Hamano wrote:
>>
>>> To reduce confusion, you may want to rename compat/bswap.h to something
>>> like compat/ntohl-htonl-fix.h ;-)
>>
>> Bah.  If you wish, you can edit the patch directly for this, unless you 
>> really prefer me to repost.  Maybe we might want to add a 8-byte 
>> versions of those as well eventually, which is why I chose a more 
>> generic name.
>
> Ok, here is what I came up with after many squashing...

Meh, our mails crossed.  I'll chuck this one and use your

    [PATCH] make sure byte swapping is optimal for git

patch.  Do you want default_swab32 be mmoved inside the

    #if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))

block?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 20:30                     ` Junio C Hamano
@ 2009-08-18 20:49                       ` Nicolas Pitre
  0 siblings, 0 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-18 20:49 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sebastian Schuberth, Artur Skawina, Johannes Sixt, msysGit,
	Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Junio C Hamano wrote:

> Meh, our mails crossed.  I'll chuck this one and use your
> 
>     [PATCH] make sure byte swapping is optimal for git
> 
> patch.  Do you want default_swab32 be mmoved inside the
> 
>     #if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
> 
> block?

Not necessarily.  It is generic code that other compilers/architectures 
might use as well.


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-18 19:56                   ` Brandon Casey
  2009-08-18 20:10                     ` Nicolas Pitre
@ 2009-08-20  2:26                     ` Nicolas Pitre
  2009-08-20  2:30                       ` Linus Torvalds
  2009-08-20  2:45                       ` Brandon Casey
  1 sibling, 2 replies; 34+ messages in thread
From: Nicolas Pitre @ 2009-08-20  2:26 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Sebastian Schuberth, Artur Skawina,
	Johannes Sixt, msysGit, Linus Torvalds, Git Mailing List

On Tue, 18 Aug 2009, Brandon Casey wrote:

> The SUNWspro compiler doesn't set __i386__.  Instead it sets __i386, and
> I think __x86_64 and __amd64 where appropriate.  So, compilation with
> the SUNWspro compiler on x86 is currently unaffected by these changes and
> falls back to the generic routines.
> 
> It seems that v5.10 of the compiler can grok both the __asm__ statements
> and the ({...}) naked block notation and passes all of the tests when the
> block_sha1 code is modified to add defined(__i386) to each of the macro
> statements.
> 
> The 5.8 version cannot grok the naked block, and requires spelling __asm__
> as __asm for inline assembly.  Even then it appears that there is a bug in
> the assembly that is produced (a google search told me so), so the assembly
> code does not successfully compile.
> 
> I haven't had much time to think about how or whether to address this.
> 
> Adding something like the following would get ugly real quick:
> 
>    (defined(__i386) && defined(__SUNPRO_C) && (__SUNPRO_C >= 0x5100))

I think the best solution in this case might simply be to add something 
like this somewhere at the top of git-compat-util.h after the system 
includes:

/*
 * The SUNWspro compiler uses different symbols than gcc.
 * Let's standardize on the gcc flavor.
 */
#if defined(__i386) && !defined(__i386__)
#define __i386__
#endif
#if (defined(__x86_64) || defined(__amd64)) && !defined(__x86_64__)
#define __x86_64__
#endif
/*
 * SUNWspro from version 5.10 supports gcc extensions such as gcc's 
 * statement expressions and extended inline asm, so let's pretend...
 */
#if defined(__SUNPRO_C) && (__SUNPRO_C >= 0x5100))
#define __GNUC__
#endif


Nicolas

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-20  2:26                     ` Nicolas Pitre
@ 2009-08-20  2:30                       ` Linus Torvalds
  2009-08-20  2:45                       ` Brandon Casey
  1 sibling, 0 replies; 34+ messages in thread
From: Linus Torvalds @ 2009-08-20  2:30 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Brandon Casey, Junio C Hamano, Sebastian Schuberth,
	Artur Skawina, Johannes Sixt, msysGit, Git Mailing List



On Wed, 19 Aug 2009, Nicolas Pitre wrote:
> 
> I think the best solution in this case might simply be to add something 
> like this somewhere at the top of git-compat-util.h after the system 
> includes:

I think we can just test __i386.

Gcc defines that one too (in fact, in general, gcc always defines both the 
__x and the __x__ versions, although I'm sure there are exceptions)

		Linus

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] block-sha1: Windows declares ntohl() in winsock2.h
  2009-08-20  2:26                     ` Nicolas Pitre
  2009-08-20  2:30                       ` Linus Torvalds
@ 2009-08-20  2:45                       ` Brandon Casey
  1 sibling, 0 replies; 34+ messages in thread
From: Brandon Casey @ 2009-08-20  2:45 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Sebastian Schuberth, Artur Skawina,
	Johannes Sixt, msysGit, Linus Torvalds, Git Mailing List

Nicolas Pitre wrote:
> On Tue, 18 Aug 2009, Brandon Casey wrote:
> 
>> The SUNWspro compiler doesn't set __i386__.  Instead it sets __i386, and
>> I think __x86_64 and __amd64 where appropriate.  So, compilation with
>> the SUNWspro compiler on x86 is currently unaffected by these changes and
>> falls back to the generic routines.
>>
>> It seems that v5.10 of the compiler can grok both the __asm__ statements
>> and the ({...}) naked block notation and passes all of the tests when the
>> block_sha1 code is modified to add defined(__i386) to each of the macro
>> statements.
>>
>> The 5.8 version cannot grok the naked block, and requires spelling __asm__
>> as __asm for inline assembly.  Even then it appears that there is a bug in
>> the assembly that is produced (a google search told me so), so the assembly
>> code does not successfully compile.
>>
>> I haven't had much time to think about how or whether to address this.
>>
>> Adding something like the following would get ugly real quick:
>>
>>    (defined(__i386) && defined(__SUNPRO_C) && (__SUNPRO_C >= 0x5100))
> 
> I think the best solution in this case might simply be to add something 
> like this somewhere at the top of git-compat-util.h after the system 
> includes:
> 
> /*
>  * The SUNWspro compiler uses different symbols than gcc.
>  * Let's standardize on the gcc flavor.
>  */
> #if defined(__i386) && !defined(__i386__)
> #define __i386__
> #endif
> #if (defined(__x86_64) || defined(__amd64)) && !defined(__x86_64__)
> #define __x86_64__
> #endif

Yes, I had this idea too.

> /*
>  * SUNWspro from version 5.10 supports gcc extensions such as gcc's 
>  * statement expressions and extended inline asm, so let's pretend...
>  */
> #if defined(__SUNPRO_C) && (__SUNPRO_C >= 0x5100))
> #define __GNUC__
> #endif

I hadn't thought of this.  I'll test to make sure the other statements
that are protected by ifdef __GNUC__ work correctly with SUNWspro v5.10.

-brandon

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2009-08-20  2:46 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-18  7:15 [PATCH] block-sha1: Windows declares ntohl() in winsock2.h Johannes Sixt
2009-08-18 10:45 ` Sebastian Schuberth
2009-08-18 11:07   ` Junio C Hamano
2009-08-18 11:28     ` Sebastian Schuberth
2009-08-18 12:56   ` Artur Skawina
2009-08-18 13:17     ` Sebastian Schuberth
2009-08-18 13:39       ` Junio C Hamano
2009-08-18 15:40         ` Linus Torvalds
2009-08-18 16:08           ` Linus Torvalds
2009-08-18 16:23             ` Sebastian Schuberth
2009-08-18 16:43               ` Junio C Hamano
2009-08-18 16:30         ` Nicolas Pitre
2009-08-18 16:43           ` Nicolas Pitre
2009-08-18 16:50             ` Junio C Hamano
2009-08-18 18:01               ` Nicolas Pitre
2009-08-18 19:00                 ` Junio C Hamano
2009-08-18 19:22                   ` Nicolas Pitre
2009-08-18 19:26                     ` [PATCH] make sure byte swapping is optimal for git Nicolas Pitre
2009-08-18 19:37                     ` [PATCH] block-sha1: guard gcc extensions with __GNUC__ Nicolas Pitre
2009-08-18 19:40                   ` [PATCH] block-sha1: Windows declares ntohl() in winsock2.h Junio C Hamano
2009-08-18 19:56                   ` Brandon Casey
2009-08-18 20:10                     ` Nicolas Pitre
2009-08-18 20:25                       ` Linus Torvalds
2009-08-18 20:29                       ` Brandon Casey
2009-08-20  2:26                     ` Nicolas Pitre
2009-08-20  2:30                       ` Linus Torvalds
2009-08-20  2:45                       ` Brandon Casey
2009-08-18 16:59             ` Sebastian Schuberth
2009-08-18 17:05               ` Junio C Hamano
2009-08-18 18:16                 ` Nicolas Pitre
2009-08-18 20:17                   ` Junio C Hamano
2009-08-18 20:30                     ` Junio C Hamano
2009-08-18 20:49                       ` Nicolas Pitre
2009-08-18 18:10               ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.