All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] fast_hash: clobber registers correctly for inline function use
@ 2014-11-14 14:06 Hannes Frederic Sowa
  2014-11-14 14:40 ` [PATCH net-next v2] " Hannes Frederic Sowa
  2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
  0 siblings, 2 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 14:06 UTC (permalink / raw)
  To: netdev; +Cc: ogerlitz, pshelar, jesse, jay.vosburgh, discuss

In case the arch_fast_hash call gets inlined we need to tell gcc which
registers are clobbered with. Most callers where fine, as rhashtable
used arch_fast_hash via function pointer and thus the compiler took care
of that. In case of openvswitch the call got inlined and arch_fast_hash
touched registeres which gcc didn't know about.

Also don't use conditional compilation inside arguments, as this confuses
sparse.

Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 arch/x86/include/asm/hash.h | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
index a881d78..771cee0 100644
--- a/arch/x86/include/asm/hash.h
+++ b/arch/x86/include/asm/hash.h
@@ -23,11 +23,14 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "cc", "memory");
 #endif
 	return hash;
 }
@@ -36,11 +39,14 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "cc", "memory");
 #endif
 	return hash;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH net-next v2] fast_hash: clobber registers correctly for inline function use
  2014-11-14 14:06 [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
@ 2014-11-14 14:40 ` Hannes Frederic Sowa
  2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
  1 sibling, 0 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 14:40 UTC (permalink / raw)
  To: netdev; +Cc: ogerlitz, pshelar, jesse, jay.vosburgh, discuss

In case the arch_fast_hash call gets inlined we need to tell gcc which
registers are clobbered with. rhashtable was fine, because it used
arch_fast_hash via function pointer and thus the compiler took care of
that. In case of openvswitch the call got inlined and arch_fast_hash
touched registeres which gcc didn't know about.

Also don't use conditional compilation inside arguments, as this confuses
sparse.

Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---

v2)
After studying gcc documentation again, it occured to me that I need to
specificy all input operands in the clobber section, too. Otherwise gcc
can expect that the inline assembler section won't modify the inputs,
which is not true.

Bye,
Hannes


 arch/x86/include/asm/hash.h | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
index a881d78..a25c45a 100644
--- a/arch/x86/include/asm/hash.h
+++ b/arch/x86/include/asm/hash.h
@@ -23,11 +23,15 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
+			   "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "edx", "ecx", "cc", "memory");
 #endif
 	return hash;
 }
@@ -36,11 +40,15 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
+			   "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "edx", "ecx", "cc", "memory");
 #endif
 	return hash;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 14:06 [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
  2014-11-14 14:40 ` [PATCH net-next v2] " Hannes Frederic Sowa
@ 2014-11-14 14:50 ` Eric Dumazet
  2014-11-14 15:13   ` Hannes Frederic Sowa
  2014-11-14 15:17   ` [PATCH net-next v3] " Hannes Frederic Sowa
  1 sibling, 2 replies; 21+ messages in thread
From: Eric Dumazet @ 2014-11-14 14:50 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fri, 2014-11-14 at 15:06 +0100, Hannes Frederic Sowa wrote:
> In case the arch_fast_hash call gets inlined we need to tell gcc which
> registers are clobbered with. Most callers where fine, as rhashtable
> used arch_fast_hash via function pointer and thus the compiler took care
> of that. In case of openvswitch the call got inlined and arch_fast_hash
> touched registeres which gcc didn't know about.
> 
> Also don't use conditional compilation inside arguments, as this confuses
> sparse.
> 

Please add a 
Fixes: 12-sha1 ("patch title")

> Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
> Cc: Pravin Shelar <pshelar@nicira.com>
> Cc: Jesse Gross <jesse@nicira.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
>  arch/x86/include/asm/hash.h | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
> index a881d78..771cee0 100644
> --- a/arch/x86/include/asm/hash.h
> +++ b/arch/x86/include/asm/hash.h
> @@ -23,11 +23,14 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
>  {
>  	u32 hash;
>  
> -	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
>  #ifdef CONFIG_X86_64
> -			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
> +	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> +			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
> +			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");
>  #else




> -			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
> +	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> +			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
> +			 : "cc", "memory");
>  #endif
>  	return hash;
>  }
> @@ -36,11 +39,14 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
>  {
>  	u32 hash;
>  
> -	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
>  #ifdef CONFIG_X86_64
> -			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
> +	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
> +			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
> +			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");


Thats a lot of clobbers.

Alternative would be to use an assembly trampoline to save/restore them
before calling __jhash2

__intel_crc4_2_hash2 can probably be written in assembly, it is quite
simple.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
@ 2014-11-14 15:13   ` Hannes Frederic Sowa
  2014-11-14 15:33     ` Eric Dumazet
  2014-11-14 15:17   ` [PATCH net-next v3] " Hannes Frederic Sowa
  1 sibling, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 15:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fr, 2014-11-14 at 06:50 -0800, Eric Dumazet wrote:
> On Fri, 2014-11-14 at 15:06 +0100, Hannes Frederic Sowa wrote:
> > In case the arch_fast_hash call gets inlined we need to tell gcc which
> > registers are clobbered with. Most callers where fine, as rhashtable
> > used arch_fast_hash via function pointer and thus the compiler took care
> > of that. In case of openvswitch the call got inlined and arch_fast_hash
> > touched registeres which gcc didn't know about.
> > 
> > Also don't use conditional compilation inside arguments, as this confuses
> > sparse.
> > 
> 
> Please add a 
> Fixes: 12-sha1 ("patch title")

I forgot, will send new version with tag added.

> 
> > Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
> > Cc: Pravin Shelar <pshelar@nicira.com>
> > Cc: Jesse Gross <jesse@nicira.com>
> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > ---
> >  arch/x86/include/asm/hash.h | 18 ++++++++++++------
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
> > index a881d78..771cee0 100644
> > --- a/arch/x86/include/asm/hash.h
> > +++ b/arch/x86/include/asm/hash.h
> > @@ -23,11 +23,14 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
> >  {
> >  	u32 hash;
> >  
> > -	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> >  #ifdef CONFIG_X86_64
> > -			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
> > +	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> > +			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
> > +			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");
> >  #else
> 
> 
> 
> 
> > -			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
> > +	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> > +			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
> > +			 : "cc", "memory");
> >  #endif
> >  	return hash;
> >  }
> > @@ -36,11 +39,14 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
> >  {
> >  	u32 hash;
> >  
> > -	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
> >  #ifdef CONFIG_X86_64
> > -			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
> > +	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
> > +			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
> > +			 : "rcx", "r8", "r9", "r10", "r11", "cc", "memory");
> 
> 
> Thats a lot of clobbers.

Yes, those are basically all callee-clobbered registers for the
particular architecture. I didn't look at the generated code for jhash
and crc_hash because I want this code to always be safe, independent of
the version and optimization levels of gcc.

> Alternative would be to use an assembly trampoline to save/restore them
> before calling __jhash2

This version provides the best hints on how to allocate registers to the
optimizers. E.g. it could avoid using callee-clobbered registers but use
callee-saved ones. If we build a trampoline, we need to save and reload
all registers all the time. This version just lets gcc decide how to do
that.

> __intel_crc4_2_hash2 can probably be written in assembly, it is quite
> simple.

Sure, but all the pre and postconditions must hold for both, jhash and
intel_crc4_2_hash and I don't want to rewrite jhash in assembler.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next v3] fast_hash: clobber registers correctly for inline function use
  2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
  2014-11-14 15:13   ` Hannes Frederic Sowa
@ 2014-11-14 15:17   ` Hannes Frederic Sowa
  2014-11-14 17:57     ` Jay Vosburgh
  1 sibling, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 15:17 UTC (permalink / raw)
  To: netdev; +Cc: ogerlitz, pshelar, jesse, jay.vosburgh, discuss

In case the arch_fast_hash call gets inlined we need to tell gcc which
registers are clobbered with. rhashtable was fine, because it used
arch_fast_hash via function pointer and thus the compiler took care of
that. In case of openvswitch the call got inlined and arch_fast_hash
touched registeres which gcc didn't know about.

Also don't use conditional compilation inside arguments, as this confuses
sparse.

Fixes: e5a2c899957659c ("fast_hash: avoid indirect function calls")
Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---


v2)
After studying gcc documentation again, it occured to me that I need to
specificy all input operands in the clobber section, too. Otherwise gcc
can expect that the inline assembler section won't modify the inputs,
which is not true.

v3)
added Fixes tag

Bye,
Hannes

 arch/x86/include/asm/hash.h | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
index a881d78..a25c45a 100644
--- a/arch/x86/include/asm/hash.h
+++ b/arch/x86/include/asm/hash.h
@@ -23,11 +23,15 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
+			   "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "edx", "ecx", "cc", "memory");
 #endif
 	return hash;
 }
@@ -36,11 +40,15 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
 {
 	u32 hash;
 
-	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
 #ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
+			   "cc", "memory");
 #else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
+			 : "edx", "ecx", "cc", "memory");
 #endif
 	return hash;
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 15:13   ` Hannes Frederic Sowa
@ 2014-11-14 15:33     ` Eric Dumazet
  2014-11-14 15:46       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Dumazet @ 2014-11-14 15:33 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fri, 2014-11-14 at 16:13 +0100, Hannes Frederic Sowa wrote:
> > 
> > 
> > Thats a lot of clobbers.
> 
> Yes, those are basically all callee-clobbered registers for the
> particular architecture. I didn't look at the generated code for jhash
> and crc_hash because I want this code to always be safe, independent of
> the version and optimization levels of gcc.
> 
> > Alternative would be to use an assembly trampoline to save/restore them
> > before calling __jhash2
> 
> This version provides the best hints on how to allocate registers to the
> optimizers. E.g. it could avoid using callee-clobbered registers but use
> callee-saved ones. If we build a trampoline, we need to save and reload
> all registers all the time. This version just lets gcc decide how to do
> that.
> 
> > __intel_crc4_2_hash2 can probably be written in assembly, it is quite
> > simple.
> 
> Sure, but all the pre and postconditions must hold for both, jhash and
> intel_crc4_2_hash and I don't want to rewrite jhash in assembler.

We write optimized code for current cpus.

With current generation of cpus, we have crc32 support.

The fallback having to save/restore few registers, we don't care, as the
fallback has huge cost anyway.

You don't have to write jhash() in assembler, you misunderstood me.

We only have to provide a trampoline in assembler, with maybe 10
instructions.

Then gcc will know that we do not clobber registers for the optimized
case.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 15:33     ` Eric Dumazet
@ 2014-11-14 15:46       ` Hannes Frederic Sowa
  2014-11-14 18:38         ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 15:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fr, 2014-11-14 at 07:33 -0800, Eric Dumazet wrote:
> On Fri, 2014-11-14 at 16:13 +0100, Hannes Frederic Sowa wrote:
> > > 
> > > 
> > > Thats a lot of clobbers.
> > 
> > Yes, those are basically all callee-clobbered registers for the
> > particular architecture. I didn't look at the generated code for jhash
> > and crc_hash because I want this code to always be safe, independent of
> > the version and optimization levels of gcc.
> > 
> > > Alternative would be to use an assembly trampoline to save/restore them
> > > before calling __jhash2
> > 
> > This version provides the best hints on how to allocate registers to the
> > optimizers. E.g. it could avoid using callee-clobbered registers but use
> > callee-saved ones. If we build a trampoline, we need to save and reload
> > all registers all the time. This version just lets gcc decide how to do
> > that.
> > 
> > > __intel_crc4_2_hash2 can probably be written in assembly, it is quite
> > > simple.
> > 
> > Sure, but all the pre and postconditions must hold for both, jhash and
> > intel_crc4_2_hash and I don't want to rewrite jhash in assembler.
> 
> We write optimized code for current cpus.
> 
> With current generation of cpus, we have crc32 support.

__intel_crc4_2_hash(2) does already make use of crc32 instruction. I'll
have a closer look at what gcc generates.

> The fallback having to save/restore few registers, we don't care, as the
> fallback has huge cost anyway.
> 
> You don't have to write jhash() in assembler, you misunderstood me.

Ok, understood, so we only clobber the registers needed in the
crc32_hash implementation and only if we branch to jhash we save all the
other ones in a trampoline directly before jhash.

> We only have to provide a trampoline in assembler, with maybe 10
> instructions.
> 
> Then gcc will know that we do not clobber registers for the optimized
> case.

Yes, makes sense.

I would still like to see the current proposed fix getting applied and
we can do this on-top. The inline call after this patch reassembles a
direct function call, so besides the long list of clobbers, it should
still be pretty fast.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next v3] fast_hash: clobber registers correctly for inline function use
  2014-11-14 15:17   ` [PATCH net-next v3] " Hannes Frederic Sowa
@ 2014-11-14 17:57     ` Jay Vosburgh
  0 siblings, 0 replies; 21+ messages in thread
From: Jay Vosburgh @ 2014-11-14 17:57 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, ogerlitz, pshelar, jesse, discuss

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:

>In case the arch_fast_hash call gets inlined we need to tell gcc which
>registers are clobbered with. rhashtable was fine, because it used
>arch_fast_hash via function pointer and thus the compiler took care of
>that. In case of openvswitch the call got inlined and arch_fast_hash
>touched registeres which gcc didn't know about.
>
>Also don't use conditional compilation inside arguments, as this confuses
>sparse.
>
>Fixes: e5a2c899957659c ("fast_hash: avoid indirect function calls")
>Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>Cc: Pravin Shelar <pshelar@nicira.com>
>Cc: Jesse Gross <jesse@nicira.com>
>Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>---
>
>
>v2)
>After studying gcc documentation again, it occured to me that I need to
>specificy all input operands in the clobber section, too. Otherwise gcc
>can expect that the inline assembler section won't modify the inputs,
>which is not true.
>
>v3)
>added Fixes tag

	This patch does not compile for me when applied to today's
net-next:

  CC [M]  fs/nfsd/nfs4state.o
In file included from ./arch/x86/include/asm/bitops.h:16:0,
                 from include/linux/bitops.h:33,
                 from include/linux/kernel.h:10,
                 from include/linux/list.h:8,
                 from include/linux/wait.h:6,
                 from include/linux/fs.h:6,
                 from fs/nfsd/nfs4state.c:36:
fs/nfsd/nfs4state.c: In function ‘nfsd4_cb_recall_prepare’:
./arch/x86/include/asm/alternative.h:185:2: error: ‘asm’ operand has impossible constraints
  asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
  ^
./arch/x86/include/asm/hash.h:27:2: note: in expansion of macro ‘alternative_call’
  alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
  ^
make[2]: *** [fs/nfsd/nfs4state.o] Error 1
make[1]: *** [fs/nfsd] Error 2
make: *** [fs] Error 2

	-J


>Bye,
>Hannes
>
> arch/x86/include/asm/hash.h | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)
>
>diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
>index a881d78..a25c45a 100644
>--- a/arch/x86/include/asm/hash.h
>+++ b/arch/x86/include/asm/hash.h
>@@ -23,11 +23,15 @@ static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
> {
> 	u32 hash;
> 
>-	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
> #ifdef CONFIG_X86_64
>-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
>+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
>+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
>+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
>+			   "cc", "memory");
> #else
>-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
>+	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
>+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
>+			 : "edx", "ecx", "cc", "memory");
> #endif
> 	return hash;
> }
>@@ -36,11 +40,15 @@ static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
> {
> 	u32 hash;
> 
>-	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
> #ifdef CONFIG_X86_64
>-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
>+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
>+			 "=a" (hash), "D" (data), "S" (len), "d" (seed)
>+			 : "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11",
>+			   "cc", "memory");
> #else
>-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
>+	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
>+			 "=a" (hash), "a" (data), "d" (len), "c" (seed)
>+			 : "edx", "ecx", "cc", "memory");
> #endif
> 	return hash;
> }
>-- 
>1.9.3

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 15:46       ` Hannes Frederic Sowa
@ 2014-11-14 18:38         ` David Miller
  2014-11-14 19:02           ` Cong Wang
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: David Miller @ 2014-11-14 18:38 UTC (permalink / raw)
  To: hannes
  Cc: eric.dumazet, netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 14 Nov 2014 16:46:18 +0100

> I would still like to see the current proposed fix getting applied and
> we can do this on-top. The inline call after this patch reassembles a
> direct function call, so besides the long list of clobbers, it should
> still be pretty fast.

I would rather revert the change entirely until it is implemented
properly.

Also, I am strongly of the opinion that this is a mis-use of the
alternative call interface.  It was never intended to be used for
things that can make real function calls.

You can add a million clobbers, or a trampoline, it's still using a
facility in a manner for which it was not designed.

This means a new interface with a new name and with capabilities
explicitly supporting this case are in order.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 18:38         ` David Miller
@ 2014-11-14 19:02           ` Cong Wang
  2014-11-14 20:42             ` Hannes Frederic Sowa
  2014-11-14 19:05           ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
  2014-11-14 20:04           ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
  2 siblings, 1 reply; 21+ messages in thread
From: Cong Wang @ 2014-11-14 19:02 UTC (permalink / raw)
  To: David Miller
  Cc: Hannes Frederic Sowa, Eric Dumazet, netdev, Or Gerlitz,
	Pravin B Shelar, Jesse Gross, jay.vosburgh, discuss

On Fri, Nov 14, 2014 at 10:38 AM, David Miller <davem@davemloft.net> wrote:
>
> Also, I am strongly of the opinion that this is a mis-use of the
> alternative call interface.  It was never intended to be used for
> things that can make real function calls.
>
> You can add a million clobbers, or a trampoline, it's still using a
> facility in a manner for which it was not designed.
>
> This means a new interface with a new name and with capabilities
> explicitly supporting this case are in order.
>

I am wondering, compare with alternative call, how slower is just testing
cpu_has_xmm4_2?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH net-next] Revert "fast_hash: avoid indirect function calls"
  2014-11-14 18:38         ` David Miller
  2014-11-14 19:02           ` Cong Wang
@ 2014-11-14 19:05           ` Jay Vosburgh
  2014-11-14 21:36             ` David Miller
  2014-11-14 21:43             ` Hannes Frederic Sowa
  2014-11-14 20:04           ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
  2 siblings, 2 replies; 21+ messages in thread
From: Jay Vosburgh @ 2014-11-14 19:05 UTC (permalink / raw)
  To: David Miller
  Cc: hannes, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss


This reverts commit e5a2c899957659cd1a9f789bc462f9c0b35f5150.

	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
that selects between __jhash2 and __intel_crc4_2_hash based on the
X86_FEATURE_XMM4_2.

	Unfortunately, the alternative_call system does not appear to be
suitable for use with C functions, as register usage is not handled
properly for the called functions.  The __jhash2 function in particular
clobbers registers that are not preserved when called via
alternative_call, resulting in a panic for direct callers of
arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
__intel_crc4_2_hash works merely by chance because it uses fewer
registers.

	This commit was suggested as the source of the problem by Jesse
Gross <jesse@nicira.com>.

Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

---
 arch/x86/include/asm/hash.h | 51 +++++----------------------------------------
 arch/x86/lib/hash.c         | 29 +++++++++++---------------
 include/asm-generic/hash.h  | 36 ++------------------------------
 include/linux/hash.h        | 34 ++++++++++++++++++++++++++++++
 lib/Makefile                |  2 +-
 lib/hash.c                  | 39 ++++++++++++++++++++++++++++++++++
 6 files changed, 93 insertions(+), 98 deletions(-)
 create mode 100644 lib/hash.c

diff --git a/arch/x86/include/asm/hash.h b/arch/x86/include/asm/hash.h
index a881d78..e8c58f8 100644
--- a/arch/x86/include/asm/hash.h
+++ b/arch/x86/include/asm/hash.h
@@ -1,48 +1,7 @@
-#ifndef __ASM_X86_HASH_H
-#define __ASM_X86_HASH_H
+#ifndef _ASM_X86_HASH_H
+#define _ASM_X86_HASH_H
 
-#include <linux/cpufeature.h>
-#include <asm/alternative.h>
+struct fast_hash_ops;
+extern void setup_arch_fast_hash(struct fast_hash_ops *ops);
 
-u32 __intel_crc4_2_hash(const void *data, u32 len, u32 seed);
-u32 __intel_crc4_2_hash2(const u32 *data, u32 len, u32 seed);
-
-/*
- * non-inline versions of jhash so gcc does not need to generate
- * duplicate code in every object file
- */
-u32 __jhash(const void *data, u32 len, u32 seed);
-u32 __jhash2(const u32 *data, u32 len, u32 seed);
-
-/*
- * for documentation of these functions please look into
- * <include/asm-generic/hash.h>
- */
-
-static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
-{
-	u32 hash;
-
-	alternative_call(__jhash, __intel_crc4_2_hash, X86_FEATURE_XMM4_2,
-#ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
-#else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
-#endif
-	return hash;
-}
-
-static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
-{
-	u32 hash;
-
-	alternative_call(__jhash2, __intel_crc4_2_hash2, X86_FEATURE_XMM4_2,
-#ifdef CONFIG_X86_64
-			 "=a" (hash), "D" (data), "S" (len), "d" (seed));
-#else
-			 "=a" (hash), "a" (data), "d" (len), "c" (seed));
-#endif
-	return hash;
-}
-
-#endif /* __ASM_X86_HASH_H */
+#endif /* _ASM_X86_HASH_H */
diff --git a/arch/x86/lib/hash.c b/arch/x86/lib/hash.c
index e143271..ff4fa51 100644
--- a/arch/x86/lib/hash.c
+++ b/arch/x86/lib/hash.c
@@ -31,13 +31,13 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
+#include <linux/hash.h>
+#include <linux/init.h>
+
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
 #include <asm/hash.h>
 
-#include <linux/hash.h>
-#include <linux/jhash.h>
-
 static inline u32 crc32_u32(u32 crc, u32 val)
 {
 #ifdef CONFIG_AS_CRC32
@@ -48,7 +48,7 @@ static inline u32 crc32_u32(u32 crc, u32 val)
 	return crc;
 }
 
-u32 __intel_crc4_2_hash(const void *data, u32 len, u32 seed)
+static u32 intel_crc4_2_hash(const void *data, u32 len, u32 seed)
 {
 	const u32 *p32 = (const u32 *) data;
 	u32 i, tmp = 0;
@@ -71,27 +71,22 @@ u32 __intel_crc4_2_hash(const void *data, u32 len, u32 seed)
 
 	return seed;
 }
-EXPORT_SYMBOL(__intel_crc4_2_hash);
 
-u32 __intel_crc4_2_hash2(const u32 *data, u32 len, u32 seed)
+static u32 intel_crc4_2_hash2(const u32 *data, u32 len, u32 seed)
 {
+	const u32 *p32 = (const u32 *) data;
 	u32 i;
 
 	for (i = 0; i < len; i++)
-		seed = crc32_u32(seed, *data++);
+		seed = crc32_u32(seed, *p32++);
 
 	return seed;
 }
-EXPORT_SYMBOL(__intel_crc4_2_hash2);
 
-u32 __jhash(const void *data, u32 len, u32 seed)
+void __init setup_arch_fast_hash(struct fast_hash_ops *ops)
 {
-	return jhash(data, len, seed);
-}
-EXPORT_SYMBOL(__jhash);
-
-u32 __jhash2(const u32 *data, u32 len, u32 seed)
-{
-	return jhash2(data, len, seed);
+	if (cpu_has_xmm4_2) {
+		ops->hash  = intel_crc4_2_hash;
+		ops->hash2 = intel_crc4_2_hash2;
+	}
 }
-EXPORT_SYMBOL(__jhash2);
diff --git a/include/asm-generic/hash.h b/include/asm-generic/hash.h
index 3c82760..b631284 100644
--- a/include/asm-generic/hash.h
+++ b/include/asm-generic/hash.h
@@ -1,41 +1,9 @@
 #ifndef __ASM_GENERIC_HASH_H
 #define __ASM_GENERIC_HASH_H
 
-#include <linux/jhash.h>
-
-/**
- *	arch_fast_hash - Caclulates a hash over a given buffer that can have
- *			 arbitrary size. This function will eventually use an
- *			 architecture-optimized hashing implementation if
- *			 available, and trades off distribution for speed.
- *
- *	@data: buffer to hash
- *	@len: length of buffer in bytes
- *	@seed: start seed
- *
- *	Returns 32bit hash.
- */
-static inline u32 arch_fast_hash(const void *data, u32 len, u32 seed)
-{
-	return jhash(data, len, seed);
-}
-
-/**
- *	arch_fast_hash2 - Caclulates a hash over a given buffer that has a
- *			  size that is of a multiple of 32bit words. This
- *			  function will eventually use an architecture-
- *			  optimized hashing implementation if available,
- *			  and trades off distribution for speed.
- *
- *	@data: buffer to hash (must be 32bit padded)
- *	@len: number of 32bit words
- *	@seed: start seed
- *
- *	Returns 32bit hash.
- */
-static inline u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
+struct fast_hash_ops;
+static inline void setup_arch_fast_hash(struct fast_hash_ops *ops)
 {
-	return jhash2(data, len, seed);
 }
 
 #endif /* __ASM_GENERIC_HASH_H */
diff --git a/include/linux/hash.h b/include/linux/hash.h
index 6e8fb02..d0494c3 100644
--- a/include/linux/hash.h
+++ b/include/linux/hash.h
@@ -84,4 +84,38 @@ static inline u32 hash32_ptr(const void *ptr)
 	return (u32)val;
 }
 
+struct fast_hash_ops {
+	u32 (*hash)(const void *data, u32 len, u32 seed);
+	u32 (*hash2)(const u32 *data, u32 len, u32 seed);
+};
+
+/**
+ *	arch_fast_hash - Caclulates a hash over a given buffer that can have
+ *			 arbitrary size. This function will eventually use an
+ *			 architecture-optimized hashing implementation if
+ *			 available, and trades off distribution for speed.
+ *
+ *	@data: buffer to hash
+ *	@len: length of buffer in bytes
+ *	@seed: start seed
+ *
+ *	Returns 32bit hash.
+ */
+extern u32 arch_fast_hash(const void *data, u32 len, u32 seed);
+
+/**
+ *	arch_fast_hash2 - Caclulates a hash over a given buffer that has a
+ *			  size that is of a multiple of 32bit words. This
+ *			  function will eventually use an architecture-
+ *			  optimized hashing implementation if available,
+ *			  and trades off distribution for speed.
+ *
+ *	@data: buffer to hash (must be 32bit padded)
+ *	@len: number of 32bit words
+ *	@seed: start seed
+ *
+ *	Returns 32bit hash.
+ */
+extern u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed);
+
 #endif /* _LINUX_HASH_H */
diff --git a/lib/Makefile b/lib/Makefile
index 04e53dd..7512dc9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -26,7 +26,7 @@ obj-y += bcd.o div64.o sort.o parser.o halfmd4.o debug_locks.o random32.o \
 	 bust_spinlocks.o hexdump.o kasprintf.o bitmap.o scatterlist.o \
 	 gcd.o lcm.o list_sort.o uuid.o flex_array.o iovec.o clz_ctz.o \
 	 bsearch.o find_last_bit.o find_next_bit.o llist.o memweight.o kfifo.o \
-	 percpu-refcount.o percpu_ida.o rhashtable.o
+	 percpu-refcount.o percpu_ida.o hash.o rhashtable.o
 obj-y += string_helpers.o
 obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
 obj-y += kstrtox.o
diff --git a/lib/hash.c b/lib/hash.c
new file mode 100644
index 0000000..fea973f
--- /dev/null
+++ b/lib/hash.c
@@ -0,0 +1,39 @@
+/* General purpose hashing library
+ *
+ * That's a start of a kernel hashing library, which can be extended
+ * with further algorithms in future. arch_fast_hash{2,}() will
+ * eventually resolve to an architecture optimized implementation.
+ *
+ * Copyright 2013 Francesco Fusco <ffusco@redhat.com>
+ * Copyright 2013 Daniel Borkmann <dborkman@redhat.com>
+ * Copyright 2013 Thomas Graf <tgraf@redhat.com>
+ * Licensed under the GNU General Public License, version 2.0 (GPLv2)
+ */
+
+#include <linux/jhash.h>
+#include <linux/hash.h>
+#include <linux/cache.h>
+
+static struct fast_hash_ops arch_hash_ops __read_mostly = {
+	.hash  = jhash,
+	.hash2 = jhash2,
+};
+
+u32 arch_fast_hash(const void *data, u32 len, u32 seed)
+{
+	return arch_hash_ops.hash(data, len, seed);
+}
+EXPORT_SYMBOL_GPL(arch_fast_hash);
+
+u32 arch_fast_hash2(const u32 *data, u32 len, u32 seed)
+{
+	return arch_hash_ops.hash2(data, len, seed);
+}
+EXPORT_SYMBOL_GPL(arch_fast_hash2);
+
+static int __init hashlib_init(void)
+{
+	setup_arch_fast_hash(&arch_hash_ops);
+	return 0;
+}
+early_initcall(hashlib_init);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 18:38         ` David Miller
  2014-11-14 19:02           ` Cong Wang
  2014-11-14 19:05           ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
@ 2014-11-14 20:04           ` Hannes Frederic Sowa
  2014-11-14 20:10             ` Hannes Frederic Sowa
  2014-11-14 20:15             ` Jay Vosburgh
  2 siblings, 2 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 20:04 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fr, 2014-11-14 at 13:38 -0500, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Fri, 14 Nov 2014 16:46:18 +0100
> 
> > I would still like to see the current proposed fix getting applied and
> > we can do this on-top. The inline call after this patch reassembles a
> > direct function call, so besides the long list of clobbers, it should
> > still be pretty fast.
> 
> I would rather revert the change entirely until it is implemented
> properly.
> 
> Also, I am strongly of the opinion that this is a mis-use of the
> alternative call interface.  It was never intended to be used for
> things that can make real function calls.

I tend to disagree. Grepping e.g. shows

        alternative_call_2(copy_user_generic_unrolled,
                         copy_user_generic_string,
                         X86_FEATURE_REP_GOOD,
                         copy_user_enhanced_fast_string,
                         X86_FEATURE_ERMS,
                         ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
                                     "=d" (len)),
                         "1" (to), "2" (from), "3" (len)
                         : "memory", "rcx", "r8", "r9", "r10", "r11");


(it has a few less clobbers because it has more output operands)

I just tried to come up with some macros which lets you abstract away
the clobber list, but in the end it somehow has to look exactly like
that. The double-colon syntax also makes it difficult to come up with
something that let's us use varargs for that.

> You can add a million clobbers, or a trampoline, it's still using a
> facility in a manner for which it was not designed.

The full clobber list for a function call which would always clear
registers like we would have in a normal non-inlined function call would
look like this:

#define FUNC_CLOBBER LIST "memory", "cc", "rax", "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11"

(reference in arch/x86/include/asm/calling.h).

> This means a new interface with a new name and with capabilities
> explicitly supporting this case are in order.

It try to implicitly embed the clobber list, would something like that
be ok?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 20:04           ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
@ 2014-11-14 20:10             ` Hannes Frederic Sowa
  2014-11-14 20:15             ` Jay Vosburgh
  1 sibling, 0 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 20:10 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, netdev, ogerlitz, pshelar, jesse, jay.vosburgh, discuss

On Fr, 2014-11-14 at 21:04 +0100, Hannes Frederic Sowa wrote:
> On Fr, 2014-11-14 at 13:38 -0500, David Miller wrote:
> > From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > Date: Fri, 14 Nov 2014 16:46:18 +0100
> > 
> > > I would still like to see the current proposed fix getting applied and
> > > we can do this on-top. The inline call after this patch reassembles a
> > > direct function call, so besides the long list of clobbers, it should
> > > still be pretty fast.
> > 
> > I would rather revert the change entirely until it is implemented
> > properly.
> > 
> > Also, I am strongly of the opinion that this is a mis-use of the
> > alternative call interface.  It was never intended to be used for
> > things that can make real function calls.
> 
> I tend to disagree. Grepping e.g. shows
> 
>         alternative_call_2(copy_user_generic_unrolled,
>                          copy_user_generic_string,
>                          X86_FEATURE_REP_GOOD,
>                          copy_user_enhanced_fast_string,
>                          X86_FEATURE_ERMS,
>                          ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
>                                      "=d" (len)),
>                          "1" (to), "2" (from), "3" (len)
>                          : "memory", "rcx", "r8", "r9", "r10", "r11");

Oh, I need to take this back, These are actually assembler functions.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 20:04           ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
  2014-11-14 20:10             ` Hannes Frederic Sowa
@ 2014-11-14 20:15             ` Jay Vosburgh
  2014-11-14 20:35               ` Hannes Frederic Sowa
  1 sibling, 1 reply; 21+ messages in thread
From: Jay Vosburgh @ 2014-11-14 20:15 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:

>On Fr, 2014-11-14 at 13:38 -0500, David Miller wrote:
>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Date: Fri, 14 Nov 2014 16:46:18 +0100
>> 
>> > I would still like to see the current proposed fix getting applied and
>> > we can do this on-top. The inline call after this patch reassembles a
>> > direct function call, so besides the long list of clobbers, it should
>> > still be pretty fast.
>> 
>> I would rather revert the change entirely until it is implemented
>> properly.
>> 
>> Also, I am strongly of the opinion that this is a mis-use of the
>> alternative call interface.  It was never intended to be used for
>> things that can make real function calls.
>
>I tend to disagree. Grepping e.g. shows
>
>        alternative_call_2(copy_user_generic_unrolled,
>                         copy_user_generic_string,
>                         X86_FEATURE_REP_GOOD,
>                         copy_user_enhanced_fast_string,
>                         X86_FEATURE_ERMS,
>                         ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
>                                     "=d" (len)),
>                         "1" (to), "2" (from), "3" (len)
>                         : "memory", "rcx", "r8", "r9", "r10", "r11");
>
>
>(it has a few less clobbers because it has more output operands)

	As those functions (copy_user_generic_unrolled, et al) are all
in assembly language, presumably the list of clobbered registers can be
had via inspection.

	For the arch_fast_hash2 case, the functions (__intel_crc4_2_hash
and __jash2) are both written in C, so how would the clobber list be
created?

	-J

>I just tried to come up with some macros which lets you abstract away
>the clobber list, but in the end it somehow has to look exactly like
>that. The double-colon syntax also makes it difficult to come up with
>something that let's us use varargs for that.
>
>> You can add a million clobbers, or a trampoline, it's still using a
>> facility in a manner for which it was not designed.
>
>The full clobber list for a function call which would always clear
>registers like we would have in a normal non-inlined function call would
>look like this:
>
>#define FUNC_CLOBBER LIST "memory", "cc", "rax", "rdi", "rsi", "rdx", "rcx", "r8", "r9", "r10", "r11"
>
>(reference in arch/x86/include/asm/calling.h).
>
>> This means a new interface with a new name and with capabilities
>> explicitly supporting this case are in order.
>
>It try to implicitly embed the clobber list, would something like that
>be ok?
>
>Thanks,
>Hannes

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 20:15             ` Jay Vosburgh
@ 2014-11-14 20:35               ` Hannes Frederic Sowa
  2014-11-14 22:10                 ` Jay Vosburgh
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 20:35 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

On Fr, 2014-11-14 at 12:15 -0800, Jay Vosburgh wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> 
> >On Fr, 2014-11-14 at 13:38 -0500, David Miller wrote:
> >> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >> Date: Fri, 14 Nov 2014 16:46:18 +0100
> >> 
> >> > I would still like to see the current proposed fix getting applied and
> >> > we can do this on-top. The inline call after this patch reassembles a
> >> > direct function call, so besides the long list of clobbers, it should
> >> > still be pretty fast.
> >> 
> >> I would rather revert the change entirely until it is implemented
> >> properly.
> >> 
> >> Also, I am strongly of the opinion that this is a mis-use of the
> >> alternative call interface.  It was never intended to be used for
> >> things that can make real function calls.
> >
> >I tend to disagree. Grepping e.g. shows
> >
> >        alternative_call_2(copy_user_generic_unrolled,
> >                         copy_user_generic_string,
> >                         X86_FEATURE_REP_GOOD,
> >                         copy_user_enhanced_fast_string,
> >                         X86_FEATURE_ERMS,
> >                         ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
> >                                     "=d" (len)),
> >                         "1" (to), "2" (from), "3" (len)
> >                         : "memory", "rcx", "r8", "r9", "r10", "r11");
> >
> >
> >(it has a few less clobbers because it has more output operands)
> 
> 	As those functions (copy_user_generic_unrolled, et al) are all
> in assembly language, presumably the list of clobbered registers can be
> had via inspection.
> 
> 	For the arch_fast_hash2 case, the functions (__intel_crc4_2_hash
> and __jash2) are both written in C, so how would the clobber list be
> created?

I created it via the function calling convention documented in
arch/x86/include/asm/calling.h, so I specified each register which a
function is allowed to clobber with.

I currently cannot see how I can resolve the invalid constraints error
easily. :(

So either go with my first patch, which I puts the alternative_call
switch point into its own function without ever inlining or the patch
needs to be reverted. :/

Bye,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 19:02           ` Cong Wang
@ 2014-11-14 20:42             ` Hannes Frederic Sowa
  2014-11-14 21:35               ` David Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 20:42 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Eric Dumazet, netdev, Or Gerlitz, Pravin B Shelar,
	Jesse Gross, jay.vosburgh, discuss

On Fr, 2014-11-14 at 11:02 -0800, Cong Wang wrote:
> 
> I am wondering, compare with alternative call, how slower is just
> testing
> cpu_has_xmm4_2?

I can test, but cpu_has_xmm4_2 expands into quite some code. I don't
know if indirect function call or a test of this flag is faster. I'll
test this.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 20:42             ` Hannes Frederic Sowa
@ 2014-11-14 21:35               ` David Miller
  0 siblings, 0 replies; 21+ messages in thread
From: David Miller @ 2014-11-14 21:35 UTC (permalink / raw)
  To: hannes
  Cc: cwang, eric.dumazet, netdev, ogerlitz, pshelar, jesse,
	jay.vosburgh, discuss

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 14 Nov 2014 21:42:13 +0100

> On Fr, 2014-11-14 at 11:02 -0800, Cong Wang wrote:
>> 
>> I am wondering, compare with alternative call, how slower is just
>> testing
>> cpu_has_xmm4_2?
> 
> I can test, but cpu_has_xmm4_2 expands into quite some code. I don't
> know if indirect function call or a test of this flag is faster. I'll
> test this.

I'm going to apply Jay's revert, and then you guys can take your time
sorting this out.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] Revert "fast_hash: avoid indirect function calls"
  2014-11-14 19:05           ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
@ 2014-11-14 21:36             ` David Miller
  2014-11-14 21:43             ` Hannes Frederic Sowa
  1 sibling, 0 replies; 21+ messages in thread
From: David Miller @ 2014-11-14 21:36 UTC (permalink / raw)
  To: jay.vosburgh
  Cc: hannes, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Fri, 14 Nov 2014 11:05:06 -0800

> 
> This reverts commit e5a2c899957659cd1a9f789bc462f9c0b35f5150.
> 
> 	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
> that selects between __jhash2 and __intel_crc4_2_hash based on the
> X86_FEATURE_XMM4_2.
> 
> 	Unfortunately, the alternative_call system does not appear to be
> suitable for use with C functions, as register usage is not handled
> properly for the called functions.  The __jhash2 function in particular
> clobbers registers that are not preserved when called via
> alternative_call, resulting in a panic for direct callers of
> arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
> __intel_crc4_2_hash works merely by chance because it uses fewer
> registers.
> 
> 	This commit was suggested as the source of the problem by Jesse
> Gross <jesse@nicira.com>.
> 
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

Applied, thanks Jay.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] Revert "fast_hash: avoid indirect function calls"
  2014-11-14 19:05           ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
  2014-11-14 21:36             ` David Miller
@ 2014-11-14 21:43             ` Hannes Frederic Sowa
  1 sibling, 0 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 21:43 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

On Fr, 2014-11-14 at 11:05 -0800, Jay Vosburgh wrote:
> This reverts commit e5a2c899957659cd1a9f789bc462f9c0b35f5150.
> 
> 	Commit e5a2c899 introduced an alternative_call, arch_fast_hash2,
> that selects between __jhash2 and __intel_crc4_2_hash based on the
> X86_FEATURE_XMM4_2.
> 
> 	Unfortunately, the alternative_call system does not appear to be
> suitable for use with C functions, as register usage is not handled
> properly for the called functions.  The __jhash2 function in particular
> clobbers registers that are not preserved when called via
> alternative_call, resulting in a panic for direct callers of
> arch_fast_hash2 on older CPUs lacking sse4_2.  It is possible that
> __intel_crc4_2_hash works merely by chance because it uses fewer
> registers.
> 
> 	This commit was suggested as the source of the problem by Jesse
> Gross <jesse@nicira.com>.
> 
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>

I am totally fine to revert this and try to come up with a better
solution.

Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Bye,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 20:35               ` Hannes Frederic Sowa
@ 2014-11-14 22:10                 ` Jay Vosburgh
  2014-11-14 22:37                   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 21+ messages in thread
From: Jay Vosburgh @ 2014-11-14 22:10 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
[...]
>I created it via the function calling convention documented in
>arch/x86/include/asm/calling.h, so I specified each register which a
>function is allowed to clobber with.
>
>I currently cannot see how I can resolve the invalid constraints error
>easily. :(
>
>So either go with my first patch, which I puts the alternative_call
>switch point into its own function without ever inlining or the patch
>needs to be reverted. :/

	As a data point, I tested the first patch as well, and the
system does not panic with it in place.  Inspection shows that it's
using %r14 in place of %r8 in the prior (crashing) implementation.

	Disassembly of the call site (on the non-sse4_1 system) in
ovs_flow_tbl_insert with the first patch applied looks like this:

0xffffffffa00b6bb9 <ovs_flow_tbl_insert+0xb9>:	mov    %r15,0x348(%r14)
0xffffffffa00b6bc0 <ovs_flow_tbl_insert+0xc0>:	movzwl 0x28(%r15),%ecx
0xffffffffa00b6bc5 <ovs_flow_tbl_insert+0xc5>:	movzwl 0x2a(%r15),%esi
0xffffffffa00b6bca <ovs_flow_tbl_insert+0xca>:	movzwl %cx,%eax
0xffffffffa00b6bcd <ovs_flow_tbl_insert+0xcd>:	sub    %ecx,%esi
0xffffffffa00b6bcf <ovs_flow_tbl_insert+0xcf>:	lea    0x38(%r14,%rax,1),%rdi
0xffffffffa00b6bd4 <ovs_flow_tbl_insert+0xd4>:	sar    $0x2,%esi
0xffffffffa00b6bd7 <ovs_flow_tbl_insert+0xd7>:	callq  0xffffffff813a7810 <__jhash2>
0xffffffffa00b6bdc <ovs_flow_tbl_insert+0xdc>:	mov    %eax,0x30(%r14)
0xffffffffa00b6be0 <ovs_flow_tbl_insert+0xe0>:	mov    (%rbx),%r13
0xffffffffa00b6be3 <ovs_flow_tbl_insert+0xe3>:	mov    %r14,%rsi
0xffffffffa00b6be6 <ovs_flow_tbl_insert+0xe6>:	mov    %r13,%rdi
0xffffffffa00b6be9 <ovs_flow_tbl_insert+0xe9>:	callq  0xffffffffa00b61a0 <table_instance_insert>

	Compared to the panicking version's function:

0xffffffffa01a55c9 <ovs_flow_tbl_insert+0xb9>:  mov    %r15,0x348(%r8)
0xffffffffa01a55d0 <ovs_flow_tbl_insert+0xc0>:  movzwl 0x28(%r15),%ecx
0xffffffffa01a55d5 <ovs_flow_tbl_insert+0xc5>:  movzwl 0x2a(%r15),%esi
0xffffffffa01a55da <ovs_flow_tbl_insert+0xca>:  movzwl %cx,%eax
0xffffffffa01a55dd <ovs_flow_tbl_insert+0xcd>:  sub    %ecx,%esi
0xffffffffa01a55df <ovs_flow_tbl_insert+0xcf>:  lea    0x38(%r8,%rax,1),%rdi
0xffffffffa01a55e4 <ovs_flow_tbl_insert+0xd4>:  sar    $0x2,%esi
0xffffffffa01a55e7 <ovs_flow_tbl_insert+0xd7>:  callq  0xffffffff813a75c0 <__jhash2>
0xffffffffa01a55ec <ovs_flow_tbl_insert+0xdc>:  mov    %eax,0x30(%r8)
0xffffffffa01a55f0 <ovs_flow_tbl_insert+0xe0>:  mov    (%rbx),%r13
0xffffffffa01a55f3 <ovs_flow_tbl_insert+0xe3>:  mov    %r8,%rsi
0xffffffffa01a55f6 <ovs_flow_tbl_insert+0xe6>:  mov    %r13,%rdi
0xffffffffa01a55f9 <ovs_flow_tbl_insert+0xe9>:  callq  0xffffffffa01a4ba0 <table_instance_insert>

	It appears to generate the same instructions, but allocates
registers differently (using %r14 instead of %r8).

	The __jhash2 disassembly appears to be unchanged between the two
versions.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH net-next] fast_hash: clobber registers correctly for inline function use
  2014-11-14 22:10                 ` Jay Vosburgh
@ 2014-11-14 22:37                   ` Hannes Frederic Sowa
  0 siblings, 0 replies; 21+ messages in thread
From: Hannes Frederic Sowa @ 2014-11-14 22:37 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: David Miller, eric.dumazet, netdev, ogerlitz, pshelar, jesse, discuss

Hi Jay,

On Fr, 2014-11-14 at 14:10 -0800, Jay Vosburgh wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> [...]
> >I created it via the function calling convention documented in
> >arch/x86/include/asm/calling.h, so I specified each register which a
> >function is allowed to clobber with.
> >
> >I currently cannot see how I can resolve the invalid constraints error
> >easily. :(
> >
> >So either go with my first patch, which I puts the alternative_call
> >switch point into its own function without ever inlining or the patch
> >needs to be reverted. :/
> 
> 	As a data point, I tested the first patch as well, and the
> system does not panic with it in place.  Inspection shows that it's
> using %r14 in place of %r8 in the prior (crashing) implementation.

Yes, I also could reproduce your oops and the first unoffical patch  and
the first offical one both fixed it. After that, I thought that just
adding more clobbers cannot introduce bugs, so I only did compile
testing until I hit a window where gcc got mad with the excessive use of
clobbered registers but haven't tested the inline call sites that much
(sorry). :(

> 	Disassembly of the call site (on the non-sse4_1 system) in
> ovs_flow_tbl_insert with the first patch applied looks like this:
> 
> 0xffffffffa00b6bb9 <ovs_flow_tbl_insert+0xb9>:	mov    %r15,0x348(%r14)
> 0xffffffffa00b6bc0 <ovs_flow_tbl_insert+0xc0>:	movzwl 0x28(%r15),%ecx
> 0xffffffffa00b6bc5 <ovs_flow_tbl_insert+0xc5>:	movzwl 0x2a(%r15),%esi
> 0xffffffffa00b6bca <ovs_flow_tbl_insert+0xca>:	movzwl %cx,%eax
> 0xffffffffa00b6bcd <ovs_flow_tbl_insert+0xcd>:	sub    %ecx,%esi
> 0xffffffffa00b6bcf <ovs_flow_tbl_insert+0xcf>:	lea    0x38(%r14,%rax,1),%rdi
> 0xffffffffa00b6bd4 <ovs_flow_tbl_insert+0xd4>:	sar    $0x2,%esi
> 0xffffffffa00b6bd7 <ovs_flow_tbl_insert+0xd7>:	callq  0xffffffff813a7810 <__jhash2>
> 0xffffffffa00b6bdc <ovs_flow_tbl_insert+0xdc>:	mov    %eax,0x30(%r14)
> 0xffffffffa00b6be0 <ovs_flow_tbl_insert+0xe0>:	mov    (%rbx),%r13
> 0xffffffffa00b6be3 <ovs_flow_tbl_insert+0xe3>:	mov    %r14,%rsi
> 0xffffffffa00b6be6 <ovs_flow_tbl_insert+0xe6>:	mov    %r13,%rdi
> 0xffffffffa00b6be9 <ovs_flow_tbl_insert+0xe9>:	callq  0xffffffffa00b61a0 <table_instance_insert>
> 
> 	Compared to the panicking version's function:
> 
> 0xffffffffa01a55c9 <ovs_flow_tbl_insert+0xb9>:  mov    %r15,0x348(%r8)
> 0xffffffffa01a55d0 <ovs_flow_tbl_insert+0xc0>:  movzwl 0x28(%r15),%ecx
> 0xffffffffa01a55d5 <ovs_flow_tbl_insert+0xc5>:  movzwl 0x2a(%r15),%esi
> 0xffffffffa01a55da <ovs_flow_tbl_insert+0xca>:  movzwl %cx,%eax
> 0xffffffffa01a55dd <ovs_flow_tbl_insert+0xcd>:  sub    %ecx,%esi
> 0xffffffffa01a55df <ovs_flow_tbl_insert+0xcf>:  lea    0x38(%r8,%rax,1),%rdi
> 0xffffffffa01a55e4 <ovs_flow_tbl_insert+0xd4>:  sar    $0x2,%esi
> 0xffffffffa01a55e7 <ovs_flow_tbl_insert+0xd7>:  callq  0xffffffff813a75c0 <__jhash2>
> 0xffffffffa01a55ec <ovs_flow_tbl_insert+0xdc>:  mov    %eax,0x30(%r8)
> 0xffffffffa01a55f0 <ovs_flow_tbl_insert+0xe0>:  mov    (%rbx),%r13
> 0xffffffffa01a55f3 <ovs_flow_tbl_insert+0xe3>:  mov    %r8,%rsi
> 0xffffffffa01a55f6 <ovs_flow_tbl_insert+0xe6>:  mov    %r13,%rdi
> 0xffffffffa01a55f9 <ovs_flow_tbl_insert+0xe9>:  callq  0xffffffffa01a4ba0 <table_instance_insert>
> 
> 	It appears to generate the same instructions, but allocates
> registers differently (using %r14 instead of %r8).

Exactly and that makes sense. While %r8 must be available for the callee
to be clobbered with, %r14 must be saved by the callee and restored
before returning. So you pass the responsibility down to the other
functions, which tries not to touch %r14 because it knows it will have
to generate code for saving and restoring.

That's the reason why I actually like the the static inline clobbering
approach so much, it gives gcc possibilities to move around the
save/restore cycles and decide itself just by aligning which registers
to use.

Also the first version does work flawlessly (which I didn't send as a
patch but only as a diff in the mail). Here gcc synthesizes a full
function call which has the same effect as the long clobber list, only
it does chain two calls right behind each other.

> 	The __jhash2 disassembly appears to be unchanged between the two
> versions.

Thanks for looking into this!

It is actually pretty hairy to come up with a good solution for this,
because with the alternative interface you are only allowed to alter one
instruction. jump_tables also don't work because I currently have the
opinion that they do the switch way too late. I absolutely don't want to
have inserts into a hashtable with different hashing functions depending
how early during boot they took place.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-11-14 22:37 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-14 14:06 [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
2014-11-14 14:40 ` [PATCH net-next v2] " Hannes Frederic Sowa
2014-11-14 14:50 ` [PATCH net-next] " Eric Dumazet
2014-11-14 15:13   ` Hannes Frederic Sowa
2014-11-14 15:33     ` Eric Dumazet
2014-11-14 15:46       ` Hannes Frederic Sowa
2014-11-14 18:38         ` David Miller
2014-11-14 19:02           ` Cong Wang
2014-11-14 20:42             ` Hannes Frederic Sowa
2014-11-14 21:35               ` David Miller
2014-11-14 19:05           ` [PATCH net-next] Revert "fast_hash: avoid indirect function calls" Jay Vosburgh
2014-11-14 21:36             ` David Miller
2014-11-14 21:43             ` Hannes Frederic Sowa
2014-11-14 20:04           ` [PATCH net-next] fast_hash: clobber registers correctly for inline function use Hannes Frederic Sowa
2014-11-14 20:10             ` Hannes Frederic Sowa
2014-11-14 20:15             ` Jay Vosburgh
2014-11-14 20:35               ` Hannes Frederic Sowa
2014-11-14 22:10                 ` Jay Vosburgh
2014-11-14 22:37                   ` Hannes Frederic Sowa
2014-11-14 15:17   ` [PATCH net-next v3] " Hannes Frederic Sowa
2014-11-14 17:57     ` Jay Vosburgh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.