netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] reciprocal_divide: correction/update of the algorithm
@ 2014-01-13 21:42 Hannes Frederic Sowa
  2014-01-14 18:02 ` Randy Dunlap
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-13 21:42 UTC (permalink / raw)
  To: netdev; +Cc: dborkman, eric.dumazet, linux-kernel, darkjames-ws

This patch is a RFC and part of a series Daniel Borkmann and me want to
do when introducing prandom_u32_range{,_ro} and prandom_u32_max{,_ro}
helpers later this week.

At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct:
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

Also: reciprocal_value and reciprocal_divide always return 0 for
divisions by 1.  This is a bit worrisome as those functions also get
used in mm/slab.c and lib/flex_array.c. Bonding already seems to check
for the 1-divisor case and handles that correctly. We don't know about
other problems, yet.

I propose an correction/update of the algorithm
based on the paper "T. Granlund and P. L. Montgomery:
Division by Invariant Integers Using Multiplication"
<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556>

The assembler implementation from Agner Fog, found here
<http://www.agner.org/optimize/asmlib.zip>, helped a lot while
implementing.

I would like to have feedback if people see problems with this patch or
have concerns about performance. I did some testing on x86-64 and found
no problems so far but did no performance evaluation, yet.

The current code does break the call-sides of reciprocal_divide. The necessary
changes will be part of the full series, then.

Thanks!
---
 include/linux/reciprocal_div.h | 12 +++++++++---
 lib/reciprocal_div.c           | 22 ++++++++++++++++++----
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/include/linux/reciprocal_div.h b/include/linux/reciprocal_div.h
index f9c90b3..6f17a87 100644
--- a/include/linux/reciprocal_div.h
+++ b/include/linux/reciprocal_div.h
@@ -22,11 +22,17 @@
  * Should not be called before each reciprocal_divide(),
  * or else the performance is slower than a normal divide.
  */
-extern u32 reciprocal_value(u32 B);
 
+struct reciprocal_value {
+	u32 m;
+	u8 sh1, sh2;
+};
 
-static inline u32 reciprocal_divide(u32 A, u32 R)
+struct reciprocal_value reciprocal_value(u32 d);
+
+static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)
 {
-	return (u32)(((u64)A * R) >> 32);
+	u32 t = (u32)(((u64)a * R.m) >> 32);
+	return (t + ((a - t) >> R.sh1)) >> R.sh2;
 }
 #endif
diff --git a/lib/reciprocal_div.c b/lib/reciprocal_div.c
index 75510e9..b741b30 100644
--- a/lib/reciprocal_div.c
+++ b/lib/reciprocal_div.c
@@ -1,11 +1,25 @@
+#include <linux/kernel.h>
 #include <asm/div64.h>
 #include <linux/reciprocal_div.h>
 #include <linux/export.h>
 
-u32 reciprocal_value(u32 k)
+/* For a description of the algorithmus please look at
+ * linux/reciprocal_div.h
+ */
+
+struct reciprocal_value reciprocal_value(u32 d)
 {
-	u64 val = (1LL << 32) + (k - 1);
-	do_div(val, k);
-	return (u32)val;
+	struct reciprocal_value R;
+	u64 m;
+	int l;
+
+	l = fls(d - 1);
+	m = ((1ULL << 32) * ((1ULL << l) - d));
+	do_div(m, d);
+	++m;
+	R.m = (u32)m;
+	R.sh1 = min(l, 1);
+	R.sh2 = max(l-1, 0);
+	return R;
 }
 EXPORT_SYMBOL(reciprocal_value);
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-13 21:42 [PATCH RFC] reciprocal_divide: correction/update of the algorithm Hannes Frederic Sowa
@ 2014-01-14 18:02 ` Randy Dunlap
  2014-01-15 15:02   ` Hannes Frederic Sowa
  2014-01-14 18:07 ` Eric Dumazet
  2014-01-15  7:02 ` [PATCH net] bpf: do not use reciprocal divide Eric Dumazet
  2 siblings, 1 reply; 32+ messages in thread
From: Randy Dunlap @ 2014-01-14 18:02 UTC (permalink / raw)
  To: netdev, dborkman, eric.dumazet, linux-kernel, darkjames-ws

On 01/13/2014 01:42 PM, Hannes Frederic Sowa wrote:
> 
> I propose an correction/update of the algorithm
> based on the paper "T. Granlund and P. L. Montgomery:
> Division by Invariant Integers Using Multiplication"
> <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556>
> 
> The assembler implementation from Agner Fog, found here
> <http://www.agner.org/optimize/asmlib.zip>, helped a lot while
> implementing.
> 
> I would like to have feedback if people see problems with this patch or
> have concerns about performance. I did some testing on x86-64 and found
> no problems so far but did no performance evaluation, yet.
> 
> The current code does break the call-sides of reciprocal_divide. The necessary
> changes will be part of the full series, then.
> 
> Thanks!
> ---
>  include/linux/reciprocal_div.h | 12 +++++++++---
>  lib/reciprocal_div.c           | 22 ++++++++++++++++++----
>  2 files changed, 27 insertions(+), 7 deletions(-)

Just trivia (coding style and spelling):

> diff --git a/lib/reciprocal_div.c b/lib/reciprocal_div.c
> index 75510e9..b741b30 100644
> --- a/lib/reciprocal_div.c
> +++ b/lib/reciprocal_div.c
> @@ -1,11 +1,25 @@
> +#include <linux/kernel.h>
>  #include <asm/div64.h>
>  #include <linux/reciprocal_div.h>
>  #include <linux/export.h>
>  
> -u32 reciprocal_value(u32 k)
> +/* For a description of the algorithmus please look at

                               algorithms

> + * linux/reciprocal_div.h
> + */

and kernel coding style for multi-line comments is like so:

/*
 * For a description of the algorithms, please look at
 * linux/reciprocal_div.h
 */

> +
> +struct reciprocal_value reciprocal_value(u32 d)
>  {



-- 
~Randy

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-13 21:42 [PATCH RFC] reciprocal_divide: correction/update of the algorithm Hannes Frederic Sowa
  2014-01-14 18:02 ` Randy Dunlap
@ 2014-01-14 18:07 ` Eric Dumazet
  2014-01-14 19:22   ` Austin S Hemmelgarn
  2014-01-14 22:39   ` Hannes Frederic Sowa
  2014-01-15  7:02 ` [PATCH net] bpf: do not use reciprocal divide Eric Dumazet
  2 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-14 18:07 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, dborkman, linux-kernel, darkjames-ws

On Mon, 2014-01-13 at 22:42 +0100, Hannes Frederic Sowa wrote:
> This patch is a RFC and part of a series Daniel Borkmann and me want to
> do when introducing prandom_u32_range{,_ro} and prandom_u32_max{,_ro}
> helpers later this week.

> -static inline u32 reciprocal_divide(u32 A, u32 R)
> +struct reciprocal_value reciprocal_value(u32 d);
> +
> +static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)
>  {
> -	return (u32)(((u64)A * R) >> 32);
> +	u32 t = (u32)(((u64)a * R.m) >> 32);
> +	return (t + ((a - t) >> R.sh1)) >> R.sh2;
>  }

I would rather introduce new helpers and convert users that really need
them.

For instance, just use a divide in BPF, because doing this on JIT might
be too complex for the gains. Strangely, libpcap doesn't seem to
optimize any divide, like divides by a power of two...

Reciprocal were added 7 years ago, for very specific uses, but current
cpus have reasonably fast dividers.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 18:07 ` Eric Dumazet
@ 2014-01-14 19:22   ` Austin S Hemmelgarn
  2014-01-14 19:50     ` Eric Dumazet
  2014-01-14 22:39   ` Hannes Frederic Sowa
  1 sibling, 1 reply; 32+ messages in thread
From: Austin S Hemmelgarn @ 2014-01-14 19:22 UTC (permalink / raw)
  To: Eric Dumazet, Hannes Frederic Sowa
  Cc: netdev, dborkman, linux-kernel, darkjames-ws

On 2014-01-14 13:07, Eric Dumazet wrote:
> On Mon, 2014-01-13 at 22:42 +0100, Hannes Frederic Sowa wrote:
>> This patch is a RFC and part of a series Daniel Borkmann and me want to
>> do when introducing prandom_u32_range{,_ro} and prandom_u32_max{,_ro}
>> helpers later this week.
> 
>> -static inline u32 reciprocal_divide(u32 A, u32 R)
>> +struct reciprocal_value reciprocal_value(u32 d);
>> +
>> +static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)
>>  {
>> -	return (u32)(((u64)A * R) >> 32);
>> +	u32 t = (u32)(((u64)a * R.m) >> 32);
>> +	return (t + ((a - t) >> R.sh1)) >> R.sh2;
>>  }
> 
> I would rather introduce new helpers and convert users that really need
> them.
> 
> For instance, just use a divide in BPF, because doing this on JIT might
> be too complex for the gains. Strangely, libpcap doesn't seem to
> optimize any divide, like divides by a power of two...
> 
> Reciprocal were added 7 years ago, for very specific uses, but current
> cpus have reasonably fast dividers.

I disagree with the statement that current CPU's have reasonably fast
dividers.  A lot of embedded processors and many low-end x86 CPU's do
not in-fact have any hardware divider, and usually provide it using
microcode based emulation if they provide it at all.  The AMD Jaguar
micro-architecture in particular comes to mind, it uses an iterative
division algorithm provided by the microcode that only produces 2 bits
of quotient per cycle, even in the best case (2 8-bit integers and an
integral 8-bit quotient) this still takes 4 cycles, which is twice as
slow as any other math operation on the same processor.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 19:22   ` Austin S Hemmelgarn
@ 2014-01-14 19:50     ` Eric Dumazet
  2014-01-14 20:10       ` Hannes Frederic Sowa
  2014-01-14 20:53       ` Austin S Hemmelgarn
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-14 19:50 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Hannes Frederic Sowa, netdev, dborkman, linux-kernel, darkjames-ws

On Tue, 2014-01-14 at 14:22 -0500, Austin S Hemmelgarn wrote:

> I disagree with the statement that current CPU's have reasonably fast
> dividers.  A lot of embedded processors and many low-end x86 CPU's do
> not in-fact have any hardware divider, and usually provide it using
> microcode based emulation if they provide it at all.  The AMD Jaguar
> micro-architecture in particular comes to mind, it uses an iterative
> division algorithm provided by the microcode that only produces 2 bits
> of quotient per cycle, even in the best case (2 8-bit integers and an
> integral 8-bit quotient) this still takes 4 cycles, which is twice as
> slow as any other math operation on the same processor.

I doubt you run any BPF filter with a divide instruction in it on these
platform.

Get real, do not over optimize things where it does not matter.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 19:50     ` Eric Dumazet
@ 2014-01-14 20:10       ` Hannes Frederic Sowa
  2014-01-14 20:53       ` Austin S Hemmelgarn
  1 sibling, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-14 20:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Austin S Hemmelgarn, netdev, dborkman, linux-kernel, darkjames-ws

On Tue, Jan 14, 2014 at 11:50:32AM -0800, Eric Dumazet wrote:
> On Tue, 2014-01-14 at 14:22 -0500, Austin S Hemmelgarn wrote:
> 
> > I disagree with the statement that current CPU's have reasonably fast
> > dividers.  A lot of embedded processors and many low-end x86 CPU's do
> > not in-fact have any hardware divider, and usually provide it using
> > microcode based emulation if they provide it at all.  The AMD Jaguar
> > micro-architecture in particular comes to mind, it uses an iterative
> > division algorithm provided by the microcode that only produces 2 bits
> > of quotient per cycle, even in the best case (2 8-bit integers and an
> > integral 8-bit quotient) this still takes 4 cycles, which is twice as
> > slow as any other math operation on the same processor.
> 
> I doubt you run any BPF filter with a divide instruction in it on these
> platform.
> 
> Get real, do not over optimize things where it does not matter.

If I read the instruction tables correctly, we could half the latency with
reciprocal divide even on haswell.

What a pitty that the Intel Architecture Code Analyzer does not support imul
nor div instruction. :(

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 19:50     ` Eric Dumazet
  2014-01-14 20:10       ` Hannes Frederic Sowa
@ 2014-01-14 20:53       ` Austin S Hemmelgarn
  2014-01-14 22:45         ` Eric Dumazet
  1 sibling, 1 reply; 32+ messages in thread
From: Austin S Hemmelgarn @ 2014-01-14 20:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, netdev, dborkman, linux-kernel, darkjames-ws

On 2014-01-14 14:50, Eric Dumazet wrote:
> On Tue, 2014-01-14 at 14:22 -0500, Austin S Hemmelgarn wrote:
> 
>> I disagree with the statement that current CPU's have reasonably fast
>> dividers.  A lot of embedded processors and many low-end x86 CPU's do
>> not in-fact have any hardware divider, and usually provide it using
>> microcode based emulation if they provide it at all.  The AMD Jaguar
>> micro-architecture in particular comes to mind, it uses an iterative
>> division algorithm provided by the microcode that only produces 2 bits
>> of quotient per cycle, even in the best case (2 8-bit integers and an
>> integral 8-bit quotient) this still takes 4 cycles, which is twice as
>> slow as any other math operation on the same processor.
> 
> I doubt you run any BPF filter with a divide instruction in it on these
> platform.
> 
> Get real, do not over optimize things where it does not matter.
> 
Actually, I have three Jaguar based routers, and use BPF regularly as
part of their iptables rules to log certain packet types.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 18:07 ` Eric Dumazet
  2014-01-14 19:22   ` Austin S Hemmelgarn
@ 2014-01-14 22:39   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-14 22:39 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, dborkman, linux-kernel, darkjames-ws

On Tue, Jan 14, 2014 at 10:07:05AM -0800, Eric Dumazet wrote:
> On Mon, 2014-01-13 at 22:42 +0100, Hannes Frederic Sowa wrote:
> > This patch is a RFC and part of a series Daniel Borkmann and me want to
> > do when introducing prandom_u32_range{,_ro} and prandom_u32_max{,_ro}
> > helpers later this week.
> 
> > -static inline u32 reciprocal_divide(u32 A, u32 R)
> > +struct reciprocal_value reciprocal_value(u32 d);
> > +
> > +static inline u32 reciprocal_divide(u32 a, struct reciprocal_value R)
> >  {
> > -	return (u32)(((u64)A * R) >> 32);
> > +	u32 t = (u32)(((u64)a * R.m) >> 32);
> > +	return (t + ((a - t) >> R.sh1)) >> R.sh2;
> >  }
> 
> I would rather introduce new helpers and convert users that really need
> them.
>
> For instance, just use a divide in BPF, because doing this on JIT might
> be too complex for the gains. Strangely, libpcap doesn't seem to
> optimize any divide, like divides by a power of two...
> 
> Reciprocal were added 7 years ago, for very specific uses, but current
> cpus have reasonably fast dividers.

Agreed. The new algorithm would need to change the size of struct
sock_filter, which is exported to user space. We will leave BPF as-is
for the time being and check that later.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 20:53       ` Austin S Hemmelgarn
@ 2014-01-14 22:45         ` Eric Dumazet
  2014-01-14 23:25           ` Borislav Petkov
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-01-14 22:45 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Hannes Frederic Sowa, netdev, dborkman, linux-kernel, darkjames-ws

On Tue, 2014-01-14 at 15:53 -0500, Austin S Hemmelgarn wrote:
> On 2014-01-14 14:50, Eric Dumazet wrote:
> > On Tue, 2014-01-14 at 14:22 -0500, Austin S Hemmelgarn wrote:
> > 
> >> I disagree with the statement that current CPU's have reasonably fast
> >> dividers.  A lot of embedded processors and many low-end x86 CPU's do
> >> not in-fact have any hardware divider, and usually provide it using
> >> microcode based emulation if they provide it at all.  The AMD Jaguar
> >> micro-architecture in particular comes to mind, it uses an iterative
> >> division algorithm provided by the microcode that only produces 2 bits
> >> of quotient per cycle, even in the best case (2 8-bit integers and an
> >> integral 8-bit quotient) this still takes 4 cycles, which is twice as
> >> slow as any other math operation on the same processor.
> > 
> > I doubt you run any BPF filter with a divide instruction in it on these
> > platform.
> > 
> > Get real, do not over optimize things where it does not matter.
> > 
> Actually, I have three Jaguar based routers, and use BPF regularly as
> part of their iptables rules to log certain packet types.



1) Have you enabled BPF JIT

2) Do you have divide instructions in a BPF filter, 
   if yes, I would like to have an example.
   (current code works well for small divisors anyway)

3) How much time is spent in BPF compared to the rest of the stack,
   especially if you run iptables...

Spending 2 or 3 days of work to save ~7 cycles for a divide that
probably can be replaced by a shift anyway, while spending 5000 cycles
per packet is what I call not a wise optimization, especially
if dealing with old hardware.

Even on a Jaguar, the proposed alternative 

+       u32 t = (u32)(((u64)a * R.m) >> 32);
+       return (t + ((a - t) >> R.sh1)) >> R.sh2;

will have a similar cost.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 22:45         ` Eric Dumazet
@ 2014-01-14 23:25           ` Borislav Petkov
  2014-01-15  2:51             ` Austin S. Hemmelgarn
  0 siblings, 1 reply; 32+ messages in thread
From: Borislav Petkov @ 2014-01-14 23:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Austin S Hemmelgarn, Hannes Frederic Sowa, netdev, dborkman,
	linux-kernel, darkjames-ws

On Tue, Jan 14, 2014 at 02:45:37PM -0800, Eric Dumazet wrote:
> Even on a Jaguar, the proposed alternative

I don't know what Jaguar you guys are talking about but the Jaguar
I know - Fam16h - has an int hardware divider:

http://developer.amd.com/wordpress/media/2012/10/SOG_16h_52128_PUB_Rev1_1.pdf

So all that talk about microcode is plain wrong. The hardware divider
comes from Llano (F12h) so it must be some other Jaguar, maybe Bobcat.

:-)

If it is Bobcat, then it has a 1-bit per cycle ucode int divider.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 23:25           ` Borislav Petkov
@ 2014-01-15  2:51             ` Austin S. Hemmelgarn
  0 siblings, 0 replies; 32+ messages in thread
From: Austin S. Hemmelgarn @ 2014-01-15  2:51 UTC (permalink / raw)
  To: Borislav Petkov, Eric Dumazet
  Cc: Hannes Frederic Sowa, netdev, dborkman, linux-kernel, darkjames-ws


On 01/14/2014 06:25 PM, Borislav Petkov wrote:
> On Tue, Jan 14, 2014 at 02:45:37PM -0800, Eric Dumazet wrote:
>> Even on a Jaguar, the proposed alternative
> I don't know what Jaguar you guys are talking about but the Jaguar
> I know - Fam16h - has an int hardware divider:
>
> http://developer.amd.com/wordpress/media/2012/10/SOG_16h_52128_PUB_Rev1_1.pdf
>
> So all that talk about microcode is plain wrong. The hardware divider
> comes from Llano (F12h) so it must be some other Jaguar, maybe Bobcat.
>
> :-)
>
> If it is Bobcat, then it has a 1-bit per cycle ucode int divider.
>
Apologies, it does have a hardware divider, however it still only gets 2
bits per cycle.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH net] bpf: do not use reciprocal divide
  2014-01-13 21:42 [PATCH RFC] reciprocal_divide: correction/update of the algorithm Hannes Frederic Sowa
  2014-01-14 18:02 ` Randy Dunlap
  2014-01-14 18:07 ` Eric Dumazet
@ 2014-01-15  7:02 ` Eric Dumazet
  2014-01-15  7:28   ` David Miller
  2014-01-15  8:00   ` Heiko Carstens
  2 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15  7:02 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: netdev, dborkman, darkjames-ws, Mircea Gherzan, Russell King,
	Matt Evans, Martin Schwidefsky, Heiko Carstens

From: Eric Dumazet <edumazet@google.com>

At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
---

Please review arch code to make sure I made no mistake, thanks !

 arch/arm/net/bpf_jit_32.c       |    4 +---
 arch/powerpc/net/bpf_jit_comp.c |    5 ++---
 arch/s390/net/bpf_jit_comp.c    |   10 +++++-----
 arch/sparc/net/bpf_jit_comp.c   |    3 +--
 arch/x86/net/bpf_jit_comp.c     |    8 ++++----
 net/core/filter.c               |   30 ++----------------------------
 6 files changed, 15 insertions(+), 45 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 9ed155ad0f97..81f2b6d37b06 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -641,10 +641,8 @@ load_ind:
 			emit(ARM_MUL(r_A, r_A, r_X), ctx);
 			break;
 		case BPF_S_ALU_DIV_K:
-			/* current k == reciprocal_value(userspace k) */
 			emit_mov_i(r_scratch, k, ctx);
-			/* A = top 32 bits of the product */
-			emit(ARM_UMULL(r_scratch, r_A, r_A, r_scratch), ctx);
+			emit_udiv(r_A, r_A, r_scratch, ctx);
 			break;
 		case BPF_S_ALU_DIV_X:
 			update_on_xread(ctx);
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index ac3c2a10dafd..8044ddc6b320 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -223,10 +223,9 @@ static int bpf_jit_build_body(struct sk_filter *fp, u32 *image,
 			}
 			PPC_DIVWU(r_A, r_A, r_X);
 			break;
-		case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
+		case BPF_S_ALU_DIV_K: /* A /= K */
 			PPC_LI32(r_scratch1, K);
-			/* Top 32 bits of 64bit result -> A */
-			PPC_MULHWU(r_A, r_A, r_scratch1);
+			PPC_DIVWU(r_A, r_A, r_scratch1);
 			break;
 		case BPF_S_ALU_AND_X:
 			ctx->seen |= SEEN_XREG;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 16871da37371..e349dc7d0992 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		/* dr %r4,%r12 */
 		EMIT2(0x1d4c);
 		break;
-	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
-		/* m %r4,<d(K)>(%r13) */
-		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
-		/* lr %r5,%r4 */
-		EMIT2(0x1854);
+	case BPF_S_ALU_DIV_K: /* A /= K */
+		/* lhi %r4,0 */
+		EMIT4(0xa7480000);
+		/* d %r4,<d(K)>(%r13) */
+		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
 		break;
 	case BPF_S_ALU_MOD_X: /* A %= X */
 		jit->seen |= SEEN_XREG | SEEN_RET0;
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index 218b6b23c378..125045063b91 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -498,8 +498,7 @@ void bpf_jit_compile(struct sk_filter *fp)
 				emit_alu_K(MUL, K);
 				break;
 			case BPF_S_ALU_DIV_K:	/* A /= K */
-				emit_alu_K(MUL, K);
-				emit_read_y(r_A);
+				emit_alu_K(DIV, K);
 				break;
 			case BPF_S_ALU_DIV_X:	/* A /= X; */
 				emit_cmpi(r_X, 0);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 26328e800869..f3d1f54de012 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -364,10 +364,10 @@ void bpf_jit_compile(struct sk_filter *fp)
 				EMIT2(0xf7, 0xf1);	/* div %ecx */
 				EMIT2(0x89, 0xd0);	/* mov %edx,%eax */
 				break;
-			case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
-				EMIT3(0x48, 0x69, 0xc0); /* imul imm32,%rax,%rax */
-				EMIT(K, 4);
-				EMIT4(0x48, 0xc1, 0xe8, 0x20); /* shr $0x20,%rax */
+			case BPF_S_ALU_DIV_K: /* A /= K */
+				EMIT2(0x31, 0xd2);	/* xor %edx,%edx */
+				EMIT1(0xb9);EMIT(K, 4);	/* mov imm32,%ecx */
+				EMIT2(0xf7, 0xf1);	/* div %ecx */
 				break;
 			case BPF_S_ALU_AND_X:
 				seen |= SEEN_XREG;
diff --git a/net/core/filter.c b/net/core/filter.c
index 01b780856db2..ad30d626a5bd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -36,7 +36,6 @@
 #include <asm/uaccess.h>
 #include <asm/unaligned.h>
 #include <linux/filter.h>
-#include <linux/reciprocal_div.h>
 #include <linux/ratelimit.h>
 #include <linux/seccomp.h>
 #include <linux/if_vlan.h>
@@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff *skb,
 			A /= X;
 			continue;
 		case BPF_S_ALU_DIV_K:
-			A = reciprocal_divide(A, K);
+			A /= K;
 			continue;
 		case BPF_S_ALU_MOD_X:
 			if (X == 0)
@@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
 		/* Some instructions need special checks */
 		switch (code) {
 		case BPF_S_ALU_DIV_K:
-			/* check for division by zero */
-			if (ftest->k == 0)
-				return -EINVAL;
-			ftest->k = reciprocal_value(ftest->k);
-			break;
 		case BPF_S_ALU_MOD_K:
 			/* check for division by zero */
 			if (ftest->k == 0)
@@ -853,27 +847,7 @@ void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to)
 	to->code = decodes[code];
 	to->jt = filt->jt;
 	to->jf = filt->jf;
-
-	if (code == BPF_S_ALU_DIV_K) {
-		/*
-		 * When loaded this rule user gave us X, which was
-		 * translated into R = r(X). Now we calculate the
-		 * RR = r(R) and report it back. If next time this
-		 * value is loaded and RRR = r(RR) is calculated
-		 * then the R == RRR will be true.
-		 *
-		 * One exception. X == 1 translates into R == 0 and
-		 * we can't calculate RR out of it with r().
-		 */
-
-		if (filt->k == 0)
-			to->k = 1;
-		else
-			to->k = reciprocal_value(filt->k);
-
-		BUG_ON(reciprocal_value(to->k) != filt->k);
-	} else
-		to->k = filt->k;
+	to->k = filt->k;
 }
 
 int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, unsigned int len)

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  7:02 ` [PATCH net] bpf: do not use reciprocal divide Eric Dumazet
@ 2014-01-15  7:28   ` David Miller
  2014-01-15  7:39     ` Eric Dumazet
  2014-01-15  8:00   ` Heiko Carstens
  1 sibling, 1 reply; 32+ messages in thread
From: David Miller @ 2014-01-15  7:28 UTC (permalink / raw)
  To: eric.dumazet
  Cc: hannes, netdev, dborkman, darkjames-ws, mgherzan, rmk+kernel,
	matt, schwidefsky, heiko.carstens

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 14 Jan 2014 23:02:41 -0800

> diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
> index 218b6b23c378..125045063b91 100644
> --- a/arch/sparc/net/bpf_jit_comp.c
> +++ b/arch/sparc/net/bpf_jit_comp.c
> @@ -498,8 +498,7 @@ void bpf_jit_compile(struct sk_filter *fp)
>  				emit_alu_K(MUL, K);
>  				break;
>  			case BPF_S_ALU_DIV_K:	/* A /= K */
> -				emit_alu_K(MUL, K);
> -				emit_read_y(r_A);
> +				emit_alu_K(DIV, K);
>  				break;
>  			case BPF_S_ALU_DIV_X:	/* A /= X; */
>  				emit_cmpi(r_X, 0);

You have to clear the Y register before a divide, as it provides
the top 32-bits of the 64-bit numerator.

You can just cut and paste the sequence used for BPF_S_ALU_DIV_X:

				emit_write_y(G0);
#ifdef CONFIG_SPARC32
				/* The Sparc v8 architecture requires
				 * three instructions between a %y
				 * register write and the first use.
				 */
				emit_nop();
				emit_nop();
				emit_nop();
#endif
				emit_alu_X(DIV);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  7:28   ` David Miller
@ 2014-01-15  7:39     ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15  7:39 UTC (permalink / raw)
  To: David Miller
  Cc: hannes, netdev, dborkman, darkjames-ws, mgherzan, rmk+kernel,
	matt, schwidefsky, heiko.carstens

On Tue, 2014-01-14 at 23:28 -0800, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 14 Jan 2014 23:02:41 -0800
> 
> > diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
> > index 218b6b23c378..125045063b91 100644
> > --- a/arch/sparc/net/bpf_jit_comp.c
> > +++ b/arch/sparc/net/bpf_jit_comp.c
> > @@ -498,8 +498,7 @@ void bpf_jit_compile(struct sk_filter *fp)
> >  				emit_alu_K(MUL, K);
> >  				break;
> >  			case BPF_S_ALU_DIV_K:	/* A /= K */
> > -				emit_alu_K(MUL, K);
> > -				emit_read_y(r_A);
> > +				emit_alu_K(DIV, K);
> >  				break;
> >  			case BPF_S_ALU_DIV_X:	/* A /= X; */
> >  				emit_cmpi(r_X, 0);
> 
> You have to clear the Y register before a divide, as it provides
> the top 32-bits of the 64-bit numerator.
> 
> You can just cut and paste the sequence used for BPF_S_ALU_DIV_X:
> 
> 				emit_write_y(G0);
> #ifdef CONFIG_SPARC32
> 				/* The Sparc v8 architecture requires
> 				 * three instructions between a %y
> 				 * register write and the first use.
> 				 */
> 				emit_nop();
> 				emit_nop();
> 				emit_nop();
> #endif
> 				emit_alu_X(DIV);

Thanks David, I will submit a v2 tomorrow morning, when/if I get more
feedback for other arches, and after some rest ;)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  7:02 ` [PATCH net] bpf: do not use reciprocal divide Eric Dumazet
  2014-01-15  7:28   ` David Miller
@ 2014-01-15  8:00   ` Heiko Carstens
  2014-01-15  8:13     ` Martin Schwidefsky
  2014-01-15 14:16     ` Eric Dumazet
  1 sibling, 2 replies; 32+ messages in thread
From: Heiko Carstens @ 2014-01-15  8:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, netdev, dborkman, darkjames-ws,
	Mircea Gherzan, Russell King, Matt Evans, Martin Schwidefsky

On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> index 16871da37371..e349dc7d0992 100644
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
>  		/* dr %r4,%r12 */
>  		EMIT2(0x1d4c);
>  		break;
> -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> -		/* m %r4,<d(K)>(%r13) */
> -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> -		/* lr %r5,%r4 */
> -		EMIT2(0x1854);
> +	case BPF_S_ALU_DIV_K: /* A /= K */
> +		/* lhi %r4,0 */
> +		EMIT4(0xa7480000);
> +		/* d %r4,<d(K)>(%r13) */
> +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
>  		break;

The s390 part looks good.

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 01b780856db2..ad30d626a5bd 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff *skb,
>  			A /= X;
>  			continue;
>  		case BPF_S_ALU_DIV_K:
> -			A = reciprocal_divide(A, K);
> +			A /= K;
>  			continue;
>  		case BPF_S_ALU_MOD_X:
>  			if (X == 0)
> @@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
>  		/* Some instructions need special checks */
>  		switch (code) {
>  		case BPF_S_ALU_DIV_K:
> -			/* check for division by zero */
> -			if (ftest->k == 0)
> -				return -EINVAL;
> -			ftest->k = reciprocal_value(ftest->k);
> -			break;

Are you sure you want to remove the k == 0 check? Is there something
else that would prevent a division by zero?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  8:00   ` Heiko Carstens
@ 2014-01-15  8:13     ` Martin Schwidefsky
  2014-01-15 10:51       ` Heiko Carstens
  2014-01-15 14:16     ` Eric Dumazet
  1 sibling, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2014-01-15  8:13 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Eric Dumazet, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 15 Jan 2014 09:00:07 +0100
Heiko Carstens <heiko.carstens@de.ibm.com> wrote:

> On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > index 16871da37371..e349dc7d0992 100644
> > --- a/arch/s390/net/bpf_jit_comp.c
> > +++ b/arch/s390/net/bpf_jit_comp.c
> > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> >  		/* dr %r4,%r12 */
> >  		EMIT2(0x1d4c);
> >  		break;
> > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > -		/* m %r4,<d(K)>(%r13) */
> > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > -		/* lr %r5,%r4 */
> > -		EMIT2(0x1854);
> > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > +		/* lhi %r4,0 */
> > +		EMIT4(0xa7480000);
> > +		/* d %r4,<d(K)>(%r13) */
> > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> >  		break;
> 
> The s390 part looks good.

Does it? The divide instruction is signed, for the special
case of K==1 this can now cause an exception if the quotient
gets too large. We should add a check for K==1 and do nothing
in this case. With a divisor of at least 2 the result will
stay in the limit.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  8:13     ` Martin Schwidefsky
@ 2014-01-15 10:51       ` Heiko Carstens
  2014-01-15 14:21         ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Heiko Carstens @ 2014-01-15 10:51 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Eric Dumazet, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, Jan 15, 2014 at 09:13:22AM +0100, Martin Schwidefsky wrote:
> On Wed, 15 Jan 2014 09:00:07 +0100
> Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> 
> > On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > > index 16871da37371..e349dc7d0992 100644
> > > --- a/arch/s390/net/bpf_jit_comp.c
> > > +++ b/arch/s390/net/bpf_jit_comp.c
> > > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> > >  		/* dr %r4,%r12 */
> > >  		EMIT2(0x1d4c);
> > >  		break;
> > > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > > -		/* m %r4,<d(K)>(%r13) */
> > > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > > -		/* lr %r5,%r4 */
> > > -		EMIT2(0x1854);
> > > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > > +		/* lhi %r4,0 */
> > > +		EMIT4(0xa7480000);
> > > +		/* d %r4,<d(K)>(%r13) */
> > > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> > >  		break;
> > 
> > The s390 part looks good.
> 
> Does it? The divide instruction is signed, for the special
> case of K==1 this can now cause an exception if the quotient
> gets too large. We should add a check for K==1 and do nothing
> in this case. With a divisor of at least 2 the result will
> stay in the limit.

Indeed. That's quite subtle.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15  8:00   ` Heiko Carstens
  2014-01-15  8:13     ` Martin Schwidefsky
@ 2014-01-15 14:16     ` Eric Dumazet
  2014-01-15 15:10       ` Heiko Carstens
  1 sibling, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 14:16 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Hannes Frederic Sowa, netdev, dborkman, darkjames-ws,
	Mircea Gherzan, Russell King, Matt Evans, Martin Schwidefsky

On Wed, 2014-01-15 at 09:00 +0100, Heiko Carstens wrote:
> On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > index 16871da37371..e349dc7d0992 100644
> > --- a/arch/s390/net/bpf_jit_comp.c
> > +++ b/arch/s390/net/bpf_jit_comp.c
> > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> >  		/* dr %r4,%r12 */
> >  		EMIT2(0x1d4c);
> >  		break;
> > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > -		/* m %r4,<d(K)>(%r13) */
> > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > -		/* lr %r5,%r4 */
> > -		EMIT2(0x1854);
> > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > +		/* lhi %r4,0 */
> > +		EMIT4(0xa7480000);
> > +		/* d %r4,<d(K)>(%r13) */
> > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> >  		break;
> 
> The s390 part looks good.
> 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 01b780856db2..ad30d626a5bd 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff *skb,
> >  			A /= X;
> >  			continue;
> >  		case BPF_S_ALU_DIV_K:
> > -			A = reciprocal_divide(A, K);
> > +			A /= K;
> >  			continue;
> >  		case BPF_S_ALU_MOD_X:
> >  			if (X == 0)
> > @@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
> >  		/* Some instructions need special checks */
> >  		switch (code) {
> >  		case BPF_S_ALU_DIV_K:
> > -			/* check for division by zero */
> > -			if (ftest->k == 0)
> > -				return -EINVAL;
> > -			ftest->k = reciprocal_value(ftest->k);
> > -			break;
> 
> Are you sure you want to remove the k == 0 check? Is there something
> else that would prevent a division by zero?

This is done by factoring the two cases, modulo and divide :

 vi +553 net/core/filter.c 

                /* Some instructions need special checks */
                switch (code) {
                case BPF_S_ALU_DIV_K:
                case BPF_S_ALU_MOD_K:
                        /* check for division by zero */
                        if (ftest->k == 0)
                                return -EINVAL;
                        break;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 10:51       ` Heiko Carstens
@ 2014-01-15 14:21         ` Eric Dumazet
  2014-01-15 14:25           ` Eric Dumazet
  2014-01-15 15:26           ` Martin Schwidefsky
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 14:21 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Martin Schwidefsky, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 2014-01-15 at 11:51 +0100, Heiko Carstens wrote:
> On Wed, Jan 15, 2014 at 09:13:22AM +0100, Martin Schwidefsky wrote:
> > On Wed, 15 Jan 2014 09:00:07 +0100
> > Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > 
> > > On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > > > index 16871da37371..e349dc7d0992 100644
> > > > --- a/arch/s390/net/bpf_jit_comp.c
> > > > +++ b/arch/s390/net/bpf_jit_comp.c
> > > > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> > > >  		/* dr %r4,%r12 */
> > > >  		EMIT2(0x1d4c);
> > > >  		break;
> > > > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > > > -		/* m %r4,<d(K)>(%r13) */
> > > > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > > > -		/* lr %r5,%r4 */
> > > > -		EMIT2(0x1854);
> > > > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > > > +		/* lhi %r4,0 */
> > > > +		EMIT4(0xa7480000);
> > > > +		/* d %r4,<d(K)>(%r13) */
> > > > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> > > >  		break;
> > > 
> > > The s390 part looks good.
> > 
> > Does it? The divide instruction is signed, for the special
> > case of K==1 this can now cause an exception if the quotient
> > gets too large. We should add a check for K==1 and do nothing
> > in this case. With a divisor of at least 2 the result will
> > stay in the limit.
> 
> Indeed. That's quite subtle.

net/core/filter.c does :

A /= K;

Why is this working in generic code (if K == 1), not in s390 one ?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 14:21         ` Eric Dumazet
@ 2014-01-15 14:25           ` Eric Dumazet
  2014-01-15 14:50             ` [PATCH v2 " Eric Dumazet
  2014-01-15 15:35             ` [PATCH " Martin Schwidefsky
  2014-01-15 15:26           ` Martin Schwidefsky
  1 sibling, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 14:25 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Martin Schwidefsky, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 2014-01-15 at 06:21 -0800, Eric Dumazet wrote:
> On Wed, 2014-01-15 at 11:51 +0100, Heiko Carstens wrote:
> > On Wed, Jan 15, 2014 at 09:13:22AM +0100, Martin Schwidefsky wrote:
> > > On Wed, 15 Jan 2014 09:00:07 +0100
> > > Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > > 
> > > > On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > > > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > > > > index 16871da37371..e349dc7d0992 100644
> > > > > --- a/arch/s390/net/bpf_jit_comp.c
> > > > > +++ b/arch/s390/net/bpf_jit_comp.c
> > > > > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> > > > >  		/* dr %r4,%r12 */
> > > > >  		EMIT2(0x1d4c);
> > > > >  		break;
> > > > > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > > > > -		/* m %r4,<d(K)>(%r13) */
> > > > > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > > > > -		/* lr %r5,%r4 */
> > > > > -		EMIT2(0x1854);
> > > > > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > > > > +		/* lhi %r4,0 */
> > > > > +		EMIT4(0xa7480000);
> > > > > +		/* d %r4,<d(K)>(%r13) */
> > > > > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> > > > >  		break;
> > > > 
> > > > The s390 part looks good.
> > > 
> > > Does it? The divide instruction is signed, for the special
> > > case of K==1 this can now cause an exception if the quotient
> > > gets too large. We should add a check for K==1 and do nothing
> > > in this case. With a divisor of at least 2 the result will
> > > stay in the limit.
> > 
> > Indeed. That's quite subtle.
> 
> net/core/filter.c does :
> 
> A /= K;
> 
> Why is this working in generic code (if K == 1), not in s390 one ?

Note that I copied code found in BPF_S_ALU_MOD_K, so this one would need
a fix as well.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-15 14:25           ` Eric Dumazet
@ 2014-01-15 14:50             ` Eric Dumazet
  2014-01-15 15:10               ` Matt Evans
  2014-01-16  1:02               ` David Miller
  2014-01-15 15:35             ` [PATCH " Martin Schwidefsky
  1 sibling, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 14:50 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Martin Schwidefsky, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

From: Eric Dumazet <edumazet@google.com>

At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
---
v2: Fixed sparc code as David kindly suggested
    Added tests on K being 1 (divide by 1 is a nop
                              modulo by 1 clears A),
    as Martin Schwidefsky seems concerned by this case.

Please review arch code to make sure I made no mistake, thanks !

 arch/arm/net/bpf_jit_32.c       |    6 +++---
 arch/powerpc/net/bpf_jit_comp.c |    7 ++++---
 arch/s390/net/bpf_jit_comp.c    |   17 ++++++++++++-----
 arch/sparc/net/bpf_jit_comp.c   |   17 ++++++++++++++---
 arch/x86/net/bpf_jit_comp.c     |   14 ++++++++++----
 net/core/filter.c               |   30 ++----------------------------
 6 files changed, 45 insertions(+), 46 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 9ed155ad0f97..271b5e971568 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -641,10 +641,10 @@ load_ind:
 			emit(ARM_MUL(r_A, r_A, r_X), ctx);
 			break;
 		case BPF_S_ALU_DIV_K:
-			/* current k == reciprocal_value(userspace k) */
+			if (k == 1)
+				break;
 			emit_mov_i(r_scratch, k, ctx);
-			/* A = top 32 bits of the product */
-			emit(ARM_UMULL(r_scratch, r_A, r_A, r_scratch), ctx);
+			emit_udiv(r_A, r_A, r_scratch, ctx);
 			break;
 		case BPF_S_ALU_DIV_X:
 			update_on_xread(ctx);
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index ac3c2a10dafd..555034f8505e 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -223,10 +223,11 @@ static int bpf_jit_build_body(struct sk_filter *fp, u32 *image,
 			}
 			PPC_DIVWU(r_A, r_A, r_X);
 			break;
-		case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
+		case BPF_S_ALU_DIV_K: /* A /= K */
+			if (K == 1)
+				break;
 			PPC_LI32(r_scratch1, K);
-			/* Top 32 bits of 64bit result -> A */
-			PPC_MULHWU(r_A, r_A, r_scratch1);
+			PPC_DIVWU(r_A, r_A, r_scratch1);
 			break;
 		case BPF_S_ALU_AND_X:
 			ctx->seen |= SEEN_XREG;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 16871da37371..fc0fa77728e1 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -371,11 +371,13 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		/* dr %r4,%r12 */
 		EMIT2(0x1d4c);
 		break;
-	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
-		/* m %r4,<d(K)>(%r13) */
-		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
-		/* lr %r5,%r4 */
-		EMIT2(0x1854);
+	case BPF_S_ALU_DIV_K: /* A /= K */
+		if (K == 1)
+			break;
+		/* lhi %r4,0 */
+		EMIT4(0xa7480000);
+		/* d %r4,<d(K)>(%r13) */
+		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
 		break;
 	case BPF_S_ALU_MOD_X: /* A %= X */
 		jit->seen |= SEEN_XREG | SEEN_RET0;
@@ -391,6 +393,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		EMIT2(0x1854);
 		break;
 	case BPF_S_ALU_MOD_K: /* A %= K */
+		if (K == 1) {
+			/* lhi %r5,0 */
+			EMIT4(0xa7580000);
+			break;
+		}
 		/* lhi %r4,0 */
 		EMIT4(0xa7480000);
 		/* d %r4,<d(K)>(%r13) */
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index 218b6b23c378..01fe9946d388 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -497,9 +497,20 @@ void bpf_jit_compile(struct sk_filter *fp)
 			case BPF_S_ALU_MUL_K:	/* A *= K */
 				emit_alu_K(MUL, K);
 				break;
-			case BPF_S_ALU_DIV_K:	/* A /= K */
-				emit_alu_K(MUL, K);
-				emit_read_y(r_A);
+			case BPF_S_ALU_DIV_K:	/* A /= K with K != 0*/
+				if (K == 1)
+					break;
+				emit_write_y(G0);
+#ifdef CONFIG_SPARC32
+				/* The Sparc v8 architecture requires
+				 * three instructions between a %y
+				 * register write and the first use.
+				 */
+				emit_nop();
+				emit_nop();
+				emit_nop();
+#endif
+				emit_alu_K(DIV, K);
 				break;
 			case BPF_S_ALU_DIV_X:	/* A /= X; */
 				emit_cmpi(r_X, 0);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 26328e800869..4ed75dd81d05 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -359,15 +359,21 @@ void bpf_jit_compile(struct sk_filter *fp)
 				EMIT2(0x89, 0xd0);	/* mov %edx,%eax */
 				break;
 			case BPF_S_ALU_MOD_K: /* A %= K; */
+				if (K == 1) {
+					CLEAR_A();
+					break;
+				}
 				EMIT2(0x31, 0xd2);	/* xor %edx,%edx */
 				EMIT1(0xb9);EMIT(K, 4);	/* mov imm32,%ecx */
 				EMIT2(0xf7, 0xf1);	/* div %ecx */
 				EMIT2(0x89, 0xd0);	/* mov %edx,%eax */
 				break;
-			case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
-				EMIT3(0x48, 0x69, 0xc0); /* imul imm32,%rax,%rax */
-				EMIT(K, 4);
-				EMIT4(0x48, 0xc1, 0xe8, 0x20); /* shr $0x20,%rax */
+			case BPF_S_ALU_DIV_K: /* A /= K */
+				if (K == 1)
+					break;
+				EMIT2(0x31, 0xd2);	/* xor %edx,%edx */
+				EMIT1(0xb9);EMIT(K, 4);	/* mov imm32,%ecx */
+				EMIT2(0xf7, 0xf1);	/* div %ecx */
 				break;
 			case BPF_S_ALU_AND_X:
 				seen |= SEEN_XREG;
diff --git a/net/core/filter.c b/net/core/filter.c
index 01b780856db2..ad30d626a5bd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -36,7 +36,6 @@
 #include <asm/uaccess.h>
 #include <asm/unaligned.h>
 #include <linux/filter.h>
-#include <linux/reciprocal_div.h>
 #include <linux/ratelimit.h>
 #include <linux/seccomp.h>
 #include <linux/if_vlan.h>
@@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff *skb,
 			A /= X;
 			continue;
 		case BPF_S_ALU_DIV_K:
-			A = reciprocal_divide(A, K);
+			A /= K;
 			continue;
 		case BPF_S_ALU_MOD_X:
 			if (X == 0)
@@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen)
 		/* Some instructions need special checks */
 		switch (code) {
 		case BPF_S_ALU_DIV_K:
-			/* check for division by zero */
-			if (ftest->k == 0)
-				return -EINVAL;
-			ftest->k = reciprocal_value(ftest->k);
-			break;
 		case BPF_S_ALU_MOD_K:
 			/* check for division by zero */
 			if (ftest->k == 0)
@@ -853,27 +847,7 @@ void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to)
 	to->code = decodes[code];
 	to->jt = filt->jt;
 	to->jf = filt->jf;
-
-	if (code == BPF_S_ALU_DIV_K) {
-		/*
-		 * When loaded this rule user gave us X, which was
-		 * translated into R = r(X). Now we calculate the
-		 * RR = r(R) and report it back. If next time this
-		 * value is loaded and RRR = r(RR) is calculated
-		 * then the R == RRR will be true.
-		 *
-		 * One exception. X == 1 translates into R == 0 and
-		 * we can't calculate RR out of it with r().
-		 */
-
-		if (filt->k == 0)
-			to->k = 1;
-		else
-			to->k = reciprocal_value(filt->k);
-
-		BUG_ON(reciprocal_value(to->k) != filt->k);
-	} else
-		to->k = filt->k;
+	to->k = filt->k;
 }
 
 int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, unsigned int len)

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] reciprocal_divide: correction/update of the algorithm
  2014-01-14 18:02 ` Randy Dunlap
@ 2014-01-15 15:02   ` Hannes Frederic Sowa
  0 siblings, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-15 15:02 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: netdev, dborkman, eric.dumazet, linux-kernel, darkjames-ws

On Tue, Jan 14, 2014 at 10:02:19AM -0800, Randy Dunlap wrote:
> Just trivia (coding style and spelling):
> 
> > diff --git a/lib/reciprocal_div.c b/lib/reciprocal_div.c
> > index 75510e9..b741b30 100644
> > --- a/lib/reciprocal_div.c
> > +++ b/lib/reciprocal_div.c
> > @@ -1,11 +1,25 @@
> > +#include <linux/kernel.h>
> >  #include <asm/div64.h>
> >  #include <linux/reciprocal_div.h>
> >  #include <linux/export.h>
> >  
> > -u32 reciprocal_value(u32 k)
> > +/* For a description of the algorithmus please look at
> 
>                                algorithms
> 
> > + * linux/reciprocal_div.h
> > + */
> 
> and kernel coding style for multi-line comments is like so:
> 
> /*
>  * For a description of the algorithms, please look at
>  * linux/reciprocal_div.h
>  */
> 
> > +
> > +struct reciprocal_value reciprocal_value(u32 d)
> >  {

Thanks, fixed those stuff locally along with a better description. I am
used to netdev comments. ;)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 14:16     ` Eric Dumazet
@ 2014-01-15 15:10       ` Heiko Carstens
  0 siblings, 0 replies; 32+ messages in thread
From: Heiko Carstens @ 2014-01-15 15:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, netdev, dborkman, darkjames-ws,
	Mircea Gherzan, Russell King, Matt Evans, Martin Schwidefsky

On Wed, Jan 15, 2014 at 06:16:38AM -0800, Eric Dumazet wrote:
> On Wed, 2014-01-15 at 09:00 +0100, Heiko Carstens wrote:
> > Are you sure you want to remove the k == 0 check? Is there something
> > else that would prevent a division by zero?
> 
> This is done by factoring the two cases, modulo and divide :
> 
>  vi +553 net/core/filter.c 
> 
>                 /* Some instructions need special checks */
>                 switch (code) {
>                 case BPF_S_ALU_DIV_K:
>                 case BPF_S_ALU_MOD_K:
>                         /* check for division by zero */
>                         if (ftest->k == 0)
>                                 return -EINVAL;
>                         break;

Oh, sorry. I missed the fallthrough.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-15 14:50             ` [PATCH v2 " Eric Dumazet
@ 2014-01-15 15:10               ` Matt Evans
  2014-01-15 16:09                 ` Eric Dumazet
  2014-01-16  1:02               ` David Miller
  1 sibling, 1 reply; 32+ messages in thread
From: Matt Evans @ 2014-01-15 15:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Heiko Carstens, Martin Schwidefsky, Hannes Frederic Sowa, netdev,
	dborkman, darkjames-ws, Mircea Gherzan, Russell King

Hi Eric,

On 2014-01-15 14:50, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> At first Jakub Zawadzki noticed that some divisions by 
> reciprocal_divide
> were not correct. (off by one in some cases)
> http://www.wireshark.org/~darkjames/reciprocal-buggy.c
> 
> He could also show this with BPF:
> http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
> 
> The reciprocal divide in linux kernel is not generic enough,
> lets remove its use in BPF, as it is not worth the pain with
> current cpus.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
> Cc: Mircea Gherzan <mgherzan@gmail.com>
> Cc: Daniel Borkmann <dxchgb@gmail.com>
> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Cc: Matt Evans <matt@ozlabs.org>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: David S. Miller <davem@davemloft.net>
> ---
> v2: Fixed sparc code as David kindly suggested
>     Added tests on K being 1 (divide by 1 is a nop
>                               modulo by 1 clears A),
>     as Martin Schwidefsky seems concerned by this case.
> 
> Please review arch code to make sure I made no mistake, thanks !

PPC looks fine; I had a look at the core/ARM parts which also look good.

I'd forgotten where the DIV0 checking occurred, so I also benefited from 
your hint to Heiko. :)


Cheers,

Matt

> 
>  arch/arm/net/bpf_jit_32.c       |    6 +++---
>  arch/powerpc/net/bpf_jit_comp.c |    7 ++++---
>  arch/s390/net/bpf_jit_comp.c    |   17 ++++++++++++-----
>  arch/sparc/net/bpf_jit_comp.c   |   17 ++++++++++++++---
>  arch/x86/net/bpf_jit_comp.c     |   14 ++++++++++----
>  net/core/filter.c               |   30 ++----------------------------
>  6 files changed, 45 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
> index 9ed155ad0f97..271b5e971568 100644
> --- a/arch/arm/net/bpf_jit_32.c
> +++ b/arch/arm/net/bpf_jit_32.c
> @@ -641,10 +641,10 @@ load_ind:
>  			emit(ARM_MUL(r_A, r_A, r_X), ctx);
>  			break;
>  		case BPF_S_ALU_DIV_K:
> -			/* current k == reciprocal_value(userspace k) */
> +			if (k == 1)
> +				break;
>  			emit_mov_i(r_scratch, k, ctx);
> -			/* A = top 32 bits of the product */
> -			emit(ARM_UMULL(r_scratch, r_A, r_A, r_scratch), ctx);
> +			emit_udiv(r_A, r_A, r_scratch, ctx);
>  			break;
>  		case BPF_S_ALU_DIV_X:
>  			update_on_xread(ctx);
> diff --git a/arch/powerpc/net/bpf_jit_comp.c 
> b/arch/powerpc/net/bpf_jit_comp.c
> index ac3c2a10dafd..555034f8505e 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -223,10 +223,11 @@ static int bpf_jit_build_body(struct sk_filter
> *fp, u32 *image,
>  			}
>  			PPC_DIVWU(r_A, r_A, r_X);
>  			break;
> -		case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
> +		case BPF_S_ALU_DIV_K: /* A /= K */
> +			if (K == 1)
> +				break;
>  			PPC_LI32(r_scratch1, K);
> -			/* Top 32 bits of 64bit result -> A */
> -			PPC_MULHWU(r_A, r_A, r_scratch1);
> +			PPC_DIVWU(r_A, r_A, r_scratch1);
>  			break;
>  		case BPF_S_ALU_AND_X:
>  			ctx->seen |= SEEN_XREG;
> diff --git a/arch/s390/net/bpf_jit_comp.c 
> b/arch/s390/net/bpf_jit_comp.c
> index 16871da37371..fc0fa77728e1 100644
> --- a/arch/s390/net/bpf_jit_comp.c
> +++ b/arch/s390/net/bpf_jit_comp.c
> @@ -371,11 +371,13 @@ static int bpf_jit_insn(struct bpf_jit *jit,
> struct sock_filter *filter,
>  		/* dr %r4,%r12 */
>  		EMIT2(0x1d4c);
>  		break;
> -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> -		/* m %r4,<d(K)>(%r13) */
> -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> -		/* lr %r5,%r4 */
> -		EMIT2(0x1854);
> +	case BPF_S_ALU_DIV_K: /* A /= K */
> +		if (K == 1)
> +			break;
> +		/* lhi %r4,0 */
> +		EMIT4(0xa7480000);
> +		/* d %r4,<d(K)>(%r13) */
> +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
>  		break;
>  	case BPF_S_ALU_MOD_X: /* A %= X */
>  		jit->seen |= SEEN_XREG | SEEN_RET0;
> @@ -391,6 +393,11 @@ static int bpf_jit_insn(struct bpf_jit *jit,
> struct sock_filter *filter,
>  		EMIT2(0x1854);
>  		break;
>  	case BPF_S_ALU_MOD_K: /* A %= K */
> +		if (K == 1) {
> +			/* lhi %r5,0 */
> +			EMIT4(0xa7580000);
> +			break;
> +		}
>  		/* lhi %r4,0 */
>  		EMIT4(0xa7480000);
>  		/* d %r4,<d(K)>(%r13) */
> diff --git a/arch/sparc/net/bpf_jit_comp.c 
> b/arch/sparc/net/bpf_jit_comp.c
> index 218b6b23c378..01fe9946d388 100644
> --- a/arch/sparc/net/bpf_jit_comp.c
> +++ b/arch/sparc/net/bpf_jit_comp.c
> @@ -497,9 +497,20 @@ void bpf_jit_compile(struct sk_filter *fp)
>  			case BPF_S_ALU_MUL_K:	/* A *= K */
>  				emit_alu_K(MUL, K);
>  				break;
> -			case BPF_S_ALU_DIV_K:	/* A /= K */
> -				emit_alu_K(MUL, K);
> -				emit_read_y(r_A);
> +			case BPF_S_ALU_DIV_K:	/* A /= K with K != 0*/
> +				if (K == 1)
> +					break;
> +				emit_write_y(G0);
> +#ifdef CONFIG_SPARC32
> +				/* The Sparc v8 architecture requires
> +				 * three instructions between a %y
> +				 * register write and the first use.
> +				 */
> +				emit_nop();
> +				emit_nop();
> +				emit_nop();
> +#endif
> +				emit_alu_K(DIV, K);
>  				break;
>  			case BPF_S_ALU_DIV_X:	/* A /= X; */
>  				emit_cmpi(r_X, 0);
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 26328e800869..4ed75dd81d05 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -359,15 +359,21 @@ void bpf_jit_compile(struct sk_filter *fp)
>  				EMIT2(0x89, 0xd0);	/* mov %edx,%eax */
>  				break;
>  			case BPF_S_ALU_MOD_K: /* A %= K; */
> +				if (K == 1) {
> +					CLEAR_A();
> +					break;
> +				}
>  				EMIT2(0x31, 0xd2);	/* xor %edx,%edx */
>  				EMIT1(0xb9);EMIT(K, 4);	/* mov imm32,%ecx */
>  				EMIT2(0xf7, 0xf1);	/* div %ecx */
>  				EMIT2(0x89, 0xd0);	/* mov %edx,%eax */
>  				break;
> -			case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */
> -				EMIT3(0x48, 0x69, 0xc0); /* imul imm32,%rax,%rax */
> -				EMIT(K, 4);
> -				EMIT4(0x48, 0xc1, 0xe8, 0x20); /* shr $0x20,%rax */
> +			case BPF_S_ALU_DIV_K: /* A /= K */
> +				if (K == 1)
> +					break;
> +				EMIT2(0x31, 0xd2);	/* xor %edx,%edx */
> +				EMIT1(0xb9);EMIT(K, 4);	/* mov imm32,%ecx */
> +				EMIT2(0xf7, 0xf1);	/* div %ecx */
>  				break;
>  			case BPF_S_ALU_AND_X:
>  				seen |= SEEN_XREG;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 01b780856db2..ad30d626a5bd 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -36,7 +36,6 @@
>  #include <asm/uaccess.h>
>  #include <asm/unaligned.h>
>  #include <linux/filter.h>
> -#include <linux/reciprocal_div.h>
>  #include <linux/ratelimit.h>
>  #include <linux/seccomp.h>
>  #include <linux/if_vlan.h>
> @@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff 
> *skb,
>  			A /= X;
>  			continue;
>  		case BPF_S_ALU_DIV_K:
> -			A = reciprocal_divide(A, K);
> +			A /= K;
>  			continue;
>  		case BPF_S_ALU_MOD_X:
>  			if (X == 0)
> @@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter,
> unsigned int flen)
>  		/* Some instructions need special checks */
>  		switch (code) {
>  		case BPF_S_ALU_DIV_K:
> -			/* check for division by zero */
> -			if (ftest->k == 0)
> -				return -EINVAL;
> -			ftest->k = reciprocal_value(ftest->k);
> -			break;
>  		case BPF_S_ALU_MOD_K:
>  			/* check for division by zero */
>  			if (ftest->k == 0)
> @@ -853,27 +847,7 @@ void sk_decode_filter(struct sock_filter *filt,
> struct sock_filter *to)
>  	to->code = decodes[code];
>  	to->jt = filt->jt;
>  	to->jf = filt->jf;
> -
> -	if (code == BPF_S_ALU_DIV_K) {
> -		/*
> -		 * When loaded this rule user gave us X, which was
> -		 * translated into R = r(X). Now we calculate the
> -		 * RR = r(R) and report it back. If next time this
> -		 * value is loaded and RRR = r(RR) is calculated
> -		 * then the R == RRR will be true.
> -		 *
> -		 * One exception. X == 1 translates into R == 0 and
> -		 * we can't calculate RR out of it with r().
> -		 */
> -
> -		if (filt->k == 0)
> -			to->k = 1;
> -		else
> -			to->k = reciprocal_value(filt->k);
> -
> -		BUG_ON(reciprocal_value(to->k) != filt->k);
> -	} else
> -		to->k = filt->k;
> +	to->k = filt->k;
>  }
> 
>  int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf,
> unsigned int len)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 14:21         ` Eric Dumazet
  2014-01-15 14:25           ` Eric Dumazet
@ 2014-01-15 15:26           ` Martin Schwidefsky
  2014-01-15 16:07             ` Eric Dumazet
  1 sibling, 1 reply; 32+ messages in thread
From: Martin Schwidefsky @ 2014-01-15 15:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Heiko Carstens, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 15 Jan 2014 06:21:56 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Wed, 2014-01-15 at 11:51 +0100, Heiko Carstens wrote:
> > On Wed, Jan 15, 2014 at 09:13:22AM +0100, Martin Schwidefsky wrote:
> > > On Wed, 15 Jan 2014 09:00:07 +0100
> > > Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > > 
> > > > On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > > > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > > > > index 16871da37371..e349dc7d0992 100644
> > > > > --- a/arch/s390/net/bpf_jit_comp.c
> > > > > +++ b/arch/s390/net/bpf_jit_comp.c
> > > > > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> > > > >  		/* dr %r4,%r12 */
> > > > >  		EMIT2(0x1d4c);
> > > > >  		break;
> > > > > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > > > > -		/* m %r4,<d(K)>(%r13) */
> > > > > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > > > > -		/* lr %r5,%r4 */
> > > > > -		EMIT2(0x1854);
> > > > > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > > > > +		/* lhi %r4,0 */
> > > > > +		EMIT4(0xa7480000);
> > > > > +		/* d %r4,<d(K)>(%r13) */
> > > > > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> > > > >  		break;
> > > > 
> > > > The s390 part looks good.
> > > 
> > > Does it? The divide instruction is signed, for the special
> > > case of K==1 this can now cause an exception if the quotient
> > > gets too large. We should add a check for K==1 and do nothing
> > > in this case. With a divisor of at least 2 the result will
> > > stay in the limit.
> > 
> > Indeed. That's quite subtle.
> 
> net/core/filter.c does :
> 
> A /= K;
> 
> Why is this working in generic code (if K == 1), not in s390 one ?

The C compiler naturally can to a u32/u32 division, it either uses
the "dlr" instruction which is unsigned, or uses a call to a function
to do the u32/u32 math. See the lovely code in arch/s390/lib/div64.c
for the kernel version of that code.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 14:25           ` Eric Dumazet
  2014-01-15 14:50             ` [PATCH v2 " Eric Dumazet
@ 2014-01-15 15:35             ` Martin Schwidefsky
  1 sibling, 0 replies; 32+ messages in thread
From: Martin Schwidefsky @ 2014-01-15 15:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Heiko Carstens, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 15 Jan 2014 06:25:26 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On Wed, 2014-01-15 at 06:21 -0800, Eric Dumazet wrote:
> > On Wed, 2014-01-15 at 11:51 +0100, Heiko Carstens wrote:
> > > On Wed, Jan 15, 2014 at 09:13:22AM +0100, Martin Schwidefsky wrote:
> > > > On Wed, 15 Jan 2014 09:00:07 +0100
> > > > Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > > > 
> > > > > On Tue, Jan 14, 2014 at 11:02:41PM -0800, Eric Dumazet wrote:
> > > > > > diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
> > > > > > index 16871da37371..e349dc7d0992 100644
> > > > > > --- a/arch/s390/net/bpf_jit_comp.c
> > > > > > +++ b/arch/s390/net/bpf_jit_comp.c
> > > > > > @@ -371,11 +371,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
> > > > > >  		/* dr %r4,%r12 */
> > > > > >  		EMIT2(0x1d4c);
> > > > > >  		break;
> > > > > > -	case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */
> > > > > > -		/* m %r4,<d(K)>(%r13) */
> > > > > > -		EMIT4_DISP(0x5c40d000, EMIT_CONST(K));
> > > > > > -		/* lr %r5,%r4 */
> > > > > > -		EMIT2(0x1854);
> > > > > > +	case BPF_S_ALU_DIV_K: /* A /= K */
> > > > > > +		/* lhi %r4,0 */
> > > > > > +		EMIT4(0xa7480000);
> > > > > > +		/* d %r4,<d(K)>(%r13) */
> > > > > > +		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
> > > > > >  		break;
> > > > > 
> > > > > The s390 part looks good.
> > > > 
> > > > Does it? The divide instruction is signed, for the special
> > > > case of K==1 this can now cause an exception if the quotient
> > > > gets too large. We should add a check for K==1 and do nothing
> > > > in this case. With a divisor of at least 2 the result will
> > > > stay in the limit.
> > > 
> > > Indeed. That's quite subtle.
> > 
> > net/core/filter.c does :
> > 
> > A /= K;
> > 
> > Why is this working in generic code (if K == 1), not in s390 one ?
> 
> Note that I copied code found in BPF_S_ALU_MOD_K, so this one would need
> a fix as well.

Hmm, that is true BPF_S_ALU_DIV_X, BPF_S_ALU_MOD_K and BPF_S_ALU_MOD_X all
suffer from the same problem. As the BPF jit is only used for 64-bit 
kernel the simplest solution is to replace the "dr" instruction with "dlr".

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net] bpf: do not use reciprocal divide
  2014-01-15 15:26           ` Martin Schwidefsky
@ 2014-01-15 16:07             ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 16:07 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Heiko Carstens, Hannes Frederic Sowa, netdev, dborkman,
	darkjames-ws, Mircea Gherzan, Russell King, Matt Evans

On Wed, 2014-01-15 at 16:26 +0100, Martin Schwidefsky wrote:

> The C compiler naturally can to a u32/u32 division, it either uses
> the "dlr" instruction which is unsigned, or uses a call to a function
> to do the u32/u32 math. See the lovely code in arch/s390/lib/div64.c
> for the kernel version of that code.

If the C code uses unsigned, then all arch jit code must use the same.

Behavior of the filter should not depend of jit being enabled or not ;)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-15 15:10               ` Matt Evans
@ 2014-01-15 16:09                 ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-15 16:09 UTC (permalink / raw)
  To: Matt Evans
  Cc: Heiko Carstens, Martin Schwidefsky, Hannes Frederic Sowa, netdev,
	dborkman, darkjames-ws, Mircea Gherzan, Russell King

On Wed, 2014-01-15 at 15:10 +0000, Matt Evans wrote:
> Hi Eric,
...
> PPC looks fine; I had a look at the core/ARM parts which also look good.
> 
> I'd forgotten where the DIV0 checking occurred, so I also benefited from 
> your hint to Heiko. :)
> 

Thanks for reviewing !

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-15 14:50             ` [PATCH v2 " Eric Dumazet
  2014-01-15 15:10               ` Matt Evans
@ 2014-01-16  1:02               ` David Miller
  2014-01-17  8:59                 ` Heiko Carstens
  1 sibling, 1 reply; 32+ messages in thread
From: David Miller @ 2014-01-16  1:02 UTC (permalink / raw)
  To: eric.dumazet
  Cc: heiko.carstens, schwidefsky, hannes, netdev, dborkman,
	darkjames-ws, mgherzan, rmk+kernel, matt

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 15 Jan 2014 06:50:07 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
> were not correct. (off by one in some cases)
> http://www.wireshark.org/~darkjames/reciprocal-buggy.c
> 
> He could also show this with BPF:
> http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
> 
> The reciprocal divide in linux kernel is not generic enough,
> lets remove its use in BPF, as it is not worth the pain with
> current cpus.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-16  1:02               ` David Miller
@ 2014-01-17  8:59                 ` Heiko Carstens
  2014-01-18  2:56                   ` David Miller
  0 siblings, 1 reply; 32+ messages in thread
From: Heiko Carstens @ 2014-01-17  8:59 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, schwidefsky, hannes, netdev, dborkman,
	darkjames-ws, mgherzan, rmk+kernel, matt

On Wed, Jan 15, 2014 at 05:02:30PM -0800, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 15 Jan 2014 06:50:07 -0800
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
> > were not correct. (off by one in some cases)
> > http://www.wireshark.org/~darkjames/reciprocal-buggy.c
> > 
> > He could also show this with BPF:
> > http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c
> > 
> > The reciprocal divide in linux kernel is not generic enough,
> > lets remove its use in BPF, as it is not worth the pain with
> > current cpus.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
> 
> Applied and queued up for -stable, thanks Eric.

While thinking a bit more the s390 variant is still broken with special
casing the "divide with 1" case (see below).

Could you please also apply the patch below to your tree? It would only
generate a merge conflict, that would need fixing, if it would sit in the
s390 tree.

>From bf0f2dc84dd3774944fc4dddef8c7e699277aa96 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Fri, 17 Jan 2014 09:37:15 +0100
Subject: [PATCH] s390/bpf,jit: fix 32 bit divisions, use unsigned divide
 instructions

The s390 bpf jit compiler emits the signed divide instructions "dr" and "d"
for unsigned divisions.
This can cause problems: the dividend will be zero extended to a 64 bit value
and the divisor is the 32 bit signed value as specified A or X accumulator,
even though A and X are supposed to be treated as unsigned values.

The divide instrunctions will generate an exception if the result cannot be
expressed with a 32 bit signed value.
This is the case if e.g. the dividend is 0xffffffff and the divisor either 1
or also 0xffffffff (signed: -1).

To avoid all these issues simply use unsigned divide instructions.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index fc0fa77728e1..708d60e40066 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -368,16 +368,16 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		EMIT4_PCREL(0xa7840000, (jit->ret0_ip - jit->prg));
 		/* lhi %r4,0 */
 		EMIT4(0xa7480000);
-		/* dr %r4,%r12 */
-		EMIT2(0x1d4c);
+		/* dlr %r4,%r12 */
+		EMIT4(0xb997004c);
 		break;
 	case BPF_S_ALU_DIV_K: /* A /= K */
 		if (K == 1)
 			break;
 		/* lhi %r4,0 */
 		EMIT4(0xa7480000);
-		/* d %r4,<d(K)>(%r13) */
-		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
+		/* dl %r4,<d(K)>(%r13) */
+		EMIT6_DISP(0xe340d000, 0x0097, EMIT_CONST(K));
 		break;
 	case BPF_S_ALU_MOD_X: /* A %= X */
 		jit->seen |= SEEN_XREG | SEEN_RET0;
@@ -387,8 +387,8 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		EMIT4_PCREL(0xa7840000, (jit->ret0_ip - jit->prg));
 		/* lhi %r4,0 */
 		EMIT4(0xa7480000);
-		/* dr %r4,%r12 */
-		EMIT2(0x1d4c);
+		/* dlr %r4,%r12 */
+		EMIT4(0xb997004c);
 		/* lr %r5,%r4 */
 		EMIT2(0x1854);
 		break;
@@ -400,8 +400,8 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter,
 		}
 		/* lhi %r4,0 */
 		EMIT4(0xa7480000);
-		/* d %r4,<d(K)>(%r13) */
-		EMIT4_DISP(0x5d40d000, EMIT_CONST(K));
+		/* dl %r4,<d(K)>(%r13) */
+		EMIT6_DISP(0xe340d000, 0x0097, EMIT_CONST(K));
 		/* lr %r5,%r4 */
 		EMIT2(0x1854);
 		break;
-- 
1.8.4.5

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-17  8:59                 ` Heiko Carstens
@ 2014-01-18  2:56                   ` David Miller
  2014-01-18 10:12                     ` Heiko Carstens
  0 siblings, 1 reply; 32+ messages in thread
From: David Miller @ 2014-01-18  2:56 UTC (permalink / raw)
  To: heiko.carstens
  Cc: eric.dumazet, schwidefsky, hannes, netdev, dborkman,
	darkjames-ws, mgherzan, rmk+kernel, matt

From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Fri, 17 Jan 2014 09:59:16 +0100

> Could you please also apply the patch below to your tree? It would only
> generate a merge conflict, that would need fixing, if it would sit in the
> s390 tree.

Applied and I queued it up for -stable so I can combine it with
Eric's original change when I submit it to -stable.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 net] bpf: do not use reciprocal divide
  2014-01-18  2:56                   ` David Miller
@ 2014-01-18 10:12                     ` Heiko Carstens
  0 siblings, 0 replies; 32+ messages in thread
From: Heiko Carstens @ 2014-01-18 10:12 UTC (permalink / raw)
  To: David Miller
  Cc: eric.dumazet, schwidefsky, hannes, netdev, dborkman,
	darkjames-ws, mgherzan, rmk+kernel, matt

On Fri, Jan 17, 2014 at 06:56:00PM -0800, David Miller wrote:
> From: Heiko Carstens <heiko.carstens@de.ibm.com>
> Date: Fri, 17 Jan 2014 09:59:16 +0100
> 
> > Could you please also apply the patch below to your tree? It would only
> > generate a merge conflict, that would need fixing, if it would sit in the
> > s390 tree.
> 
> Applied and I queued it up for -stable so I can combine it with
> Eric's original change when I submit it to -stable.

Great, thank you!

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-01-18 10:12 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-13 21:42 [PATCH RFC] reciprocal_divide: correction/update of the algorithm Hannes Frederic Sowa
2014-01-14 18:02 ` Randy Dunlap
2014-01-15 15:02   ` Hannes Frederic Sowa
2014-01-14 18:07 ` Eric Dumazet
2014-01-14 19:22   ` Austin S Hemmelgarn
2014-01-14 19:50     ` Eric Dumazet
2014-01-14 20:10       ` Hannes Frederic Sowa
2014-01-14 20:53       ` Austin S Hemmelgarn
2014-01-14 22:45         ` Eric Dumazet
2014-01-14 23:25           ` Borislav Petkov
2014-01-15  2:51             ` Austin S. Hemmelgarn
2014-01-14 22:39   ` Hannes Frederic Sowa
2014-01-15  7:02 ` [PATCH net] bpf: do not use reciprocal divide Eric Dumazet
2014-01-15  7:28   ` David Miller
2014-01-15  7:39     ` Eric Dumazet
2014-01-15  8:00   ` Heiko Carstens
2014-01-15  8:13     ` Martin Schwidefsky
2014-01-15 10:51       ` Heiko Carstens
2014-01-15 14:21         ` Eric Dumazet
2014-01-15 14:25           ` Eric Dumazet
2014-01-15 14:50             ` [PATCH v2 " Eric Dumazet
2014-01-15 15:10               ` Matt Evans
2014-01-15 16:09                 ` Eric Dumazet
2014-01-16  1:02               ` David Miller
2014-01-17  8:59                 ` Heiko Carstens
2014-01-18  2:56                   ` David Miller
2014-01-18 10:12                     ` Heiko Carstens
2014-01-15 15:35             ` [PATCH " Martin Schwidefsky
2014-01-15 15:26           ` Martin Schwidefsky
2014-01-15 16:07             ` Eric Dumazet
2014-01-15 14:16     ` Eric Dumazet
2014-01-15 15:10       ` Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).