All of lore.kernel.org
 help / color / mirror / Atom feed
* activeconns * weight overflowing 32-bit int
@ 2013-04-13  6:43 Simon Kirby
  2013-04-13 15:10 ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-04-13  6:43 UTC (permalink / raw)
  To: lvs-devel; +Cc: Changli Gao

Hello!

We use lblc in some environments to try to maintain some cache locality.

We recently had some problems upgrading beyond 2.6.38 in one environment.
The cluster kept overloading real servers and showed flapping that didn't
occur on 2.6.38 and older. I was never able to figure this out, but I
think now I see the reason.

We need to use fairly high weights, since lblc requires this in order to
do rescheduling in the event of overload. In the event that we have 3000
activeconns to a real server and a weight of 3000, the next connection
will check to see if any other real server has 2*activeconns less than
its weight, and if so, reschedule by wlc.

With b552f7e3a9524abcbcdf86f0a99b2be58e55a9c6, which "git tag --contains"
says appeared in 2.6.39-rc, the open-coded activeconns * 50 + inactconns
was changed to ip_vs_dest_conn_overhead() that matches the implementation
in ip_vs_wlc.c and others. The problem for us is that ip_vs_lblc.c uses
"int" (and wlc uses "unsigned int") for "loh" and "doh" variables that
the ip_vs_dest_conn_overhead() result is stored in, and then these are
multiplied by the weight.

ip_vs_dest_conn_overhead() uses (activeconns << 8) + inactconns (* 256
instead of * 50), so before where 3000 * 3000 * 50 would fit in an int,
3000 * 3000 * 256 does not.

We really don't care about inactconns, so removing the "<< 8" and just
using activeconns would work for us, but I suspect it must be there for a
raeason. "unsigned long" would fix the problem only for 64-bit arches.
Using __u64 would work everywhere, but perhaps be slow on 32-bit arches.
Thoughts?

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-04-13  6:43 activeconns * weight overflowing 32-bit int Simon Kirby
@ 2013-04-13 15:10 ` Julian Anastasov
  2013-05-22  6:18   ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-04-13 15:10 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Fri, 12 Apr 2013, Simon Kirby wrote:

> Hello!
> 
> We use lblc in some environments to try to maintain some cache locality.
> 
> We recently had some problems upgrading beyond 2.6.38 in one environment.
> The cluster kept overloading real servers and showed flapping that didn't
> occur on 2.6.38 and older. I was never able to figure this out, but I
> think now I see the reason.
> 
> We need to use fairly high weights, since lblc requires this in order to
> do rescheduling in the event of overload. In the event that we have 3000
> activeconns to a real server and a weight of 3000, the next connection
> will check to see if any other real server has 2*activeconns less than
> its weight, and if so, reschedule by wlc.
> 
> With b552f7e3a9524abcbcdf86f0a99b2be58e55a9c6, which "git tag --contains"
> says appeared in 2.6.39-rc, the open-coded activeconns * 50 + inactconns
> was changed to ip_vs_dest_conn_overhead() that matches the implementation
> in ip_vs_wlc.c and others. The problem for us is that ip_vs_lblc.c uses
> "int" (and wlc uses "unsigned int") for "loh" and "doh" variables that
> the ip_vs_dest_conn_overhead() result is stored in, and then these are
> multiplied by the weight.
> 
> ip_vs_dest_conn_overhead() uses (activeconns << 8) + inactconns (* 256
> instead of * 50), so before where 3000 * 3000 * 50 would fit in an int,
> 3000 * 3000 * 256 does not.

	There is no big difference between 50 and 256.

> We really don't care about inactconns, so removing the "<< 8" and just
> using activeconns would work for us, but I suspect it must be there for a
> raeason. "unsigned long" would fix the problem only for 64-bit arches.
> Using __u64 would work everywhere, but perhaps be slow on 32-bit arches.
> Thoughts?

	May be we can avoid 64-bit multiply with a
32*32=>64 optimization, for example:

-               if (loh * atomic_read(&dest->weight) >
-                   doh * atomic_read(&least->weight)) {
+               if ((__u64) loh * atomic_read(&dest->weight) >
+                   (__u64) doh * atomic_read(&least->weight)) {

	May be __s64/__u64 does not matter here. Can you
create and test such patch for lblc and lblcr against
ipvs-next or net-next tree? Such change should be also
applied to other schedulers but it does not look so critical.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-04-13 15:10 ` Julian Anastasov
@ 2013-05-22  6:18   ` Julian Anastasov
  2013-05-23 16:58     ` Simon Kirby
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-05-22  6:18 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Sat, 13 Apr 2013, Julian Anastasov wrote:

> On Fri, 12 Apr 2013, Simon Kirby wrote:
> 
> > Hello!
> > 
> > We use lblc in some environments to try to maintain some cache locality.
> > 
> > We recently had some problems upgrading beyond 2.6.38 in one environment.
> > The cluster kept overloading real servers and showed flapping that didn't
> > occur on 2.6.38 and older. I was never able to figure this out, but I
> > think now I see the reason.
> > 
> > We need to use fairly high weights, since lblc requires this in order to
> > do rescheduling in the event of overload. In the event that we have 3000
> > activeconns to a real server and a weight of 3000, the next connection
> > will check to see if any other real server has 2*activeconns less than
> > its weight, and if so, reschedule by wlc.
> > 
> > With b552f7e3a9524abcbcdf86f0a99b2be58e55a9c6, which "git tag --contains"
> > says appeared in 2.6.39-rc, the open-coded activeconns * 50 + inactconns
> > was changed to ip_vs_dest_conn_overhead() that matches the implementation
> > in ip_vs_wlc.c and others. The problem for us is that ip_vs_lblc.c uses
> > "int" (and wlc uses "unsigned int") for "loh" and "doh" variables that
> > the ip_vs_dest_conn_overhead() result is stored in, and then these are
> > multiplied by the weight.
> > 
> > ip_vs_dest_conn_overhead() uses (activeconns << 8) + inactconns (* 256
> > instead of * 50), so before where 3000 * 3000 * 50 would fit in an int,
> > 3000 * 3000 * 256 does not.
> 
> 	There is no big difference between 50 and 256.
> 
> > We really don't care about inactconns, so removing the "<< 8" and just
> > using activeconns would work for us, but I suspect it must be there for a
> > raeason. "unsigned long" would fix the problem only for 64-bit arches.
> > Using __u64 would work everywhere, but perhaps be slow on 32-bit arches.
> > Thoughts?
> 
> 	May be we can avoid 64-bit multiply with a
> 32*32=>64 optimization, for example:
> 
> -               if (loh * atomic_read(&dest->weight) >
> -                   doh * atomic_read(&least->weight)) {
> +               if ((__u64) loh * atomic_read(&dest->weight) >
> +                   (__u64) doh * atomic_read(&least->weight)) {
> 
> 	May be __s64/__u64 does not matter here. Can you
> create and test such patch for lblc and lblcr against
> ipvs-next or net-next tree? Such change should be also
> applied to other schedulers but it does not look so critical.

	Any progress on this problem?

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-22  6:18   ` Julian Anastasov
@ 2013-05-23 16:58     ` Simon Kirby
  2013-05-23 20:44       ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-05-23 16:58 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, Changli Gao

On Wed, May 22, 2013 at 09:18:03AM +0300, Julian Anastasov wrote:

> > > With b552f7e3a9524abcbcdf86f0a99b2be58e55a9c6, which "git tag --contains"
> > > says appeared in 2.6.39-rc, the open-coded activeconns * 50 + inactconns
> > > was changed to ip_vs_dest_conn_overhead() that matches the implementation
> > > in ip_vs_wlc.c and others. The problem for us is that ip_vs_lblc.c uses
> > > "int" (and wlc uses "unsigned int") for "loh" and "doh" variables that
> > > the ip_vs_dest_conn_overhead() result is stored in, and then these are
> > > multiplied by the weight.
> > > 
> > > ip_vs_dest_conn_overhead() uses (activeconns << 8) + inactconns (* 256
> > > instead of * 50), so before where 3000 * 3000 * 50 would fit in an int,
> > > 3000 * 3000 * 256 does not.
> > 
> > 	There is no big difference between 50 and 256.
> > 
> > > We really don't care about inactconns, so removing the "<< 8" and just
> > > using activeconns would work for us, but I suspect it must be there for a
> > > raeason. "unsigned long" would fix the problem only for 64-bit arches.
> > > Using __u64 would work everywhere, but perhaps be slow on 32-bit arches.
> > > Thoughts?
> > 
> > 	May be we can avoid 64-bit multiply with a
> > 32*32=>64 optimization, for example:
> > 
> > -               if (loh * atomic_read(&dest->weight) >
> > -                   doh * atomic_read(&least->weight)) {
> > +               if ((__u64) loh * atomic_read(&dest->weight) >
> > +                   (__u64) doh * atomic_read(&least->weight)) {
> > 
> > 	May be __s64/__u64 does not matter here. Can you
> > create and test such patch for lblc and lblcr against
> > ipvs-next or net-next tree? Such change should be also
> > applied to other schedulers but it does not look so critical.
> 
> 	Any progress on this problem?

Hello!

Yes, see: http://0x.ca/sim/ref/3.9-ipvs/

The case of just (__u64) on i386 looks not very good, but making the
weight also unsigned (__u32) seems to improve things. I set up a test
harness (ipvs.c) and disassembled i386 and amd64 compiler outputs for
both. The only reason I haven't submitted it yet is that I haven't yet
confirmed that it fixes our problem in production, though it did seem to
work in testing. Will follow-up shortly.

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-23 16:58     ` Simon Kirby
@ 2013-05-23 20:44       ` Julian Anastasov
  2013-05-24  0:43         ` Simon Kirby
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-05-23 20:44 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Thu, 23 May 2013, Simon Kirby wrote:

> Hello!
> 
> Yes, see: http://0x.ca/sim/ref/3.9-ipvs/
> 
> The case of just (__u64) on i386 looks not very good, but making the
> weight also unsigned (__u32) seems to improve things. I set up a test

	Hm, only sizeof(long int) should differ between
32-bit and 64-bit platforms, atomic_t is always int, int
is returned from atomic_read and int is always 4 bytes.

> harness (ipvs.c) and disassembled i386 and amd64 compiler outputs for
> both. The only reason I haven't submitted it yet is that I haven't yet
> confirmed that it fixes our problem in production, though it did seem to
> work in testing. Will follow-up shortly.

	Last time I also checked the assembler output from x86-32
and it looked good. I used 'make net/netfilter/ipvs/ip_vs_wlc.s'
to generate asm output, note the 's' extension.

	May be problem in your ipvs.c file comes from the
fact that __u64 is unsigned long long for 32-bit platforms.
I changed the code to use 'unsigned long' as follows:

#if 0
typedef unsigned long long __u64;
#else
typedef unsigned long __u64;
#endif

	and the x86-32 platform works correctly.
x86-64 works for both cases: 'unsigned long long' and
'unsigned long' but x86-32 generates many mul operations
for the 'unsigned long long' case which is not used in
the kernel. That is why I didn't noticed such problem.

	So, I think we should be safe by adding
(__u64) without any (__u32), next days I'll check again
the asm output. May be (__u32) before weight ensures
32x32 multiply but it should not be needed.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-23 20:44       ` Julian Anastasov
@ 2013-05-24  0:43         ` Simon Kirby
  2013-05-24  8:11           ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-05-24  0:43 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, Changli Gao

On Thu, May 23, 2013 at 11:44:22PM +0300, Julian Anastasov wrote:

> 	Hello,
> 
> On Thu, 23 May 2013, Simon Kirby wrote:
> 
> > Hello!
> > 
> > Yes, see: http://0x.ca/sim/ref/3.9-ipvs/
> > 
> > The case of just (__u64) on i386 looks not very good, but making the
> > weight also unsigned (__u32) seems to improve things. I set up a test
> 
> 	Hm, only sizeof(long int) should differ between
> 32-bit and 64-bit platforms, atomic_t is always int, int
> is returned from atomic_read and int is always 4 bytes.

It's always 4 bytes, but gcc does more stuff with u64 * s32 than it does
with u64 * s32. atomic_t seems to be int, so s32 on either arch. :)

For the u64 * u32 case, it is using the single operand form of imul which
puts a 64-bit result in eax:edx, and then does an expanded unsigned
comparison, as expected. The u64 * s32 case uses a mix of 2 imul and 2
mul and some other things.

> > harness (ipvs.c) and disassembled i386 and amd64 compiler outputs for
> > both. The only reason I haven't submitted it yet is that I haven't yet
> > confirmed that it fixes our problem in production, though it did seem to
> > work in testing. Will follow-up shortly.
> 
> 	Last time I also checked the assembler output from x86-32
> and it looked good. I used 'make net/netfilter/ipvs/ip_vs_wlc.s'
> to generate asm output, note the 's' extension.
> 
> 	May be problem in your ipvs.c file comes from the
> fact that __u64 is unsigned long long for 32-bit platforms.
> I changed the code to use 'unsigned long' as follows:
> 
> #if 0
> typedef unsigned long long __u64;
> #else
> typedef unsigned long __u64;
> #endif
> 
> 	and the x86-32 platform works correctly.
> x86-64 works for both cases: 'unsigned long long' and
> 'unsigned long' but x86-32 generates many mul operations
> for the 'unsigned long long' case which is not used in
> the kernel. That is why I didn't noticed such problem.
> 
> 	So, I think we should be safe by adding
> (__u64) without any (__u32), next days I'll check again
> the asm output. May be (__u32) before weight ensures
> 32x32 multiply but it should not be needed.

Hmm, I was comparing atomic_t being s32 versus u32, not u64 being u64. :)
Anyway, the .s results are much easier to read, and (closer to) reality!
I did a comparison with (__u64)loh * atomic_read(dest->weight) versus
(__u64)loh * (__u32)atomic_read(dest->weight) on both arches and uploaded
them to http://0x.ca/sim/ref/3.9-ipvs/. It's not a huge difference, but I
prefer the shorter/faster version. ;)

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-24  0:43         ` Simon Kirby
@ 2013-05-24  8:11           ` Julian Anastasov
  2013-08-05  6:10             ` Julian Anastasov
  2013-08-06  2:41             ` Simon Kirby
  0 siblings, 2 replies; 16+ messages in thread
From: Julian Anastasov @ 2013-05-24  8:11 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Thu, 23 May 2013, Simon Kirby wrote:

> > 	Last time I also checked the assembler output from x86-32
> > and it looked good. I used 'make net/netfilter/ipvs/ip_vs_wlc.s'
> > to generate asm output, note the 's' extension.
> > 
> > 	May be problem in your ipvs.c file comes from the
> > fact that __u64 is unsigned long long for 32-bit platforms.
> > I changed the code to use 'unsigned long' as follows:
> > 
> > #if 0
> > typedef unsigned long long __u64;
> > #else
> > typedef unsigned long __u64;
> > #endif
> > 
> > 	and the x86-32 platform works correctly.
> > x86-64 works for both cases: 'unsigned long long' and
> > 'unsigned long' but x86-32 generates many mul operations
> > for the 'unsigned long long' case which is not used in
> > the kernel. That is why I didn't noticed such problem.
> > 
> > 	So, I think we should be safe by adding
> > (__u64) without any (__u32), next days I'll check again
> > the asm output. May be (__u32) before weight ensures
> > 32x32 multiply but it should not be needed.
> 
> Hmm, I was comparing atomic_t being s32 versus u32, not u64 being u64. :)
> Anyway, the .s results are much easier to read, and (closer to) reality!
> I did a comparison with (__u64)loh * atomic_read(dest->weight) versus
> (__u64)loh * (__u32)atomic_read(dest->weight) on both arches and uploaded
> them to http://0x.ca/sim/ref/3.9-ipvs/. It's not a huge difference, but I
> prefer the shorter/faster version. ;)

	I now see why your patch shows difference compared
to my tests month ago. This change is the culprit:

-       int loh, doh;
+       unsigned int loh, doh;

	It effectively changes the operation from:

(__u64/__s64) int * int

	into

(__u64) unsigned int * int

	that is why you fix it by using __u32:

(__u64) unsigned int * unsigned int

	so that both operands are from same 4-byte signedness.

	I think, we should keep loh and doh to be int, may be
the following both solutions should generate 32x32 multiply:

1. same as my first email:

int loh, doh;

(__u64/__s64) loh * atomic_read(&dest->weight)

	In this case I see only one difference between
__u64 and __s64:

-       jb      .L41    #,
-       ja      .L79    #,
+       jl      .L41    #,
+       jg      .L79    #,

2. Your patch:

unsigned int loh, doh;

(__u64) loh * (__u32) atomic_read(&dest->weight)
or
(__s64) loh * (__u32) atomic_read(&dest->weight)

	Both solutions generate code that differs only
in imul vs. mul. In internet I see that imul is
preferred/faster than mul. That is why I prefer solution 1,
it has less casts.

	So, I think you can change your patch as follows:

1. Use int for loh, doh. Note that some schedulers
use 'unsigned int' and should be patched for this
definition: NQ, SED, WLC

2. Use (__u64) prefix only, no (__u32) before atomic_read:
LBLC, LBLCR, NQ, SED, WLC

	(__u64) loh * atomic_read(&dest->weight) ...
	(__u64) doh * ...

3. Explain in commit message that we find the
result64=int32*int32 faster than result64=uint32*uint32
and far better than using 64*64 multiply which is
a bit slower on older CPUs.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-24  8:11           ` Julian Anastasov
@ 2013-08-05  6:10             ` Julian Anastasov
  2013-08-06  2:41             ` Simon Kirby
  1 sibling, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2013-08-05  6:10 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Fri, 24 May 2013, Julian Anastasov wrote:

> On Thu, 23 May 2013, Simon Kirby wrote:

> > Hmm, I was comparing atomic_t being s32 versus u32, not u64 being u64. :)
> > Anyway, the .s results are much easier to read, and (closer to) reality!
> > I did a comparison with (__u64)loh * atomic_read(dest->weight) versus
> > (__u64)loh * (__u32)atomic_read(dest->weight) on both arches and uploaded
> > them to http://0x.ca/sim/ref/3.9-ipvs/. It's not a huge difference, but I
> > prefer the shorter/faster version. ;)
> 
> 	I now see why your patch shows difference compared
> to my tests month ago. This change is the culprit:
> 
> -       int loh, doh;
> +       unsigned int loh, doh;
> 
> 	It effectively changes the operation from:
> 
> (__u64/__s64) int * int
> 
> 	into
> 
> (__u64) unsigned int * int
> 
> 	that is why you fix it by using __u32:
> 
> (__u64) unsigned int * unsigned int
> 
> 	so that both operands are from same 4-byte signedness.
> 
> 	I think, we should keep loh and doh to be int, may be
> the following both solutions should generate 32x32 multiply:
> 
> 1. same as my first email:
> 
> int loh, doh;
> 
> (__u64/__s64) loh * atomic_read(&dest->weight)
> 
> 	In this case I see only one difference between
> __u64 and __s64:
> 
> -       jb      .L41    #,
> -       ja      .L79    #,
> +       jl      .L41    #,
> +       jg      .L79    #,
> 
> 2. Your patch:
> 
> unsigned int loh, doh;
> 
> (__u64) loh * (__u32) atomic_read(&dest->weight)
> or
> (__s64) loh * (__u32) atomic_read(&dest->weight)
> 
> 	Both solutions generate code that differs only
> in imul vs. mul. In internet I see that imul is
> preferred/faster than mul. That is why I prefer solution 1,
> it has less casts.
> 
> 	So, I think you can change your patch as follows:
> 
> 1. Use int for loh, doh. Note that some schedulers
> use 'unsigned int' and should be patched for this
> definition: NQ, SED, WLC
> 
> 2. Use (__u64) prefix only, no (__u32) before atomic_read:
> LBLC, LBLCR, NQ, SED, WLC
> 
> 	(__u64) loh * atomic_read(&dest->weight) ...
> 	(__u64) doh * ...
> 
> 3. Explain in commit message that we find the
> result64=int32*int32 faster than result64=uint32*uint32
> and far better than using 64*64 multiply which is
> a bit slower on older CPUs.

	Simon, any progress on this change? I can
continue and finish it if you prefer so?

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-05-24  8:11           ` Julian Anastasov
  2013-08-05  6:10             ` Julian Anastasov
@ 2013-08-06  2:41             ` Simon Kirby
  2013-08-06  6:45               ` Julian Anastasov
  1 sibling, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-08-06  2:41 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, Changli Gao

Hello!

On Fri, May 24, 2013 at 11:11:35AM +0300, Julian Anastasov wrote:

> On Thu, 23 May 2013, Simon Kirby wrote:
> 
> > > 	Last time I also checked the assembler output from x86-32
> > > and it looked good. I used 'make net/netfilter/ipvs/ip_vs_wlc.s'
> > > to generate asm output, note the 's' extension.
> > > 
> > > 	May be problem in your ipvs.c file comes from the
> > > fact that __u64 is unsigned long long for 32-bit platforms.
> > > I changed the code to use 'unsigned long' as follows:
> > > 
> > > #if 0
> > > typedef unsigned long long __u64;
> > > #else
> > > typedef unsigned long __u64;
> > > #endif
> > > 
> > > 	and the x86-32 platform works correctly.
> > > x86-64 works for both cases: 'unsigned long long' and
> > > 'unsigned long' but x86-32 generates many mul operations
> > > for the 'unsigned long long' case which is not used in
> > > the kernel. That is why I didn't noticed such problem.
> > > 
> > > 	So, I think we should be safe by adding
> > > (__u64) without any (__u32), next days I'll check again
> > > the asm output. May be (__u32) before weight ensures
> > > 32x32 multiply but it should not be needed.
> > 
> > Hmm, I was comparing atomic_t being s32 versus u32, not u64 being u64. :)
> > Anyway, the .s results are much easier to read, and (closer to) reality!
> > I did a comparison with (__u64)loh * atomic_read(dest->weight) versus
> > (__u64)loh * (__u32)atomic_read(dest->weight) on both arches and uploaded
> > them to http://0x.ca/sim/ref/3.9-ipvs/. It's not a huge difference, but I
> > prefer the shorter/faster version. ;)
> 
> 	I now see why your patch shows difference compared
> to my tests month ago. This change is the culprit:
> 
> -       int loh, doh;
> +       unsigned int loh, doh;
> 
> 	It effectively changes the operation from:
> 
> (__u64/__s64) int * int
> 
> 	into
> 
> (__u64) unsigned int * int
> 
> 	that is why you fix it by using __u32:
> 
> (__u64) unsigned int * unsigned int
> 
> 	so that both operands are from same 4-byte signedness.
> 
> 	I think, we should keep loh and doh to be int, may be
> the following both solutions should generate 32x32 multiply:
> 
> 1. same as my first email:
> 
> int loh, doh;
> 
> (__u64/__s64) loh * atomic_read(&dest->weight)
> 
> 	In this case I see only one difference between
> __u64 and __s64:
> 
> -       jb      .L41    #,
> -       ja      .L79    #,
> +       jl      .L41    #,
> +       jg      .L79    #,
> 
> 2. Your patch:
> 
> unsigned int loh, doh;
> 
> (__u64) loh * (__u32) atomic_read(&dest->weight)
> or
> (__s64) loh * (__u32) atomic_read(&dest->weight)

Did you mean here "(__s64) loh * (__s32) atomic_read(&dest->sweight)"?

If not, the results for me on GCC 4.7.2 were what I posted at
http://0x.ca/sim/ref/3.9-ipvs/.

> 	Both solutions generate code that differs only
> in imul vs. mul. In internet I see that imul is
> preferred/faster than mul. That is why I prefer solution 1,
> it has less casts.
> 
> 	So, I think you can change your patch as follows:
> 
> 1. Use int for loh, doh. Note that some schedulers
> use 'unsigned int' and should be patched for this
> definition: NQ, SED, WLC
> 
> 2. Use (__u64) prefix only, no (__u32) before atomic_read:
> LBLC, LBLCR, NQ, SED, WLC
> 
> 	(__u64) loh * atomic_read(&dest->weight) ...
> 	(__u64) doh * ...
> 
> 3. Explain in commit message that we find the
> result64=int32*int32 faster than result64=uint32*uint32
> and far better than using 64*64 multiply which is
> a bit slower on older CPUs.

I found that u64*u32 was faster than u64*s32, but I didn't check s64*s32.
I checked now and found that u64*u32 and s64*s32 have the same number of
output instructions on i386, with just the different signedness tests.
Both actually use imul since it's the comparison after which it cares
about for signedness.

But what actually makes sense? Do negative weights ever make sense?

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: activeconns * weight overflowing 32-bit int
  2013-08-06  2:41             ` Simon Kirby
@ 2013-08-06  6:45               ` Julian Anastasov
  2013-08-08 23:54                 ` [PATCH] ipvs: Use 64-bit comparisons (connections * weight) to avoid overflow Simon Kirby
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-08-06  6:45 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Mon, 5 Aug 2013, Simon Kirby wrote:

> > 1. same as my first email:
> > 
> > int loh, doh;
> > 
> > (__u64/__s64) loh * atomic_read(&dest->weight)
> > 
> > 	In this case I see only one difference between
> > __u64 and __s64:
> > 
> > -       jb      .L41    #,
> > -       ja      .L79    #,
> > +       jl      .L41    #,
> > +       jg      .L79    #,
> > 
> > 2. Your patch:
> > 
> > unsigned int loh, doh;
> > 
> > (__u64) loh * (__u32) atomic_read(&dest->weight)
> > or
> > (__s64) loh * (__u32) atomic_read(&dest->weight)
> 
> Did you mean here "(__s64) loh * (__s32) atomic_read(&dest->sweight)"?
> 
> If not, the results for me on GCC 4.7.2 were what I posted at
> http://0x.ca/sim/ref/3.9-ipvs/.

	Reading your reply I see that the key issue you are
missing here is that the type of loh/doh matters, it should
match the type of atomic_read (and its signedness).
As atomic_t is int we prefer loh and doh to be int.
The __u64/__s64 cast before loh/doh does not matter at all.
If both operands are not from same signedness the
32*32 optimization is not applied.

> > 	Both solutions generate code that differs only
> > in imul vs. mul. In internet I see that imul is
> > preferred/faster than mul. That is why I prefer solution 1,
> > it has less casts.
> > 
> > 	So, I think you can change your patch as follows:
> > 
> > 1. Use int for loh, doh. Note that some schedulers
> > use 'unsigned int' and should be patched for this
> > definition: NQ, SED, WLC
> > 
> > 2. Use (__u64) prefix only, no (__u32) before atomic_read:
> > LBLC, LBLCR, NQ, SED, WLC
> > 
> > 	(__u64) loh * atomic_read(&dest->weight) ...
> > 	(__u64) doh * ...
> > 
> > 3. Explain in commit message that we find the
> > result64=int32*int32 faster than result64=uint32*uint32
> > and far better than using 64*64 multiply which is
> > a bit slower on older CPUs.
> 
> I found that u64*u32 was faster than u64*s32, but I didn't check s64*s32.

	It is faster because your loh/doh are u32 and single mul is
generated while for u64*s32 we have u64=u32*s32 which obviously
is not implemented with single imul/mul. Our goal is single imul.

> I checked now and found that u64*u32 and s64*s32 have the same number of
> output instructions on i386, with just the different signedness tests.
> Both actually use imul since it's the comparison after which it cares
> about for signedness.

	IIRC, (u64) u32 * u32 uses mul if you compile
with optimizations (-O).

> But what actually makes sense? Do negative weights ever make sense?

	__u64/__s64 cast does not matter because both
operands are positive values. As result, this solution
looks better:

int loh, doh;

(__s64) loh * atomic_read(&dest->weight)

	because:

- both operands are 'int' => no extra casts before atomic_read

- 'int',  not 'unsigned int' because imul is better than mul

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] ipvs: Use 64-bit comparisons (connections * weight) to avoid overflow
  2013-08-06  6:45               ` Julian Anastasov
@ 2013-08-08 23:54                 ` Simon Kirby
  2013-08-09  9:02                   ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-08-08 23:54 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, Changli Gao

On Tue, Aug 06, 2013 at 09:45:24AM +0300, Julian Anastasov wrote:

> On Mon, 5 Aug 2013, Simon Kirby wrote:
> 
> > > 1. same as my first email:
> > > 
> > > int loh, doh;
> > > 
> > > (__u64/__s64) loh * atomic_read(&dest->weight)
> > > 
> > > 	In this case I see only one difference between
> > > __u64 and __s64:
> > > 
> > > -       jb      .L41    #,
> > > -       ja      .L79    #,
> > > +       jl      .L41    #,
> > > +       jg      .L79    #,
> > > 
> > > 2. Your patch:
> > > 
> > > unsigned int loh, doh;
> > > 
> > > (__u64) loh * (__u32) atomic_read(&dest->weight)
> > > or
> > > (__s64) loh * (__u32) atomic_read(&dest->weight)
> > 
> > Did you mean here "(__s64) loh * (__s32) atomic_read(&dest->sweight)"?
> > 
> > If not, the results for me on GCC 4.7.2 were what I posted at
> > http://0x.ca/sim/ref/3.9-ipvs/.
> 
> 	Reading your reply I see that the key issue you are
> missing here is that the type of loh/doh matters, it should
> match the type of atomic_read (and its signedness).

I'm not missing it, I was just trying to determine if the weight _should_
be signed or not. :) I've never seen a negative weight documented
anywhere.

> 	It is faster because your loh/doh are u32 and single mul is
> generated while for u64*s32 we have u64=u32*s32 which obviously
> is not implemented with single imul/mul. Our goal is single imul.

Ok, well, it turns out that GCC doesn't care and uses the signal register
instruction form of imull anyway :) ->

sim@simonk:/d/linux$ diff -u <(sed 's/#.*//' /tmp/u64_u32.s) <(sed 's/#.*//' /tmp/s64_s32.s)
--- /dev/fd/63	2013-08-08 16:39:10.692645777 -0700
+++ /dev/fd/62	2013-08-08 16:39:10.692645777 -0700
@@ -125,22 +125,22 @@
 	testb	$2, %al	
 	jne	.L8	
 	movl	216(%ebx), %eax	
-	movl	220(%ebx), %esi	
+	movl	220(%ebx), %edx	
 	sall	$8, %eax	
-	addl	%eax, %esi	
+	leal	(%eax,%edx), %esi	
 	movl	44(%ebx), %eax	
 	movl	%eax, -24(%ebp)	
 	movl	44(%ecx), %eax	
 	movl	%eax, -28(%ebp)	
 	movl	-24(%ebp), %eax	
-	mull	-32(%ebp)	
+	imull	-32(%ebp)	
 	movl	%eax, -24(%ebp)	
 	movl	-28(%ebp), %eax	
 	movl	%edx, -20(%ebp)	
-	mull	%esi	
+	imull	%esi	
 	cmpl	%edx, -20(%ebp)	
-	ja	.L11	
-	jb	.L8	
+	jg	.L11	
+	jl	.L8	
 	cmpl	%eax, -24(%ebp)	
 	jbe	.L8	
 .L11:

That is the only output difference from ip_vs_wlc.c with (signed) int loh
and doh as opposed to unsigned int (and the atomic_t to u32 cast).

> > But what actually makes sense? Do negative weights ever make sense?
> 
> 	__u64/__s64 cast does not matter because both
> operands are positive values. As result, this solution
> looks better:
> 
> int loh, doh;
> 
> (__s64) loh * atomic_read(&dest->weight)
> 
> 	because:
> 
> - both operands are 'int' => no extra casts before atomic_read
> 
> - 'int',  not 'unsigned int' because imul is better than mul

Ok, here is a patch (on current ipvs-next) that makes everything "int"
and adds only a (__s64) cast in front of the loh and doh during
multiplication to solve the original overflow problem.

Simon-


Some scheduling modules such as lblc and lblcr require weight to be as
high as the maximum number of expected active connections. Meanwhile,
commit b552f7e3a9524abcbcdf, which appeared in 2.6.39-rc, cleaned up the
consideration of inactconns and activeconns to always count activeconns
as 256 times more important than inactconns.

In our case, this exposed an integer overflow because we regularly exceed
3000 active connections to a real server. A weight of 3000 * 256 * 3000
connections overflows the signed integer used when determining when to
reschedule. The original factor of 50 did not overflow, though was close.

On amd64, this merely changes the multiply and comparison instructions to
64-bit. On x86, the 64-bit result is already present from imull, so only
a few more comparisons are emitted.

Signed-off-by: Simon Kirby <sim@hostway.ca>
---
 include/net/ip_vs.h              |  2 +-
 net/netfilter/ipvs/ip_vs_lblc.c  |  4 ++--
 net/netfilter/ipvs/ip_vs_lblcr.c | 12 ++++++------
 net/netfilter/ipvs/ip_vs_nq.c    |  6 +++---
 net/netfilter/ipvs/ip_vs_sed.c   |  6 +++---
 net/netfilter/ipvs/ip_vs_wlc.c   |  6 +++---
 6 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index f0d70f0..fe782ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1649,7 +1649,7 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
 /* CONFIG_IP_VS_NFCT */
 #endif
 
-static inline unsigned int
+static inline int
 ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
 {
 	/*
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 1383b0e..eb814bf 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -443,8 +443,8 @@ __ip_vs_lblc_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 5199448..e65f7c5 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -200,8 +200,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_min(struct ip_vs_dest_set *set)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if ((loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))
+		if (((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))
 		    && (dest->flags & IP_VS_DEST_F_AVAILABLE)) {
 			least = dest;
 			loh = doh;
@@ -246,8 +246,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_max(struct ip_vs_dest_set *set)
 		dest = rcu_dereference_protected(e->dest, 1);
 		doh = ip_vs_dest_conn_overhead(dest);
 		/* moh/mw < doh/dw ==> moh*dw < doh*mw, where mw,dw>0 */
-		if ((moh * atomic_read(&dest->weight) <
-		     doh * atomic_read(&most->weight))
+		if (((__s64)moh * atomic_read(&dest->weight) <
+		     (__s64)doh * atomic_read(&most->weight))
 		    && (atomic_read(&dest->weight) > 0)) {
 			most = dest;
 			moh = doh;
@@ -611,8 +611,8 @@ __ip_vs_lblcr_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index d8d9860..368b23e 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -59,7 +59,7 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		  struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least = NULL;
-	unsigned int loh = 0, doh;
+	int loh = 0, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -92,8 +92,8 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		}
 
 		if (!least ||
-		    (loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))) {
+		    ((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index a5284cc..f90e7e6 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -63,7 +63,7 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -99,8 +99,8 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_sed_dest_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index 6dc1fa1..b5b4650 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -35,7 +35,7 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "ip_vs_wlc_schedule(): Scheduling...\n");
 
@@ -71,8 +71,8 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
-- 
1.8.4.rc1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] ipvs: Use 64-bit comparisons (connections * weight) to avoid overflow
  2013-08-08 23:54                 ` [PATCH] ipvs: Use 64-bit comparisons (connections * weight) to avoid overflow Simon Kirby
@ 2013-08-09  9:02                   ` Julian Anastasov
  2013-08-10  8:26                     ` [PATCH v2] ipvs: fix overflow on dest weight multiply Simon Kirby
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-08-09  9:02 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao


	Hello,

On Thu, 8 Aug 2013, Simon Kirby wrote:

> On Tue, Aug 06, 2013 at 09:45:24AM +0300, Julian Anastasov wrote:
> 
> > 	Reading your reply I see that the key issue you are
> > missing here is that the type of loh/doh matters, it should
> > match the type of atomic_read (and its signedness).
> 
> I'm not missing it, I was just trying to determine if the weight _should_
> be signed or not. :) I've never seen a negative weight documented
> anywhere.

	Weight should be >= 0, there is a
'server weight less than zero' error otherwise.

> > 	It is faster because your loh/doh are u32 and single mul is
> > generated while for u64*s32 we have u64=u32*s32 which obviously
> > is not implemented with single imul/mul. Our goal is single imul.
> 
> Ok, well, it turns out that GCC doesn't care and uses the signal register
> instruction form of imull anyway :) ->

	OK, so everything is as expected :)

> Ok, here is a patch (on current ipvs-next) that makes everything "int"
> and adds only a (__s64) cast in front of the loh and doh during
> multiplication to solve the original overflow problem.

	Very good, thanks!

> Simon-
> 
> 
> Some scheduling modules such as lblc and lblcr require weight to be as
> high as the maximum number of expected active connections. Meanwhile,
> commit b552f7e3a9524abcbcdf, which appeared in 2.6.39-rc, cleaned up the
> consideration of inactconns and activeconns to always count activeconns
> as 256 times more important than inactconns.
> 
> In our case, this exposed an integer overflow because we regularly exceed
> 3000 active connections to a real server. A weight of 3000 * 256 * 3000
> connections overflows the signed integer used when determining when to
> reschedule. The original factor of 50 did not overflow, though was close.
> 
> On amd64, this merely changes the multiply and comparison instructions to
> 64-bit. On x86, the 64-bit result is already present from imull, so only
> a few more comparisons are emitted.
> 
> Signed-off-by: Simon Kirby <sim@hostway.ca>

	Looks good to me, even if you add space
between "(__s64)" cast and "loh"/"doh".

	But after your fix for ip_vs_dest_conn_overhead
I see that also ip_vs_nq_dest_overhead and ip_vs_sed_dest_overhead
need to return int instead of unsigned int. I'll ack
v2 with these changes.

	Also, shorter subject is preferred, you can use
'ipvs: fix overflow on dest weight multiply' or something
else that you feel is better, '()' and '*' does not look
good in subject. Thanks!

> ---
>  include/net/ip_vs.h              |  2 +-
>  net/netfilter/ipvs/ip_vs_lblc.c  |  4 ++--
>  net/netfilter/ipvs/ip_vs_lblcr.c | 12 ++++++------
>  net/netfilter/ipvs/ip_vs_nq.c    |  6 +++---
>  net/netfilter/ipvs/ip_vs_sed.c   |  6 +++---
>  net/netfilter/ipvs/ip_vs_wlc.c   |  6 +++---
>  6 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index f0d70f0..fe782ed 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1649,7 +1649,7 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
>  /* CONFIG_IP_VS_NFCT */
>  #endif
>  
> -static inline unsigned int
> +static inline int
>  ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
>  {
>  	/*

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] ipvs: fix overflow on dest weight multiply
  2013-08-09  9:02                   ` Julian Anastasov
@ 2013-08-10  8:26                     ` Simon Kirby
  2013-08-10 12:31                       ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2013-08-10  8:26 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: lvs-devel, Changli Gao

On Fri, Aug 09, 2013 at 12:02:11PM +0300, Julian Anastasov wrote:

> > > 	It is faster because your loh/doh are u32 and single mul is
> > > generated while for u64*s32 we have u64=u32*s32 which obviously
> > > is not implemented with single imul/mul. Our goal is single imul.
> > 
> > Ok, well, it turns out that GCC doesn't care and uses the signal register
> > instruction form of imull anyway :) ->
> 
> 	OK, so everything is as expected :)
> 
> > Ok, here is a patch (on current ipvs-next) that makes everything "int"
> > and adds only a (__s64) cast in front of the loh and doh during
> > multiplication to solve the original overflow problem.
> 
> 	Very good, thanks!
> 
> > Some scheduling modules such as lblc and lblcr require weight to be as
> > high as the maximum number of expected active connections. Meanwhile,
> > commit b552f7e3a9524abcbcdf, which appeared in 2.6.39-rc, cleaned up the
> > consideration of inactconns and activeconns to always count activeconns
> > as 256 times more important than inactconns.
> > 
> > In our case, this exposed an integer overflow because we regularly exceed
> > 3000 active connections to a real server. A weight of 3000 * 256 * 3000
> > connections overflows the signed integer used when determining when to
> > reschedule. The original factor of 50 did not overflow, though was close.
> > 
> > On amd64, this merely changes the multiply and comparison instructions to
> > 64-bit. On x86, the 64-bit result is already present from imull, so only
> > a few more comparisons are emitted.
> > 
> > Signed-off-by: Simon Kirby <sim@hostway.ca>
> 
> 	Looks good to me, even if you add space
> between "(__s64)" cast and "loh"/"doh".

I think (__s64)loh * doh makes more sense as the cast applies to the
variable before the multiply is evaluated.

> 	But after your fix for ip_vs_dest_conn_overhead
> I see that also ip_vs_nq_dest_overhead and ip_vs_sed_dest_overhead
> need to return int instead of unsigned int. I'll ack
> v2 with these changes.

Ok, fixed. :)

> 	Also, shorter subject is preferred, you can use
> 'ipvs: fix overflow on dest weight multiply' or something
> else that you feel is better, '()' and '*' does not look
> good in subject. Thanks!

-- 8< --

Schedulers such as lblc and lblcr require the weight to be as high as the
maximum number of active connections. In commit b552f7e3a9524abcbcdf, the
consideration of inactconns and activeconns was cleaned up to always
count activeconns as 256 times more important than inactconns. In cases
where 3000 or more connections are expected, a weight of 3000 * 256 *
3000 connections overflows the 32-bit signed result used to determine if
rescheduling is required.

On amd64, this merely changes the multiply and comparison instructions to
64-bit. On x86, a 64-bit result is already present from imull, so only
a few more comparison instructions are emitted.

Signed-off-by: Simon Kirby <sim@hostway.ca>
---
 include/net/ip_vs.h              |  2 +-
 net/netfilter/ipvs/ip_vs_lblc.c  |  4 ++--
 net/netfilter/ipvs/ip_vs_lblcr.c | 12 ++++++------
 net/netfilter/ipvs/ip_vs_nq.c    |  8 ++++----
 net/netfilter/ipvs/ip_vs_sed.c   |  8 ++++----
 net/netfilter/ipvs/ip_vs_wlc.c   |  6 +++---
 6 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index f0d70f0..fe782ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1649,7 +1649,7 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
 /* CONFIG_IP_VS_NFCT */
 #endif
 
-static inline unsigned int
+static inline int
 ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
 {
 	/*
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 1383b0e..eb814bf 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -443,8 +443,8 @@ __ip_vs_lblc_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 5199448..e65f7c5 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -200,8 +200,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_min(struct ip_vs_dest_set *set)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if ((loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))
+		if (((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))
 		    && (dest->flags & IP_VS_DEST_F_AVAILABLE)) {
 			least = dest;
 			loh = doh;
@@ -246,8 +246,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_max(struct ip_vs_dest_set *set)
 		dest = rcu_dereference_protected(e->dest, 1);
 		doh = ip_vs_dest_conn_overhead(dest);
 		/* moh/mw < doh/dw ==> moh*dw < doh*mw, where mw,dw>0 */
-		if ((moh * atomic_read(&dest->weight) <
-		     doh * atomic_read(&most->weight))
+		if (((__s64)moh * atomic_read(&dest->weight) <
+		     (__s64)doh * atomic_read(&most->weight))
 		    && (atomic_read(&dest->weight) > 0)) {
 			most = dest;
 			moh = doh;
@@ -611,8 +611,8 @@ __ip_vs_lblcr_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index d8d9860..961a6de 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -40,7 +40,7 @@
 #include <net/ip_vs.h>
 
 
-static inline unsigned int
+static inline int
 ip_vs_nq_dest_overhead(struct ip_vs_dest *dest)
 {
 	/*
@@ -59,7 +59,7 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		  struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least = NULL;
-	unsigned int loh = 0, doh;
+	int loh = 0, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -92,8 +92,8 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		}
 
 		if (!least ||
-		    (loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))) {
+		    ((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index a5284cc..e446b9f 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -44,7 +44,7 @@
 #include <net/ip_vs.h>
 
 
-static inline unsigned int
+static inline int
 ip_vs_sed_dest_overhead(struct ip_vs_dest *dest)
 {
 	/*
@@ -63,7 +63,7 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -99,8 +99,8 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_sed_dest_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index 6dc1fa1..b5b4650 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -35,7 +35,7 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "ip_vs_wlc_schedule(): Scheduling...\n");
 
@@ -71,8 +71,8 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
-- 
1.8.4.rc1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ipvs: fix overflow on dest weight multiply
  2013-08-10  8:26                     ` [PATCH v2] ipvs: fix overflow on dest weight multiply Simon Kirby
@ 2013-08-10 12:31                       ` Julian Anastasov
  2013-08-13  2:23                         ` Simon Horman
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2013-08-10 12:31 UTC (permalink / raw)
  To: Simon Kirby; +Cc: lvs-devel, Changli Gao, Simon Horman


	Hello,

On Sat, 10 Aug 2013, Simon Kirby wrote:

> On Fri, Aug 09, 2013 at 12:02:11PM +0300, Julian Anastasov wrote:
> 
> > 	Looks good to me, even if you add space
> > between "(__s64)" cast and "loh"/"doh".
> 
> I think (__s64)loh * doh makes more sense as the cast applies to the
> variable before the multiply is evaluated.

	OK

> > 	But after your fix for ip_vs_dest_conn_overhead
> > I see that also ip_vs_nq_dest_overhead and ip_vs_sed_dest_overhead
> > need to return int instead of unsigned int. I'll ack
> > v2 with these changes.
> 
> Ok, fixed. :)

	Thanks!

> > 	Also, shorter subject is preferred, you can use
> > 'ipvs: fix overflow on dest weight multiply' or something
> > else that you feel is better, '()' and '*' does not look
> > good in subject. Thanks!
> 
> -- 8< --
> 
> Schedulers such as lblc and lblcr require the weight to be as high as the
> maximum number of active connections. In commit b552f7e3a9524abcbcdf, the
> consideration of inactconns and activeconns was cleaned up to always
> count activeconns as 256 times more important than inactconns. In cases
> where 3000 or more connections are expected, a weight of 3000 * 256 *
> 3000 connections overflows the 32-bit signed result used to determine if
> rescheduling is required.
> 
> On amd64, this merely changes the multiply and comparison instructions to
> 64-bit. On x86, a 64-bit result is already present from imull, so only
> a few more comparison instructions are emitted.
> 
> Signed-off-by: Simon Kirby <sim@hostway.ca>

Acked-by: Julian Anastasov <ja@ssi.bg>

	Horms, please apply!

> ---
>  include/net/ip_vs.h              |  2 +-
>  net/netfilter/ipvs/ip_vs_lblc.c  |  4 ++--
>  net/netfilter/ipvs/ip_vs_lblcr.c | 12 ++++++------
>  net/netfilter/ipvs/ip_vs_nq.c    |  8 ++++----
>  net/netfilter/ipvs/ip_vs_sed.c   |  8 ++++----
>  net/netfilter/ipvs/ip_vs_wlc.c   |  6 +++---
>  6 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
> index f0d70f0..fe782ed 100644
> --- a/include/net/ip_vs.h
> +++ b/include/net/ip_vs.h
> @@ -1649,7 +1649,7 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
>  /* CONFIG_IP_VS_NFCT */
>  #endif
>  
> -static inline unsigned int
> +static inline int
>  ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
>  {
>  	/*
> diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
> index 1383b0e..eb814bf 100644
> --- a/net/netfilter/ipvs/ip_vs_lblc.c
> +++ b/net/netfilter/ipvs/ip_vs_lblc.c
> @@ -443,8 +443,8 @@ __ip_vs_lblc_schedule(struct ip_vs_service *svc)
>  			continue;
>  
>  		doh = ip_vs_dest_conn_overhead(dest);
> -		if (loh * atomic_read(&dest->weight) >
> -		    doh * atomic_read(&least->weight)) {
> +		if ((__s64)loh * atomic_read(&dest->weight) >
> +		    (__s64)doh * atomic_read(&least->weight)) {
>  			least = dest;
>  			loh = doh;
>  		}
> diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
> index 5199448..e65f7c5 100644
> --- a/net/netfilter/ipvs/ip_vs_lblcr.c
> +++ b/net/netfilter/ipvs/ip_vs_lblcr.c
> @@ -200,8 +200,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_min(struct ip_vs_dest_set *set)
>  			continue;
>  
>  		doh = ip_vs_dest_conn_overhead(dest);
> -		if ((loh * atomic_read(&dest->weight) >
> -		     doh * atomic_read(&least->weight))
> +		if (((__s64)loh * atomic_read(&dest->weight) >
> +		     (__s64)doh * atomic_read(&least->weight))
>  		    && (dest->flags & IP_VS_DEST_F_AVAILABLE)) {
>  			least = dest;
>  			loh = doh;
> @@ -246,8 +246,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_max(struct ip_vs_dest_set *set)
>  		dest = rcu_dereference_protected(e->dest, 1);
>  		doh = ip_vs_dest_conn_overhead(dest);
>  		/* moh/mw < doh/dw ==> moh*dw < doh*mw, where mw,dw>0 */
> -		if ((moh * atomic_read(&dest->weight) <
> -		     doh * atomic_read(&most->weight))
> +		if (((__s64)moh * atomic_read(&dest->weight) <
> +		     (__s64)doh * atomic_read(&most->weight))
>  		    && (atomic_read(&dest->weight) > 0)) {
>  			most = dest;
>  			moh = doh;
> @@ -611,8 +611,8 @@ __ip_vs_lblcr_schedule(struct ip_vs_service *svc)
>  			continue;
>  
>  		doh = ip_vs_dest_conn_overhead(dest);
> -		if (loh * atomic_read(&dest->weight) >
> -		    doh * atomic_read(&least->weight)) {
> +		if ((__s64)loh * atomic_read(&dest->weight) >
> +		    (__s64)doh * atomic_read(&least->weight)) {
>  			least = dest;
>  			loh = doh;
>  		}
> diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
> index d8d9860..961a6de 100644
> --- a/net/netfilter/ipvs/ip_vs_nq.c
> +++ b/net/netfilter/ipvs/ip_vs_nq.c
> @@ -40,7 +40,7 @@
>  #include <net/ip_vs.h>
>  
>  
> -static inline unsigned int
> +static inline int
>  ip_vs_nq_dest_overhead(struct ip_vs_dest *dest)
>  {
>  	/*
> @@ -59,7 +59,7 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		  struct ip_vs_iphdr *iph)
>  {
>  	struct ip_vs_dest *dest, *least = NULL;
> -	unsigned int loh = 0, doh;
> +	int loh = 0, doh;
>  
>  	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
>  
> @@ -92,8 +92,8 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		}
>  
>  		if (!least ||
> -		    (loh * atomic_read(&dest->weight) >
> -		     doh * atomic_read(&least->weight))) {
> +		    ((__s64)loh * atomic_read(&dest->weight) >
> +		     (__s64)doh * atomic_read(&least->weight))) {
>  			least = dest;
>  			loh = doh;
>  		}
> diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
> index a5284cc..e446b9f 100644
> --- a/net/netfilter/ipvs/ip_vs_sed.c
> +++ b/net/netfilter/ipvs/ip_vs_sed.c
> @@ -44,7 +44,7 @@
>  #include <net/ip_vs.h>
>  
>  
> -static inline unsigned int
> +static inline int
>  ip_vs_sed_dest_overhead(struct ip_vs_dest *dest)
>  {
>  	/*
> @@ -63,7 +63,7 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		   struct ip_vs_iphdr *iph)
>  {
>  	struct ip_vs_dest *dest, *least;
> -	unsigned int loh, doh;
> +	int loh, doh;
>  
>  	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
>  
> @@ -99,8 +99,8 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
>  			continue;
>  		doh = ip_vs_sed_dest_overhead(dest);
> -		if (loh * atomic_read(&dest->weight) >
> -		    doh * atomic_read(&least->weight)) {
> +		if ((__s64)loh * atomic_read(&dest->weight) >
> +		    (__s64)doh * atomic_read(&least->weight)) {
>  			least = dest;
>  			loh = doh;
>  		}
> diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
> index 6dc1fa1..b5b4650 100644
> --- a/net/netfilter/ipvs/ip_vs_wlc.c
> +++ b/net/netfilter/ipvs/ip_vs_wlc.c
> @@ -35,7 +35,7 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		   struct ip_vs_iphdr *iph)
>  {
>  	struct ip_vs_dest *dest, *least;
> -	unsigned int loh, doh;
> +	int loh, doh;
>  
>  	IP_VS_DBG(6, "ip_vs_wlc_schedule(): Scheduling...\n");
>  
> @@ -71,8 +71,8 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
>  		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
>  			continue;
>  		doh = ip_vs_dest_conn_overhead(dest);
> -		if (loh * atomic_read(&dest->weight) >
> -		    doh * atomic_read(&least->weight)) {
> +		if ((__s64)loh * atomic_read(&dest->weight) >
> +		    (__s64)doh * atomic_read(&least->weight)) {
>  			least = dest;
>  			loh = doh;
>  		}
> -- 
> 1.8.4.rc1

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ipvs: fix overflow on dest weight multiply
  2013-08-10 12:31                       ` Julian Anastasov
@ 2013-08-13  2:23                         ` Simon Horman
  2013-08-13  4:45                           ` Simon Horman
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2013-08-13  2:23 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Simon Kirby, lvs-devel, Changli Gao

On Sat, Aug 10, 2013 at 03:31:03PM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Sat, 10 Aug 2013, Simon Kirby wrote:
> 
> > On Fri, Aug 09, 2013 at 12:02:11PM +0300, Julian Anastasov wrote:
> > 
> > > 	Looks good to me, even if you add space
> > > between "(__s64)" cast and "loh"/"doh".
> > 
> > I think (__s64)loh * doh makes more sense as the cast applies to the
> > variable before the multiply is evaluated.
> 
> 	OK
> 
> > > 	But after your fix for ip_vs_dest_conn_overhead
> > > I see that also ip_vs_nq_dest_overhead and ip_vs_sed_dest_overhead
> > > need to return int instead of unsigned int. I'll ack
> > > v2 with these changes.
> > 
> > Ok, fixed. :)
> 
> 	Thanks!
> 
> > > 	Also, shorter subject is preferred, you can use
> > > 'ipvs: fix overflow on dest weight multiply' or something
> > > else that you feel is better, '()' and '*' does not look
> > > good in subject. Thanks!
> > 
> > -- 8< --
> > 
> > Schedulers such as lblc and lblcr require the weight to be as high as the
> > maximum number of active connections. In commit b552f7e3a9524abcbcdf, the
> > consideration of inactconns and activeconns was cleaned up to always
> > count activeconns as 256 times more important than inactconns. In cases
> > where 3000 or more connections are expected, a weight of 3000 * 256 *
> > 3000 connections overflows the 32-bit signed result used to determine if
> > rescheduling is required.
> > 
> > On amd64, this merely changes the multiply and comparison instructions to
> > 64-bit. On x86, a 64-bit result is already present from imull, so only
> > a few more comparison instructions are emitted.
> > 
> > Signed-off-by: Simon Kirby <sim@hostway.ca>
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>
> 
> 	Horms, please apply!

Sure, will do.

I am on vacation until the 21st and thus my net access is somewhat
sporadic. I apologise that this may delay me pushing this patch to
ipvs-next.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] ipvs: fix overflow on dest weight multiply
  2013-08-13  2:23                         ` Simon Horman
@ 2013-08-13  4:45                           ` Simon Horman
  0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2013-08-13  4:45 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Simon Kirby, lvs-devel, Changli Gao

On Tue, Aug 13, 2013 at 12:23:48PM +1000, Simon Horman wrote:
> On Sat, Aug 10, 2013 at 03:31:03PM +0300, Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Sat, 10 Aug 2013, Simon Kirby wrote:
> > 
> > > On Fri, Aug 09, 2013 at 12:02:11PM +0300, Julian Anastasov wrote:
> > > 
> > > > 	Looks good to me, even if you add space
> > > > between "(__s64)" cast and "loh"/"doh".
> > > 
> > > I think (__s64)loh * doh makes more sense as the cast applies to the
> > > variable before the multiply is evaluated.
> > 
> > 	OK
> > 
> > > > 	But after your fix for ip_vs_dest_conn_overhead
> > > > I see that also ip_vs_nq_dest_overhead and ip_vs_sed_dest_overhead
> > > > need to return int instead of unsigned int. I'll ack
> > > > v2 with these changes.
> > > 
> > > Ok, fixed. :)
> > 
> > 	Thanks!
> > 
> > > > 	Also, shorter subject is preferred, you can use
> > > > 'ipvs: fix overflow on dest weight multiply' or something
> > > > else that you feel is better, '()' and '*' does not look
> > > > good in subject. Thanks!
> > > 
> > > -- 8< --
> > > 
> > > Schedulers such as lblc and lblcr require the weight to be as high as the
> > > maximum number of active connections. In commit b552f7e3a9524abcbcdf, the
> > > consideration of inactconns and activeconns was cleaned up to always
> > > count activeconns as 256 times more important than inactconns. In cases
> > > where 3000 or more connections are expected, a weight of 3000 * 256 *
> > > 3000 connections overflows the 32-bit signed result used to determine if
> > > rescheduling is required.
> > > 
> > > On amd64, this merely changes the multiply and comparison instructions to
> > > 64-bit. On x86, a 64-bit result is already present from imull, so only
> > > a few more comparison instructions are emitted.
> > > 
> > > Signed-off-by: Simon Kirby <sim@hostway.ca>
> > 
> > Acked-by: Julian Anastasov <ja@ssi.bg>
> > 
> > 	Horms, please apply!
> 
> Sure, will do.
> 
> I am on vacation until the 21st and thus my net access is somewhat
> sporadic. I apologise that this may delay me pushing this patch to
> ipvs-next.

I have pushed the following. Thanks everyone.

commit 6b702e9baa89684949c1e181941aec3d3305d73f
Author: Simon Kirby <sim@hostway.ca>
Date:   Sat Aug 10 01:26:18 2013 -0700

    ipvs: fix overflow on dest weight multiply
    
    Schedulers such as lblc and lblcr require the weight to be as high as the
    maximum number of active connections. In commit b552f7e3a9524abcbcdf, the
    consideration of inactconns and activeconns was cleaned up to always
    count activeconns as 256 times more important than inactconns. In cases
    where 3000 or more connections are expected, a weight of 3000 * 256 *
    3000 connections overflows the 32-bit signed result used to determine if
    rescheduling is required.
    
    On amd64, this merely changes the multiply and comparison instructions to
    64-bit. On x86, a 64-bit result is already present from imull, so only
    a few more comparison instructions are emitted.
    
    Signed-off-by: Simon Kirby <sim@hostway.ca>
    Acked-by: Julian Anastasov <ja@ssi.bg>
    Signed-off-by: Simon Horman <horms@verge.net.au>

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index f0d70f0..fe782ed 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1649,7 +1649,7 @@ static inline void ip_vs_conn_drop_conntrack(struct ip_vs_conn *cp)
 /* CONFIG_IP_VS_NFCT */
 #endif
 
-static inline unsigned int
+static inline int
 ip_vs_dest_conn_overhead(struct ip_vs_dest *dest)
 {
 	/*
diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
index 1383b0e..eb814bf 100644
--- a/net/netfilter/ipvs/ip_vs_lblc.c
+++ b/net/netfilter/ipvs/ip_vs_lblc.c
@@ -443,8 +443,8 @@ __ip_vs_lblc_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
index 5199448..e65f7c5 100644
--- a/net/netfilter/ipvs/ip_vs_lblcr.c
+++ b/net/netfilter/ipvs/ip_vs_lblcr.c
@@ -200,8 +200,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_min(struct ip_vs_dest_set *set)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if ((loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))
+		if (((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))
 		    && (dest->flags & IP_VS_DEST_F_AVAILABLE)) {
 			least = dest;
 			loh = doh;
@@ -246,8 +246,8 @@ static inline struct ip_vs_dest *ip_vs_dest_set_max(struct ip_vs_dest_set *set)
 		dest = rcu_dereference_protected(e->dest, 1);
 		doh = ip_vs_dest_conn_overhead(dest);
 		/* moh/mw < doh/dw ==> moh*dw < doh*mw, where mw,dw>0 */
-		if ((moh * atomic_read(&dest->weight) <
-		     doh * atomic_read(&most->weight))
+		if (((__s64)moh * atomic_read(&dest->weight) <
+		     (__s64)doh * atomic_read(&most->weight))
 		    && (atomic_read(&dest->weight) > 0)) {
 			most = dest;
 			moh = doh;
@@ -611,8 +611,8 @@ __ip_vs_lblcr_schedule(struct ip_vs_service *svc)
 			continue;
 
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_nq.c b/net/netfilter/ipvs/ip_vs_nq.c
index d8d9860..961a6de 100644
--- a/net/netfilter/ipvs/ip_vs_nq.c
+++ b/net/netfilter/ipvs/ip_vs_nq.c
@@ -40,7 +40,7 @@
 #include <net/ip_vs.h>
 
 
-static inline unsigned int
+static inline int
 ip_vs_nq_dest_overhead(struct ip_vs_dest *dest)
 {
 	/*
@@ -59,7 +59,7 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		  struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least = NULL;
-	unsigned int loh = 0, doh;
+	int loh = 0, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -92,8 +92,8 @@ ip_vs_nq_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		}
 
 		if (!least ||
-		    (loh * atomic_read(&dest->weight) >
-		     doh * atomic_read(&least->weight))) {
+		    ((__s64)loh * atomic_read(&dest->weight) >
+		     (__s64)doh * atomic_read(&least->weight))) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_sed.c b/net/netfilter/ipvs/ip_vs_sed.c
index a5284cc..e446b9f 100644
--- a/net/netfilter/ipvs/ip_vs_sed.c
+++ b/net/netfilter/ipvs/ip_vs_sed.c
@@ -44,7 +44,7 @@
 #include <net/ip_vs.h>
 
 
-static inline unsigned int
+static inline int
 ip_vs_sed_dest_overhead(struct ip_vs_dest *dest)
 {
 	/*
@@ -63,7 +63,7 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "%s(): Scheduling...\n", __func__);
 
@@ -99,8 +99,8 @@ ip_vs_sed_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_sed_dest_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}
diff --git a/net/netfilter/ipvs/ip_vs_wlc.c b/net/netfilter/ipvs/ip_vs_wlc.c
index 6dc1fa1..b5b4650 100644
--- a/net/netfilter/ipvs/ip_vs_wlc.c
+++ b/net/netfilter/ipvs/ip_vs_wlc.c
@@ -35,7 +35,7 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		   struct ip_vs_iphdr *iph)
 {
 	struct ip_vs_dest *dest, *least;
-	unsigned int loh, doh;
+	int loh, doh;
 
 	IP_VS_DBG(6, "ip_vs_wlc_schedule(): Scheduling...\n");
 
@@ -71,8 +71,8 @@ ip_vs_wlc_schedule(struct ip_vs_service *svc, const struct sk_buff *skb,
 		if (dest->flags & IP_VS_DEST_F_OVERLOAD)
 			continue;
 		doh = ip_vs_dest_conn_overhead(dest);
-		if (loh * atomic_read(&dest->weight) >
-		    doh * atomic_read(&least->weight)) {
+		if ((__s64)loh * atomic_read(&dest->weight) >
+		    (__s64)doh * atomic_read(&least->weight)) {
 			least = dest;
 			loh = doh;
 		}

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-08-13  4:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-13  6:43 activeconns * weight overflowing 32-bit int Simon Kirby
2013-04-13 15:10 ` Julian Anastasov
2013-05-22  6:18   ` Julian Anastasov
2013-05-23 16:58     ` Simon Kirby
2013-05-23 20:44       ` Julian Anastasov
2013-05-24  0:43         ` Simon Kirby
2013-05-24  8:11           ` Julian Anastasov
2013-08-05  6:10             ` Julian Anastasov
2013-08-06  2:41             ` Simon Kirby
2013-08-06  6:45               ` Julian Anastasov
2013-08-08 23:54                 ` [PATCH] ipvs: Use 64-bit comparisons (connections * weight) to avoid overflow Simon Kirby
2013-08-09  9:02                   ` Julian Anastasov
2013-08-10  8:26                     ` [PATCH v2] ipvs: fix overflow on dest weight multiply Simon Kirby
2013-08-10 12:31                       ` Julian Anastasov
2013-08-13  2:23                         ` Simon Horman
2013-08-13  4:45                           ` Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.