netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
@ 2015-11-02 20:48 Ani Sinha
  0 siblings, 0 replies; 12+ messages in thread
From: Ani Sinha @ 2015-11-02 20:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel, netfilter, netdev, Ani Sinha

> On Thu, Oct 29, 2015 at 6:21 PM, Neal P. Murphy
> <neal.p.murphy@alum.wpi.edu> wrote:
> > On Thu, 29 Oct 2015 17:01:24 -0700
> > Ani Sinha <ani@arista.com> wrote:
> >
> >> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
> >> <neal.p.murphy@alum.wpi.edu> wrote:
> >> > On Wed, 28 Oct 2015 02:36:50 -0400
> >> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
> >> >
> >> >> On Mon, 26 Oct 2015 21:06:33 +0100
> >> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> >> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >> >
> >> >> > Please, no need to Cc everyone here. Please, submit your Netfilter
> >> >> > patches to netfilter-devel@vger.kernel.org.
> >> >> >
> >> >> > Moreover, it would be great if the subject includes something
> >> >> > descriptive on what you need, for this I'd suggest:
> >> >> >
> >> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >> >
> >> >> > I'm including Neal P. Murphy, he said he would help testing these
> >> >> > backports, getting a Tested-by: tag usually speeds up things too.
> >> >>
> >> >
> >> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
> >> >
> >> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
> >> >
> >> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
> >> >
> >> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
> >> >
> >> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
> >> >
> >> > The load tests:
> >> >   - 2500 processes using 2500 addresses and random src ports
> >> >   - 2500 processes using 2500 addresses and the same src port
> >> >   - 2500 processes using the same src address and port
> >> >
> >> > I also tested using stock NF timeouts and using 1 second timeouts.
> >> >
> >> > Bandwidth used got as high as 16Mb/s for some tests.
> >> >
> >> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
> >> >
> >> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
> >> >
> >> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.
> >>
> >> Should I resend the patch with a Tested-by: tag?
> >
> > ... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too.
> >
> > Can I set the timeouts to zero? Or is one as low as I can go?
>
> Any progress with testing ?

I applied the 'hammer' through a firewall with the patch. I used TCP,
UDP and ICMP.

I don't know if the patch fixes the problem. But I'm reasonably sure
that it did not break normal operations.

To test a different problem I fixed (a memory leak in my 64-bit
counter patch for xt_ACCOUNT), I tested 60,000 addresses (most of a
/16) through the firewall. Again, no troubles.

I only observed two odd things which are likely completely unrelated
to your patch. When I started the TCP test, then added the UDP test,
only TCP would come through. If I stopped and restarted the TCP test,
only UDP would come through. I suspect this is due to buffering. It's
just a behaviour I haven't encountered since I started using Linux
many years ago (around '98). The second, when I started the test, the
firewall would lose contact with the upstream F/W's apcupsd daemon;
again, this is likely due to the nature of the test: it likely floods
input and output queues.


I'd say you can probably resend with Tested-by.

Neal

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-11-04 23:46   ` Ani Sinha
@ 2015-11-06 11:09     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2015-11-06 11:09 UTC (permalink / raw)
  To: Ani Sinha
  Cc: David S. Miller, netfilter-devel, netdev, Florian Westphal,
	Zefan Li, Neal P. Murphy

On Wed, Nov 04, 2015 at 03:46:54PM -0800, Ani Sinha wrote:
> (removed a bunch of people from CC list)
> 
> On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> 
> > Then we can review and, if no major concerns, I can submit this to
> > -stable.
> 
> Now that Neal has sufficiently tested the patches, is it OK to apply
> to -stable or do you guys want me to do anything more?

I'll be passing up this to -stable asap.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-26 20:06 ` Pablo Neira Ayuso
  2015-10-28  6:36   ` Neal P. Murphy
       [not found]   ` <17709_1446014232_t9S6b5a3017652_20151028023650.7b76098f@playground>
@ 2015-11-04 23:46   ` Ani Sinha
  2015-11-06 11:09     ` Pablo Neira Ayuso
  2 siblings, 1 reply; 12+ messages in thread
From: Ani Sinha @ 2015-11-04 23:46 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: David S. Miller, netfilter-devel, netdev, Florian Westphal,
	Zefan Li, Neal P. Murphy

(removed a bunch of people from CC list)

On Mon, Oct 26, 2015 at 1:06 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> Then we can review and, if no major concerns, I can submit this to
> -stable.

Now that Neal has sufficiently tested the patches, is it OK to apply
to -stable or do you guys want me to do anything more?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-30  1:21         ` Neal P. Murphy
@ 2015-10-30 15:37           ` Ani Sinha
  0 siblings, 0 replies; 12+ messages in thread
From: Ani Sinha @ 2015-10-30 15:37 UTC (permalink / raw)
  To: Neal P. Murphy; +Cc: netfilter, Pablo Neira Ayuso, netfilter-devel, netdev

On Thu, Oct 29, 2015 at 6:21 PM, Neal P. Murphy
<neal.p.murphy@alum.wpi.edu> wrote:
> On Thu, 29 Oct 2015 17:01:24 -0700
> Ani Sinha <ani@arista.com> wrote:
>
>> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
>> <neal.p.murphy@alum.wpi.edu> wrote:
>> > On Wed, 28 Oct 2015 02:36:50 -0400
>> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
>> >
>> >> On Mon, 26 Oct 2015 21:06:33 +0100
>> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
>> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >> >
>> >> > Please, no need to Cc everyone here. Please, submit your Netfilter
>> >> > patches to netfilter-devel@vger.kernel.org.
>> >> >
>> >> > Moreover, it would be great if the subject includes something
>> >> > descriptive on what you need, for this I'd suggest:
>> >> >
>> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >> >
>> >> > I'm including Neal P. Murphy, he said he would help testing these
>> >> > backports, getting a Tested-by: tag usually speeds up things too.
>> >>
>> >
>> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
>> >
>> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
>> >
>> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
>> >
>> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
>> >
>> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
>> >
>> > The load tests:
>> >   - 2500 processes using 2500 addresses and random src ports
>> >   - 2500 processes using 2500 addresses and the same src port
>> >   - 2500 processes using the same src address and port
>> >
>> > I also tested using stock NF timeouts and using 1 second timeouts.
>> >
>> > Bandwidth used got as high as 16Mb/s for some tests.
>> >
>> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
>> >
>> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
>> >
>> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.
>>
>> Should I resend the patch with a Tested-by: tag?
>
> ... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too.
>
> Can I set the timeouts to zero? Or is one as low as I can go?

I don't see any assertion or check against 0 sec timeouts. You can
try. Your conntrack entries will be constantly flushing.

>
> N

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-30  0:01       ` Ani Sinha
@ 2015-10-30  1:21         ` Neal P. Murphy
  2015-10-30 15:37           ` Ani Sinha
  0 siblings, 1 reply; 12+ messages in thread
From: Neal P. Murphy @ 2015-10-30  1:21 UTC (permalink / raw)
  To: Ani Sinha, netfilter; +Cc: Pablo Neira Ayuso, netfilter-devel, netdev

On Thu, 29 Oct 2015 17:01:24 -0700
Ani Sinha <ani@arista.com> wrote:

> On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
> <neal.p.murphy@alum.wpi.edu> wrote:
> > On Wed, 28 Oct 2015 02:36:50 -0400
> > "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
> >
> >> On Mon, 26 Oct 2015 21:06:33 +0100
> >> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >>
> >> > Hi,
> >> >
> >> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> >> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >
> >> > Please, no need to Cc everyone here. Please, submit your Netfilter
> >> > patches to netfilter-devel@vger.kernel.org.
> >> >
> >> > Moreover, it would be great if the subject includes something
> >> > descriptive on what you need, for this I'd suggest:
> >> >
> >> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> >> >
> >> > I'm including Neal P. Murphy, he said he would help testing these
> >> > backports, getting a Tested-by: tag usually speeds up things too.
> >>
> >
> > I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
> >
> > Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
> >
> > Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
> >
> > Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
> >
> > In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
> >
> > The load tests:
> >   - 2500 processes using 2500 addresses and random src ports
> >   - 2500 processes using 2500 addresses and the same src port
> >   - 2500 processes using the same src address and port
> >
> > I also tested using stock NF timeouts and using 1 second timeouts.
> >
> > Bandwidth used got as high as 16Mb/s for some tests.
> >
> > Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
> >
> > I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
> >
> > If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.
> 
> Should I resend the patch with a Tested-by: tag?

... Oh, wait. Not yet. The dawn just broke over ol' Marblehead here. I only tested TCP; I need to hammer UDP, too.

Can I set the timeouts to zero? Or is one as low as I can go?

N

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-29  6:40     ` Neal P. Murphy
@ 2015-10-30  0:01       ` Ani Sinha
  2015-10-30  1:21         ` Neal P. Murphy
  0 siblings, 1 reply; 12+ messages in thread
From: Ani Sinha @ 2015-10-30  0:01 UTC (permalink / raw)
  To: Neal P. Murphy; +Cc: Pablo Neira Ayuso, netfilter-devel, netfilter, netdev

On Wed, Oct 28, 2015 at 11:40 PM, Neal P. Murphy
<neal.p.murphy@alum.wpi.edu> wrote:
> On Wed, 28 Oct 2015 02:36:50 -0400
> "Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:
>
>> On Mon, 26 Oct 2015 21:06:33 +0100
>> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>>
>> > Hi,
>> >
>> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
>> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >
>> > Please, no need to Cc everyone here. Please, submit your Netfilter
>> > patches to netfilter-devel@vger.kernel.org.
>> >
>> > Moreover, it would be great if the subject includes something
>> > descriptive on what you need, for this I'd suggest:
>> >
>> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>> >
>> > I'm including Neal P. Murphy, he said he would help testing these
>> > backports, getting a Tested-by: tag usually speeds up things too.
>>
>
> I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.
>
> Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.
>
> Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).
>
> Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.
>
> In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.
>
> The load tests:
>   - 2500 processes using 2500 addresses and random src ports
>   - 2500 processes using 2500 addresses and the same src port
>   - 2500 processes using the same src address and port
>
> I also tested using stock NF timeouts and using 1 second timeouts.
>
> Bandwidth used got as high as 16Mb/s for some tests.
>
> Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.
>
> I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.
>
> If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.

Should I resend the patch with a Tested-by: tag?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
       [not found]   ` <17709_1446014232_t9S6b5a3017652_20151028023650.7b76098f@playground>
@ 2015-10-29  6:40     ` Neal P. Murphy
  2015-10-30  0:01       ` Ani Sinha
  0 siblings, 1 reply; 12+ messages in thread
From: Neal P. Murphy @ 2015-10-29  6:40 UTC (permalink / raw)
  To: Ani Sinha; +Cc: Pablo Neira Ayuso, netfilter-devel, netfilter, netdev

On Wed, 28 Oct 2015 02:36:50 -0400
"Neal P. Murphy" <neal.p.murphy@alum.wpi.edu> wrote:

> On Mon, 26 Oct 2015 21:06:33 +0100
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> 
> > Hi,
> > 
> > On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> > > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> > 
> > Please, no need to Cc everyone here. Please, submit your Netfilter
> > patches to netfilter-devel@vger.kernel.org.
> > 
> > Moreover, it would be great if the subject includes something
> > descriptive on what you need, for this I'd suggest:
> > 
> > [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> > 
> > I'm including Neal P. Murphy, he said he would help testing these
> > backports, getting a Tested-by: tag usually speeds up things too.
> 

I've probably done about as much seat-of-the-pants testing as I can. All opening/closing the same destination IP/port.

Host: Debian Jessie, 8-core Vishera 8350 at 4.4 GHz, 16GiB RAM at (I think) 2100MHz.

Traffic generator 1: 6-CPU KVM running 64-bit Smoothwall Express 3.1 (linux 3.4.109 without these patches), with 8GiB RAM and 9GiB swap. Packets sent across PURPLE (to bypass NAT and firewall).

Traffic generator 2: 32-bit KVM running Smoothwall Express 3.1 (linux 3.4.110 with these patches), 3GiB RAM and minimal swap.

In the first set of tests, generator 1's traffic passed through Generator 2 as a NATting firewall, to the host's web server. In the second set of tests, generator 2's traffic went through NAT to the host's web server.

The load tests:
  - 2500 processes using 2500 addresses and random src ports
  - 2500 processes using 2500 addresses and the same src port
  - 2500 processes using the same src address and port

I also tested using stock NF timeouts and using 1 second timeouts.

Bandwidth used got as high as 16Mb/s for some tests.

Conntracks got up to 200 000 or so or bounced between 1 and 2, depending on the test and the timeouts.

I did not reproduce the problem these patches solve. But more importantly, I saw no problems at all. Each time I terminated a test, RAM usage returned to about that of post-boot; so there were no apparent memory leaks. No kernel messages and no netfilter messages appeared during the tests.

If I have time, I suppose I could run another set of tests: 2500 source processes using 2500 addresses times 200 ports to connect to 2500 addresses times 200 ports on a destination system. Each process opens 200 sockets, then closes them. And repeats ad infinitum. But I might have to be clever since I can't run 500 000 processes; but I could run 20 VMs; that would get it down to about 12 000 processes per VM. And I might have to figure out how to allow allow processes on the destination system to open hundreds or thousands of sockets.

N

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-26 20:06 ` Pablo Neira Ayuso
@ 2015-10-28  6:36   ` Neal P. Murphy
       [not found]   ` <17709_1446014232_t9S6b5a3017652_20151028023650.7b76098f@playground>
  2015-11-04 23:46   ` Ani Sinha
  2 siblings, 0 replies; 12+ messages in thread
From: Neal P. Murphy @ 2015-10-28  6:36 UTC (permalink / raw)
  To: Ani Sinha; +Cc: Pablo Neira Ayuso, netfilter-devel, netfilter, netdev

On Mon, 26 Oct 2015 21:06:33 +0100
Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> Hi,
> 
> On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> > netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> 
> Please, no need to Cc everyone here. Please, submit your Netfilter
> patches to netfilter-devel@vger.kernel.org.
> 
> Moreover, it would be great if the subject includes something
> descriptive on what you need, for this I'd suggest:
> 
> [PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
> 
> I'm including Neal P. Murphy, he said he would help testing these
> backports, getting a Tested-by: tag usually speeds up things too.

I hammered it a couple nights ago. First test was 5000 processes on 6 SMP CPUs opening and closing a port on a 'remote' host using the usual random source ports. Only got up to 32000 conntracks. The generator was a 64-bit Smoothwall KVM without the patch. The traffic passed through a 32-bit Smoothwall KVM with the patch. The target was on the VM host. No problems encountered. I suspect I didn't come close to triggering the original problem. Second test was a couple thousand processes all using the same source IP and port and dest IP and port. Still no problems. But these were perl scripts (and they used lots of RAM); perhaps a short C program would let me run more.

Any ideas on how I might test it more brutally?

N

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-26 18:55 Ani Sinha
@ 2015-10-26 20:06 ` Pablo Neira Ayuso
  2015-10-28  6:36   ` Neal P. Murphy
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Pablo Neira Ayuso @ 2015-10-26 20:06 UTC (permalink / raw)
  To: Ani Sinha
  Cc: Patrick McHardy, David S. Miller, netfilter-devel, netfilter,
	netdev, coreteam, Florian Westphal, Jozsef Kadlecsik,
	Cyrill Gorcunov, Zefan Li, Eric Dumazet, Neal P. Murphy

Hi,

On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Please, no need to Cc everyone here. Please, submit your Netfilter
patches to netfilter-devel@vger.kernel.org.

Moreover, it would be great if the subject includes something
descriptive on what you need, for this I'd suggest:

[PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

I'm including Neal P. Murphy, he said he would help testing these
backports, getting a Tested-by: tag usually speeds up things too.

Burden is usually huge here, the easier you get it for us, the best.
Then we can review and, if no major concerns, I can submit this to
-stable.

Let me know if you have any other questions,
Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
@ 2015-10-26 18:55 Ani Sinha
  2015-10-26 20:06 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 12+ messages in thread
From: Ani Sinha @ 2015-10-26 18:55 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	netfilter-devel, netfilter, netdev, coreteam, Florian Westphal,
	Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik,
	Cyrill Gorcunov, Ani Sinha, Zefan Li, Eric Dumazet

netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ani Sinha <ani@arista.com>
---
 net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9a46908..fd0f7a3 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -309,6 +309,21 @@ static void death_by_timeout(unsigned long ul_conntrack)
 	nf_ct_put(ct);
 }
 
+static inline bool
+nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
+			const struct nf_conntrack_tuple *tuple,
+			u16 zone)
+{
+	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+	/* A conntrack can be recreated with the equal tuple,
+	 * so we need to check that the conntrack is confirmed
+	 */
+	return nf_ct_tuple_equal(tuple, &h->tuple) &&
+		nf_ct_zone(ct) == zone &&
+		nf_ct_is_confirmed(ct);
+}
+
 /*
  * Warning :
  * - Caller must take a reference on returned object
@@ -330,8 +345,7 @@ ____nf_conntrack_find(struct net *net, u16 zone,
 	local_bh_disable();
 begin:
 	hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
-		if (nf_ct_tuple_equal(tuple, &h->tuple) &&
-		    nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
+		if (nf_ct_key_equal(h, tuple, zone)) {
 			NF_CT_STAT_INC(net, found);
 			local_bh_enable();
 			return h;
@@ -378,8 +392,7 @@ begin:
 			     !atomic_inc_not_zero(&ct->ct_general.use)))
 			h = NULL;
 		else {
-			if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
-				     nf_ct_zone(ct) != zone)) {
+			if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
 				nf_ct_put(ct);
 				goto begin;
 			}
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
  2015-10-24 18:27 Ani Sinha
@ 2015-10-24 18:31 ` Ani Sinha
  0 siblings, 0 replies; 12+ messages in thread
From: Ani Sinha @ 2015-10-24 18:31 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	netfilter-devel, coreteam, Ani Sinha, netdev, Eric Dumazet,
	Florian Westphal

Please refer to the thread "linux 3.4.43 : kernel crash at
__nf_conntrack_confirm" on netdev for context.

thanks

On Sat, Oct 24, 2015 at 11:27 AM, Ani Sinha <ani@arista.com> wrote:
> netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
>
> Lets look at destroy_conntrack:
>
> hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
> ...
> nf_conntrack_free(ct)
>         kmem_cache_free(net->ct.nf_conntrack_cachep, ct);
>
> net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.
>
> The hash is protected by rcu, so readers look up conntracks without
> locks.
> A conntrack is removed from the hash, but in this moment a few readers
> still can use the conntrack. Then this conntrack is released and another
> thread creates conntrack with the same address and the equal tuple.
> After this a reader starts to validate the conntrack:
> * It's not dying, because a new conntrack was created
> * nf_ct_tuple_equal() returns true.
>
> But this conntrack is not initialized yet, so it can not be used by two
> threads concurrently. In this case BUG_ON may be triggered from
> nf_nat_setup_info().
>
> Florian Westphal suggested to check the confirm bit too. I think it's
> right.
>
> task 1                  task 2                  task 3
>                         nf_conntrack_find_get
>                          ____nf_conntrack_find
> destroy_conntrack
>  hlist_nulls_del_rcu
>  nf_conntrack_free
>  kmem_cache_free
>                                                 __nf_conntrack_alloc
>                                                  kmem_cache_alloc
>                                                  memset(&ct->tuplehash[IP_CT_DIR_MAX],
>                          if (nf_ct_is_dying(ct))
>                          if (!nf_ct_tuple_equal()
>
> I'm not sure, that I have ever seen this race condition in a real life.
> Currently we are investigating a bug, which is reproduced on a few nodes.
> In our case one conntrack is initialized from a few tasks concurrently,
> we don't have any other explanation for this.
>
> <2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
> ...
> <4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
> ...
> <4>[46267.085549] Call Trace:
> <4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
> <4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
> <4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
> <4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
> <4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
> <4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
> <4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
> <4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
> <4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
> <4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
> <4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
> <4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
> <4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
> <4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
> <4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
> <4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
> <4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
> <4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
> <4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
> <4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
> <4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
> <4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
> <4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
> <4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
> <4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
> <4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
> <4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
> <4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
> <4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
> <1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590
>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Florian Westphal <fw@strlen.de>
> Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Cyrill Gorcunov <gorcunov@openvz.org>
> Signed-off-by: Andrey Vagin <avagin@openvz.org>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> Signed-off-by: Ani Sinha <ani@arista.com>
> ---
>  net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 9a46908..fd0f7a3 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -309,6 +309,21 @@ static void death_by_timeout(unsigned long ul_conntrack)
>         nf_ct_put(ct);
>  }
>
> +static inline bool
> +nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
> +                       const struct nf_conntrack_tuple *tuple,
> +                       u16 zone)
> +{
> +       struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
> +
> +       /* A conntrack can be recreated with the equal tuple,
> +        * so we need to check that the conntrack is confirmed
> +        */
> +       return nf_ct_tuple_equal(tuple, &h->tuple) &&
> +               nf_ct_zone(ct) == zone &&
> +               nf_ct_is_confirmed(ct);
> +}
> +
>  /*
>   * Warning :
>   * - Caller must take a reference on returned object
> @@ -330,8 +345,7 @@ ____nf_conntrack_find(struct net *net, u16 zone,
>         local_bh_disable();
>  begin:
>         hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
> -               if (nf_ct_tuple_equal(tuple, &h->tuple) &&
> -                   nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
> +               if (nf_ct_key_equal(h, tuple, zone)) {
>                         NF_CT_STAT_INC(net, found);
>                         local_bh_enable();
>                         return h;
> @@ -378,8 +392,7 @@ begin:
>                              !atomic_inc_not_zero(&ct->ct_general.use)))
>                         h = NULL;
>                 else {
> -                       if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
> -                                    nf_ct_zone(ct) != zone)) {
> +                       if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
>                                 nf_ct_put(ct);
>                                 goto begin;
>                         }
> --
> 1.8.1.4
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.
@ 2015-10-24 18:27 Ani Sinha
  2015-10-24 18:31 ` Ani Sinha
  0 siblings, 1 reply; 12+ messages in thread
From: Ani Sinha @ 2015-10-24 18:27 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Patrick McHardy, David S. Miller,
	netfilter-devel, coreteam, ani, netdev, Eric Dumazet,
	Florian Westphal, Pablo Neira Ayuso

netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
	kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1			task 2			task 3
			nf_conntrack_find_get
			 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
						__nf_conntrack_alloc
						 kmem_cache_alloc
						 memset(&ct->tuplehash[IP_CT_DIR_MAX],
			 if (nf_ct_is_dying(ct))
			 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ani Sinha <ani@arista.com>
---
 net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9a46908..fd0f7a3 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -309,6 +309,21 @@ static void death_by_timeout(unsigned long ul_conntrack)
 	nf_ct_put(ct);
 }
 
+static inline bool
+nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
+			const struct nf_conntrack_tuple *tuple,
+			u16 zone)
+{
+	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+	/* A conntrack can be recreated with the equal tuple,
+	 * so we need to check that the conntrack is confirmed
+	 */
+	return nf_ct_tuple_equal(tuple, &h->tuple) &&
+		nf_ct_zone(ct) == zone &&
+		nf_ct_is_confirmed(ct);
+}
+
 /*
  * Warning :
  * - Caller must take a reference on returned object
@@ -330,8 +345,7 @@ ____nf_conntrack_find(struct net *net, u16 zone,
 	local_bh_disable();
 begin:
 	hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
-		if (nf_ct_tuple_equal(tuple, &h->tuple) &&
-		    nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
+		if (nf_ct_key_equal(h, tuple, zone)) {
 			NF_CT_STAT_INC(net, found);
 			local_bh_enable();
 			return h;
@@ -378,8 +392,7 @@ begin:
 			     !atomic_inc_not_zero(&ct->ct_general.use)))
 			h = NULL;
 		else {
-			if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
-				     nf_ct_zone(ct) != zone)) {
+			if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
 				nf_ct_put(ct);
 				goto begin;
 			}
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-11-06 11:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-02 20:48 [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream Ani Sinha
  -- strict thread matches above, loose matches on Subject: below --
2015-10-26 18:55 Ani Sinha
2015-10-26 20:06 ` Pablo Neira Ayuso
2015-10-28  6:36   ` Neal P. Murphy
     [not found]   ` <17709_1446014232_t9S6b5a3017652_20151028023650.7b76098f@playground>
2015-10-29  6:40     ` Neal P. Murphy
2015-10-30  0:01       ` Ani Sinha
2015-10-30  1:21         ` Neal P. Murphy
2015-10-30 15:37           ` Ani Sinha
2015-11-04 23:46   ` Ani Sinha
2015-11-06 11:09     ` Pablo Neira Ayuso
2015-10-24 18:27 Ani Sinha
2015-10-24 18:31 ` Ani Sinha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).