All of lore.kernel.org
 help / color / mirror / Atom feed
* Multicast routing stops functioning after 4G multicast packets recived.
@ 2013-12-19 14:48 Bob Falken
  2013-12-19 15:09 ` Hannes Frederic Sowa
  2014-01-04 19:55 ` Julian Anastasov
  0 siblings, 2 replies; 32+ messages in thread
From: Bob Falken @ 2013-12-19 14:48 UTC (permalink / raw)
  To: netdev

Hello, I have an issue after kernel 2.6.37 and above.
If i roll back to kernel 2.6.36.4 everything is fine.
if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.
(Tested bouth FIB_HASH and FIB_TRIE)   
Tested with xorp and pimd. 
I have abount 24 multicast groups, and it takes me about 17hours to reproduce the issue after a reboot.
Reboot is reqired to fix the issue. (Tested to stop/start pimd/xorp, reload network module for the network interface "e1000e",
Used birdge adapter and remove bridge adapter and readd bridge adapter to clear counters. none of thouse solves the issue.)

When the packet count for the input interface goes over 4294967295 packets. 
br0 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX:XX 
 inet addr:10.255.255.1 Bcast:10.255.255.255 Mask:255.255.255.0 
 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 
 RX packets:4295483655 errors:0 dropped:0 overruns:0 frame:0 
 TX packets:4210 errors:0 dropped:0 overruns:0 carrier:0 
 collisions:0 txqueuelen:0 
 RX bytes:3350477250900 (3.0 TiB) TX bytes:240828 (235.1 KiB) 

"/proc/net/ip_mr_cache" goes blank. 

This is the state of "/proc/net/ip_mr_cache" a couple of seconds before the stop: 
Group Origin Iif Pkts Bytes Wrong Oifs 
0A0202EF 0BFFFF0A 1 181610713 4217402668 0 
090202EF 0BFFFF0A 1 181610715 4217404228 0 
080202EF 0BFFFF0A 1 181610716 4217405008 0 
070202EF 0BFFFF0A 1 181610716 4217405008 0 
180202EF 0DFFFF0A 1 181611351 4217900308 0 
170202EF 0DFFFF0A 1 181611353 4217901868 0 
120202EF 0CFFFF0A 1 170885816 146950304 0 
020202EF 0AFFFF0A 1 181609636 4216562608 0 
160202EF 0DFFFF0A 1 181611358 4217905768 0 
150202EF 0DFFFF0A 1 181611359 4217906548 0 
060202EF 0AFFFF0A 1 181609641 4216566508 0 
140202EF 0DFFFF0A 1 181611364 4217910448 0 
050202EF 0AFFFF0A 1 181609644 4216568848 0 
130202EF 0DFFFF0A 1 181611366 4217912008 0 
040202EF 0AFFFF0A 1 181609649 4216572748 0 
030202EF 0AFFFF0A 1 181609651 4216574308 0 0:1 
110202EF 0CFFFF0A 1 170885832 146962784 0 
010202EF 0AFFFF0A 1 181609652 4216575088 0 0:1 
100202EF 0CFFFF0A 1 170885837 146966684 0 
0F0202EF 0CFFFF0A 1 170885838 146967464 0 
0E0202EF 0CFFFF0A 1 170885841 146969804 0 
0D0202EF 0CFFFF0A 1 170885842 146970584 0 
0C0202EF 0BFFFF0A 1 181610767 4217444788 0 
0B0202EF 0BFFFF0A 1 181610772 4217448688 0 
i have also tested the following kernels:linux-2.6.38.8 
linux-2.6.39.4 
linux-3.2.53 
linux-3.10.24 
linux-3.11.9
I can reproduce the issue on all of them. 

dmesg does provide any messages when this issue occure.

My arcutecture is "amd64".

I think there might be a variable in the kernel that get overflown. though i cannot be sure as im not a programmer. 

Let me know if you need more debug information.

Please help.  

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 14:48 Multicast routing stops functioning after 4G multicast packets recived Bob Falken
@ 2013-12-19 15:09 ` Hannes Frederic Sowa
  2013-12-19 15:15   ` Ben Greear
  2014-01-04 19:55 ` Julian Anastasov
  1 sibling, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-19 15:09 UTC (permalink / raw)
  To: Bob Falken; +Cc: netdev

On Thu, Dec 19, 2013 at 03:48:16PM +0100, Bob Falken wrote:
> Hello, I have an issue after kernel 2.6.37 and above.
> If i roll back to kernel 2.6.36.4 everything is fine.
> if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.
> (Tested bouth FIB_HASH and FIB_TRIE)   
> Tested with xorp and pimd. 
> I have abount 24 multicast groups, and it takes me about 17hours to reproduce the issue after a reboot.
> Reboot is reqired to fix the issue. (Tested to stop/start pimd/xorp, reload network module for the network interface "e1000e",
> Used birdge adapter and remove bridge adapter and readd bridge adapter to clear counters. none of thouse solves the issue.)

Please test this with a recent kernel. 2.6.37 is really old and you normally
won't get good support here with such old kernels.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 15:09 ` Hannes Frederic Sowa
@ 2013-12-19 15:15   ` Ben Greear
  2013-12-19 15:48     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Ben Greear @ 2013-12-19 15:15 UTC (permalink / raw)
  To: Bob Falken, netdev

On 12/19/2013 07:09 AM, Hannes Frederic Sowa wrote:
> On Thu, Dec 19, 2013 at 03:48:16PM +0100, Bob Falken wrote:
>> Hello, I have an issue after kernel 2.6.37 and above.
>> If i roll back to kernel 2.6.36.4 everything is fine.
>> if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.
>> (Tested bouth FIB_HASH and FIB_TRIE)
>> Tested with xorp and pimd.
>> I have abount 24 multicast groups, and it takes me about 17hours to reproduce the issue after a reboot.
>> Reboot is reqired to fix the issue. (Tested to stop/start pimd/xorp, reload network module for the network interface "e1000e",
>> Used birdge adapter and remove bridge adapter and readd bridge adapter to clear counters. none of thouse solves the issue.)
>
> Please test this with a recent kernel. 2.6.37 is really old and you normally
> won't get good support here with such old kernels.

Note that he did test up to 3.11.9 and it still showed failures.

Thanks,
Ben

>
> Greetings,
>
>    Hannes
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 15:15   ` Ben Greear
@ 2013-12-19 15:48     ` Hannes Frederic Sowa
  0 siblings, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-19 15:48 UTC (permalink / raw)
  To: Ben Greear; +Cc: Bob Falken, netdev

On Thu, Dec 19, 2013 at 07:15:37AM -0800, Ben Greear wrote:
> On 12/19/2013 07:09 AM, Hannes Frederic Sowa wrote:
> >On Thu, Dec 19, 2013 at 03:48:16PM +0100, Bob Falken wrote:
> >>Hello, I have an issue after kernel 2.6.37 and above.
> >>If i roll back to kernel 2.6.36.4 everything is fine.
> >>if recive more than 4294967295 multicast packets, the kernel does not 
> >>register the multicast packets. and multicast routing does not 
> >>functioning.
> >>(Tested bouth FIB_HASH and FIB_TRIE)
> >>Tested with xorp and pimd.
> >>I have abount 24 multicast groups, and it takes me about 17hours to 
> >>reproduce the issue after a reboot.
> >>Reboot is reqired to fix the issue. (Tested to stop/start pimd/xorp, 
> >>reload network module for the network interface "e1000e",
> >>Used birdge adapter and remove bridge adapter and readd bridge adapter to 
> >>clear counters. none of thouse solves the issue.)
> >
> >Please test this with a recent kernel. 2.6.37 is really old and you 
> >normally
> >won't get good support here with such old kernels.
> 
> Note that he did test up to 3.11.9 and it still showed failures.

Oh sorry, I did not read to the end. ;)

An interesting hint could be to use dropwatch or perf script net_dropmonitor
to check where the fragments get dropped. Also nstat could give additional
hints where something might get wrong. Please use a recent kernel while
debugging this issue. Maybe a patch can get backported later.

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 14:48 Multicast routing stops functioning after 4G multicast packets recived Bob Falken
  2013-12-19 15:09 ` Hannes Frederic Sowa
@ 2014-01-04 19:55 ` Julian Anastasov
  2014-01-04 23:38   ` Hannes Frederic Sowa
  1 sibling, 1 reply; 32+ messages in thread
From: Julian Anastasov @ 2014-01-04 19:55 UTC (permalink / raw)
  To: Bob Falken; +Cc: netdev

[-- Attachment #1: Type: TEXT/PLAIN, Size: 849 bytes --]


	Hello,

On Thu, 19 Dec 2013, Bob Falken wrote:

> Hello, I have an issue after kernel 2.6.37 and above.
> If i roll back to kernel 2.6.36.4 everything is fine.
> if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.

...

> I think there might be a variable in the kernel that get overflown. though i cannot be sure as im not a programmer. 
> 
> Let me know if you need more debug information.
> 
> Please help.  

	As Hannes guessed, may be it is really the missing
flags = FIB_LOOKUP_NOREF in ipmr_fib_lookup() when calling
fib_rules_lookup(), we have an atomic_inc_not_zero() there that
can stop to work after 4G lookups when rule's refcnt reaches 0.
As result, fib_rules_lookup() can start to return -ESRCH.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-04 19:55 ` Julian Anastasov
@ 2014-01-04 23:38   ` Hannes Frederic Sowa
  2014-01-05  8:56     ` Julian Anastasov
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-04 23:38 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Bob Falken, netdev

On Sat, Jan 04, 2014 at 09:55:51PM +0200, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Thu, 19 Dec 2013, Bob Falken wrote:
> 
> > Hello, I have an issue after kernel 2.6.37 and above.
> > If i roll back to kernel 2.6.36.4 everything is fine.
> > if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.
> 
> ...
> 
> > I think there might be a variable in the kernel that get overflown. though i cannot be sure as im not a programmer. 
> > 
> > Let me know if you need more debug information.
> > 
> > Please help.  
> 
> 	As Hannes guessed, may be it is really the missing
> flags = FIB_LOOKUP_NOREF in ipmr_fib_lookup() when calling
> fib_rules_lookup(), we have an atomic_inc_not_zero() there that
> can stop to work after 4G lookups when rule's refcnt reaches 0.
> As result, fib_rules_lookup() can start to return -ESRCH.

I guess we should just try. I somehow forgot to look after that. Thanks
for reminding, Julian.

Bob, can you try with this patch again?

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..b9b3472 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 			   struct mr_table **mrt)
 {
-	struct ipmr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ipmr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 			       flowi4_to_flowi(flp4), 0, &arg);

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-04 23:38   ` Hannes Frederic Sowa
@ 2014-01-05  8:56     ` Julian Anastasov
  2014-01-05 10:41       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Julian Anastasov @ 2014-01-05  8:56 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Bob Falken, netdev


	Hello,

On Sun, 5 Jan 2014, Hannes Frederic Sowa wrote:

> I guess we should just try. I somehow forgot to look after that. Thanks
> for reminding, Julian.

	If that patch solves the problem, do not forget
to include same fix for ip6mr_fib_lookup(). As for net/decnet/,
it looks ok, it uses fib_rule_put() in dn_fib_res_put().

> Bob, can you try with this patch again?
> 
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 421a249..b9b3472 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c
> @@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
>  static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
>  			   struct mr_table **mrt)
>  {
> -	struct ipmr_result res;
> -	struct fib_lookup_arg arg = { .result = &res, };
>  	int err;
> +	struct ipmr_result res;
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +		.flags = FIB_LOOKUP_NOREF,
> +	};
>  
>  	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
>  			       flowi4_to_flowi(flp4), 0, &arg);

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-05  8:56     ` Julian Anastasov
@ 2014-01-05 10:41       ` Hannes Frederic Sowa
  2014-01-05 19:12         ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-05 10:41 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Bob Falken, netdev

On Sun, Jan 05, 2014 at 10:56:55AM +0200, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Sun, 5 Jan 2014, Hannes Frederic Sowa wrote:
> 
> > I guess we should just try. I somehow forgot to look after that. Thanks
> > for reminding, Julian.
> 
> 	If that patch solves the problem, do not forget
> to include same fix for ip6mr_fib_lookup(). As for net/decnet/,
> it looks ok, it uses fib_rule_put() in dn_fib_res_put().

Thanks, I had IPv6 on my radar but did not check for other callers.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-05 10:41       ` Hannes Frederic Sowa
@ 2014-01-05 19:12         ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-05 19:12 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Julian Anastasov, Bob Falken, netdev

On Sun, 2014-01-05 at 11:41 +0100, Hannes Frederic Sowa wrote:
> On Sun, Jan 05, 2014 at 10:56:55AM +0200, Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Sun, 5 Jan 2014, Hannes Frederic Sowa wrote:
> > 
> > > I guess we should just try. I somehow forgot to look after that. Thanks
> > > for reminding, Julian.
> > 
> > 	If that patch solves the problem, do not forget
> > to include same fix for ip6mr_fib_lookup(). As for net/decnet/,
> > it looks ok, it uses fib_rule_put() in dn_fib_res_put().
> 
> Thanks, I had IPv6 on my radar but did not check for other callers.

BTW, I wonder if the old kernels (2.6.32 & 2.6.34) need something as
well ?

Before commit ebc0ffae5dfb444 and commit 7fa7cb7109d0, it looks like we
had an overflow in fib_rule_get() and refcnt was wrapping.

So maybe we should add a fib_rule_put() for stable kernels, and use
FIB_LOOKUP_NOREF on next kernels ...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-12  7:42             ` Hannes Frederic Sowa
@ 2014-01-13  0:56               ` Eric Dumazet
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Dumazet @ 2014-01-13  0:56 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Sun, 2014-01-12 at 08:42 +0100, Hannes Frederic Sowa wrote:

> Hm, Eric. If we do that we can just specifiy FIB_LOOKUP_NOREF
> unconditionally. FIB_LOOKUP_NOREF has no other side effects on a ipmr
> lookup as taking the reference on the rule, which we would drop after
> that.
> 
> So we would actually be going back to the first patch in this thread. I
> guess it is just a matter of style?

Hi Hannes, please submit a formal patch, so that we can have a proper
ground for discussion (I guess I'll only add my Acked-by)

Thanks !

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  7:50           ` Hannes Frederic Sowa
@ 2014-01-12  7:42             ` Hannes Frederic Sowa
  2014-01-13  0:56               ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-12  7:42 UTC (permalink / raw)
  To: Eric Dumazet, Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Fri, Jan 10, 2014 at 08:50:05AM +0100, Hannes Frederic Sowa wrote:
> > > Its a bit late here, so maybe following is just stupid :
> > > Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
> > 
> > We could add bool noref to ipmr_fib_lookup indicating we want to drop
> > reference to rule just after lookup.
> > 
> > I'll check if freeing a rule has additional side-effects on dependencies
> > in reg_vif_xmit. That would be a nice solution actually, thanks!
> 
> Hmm, rule holds a reference to the net namespace in use. I don't know
> if we want to add this special case. I guess net-namespace reference
> cannot be removed while processing ndo_start_xmit callback but I don't
> like this special case somehow. But I guess it is possible.
> 
> Your opinion on that?

Hm, Eric. If we do that we can just specifiy FIB_LOOKUP_NOREF
unconditionally. FIB_LOOKUP_NOREF has no other side effects on a ipmr
lookup as taking the reference on the rule, which we would drop after
that.

So we would actually be going back to the first patch in this thread. I
guess it is just a matter of style?

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2014-01-12  0:25 Bob Falken
  0 siblings, 0 replies; 32+ messages in thread
From: Bob Falken @ 2014-01-12  0:25 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Julian Anastasov, netdev, kaber, tgraf, eric.dumazet

Hey Hannes, The updated patch you posted on the 7th worked as expected (as far as i know, did not have any obvious issues.) :-)

I'll run a full test starting on Monday/Thursday, and report back after I have reached 2^32 RX packets. 

I will then run a longer test (3-4 weeks), and also check that TX is ok after reaching 2^32 packets. 

Thanks again for all the help.

Best Regards.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  7:43         ` Hannes Frederic Sowa
@ 2014-01-10  7:50           ` Hannes Frederic Sowa
  2014-01-12  7:42             ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10  7:50 UTC (permalink / raw)
  To: Eric Dumazet, Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Fri, Jan 10, 2014 at 08:43:25AM +0100, Hannes Frederic Sowa wrote:
> On Thu, Jan 09, 2014 at 11:32:59PM -0800, Eric Dumazet wrote:
> > On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> > > On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > > > Its not clear to me why you expand ipmr_fib_lookup()
> > > > 
> > > > Is there something wrong with existing code ?
> > > 
> > > There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> > > section, one is not.
> > > 
> > > ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> > > chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> > > just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> > > other function so I still have access to arg.rule to decrement the
> > > reference counter.
> > > 
> > > Do you agree?
> > 
> > Hmm, I see the problem now.
> > 
> > What about adding a parameter to ipmr_fib_lookup(),
> > to keep its spirit ?
> > 
> > ipmr_fib_lookup(net, &fl4, &mrt);
> > ->
> > ipmr_fib_lookup(net, &fl4, &mrt, &rule);
> > 
> > Since ipmr_rt_fib_lookup() has the same rule leak, no ?
> 
> No, ipmr_rt_fib_lookup is fine. This function gets called only from
> rcu read locked section and we don't take table reference because of
> FIB_LOOKUP_NOREF, so we don't need to put reference counter on arg.table.

arg.rule not table, actually.

> We could add the additional argument, just ignoring it in ipmr_rt_fib_lookup.
> 
> > 
> > Its a bit late here, so maybe following is just stupid :
> > Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
> 
> We could add bool noref to ipmr_fib_lookup indicating we want to drop
> reference to rule just after lookup.
> 
> I'll check if freeing a rule has additional side-effects on dependencies
> in reg_vif_xmit. That would be a nice solution actually, thanks!

Hmm, rule holds a reference to the net namespace in use. I don't know
if we want to add this special case. I guess net-namespace reference
cannot be removed while processing ndo_start_xmit callback but I don't
like this special case somehow. But I guess it is possible.

Your opinion on that?

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  7:32       ` Eric Dumazet
@ 2014-01-10  7:43         ` Hannes Frederic Sowa
  2014-01-10  7:50           ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10  7:43 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Thu, Jan 09, 2014 at 11:32:59PM -0800, Eric Dumazet wrote:
> On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> > On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > > Its not clear to me why you expand ipmr_fib_lookup()
> > > 
> > > Is there something wrong with existing code ?
> > 
> > There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> > section, one is not.
> > 
> > ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> > chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> > just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> > other function so I still have access to arg.rule to decrement the
> > reference counter.
> > 
> > Do you agree?
> 
> Hmm, I see the problem now.
> 
> What about adding a parameter to ipmr_fib_lookup(),
> to keep its spirit ?
> 
> ipmr_fib_lookup(net, &fl4, &mrt);
> ->
> ipmr_fib_lookup(net, &fl4, &mrt, &rule);
> 
> Since ipmr_rt_fib_lookup() has the same rule leak, no ?

No, ipmr_rt_fib_lookup is fine. This function gets called only from
rcu read locked section and we don't take table reference because of
FIB_LOOKUP_NOREF, so we don't need to put reference counter on arg.table.

We could add the additional argument, just ignoring it in ipmr_rt_fib_lookup.

> 
> Its a bit late here, so maybe following is just stupid :
> Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?

We could add bool noref to ipmr_fib_lookup indicating we want to drop
reference to rule just after lookup.

I'll check if freeing a rule has additional side-effects on dependencies
in reg_vif_xmit. That would be a nice solution actually, thanks!

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  7:10     ` Hannes Frederic Sowa
@ 2014-01-10  7:32       ` Eric Dumazet
  2014-01-10  7:43         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-01-10  7:32 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > Its not clear to me why you expand ipmr_fib_lookup()
> > 
> > Is there something wrong with existing code ?
> 
> There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> section, one is not.
> 
> ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> other function so I still have access to arg.rule to decrement the
> reference counter.
> 
> Do you agree?

Hmm, I see the problem now.

What about adding a parameter to ipmr_fib_lookup(),
to keep its spirit ?

ipmr_fib_lookup(net, &fl4, &mrt);
->
ipmr_fib_lookup(net, &fl4, &mrt, &rule);

Since ipmr_rt_fib_lookup() has the same rule leak, no ?

Its a bit late here, so maybe following is just stupid :
Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  7:01   ` Eric Dumazet
@ 2014-01-10  7:10     ` Hannes Frederic Sowa
  2014-01-10  7:32       ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10  7:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> Its not clear to me why you expand ipmr_fib_lookup()
> 
> Is there something wrong with existing code ?

There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
section, one is not.

ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
other function so I still have access to arg.rule to decrement the
reference counter.

Do you agree?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-10  6:36 ` Hannes Frederic Sowa
@ 2014-01-10  7:01   ` Eric Dumazet
  2014-01-10  7:10     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-01-10  7:01 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf

On Fri, 2014-01-10 at 07:36 +0100, Hannes Frederic Sowa wrote:

> Ok, so I am proposing this patch. Only difference from the RFC is that
> I removed the superfluous arg.rule NULL-pointer checks (I hate if they
> are superfluous and they always seem to spread ;) ).
> 
> Maybe you could test this one instead and David could pick it up as soon
> as your results are in.
> 
> I'll also look for the stable kernels where FIB_LOOKUP_NOREF is not
> yet available.
> 
> Thank you,
> 
>   Hannes
> 
> [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
> 
> When introducing multiple table support for multicast forwarding in
> IPv4 and IPv6, necessary fib_rules_put reference count decrements were
> forgotten.
> 
> Bob Falken reported that after 4G packets, multicast forwarding stopped
> working. This was because of a rule reference counter overflow which
> freed the rule as soon as the overflow happend.
> 
> So, use FIB_LOOKUP_NOREF if we are already in a RCU protected section and
> correctly deal with reference counter if not (called from ndo_start_xmit).
> 
> Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
> Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
> Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
>  net/ipv4/ipmr.c  | 23 +++++++++++++++++------
>  net/ipv6/ip6mr.c | 21 +++++++++++++++------
>  2 files changed, 32 insertions(+), 12 deletions(-)
> 
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 421a249..c8d0857 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c
> @@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
>  static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
>  			   struct mr_table **mrt)
>  {
> -	struct ipmr_result res;
> -	struct fib_lookup_arg arg = { .result = &res, };
>  	int err;
> +	struct ipmr_result res;
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +		.flags = FIB_LOOKUP_NOREF,
> +	};
>  
>  	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
>  			       flowi4_to_flowi(flp4), 0, &arg);
> @@ -448,16 +451,22 @@ failure:
>  
>  static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
> +	int err;
> +	struct ipmr_result res;
>  	struct net *net = dev_net(dev);
> -	struct mr_table *mrt;
> +
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +	};
> +
>  	struct flowi4 fl4 = {
>  		.flowi4_oif	= dev->ifindex,
>  		.flowi4_iif	= skb->skb_iif,
>  		.flowi4_mark	= skb->mark,
>  	};
> -	int err;
>  
> -	err = ipmr_fib_lookup(net, &fl4, &mrt);
> +	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
> +			       flowi4_to_flowi(&fl4), 0, &arg);

Its not clear to me why you expand ipmr_fib_lookup()

Is there something wrong with existing code ?

Its not mentioned in changelog

>  	if (err < 0) {
>  		kfree_skb(skb);
>  		return err;
> @@ -466,9 +475,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
>  	read_lock(&mrt_lock);
>  	dev->stats.tx_bytes += skb->len;
>  	dev->stats.tx_packets++;
> -	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
> +	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> +			  IGMPMSG_WHOLEPKT);
>  	read_unlock(&mrt_lock);
>  	kfree_skb(skb);
> +	fib_rule_put(arg.rule);

This is the one line that is really missing, patch could be smaller.

>  	return NETDEV_TX_OK;
>  }
>  
> diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
> index f365310..38347a3 100644
> --- a/net/ipv6/ip6mr.c
> +++ b/net/ipv6/ip6mr.c
> @@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
>  static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
>  			    struct mr6_table **mrt)
>  {
> -	struct ip6mr_result res;
> -	struct fib_lookup_arg arg = { .result = &res, };
>  	int err;
> +	struct ip6mr_result res;
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +		.flags = FIB_LOOKUP_NOREF,
> +	};
>  
>  	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
>  			       flowi6_to_flowi(flp6), 0, &arg);
> @@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
>  static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
>  				      struct net_device *dev)
>  {
> +	int err;
> +	struct ip6mr_result res;
>  	struct net *net = dev_net(dev);
> -	struct mr6_table *mrt;
>  	struct flowi6 fl6 = {
>  		.flowi6_oif	= dev->ifindex,
>  		.flowi6_iif	= skb->skb_iif,
>  		.flowi6_mark	= skb->mark,
>  	};
> -	int err;
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +	};
>  
> -	err = ip6mr_fib_lookup(net, &fl6, &mrt);
> +	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
> +			flowi6_to_flowi(&fl6), 0, &arg);


same remark here.

>  	if (err < 0) {
>  		kfree_skb(skb);
>  		return err;
> @@ -711,9 +718,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
>  	read_lock(&mrt_lock);
>  	dev->stats.tx_bytes += skb->len;
>  	dev->stats.tx_packets++;
> -	ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
> +	ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> +			   MRT6MSG_WHOLEPKT);
>  	read_unlock(&mrt_lock);
>  	kfree_skb(skb);
> +	fib_rule_put(arg.rule);
>  	return NETDEV_TX_OK;
>  }
>  

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-09 20:14 Bob Falken
@ 2014-01-10  6:36 ` Hannes Frederic Sowa
  2014-01-10  7:01   ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10  6:36 UTC (permalink / raw)
  To: Bob Falken; +Cc: Julian Anastasov, netdev, kaber, tgraf, eric.dumazet

On Thu, Jan 09, 2014 at 09:14:11PM +0100, Bob Falken wrote:
> Hello,
> Testing this patch as im typing this. will check status in about 12hours.
> Unfortuantly, I dont have any receivers avaialble for requesting the multicast stream on the edge point anymore. 
> So there is not TX traffic a.t.m..
> 
> I will have a better test-lab available next week. (hopefully).

Ok, so I am proposing this patch. Only difference from the RFC is that
I removed the superfluous arg.rule NULL-pointer checks (I hate if they
are superfluous and they always seem to spread ;) ).

Maybe you could test this one instead and David could pick it up as soon
as your results are in.

I'll also look for the stable kernels where FIB_LOOKUP_NOREF is not
yet available.

Thank you,

  Hannes

[PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding

When introducing multiple table support for multicast forwarding in
IPv4 and IPv6, necessary fib_rules_put reference count decrements were
forgotten.

Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.

So, use FIB_LOOKUP_NOREF if we are already in a RCU protected section and
correctly deal with reference counter if not (called from ndo_start_xmit).

Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv4/ipmr.c  | 23 +++++++++++++++++------
 net/ipv6/ip6mr.c | 21 +++++++++++++++------
 2 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..c8d0857 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 			   struct mr_table **mrt)
 {
-	struct ipmr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ipmr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 			       flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,22 @@ failure:
 
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	int err;
+	struct ipmr_result res;
 	struct net *net = dev_net(dev);
-	struct mr_table *mrt;
+
+	struct fib_lookup_arg arg = {
+		.result = &res,
+	};
+
 	struct flowi4 fl4 = {
 		.flowi4_oif	= dev->ifindex,
 		.flowi4_iif	= skb->skb_iif,
 		.flowi4_mark	= skb->mark,
 	};
-	int err;
 
-	err = ipmr_fib_lookup(net, &fl4, &mrt);
+	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+			       flowi4_to_flowi(&fl4), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -466,9 +475,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			  IGMPMSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..38347a3 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
 static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
 			    struct mr6_table **mrt)
 {
-	struct ip6mr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ip6mr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
 			       flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
+	int err;
+	struct ip6mr_result res;
 	struct net *net = dev_net(dev);
-	struct mr6_table *mrt;
 	struct flowi6 fl6 = {
 		.flowi6_oif	= dev->ifindex,
 		.flowi6_iif	= skb->skb_iif,
 		.flowi6_mark	= skb->mark,
 	};
-	int err;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+	};
 
-	err = ip6mr_fib_lookup(net, &fl6, &mrt);
+	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+			flowi6_to_flowi(&fl6), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -711,9 +718,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+	ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			   MRT6MSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2014-01-09 20:14 Bob Falken
  2014-01-10  6:36 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Bob Falken @ 2014-01-09 20:14 UTC (permalink / raw)
  To: Hannes Frederic Sowa, Julian Anastasov, netdev

Hello,
Testing this patch as im typing this. will check status in about 12hours.
Unfortuantly, I dont have any receivers avaialble for requesting the multicast stream on the edge point anymore. 
So there is not TX traffic a.t.m..

I will have a better test-lab available next week. (hopefully).

Thanks again.
----- Original Message -----
From: Hannes Frederic Sowa
Sent: 01/07/14 09:20 PM
To: Bob Falken, Julian Anastasov, netdev@vger.kernel.org
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
 On Tue, Jan 07, 2014 at 09:11:47PM +0100, Hannes Frederic Sowa wrote:
> On Tue, Jan 07, 2014 at 06:43:22PM +0100, Hannes Frederic Sowa wrote:
> > On Tue, Jan 07, 2014 at 06:01:44PM +0100, Bob Falken wrote:
> > > Hello,
> > > 
> > > I patched, kernel 3.2.53 yesterday,
> > > 8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)
> > > 
> > > Kudos for fixing this.
> > > 
> > > I will keep checking the next days.
> > > 
> > > As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.
> > > 
> > > Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).
> > 
> > Thank you for testing!
> > 
> > I'll review the RCU regions again and will prepare the patches (before
> > introduction of ebc0ffae5dfb44 ("fib: RCU conversion of fib_lookup()")
> > and after.
> 
> It seems ip(6)mr_fib_lookup is not always called from rcu section
> (ndo_start_xmit), so I had to restructure a bit. Could you retest this
> patch as preparation for a submission to stable? Thanks!
> 
> RCU conversion can be done later then.

Broken patch, sorry. Please try this one:

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..e5e9071 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 struct mr_table **mrt)
 {
- struct ipmr_result res;
- struct fib_lookup_arg arg = { .result = &res};
 int err;
+ struct ipmr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
 
 err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,22 @@ failure:
 
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+ int err;
+ struct ipmr_result res;
 struct net *net = dev_net(dev);
- struct mr_table *mrt;
+
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
+
 struct flowi4 fl4 = {
 .flowi4_oif = dev->ifindex,
 .flowi4_iif = skb->skb_iif,
 .flowi4_mark = skb->mark,
 };
- int err;
 
- err = ipmr_fib_lookup(net, &fl4, &mrt);
+ err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+ flowi4_to_flowi(&fl4), 0, &arg);
 if (err < 0) {
 kfree_skb(skb);
 return err;
@@ -466,9 +475,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 read_lock(&mrt_lock);
 dev->stats.tx_bytes += skb->len;
 dev->stats.tx_packets++;
- ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+ ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ IGMPMSG_WHOLEPKT);
 read_unlock(&mrt_lock);
 kfree_skb(skb);
+ if (arg.rule)
+ fib_rule_put(arg.rule);
 return NETDEV_TX_OK;
 }
 
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..45ec621 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
 static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
 struct mr6_table **mrt)
 {
- struct ip6mr_result res;
- struct fib_lookup_arg arg = { .result = &res};
 int err;
+ struct ip6mr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
 
 err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
 flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 struct net_device *dev)
 {
+ int err;
+ struct ip6mr_result res;
 struct net *net = dev_net(dev);
- struct mr6_table *mrt;
 struct flowi6 fl6 = {
 .flowi6_oif = dev->ifindex,
 .flowi6_iif = skb->skb_iif,
 .flowi6_mark = skb->mark,
 };
- int err;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
 
- err = ip6mr_fib_lookup(net, &fl6, &mrt);
+ err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+ flowi6_to_flowi(&fl6), 0, &arg);
 if (err < 0) {
 kfree_skb(skb);
 return err;
@@ -711,9 +718,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 read_lock(&mrt_lock);
 dev->stats.tx_bytes += skb->len;
 dev->stats.tx_packets++;
- ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+ ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ MRT6MSG_WHOLEPKT);
 read_unlock(&mrt_lock);
 kfree_skb(skb);
+ if (arg.rule)
+ fib_rule_put(arg.rule);
 return NETDEV_TX_OK;
 } 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-07 20:26     ` Eric Dumazet
@ 2014-01-07 20:29       ` Hannes Frederic Sowa
  0 siblings, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-07 20:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev

On Tue, Jan 07, 2014 at 12:26:53PM -0800, Eric Dumazet wrote:
> On Tue, 2014-01-07 at 21:11 +0100, Hannes Frederic Sowa wrote:
> 
> > It seems ip(6)mr_fib_lookup is not always called  from rcu section
> > (ndo_start_xmit), so I had to restructure a bit. Could you retest this
> > patch as preparation for a submission to stable? Thanks!
> > 
> > RCU conversion can be done later then.
> > 
> > diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> > index 421a249..9ae4ae7 100644
> > --- a/net/ipv4/ipmr.c
> > +++ b/net/ipv4/ipmr.c
> 
> >  
> >  static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> > +	int err;
> > +	struct ipmr_result res;
> >  	struct net *net = dev_net(dev);
> > -	struct mr_table *mrt;
> > +
> > +	struct fib_lookup_arg arg = {
> > +		.result = &res,
> > +		.flags = FIB_LOOKUP_NOREF,
> > +	};
> > +
> >  	struct flowi4 fl4 = {
> >  		.flowi4_oif	= dev->ifindex,
> >  		.flowi4_iif	= skb->skb_iif,
> >  		.flowi4_mark	= skb->mark,
> >  	};
> > -	int err;
> 
> Technically speaking, I don't think reg_vif_xmit() is enclosed
> in rcu_read_lock() section.
> 
> >  
> > -	err = ipmr_fib_lookup(net, &fl4, &mrt);
> > +	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
> > +			       flowi4_to_flowi(&fl4), 0, &arg);
> >  	if (err < 0) {
> >  		kfree_skb(skb);
> >  		return err;
> > @@ -466,9 +476,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
> >  	read_lock(&mrt_lock);
> >  	dev->stats.tx_bytes += skb->len;
> >  	dev->stats.tx_packets++;
> > -	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
> > +	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> > +			  IGMPMSG_WHOLEPKT);
> >  	read_unlock(&mrt_lock);
> >  	kfree_skb(skb);
> > +	if (arg.rule)
> > +		fib_rule_put(arg.rule);
> >  	return NETDEV_TX_OK;
> >  }
> >  
> 
> Hmm... Are you sure you meant to use FIB_LOOKUP_NOREF in
> reg_vif_xmit() ?

Nope, that's the reason for the second patch. Should have mentioned that.

Thanks for the review,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-07 20:11   ` Hannes Frederic Sowa
  2014-01-07 20:20     ` Hannes Frederic Sowa
@ 2014-01-07 20:26     ` Eric Dumazet
  2014-01-07 20:29       ` Hannes Frederic Sowa
  1 sibling, 1 reply; 32+ messages in thread
From: Eric Dumazet @ 2014-01-07 20:26 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev

On Tue, 2014-01-07 at 21:11 +0100, Hannes Frederic Sowa wrote:

> It seems ip(6)mr_fib_lookup is not always called  from rcu section
> (ndo_start_xmit), so I had to restructure a bit. Could you retest this
> patch as preparation for a submission to stable? Thanks!
> 
> RCU conversion can be done later then.
> 
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 421a249..9ae4ae7 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c

>  
>  static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
> +	int err;
> +	struct ipmr_result res;
>  	struct net *net = dev_net(dev);
> -	struct mr_table *mrt;
> +
> +	struct fib_lookup_arg arg = {
> +		.result = &res,
> +		.flags = FIB_LOOKUP_NOREF,
> +	};
> +
>  	struct flowi4 fl4 = {
>  		.flowi4_oif	= dev->ifindex,
>  		.flowi4_iif	= skb->skb_iif,
>  		.flowi4_mark	= skb->mark,
>  	};
> -	int err;

Technically speaking, I don't think reg_vif_xmit() is enclosed
in rcu_read_lock() section.

>  
> -	err = ipmr_fib_lookup(net, &fl4, &mrt);
> +	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
> +			       flowi4_to_flowi(&fl4), 0, &arg);
>  	if (err < 0) {
>  		kfree_skb(skb);
>  		return err;
> @@ -466,9 +476,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
>  	read_lock(&mrt_lock);
>  	dev->stats.tx_bytes += skb->len;
>  	dev->stats.tx_packets++;
> -	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
> +	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> +			  IGMPMSG_WHOLEPKT);
>  	read_unlock(&mrt_lock);
>  	kfree_skb(skb);
> +	if (arg.rule)
> +		fib_rule_put(arg.rule);
>  	return NETDEV_TX_OK;
>  }
>  

Hmm... Are you sure you meant to use FIB_LOOKUP_NOREF in
reg_vif_xmit() ?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-07 20:11   ` Hannes Frederic Sowa
@ 2014-01-07 20:20     ` Hannes Frederic Sowa
  2014-01-07 20:26     ` Eric Dumazet
  1 sibling, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-07 20:20 UTC (permalink / raw)
  To: Bob Falken, Julian Anastasov, netdev

On Tue, Jan 07, 2014 at 09:11:47PM +0100, Hannes Frederic Sowa wrote:
> On Tue, Jan 07, 2014 at 06:43:22PM +0100, Hannes Frederic Sowa wrote:
> > On Tue, Jan 07, 2014 at 06:01:44PM +0100, Bob Falken wrote:
> > > Hello,
> > > 
> > > I patched, kernel 3.2.53 yesterday,
> > > 8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)
> > > 
> > > Kudos for fixing this.
> > > 
> > > I will keep checking the next days.
> > > 
> > > As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.
> > > 
> > > Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).
> > 
> > Thank you for testing!
> > 
> > I'll review the RCU regions again and will prepare the patches (before
> > introduction of ebc0ffae5dfb44 ("fib: RCU conversion of fib_lookup()")
> > and after.
> 
> It seems ip(6)mr_fib_lookup is not always called  from rcu section
> (ndo_start_xmit), so I had to restructure a bit. Could you retest this
> patch as preparation for a submission to stable? Thanks!
> 
> RCU conversion can be done later then.

Broken patch, sorry. Please try this one:

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..e5e9071 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 			   struct mr_table **mrt)
 {
-	struct ipmr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ipmr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 			       flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,22 @@ failure:
 
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	int err;
+	struct ipmr_result res;
 	struct net *net = dev_net(dev);
-	struct mr_table *mrt;
+
+	struct fib_lookup_arg arg = {
+		.result = &res,
+	};
+
 	struct flowi4 fl4 = {
 		.flowi4_oif	= dev->ifindex,
 		.flowi4_iif	= skb->skb_iif,
 		.flowi4_mark	= skb->mark,
 	};
-	int err;
 
-	err = ipmr_fib_lookup(net, &fl4, &mrt);
+	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+			       flowi4_to_flowi(&fl4), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -466,9 +475,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			  IGMPMSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	if (arg.rule)
+		fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..45ec621 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
 static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
 			    struct mr6_table **mrt)
 {
-	struct ip6mr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ip6mr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
 			       flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
+	int err;
+	struct ip6mr_result res;
 	struct net *net = dev_net(dev);
-	struct mr6_table *mrt;
 	struct flowi6 fl6 = {
 		.flowi6_oif	= dev->ifindex,
 		.flowi6_iif	= skb->skb_iif,
 		.flowi6_mark	= skb->mark,
 	};
-	int err;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+	};
 
-	err = ip6mr_fib_lookup(net, &fl6, &mrt);
+	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+			flowi6_to_flowi(&fl6), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -711,9 +718,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+	ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			   MRT6MSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	if (arg.rule)
+		fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-07 17:43 ` Hannes Frederic Sowa
@ 2014-01-07 20:11   ` Hannes Frederic Sowa
  2014-01-07 20:20     ` Hannes Frederic Sowa
  2014-01-07 20:26     ` Eric Dumazet
  0 siblings, 2 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-07 20:11 UTC (permalink / raw)
  To: Bob Falken, Julian Anastasov, netdev

On Tue, Jan 07, 2014 at 06:43:22PM +0100, Hannes Frederic Sowa wrote:
> On Tue, Jan 07, 2014 at 06:01:44PM +0100, Bob Falken wrote:
> > Hello,
> > 
> > I patched, kernel 3.2.53 yesterday,
> > 8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)
> > 
> > Kudos for fixing this.
> > 
> > I will keep checking the next days.
> > 
> > As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.
> > 
> > Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).
> 
> Thank you for testing!
> 
> I'll review the RCU regions again and will prepare the patches (before
> introduction of ebc0ffae5dfb44 ("fib: RCU conversion of fib_lookup()")
> and after.

It seems ip(6)mr_fib_lookup is not always called  from rcu section
(ndo_start_xmit), so I had to restructure a bit. Could you retest this
patch as preparation for a submission to stable? Thanks!

RCU conversion can be done later then.

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..9ae4ae7 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 			   struct mr_table **mrt)
 {
-	struct ipmr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ipmr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 			       flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,23 @@ failure:
 
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 {
+	int err;
+	struct ipmr_result res;
 	struct net *net = dev_net(dev);
-	struct mr_table *mrt;
+
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
+
 	struct flowi4 fl4 = {
 		.flowi4_oif	= dev->ifindex,
 		.flowi4_iif	= skb->skb_iif,
 		.flowi4_mark	= skb->mark,
 	};
-	int err;
 
-	err = ipmr_fib_lookup(net, &fl4, &mrt);
+	err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+			       flowi4_to_flowi(&fl4), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -466,9 +476,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+	ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			  IGMPMSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	if (arg.rule)
+		fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..45ec621 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
 static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
 			    struct mr6_table **mrt)
 {
-	struct ip6mr_result res;
-	struct fib_lookup_arg arg = { .result = &res, };
 	int err;
+	struct ip6mr_result res;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+		.flags = FIB_LOOKUP_NOREF,
+	};
 
 	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
 			       flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
 static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 				      struct net_device *dev)
 {
+	int err;
+	struct ip6mr_result res;
 	struct net *net = dev_net(dev);
-	struct mr6_table *mrt;
 	struct flowi6 fl6 = {
 		.flowi6_oif	= dev->ifindex,
 		.flowi6_iif	= skb->skb_iif,
 		.flowi6_mark	= skb->mark,
 	};
-	int err;
+	struct fib_lookup_arg arg = {
+		.result = &res,
+	};
 
-	err = ip6mr_fib_lookup(net, &fl6, &mrt);
+	err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+			flowi6_to_flowi(&fl6), 0, &arg);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -711,9 +718,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
 	read_lock(&mrt_lock);
 	dev->stats.tx_bytes += skb->len;
 	dev->stats.tx_packets++;
-	ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+	ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+			   MRT6MSG_WHOLEPKT);
 	read_unlock(&mrt_lock);
 	kfree_skb(skb);
+	if (arg.rule)
+		fib_rule_put(arg.rule);
 	return NETDEV_TX_OK;
 }
 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2014-01-07 17:01 Bob Falken
@ 2014-01-07 17:43 ` Hannes Frederic Sowa
  2014-01-07 20:11   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-07 17:43 UTC (permalink / raw)
  To: Bob Falken; +Cc: Julian Anastasov, netdev

On Tue, Jan 07, 2014 at 06:01:44PM +0100, Bob Falken wrote:
> Hello,
> 
> I patched, kernel 3.2.53 yesterday,
> 8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)
> 
> Kudos for fixing this.
> 
> I will keep checking the next days.
> 
> As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.
> 
> Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).

Thank you for testing!

I'll review the RCU regions again and will prepare the patches (before
introduction of ebc0ffae5dfb44 ("fib: RCU conversion of fib_lookup()")
and after.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2014-01-07 17:01 Bob Falken
  2014-01-07 17:43 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Bob Falken @ 2014-01-07 17:01 UTC (permalink / raw)
  To: Hannes Frederic Sowa, Julian Anastasov; +Cc: netdev

Hello,

I patched, kernel 3.2.53 yesterday,
8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)

Kudos for fixing this.

I will keep checking the next days.

As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.

Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).

Thanks again.


----- Original Message -----
From: Hannes Frederic Sowa
Sent: 01/05/14 12:38 AM
To: Julian Anastasov
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
 On Sat, Jan 04, 2014 at 09:55:51PM +0200, Julian Anastasov wrote:
> 
> Hello,
> 
> On Thu, 19 Dec 2013, Bob Falken wrote:
> 
> > Hello, I have an issue after kernel 2.6.37 and above.
> > If i roll back to kernel 2.6.36.4 everything is fine.
> > if recive more than 4294967295 multicast packets, the kernel does not register the multicast packets. and multicast routing does not functioning.
> 
> ...
> 
> > I think there might be a variable in the kernel that get overflown. though i cannot be sure as im not a programmer. 
> > 
> > Let me know if you need more debug information.
> > 
> > Please help.  
> 
> As Hannes guessed, may be it is really the missing
> flags = FIB_LOOKUP_NOREF in ipmr_fib_lookup() when calling
> fib_rules_lookup(), we have an atomic_inc_not_zero() there that
> can stop to work after 4G lookups when rule's refcnt reaches 0.
> As result, fib_rules_lookup() can start to return -ESRCH.

I guess we should just try. I somehow forgot to look after that. Thanks
for reminding, Julian.

Bob, can you try with this patch again?

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..b9b3472 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
 static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
 struct mr_table **mrt)
 {
- struct ipmr_result res;
- struct fib_lookup_arg arg = { .result = &res};
 int err;
+ struct ipmr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
 
 err = fib_rules_lookup(net->ipv4.mr_rules_ops,
 flowi4_to_flowi(flp4), 0, &arg); 

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2014-01-04 18:53 Bob Falken
  0 siblings, 0 replies; 32+ messages in thread
From: Bob Falken @ 2014-01-04 18:53 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Eric Dumazet, Ben Greear, netdev

Hello, I compiled kernel 3.2.53 with the lates stable grsec patch, and enabled the CONFIG_PAX_REFCOUNT.

Unfortuantly the server lockedup after about 13hours so i did not get a dmesg event. forgot to disable vty blank, so don't know the reason for the server lockup.(could just be a bad kernel build).

I have rebuildt the kernel and will report back with the result, should have some results in about 16hours, unless the server locksup again.


----- Original Message -----
From: Hannes Frederic Sowa
Sent: 01/03/14 08:37 AM
To: Bob Falken
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
 On Sat, Dec 21, 2013 at 11:35:00PM +0100, Bob Falken wrote:
> On Thu, Dec 19, 2013 at 09:24:18AM -0800, Eric Dumazet wrote:
> > On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> > > The only reason why i give information about 2.6.36.4 is that its the
> > > only latest kernel that was functioning properly.
> > > i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> > > versions to help a coder see when/where the isse was implemented in
> > > the kernel.
> > > 
> > > I do not need a backport patch for an old kernel, I generally only
> > > need the issue looked into and get fixed so that I dont have to use an
> > > old kernel. :)
> > > 
> > > I have no issue reproducing the issue on the recent kernels. however i
> > > have not tried the GIT kernel.
> > > 
> > > I restarted the server just a moment ago. i will install and run
> > > dropwatch and provide feedback in about 17hours. 
> > 
> > You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
> > 2^32 packets ?
> > 
> > Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
> > imbalance...
> 
> That's my thought, too. :/
> 
> The ipmr.c rcu conversion happend in 2.6.37. 
> 

Just as a follow-up. You could verify if we actually have a refcount overflow
if you try a grsec kernel with PAX_REFCOUNT enabled. If a refcoount overflow
happens, dmesg should show then.

Greetings,

 Hannes 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-21 22:35 Bob Falken
@ 2014-01-03  7:37 ` Hannes Frederic Sowa
  0 siblings, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-03  7:37 UTC (permalink / raw)
  To: Bob Falken; +Cc: Eric Dumazet, Ben Greear, netdev

On Sat, Dec 21, 2013 at 11:35:00PM +0100, Bob Falken wrote:
>  On Thu, Dec 19, 2013 at 09:24:18AM -0800, Eric Dumazet wrote:
> > On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> > > The only reason why i give information about 2.6.36.4 is that its the
> > > only latest kernel that was functioning properly.
> > > i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> > > versions to help a coder see when/where the isse was implemented in
> > > the kernel.
> > > 
> > > I do not need a backport patch for an old kernel, I generally only
> > > need the issue looked into and get fixed so that I dont have to use an
> > > old kernel. :)
> > > 
> > > I have no issue reproducing the issue on the recent kernels. however i
> > > have not tried the GIT kernel.
> > > 
> > > I restarted the server just a moment ago. i will install and run
> > > dropwatch and provide feedback in about 17hours. 
> > 
> > You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
> > 2^32 packets ?
> > 
> > Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
> > imbalance...
> 
> That's my thought, too. :/
> 
> The ipmr.c rcu conversion happend in 2.6.37. 
> 

Just as a follow-up. You could verify if we actually have a refcount overflow
if you try a grsec kernel with PAX_REFCOUNT enabled. If a refcoount overflow
happens, dmesg should show then.

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 17:24 ` Eric Dumazet
  2013-12-19 17:32   ` Hannes Frederic Sowa
@ 2013-12-22  3:10   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-22  3:10 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Bob Falken, Ben Greear, netdev

On Thu, Dec 19, 2013 at 09:24:18AM -0800, Eric Dumazet wrote:
> On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> > The only reason why i give information about 2.6.36.4 is that its the
> > only latest kernel that was functioning properly.
> > i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> > versions to help a coder see when/where the isse was implemented in
> > the kernel.
> > 
> > I do not need a backport patch for an old kernel, I generally only
> > need the issue looked into and get fixed so that I dont have to use an
> > old kernel. :)
> > 
> > I have no issue reproducing the issue on the recent kernels. however i
> > have not tried the GIT kernel.
> > 
> > I restarted the server just a moment ago. i will install and run
> > dropwatch and provide feedback in about 17hours. 
> 
> You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
> 2^32 packets ?
> 
> Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
> imbalance...

I have the feeling it may have something to do with FIB_LOOKUP_NOREF. More
tomorrow...

Greetings,

  Hannes

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2013-12-21 22:35 Bob Falken
  2014-01-03  7:37 ` Hannes Frederic Sowa
  0 siblings, 1 reply; 32+ messages in thread
From: Bob Falken @ 2013-12-21 22:35 UTC (permalink / raw)
  To: Hannes Frederic Sowa, Eric Dumazet; +Cc: Ben Greear, netdev

OK, so at the exact time that the incoming interface for multicast packet count reaches 2^32, 
the /proc/net/ip_mr_cache stops updating. 
after a while, one by one the multicast groups in ip_mr_cache disapperes, and after 227sec all of them are gone. 


perf script net_dropmonitor:
-----------
# ========
# captured on: Sat Dec 21 23:27:37 2013
# ========
#
Starting trace (Ctrl-C to dump results)
Warning:
Processed 788648 events and lost 118 chunks!
 
Check IO/CPU overload!
 
Gathering kallsyms data
35200/35200
                 LOCATION                    OFFSET                     COUNT
                   _stext      18446744071578845580                         6
                   _stext      18446744071578843536                    785790
                   _stext      18446744071578843530                         1
 
 
 
-------------
 
netstat -s:
Ip:
    622406 total packets received
    2 with invalid addresses
    0 forwarded
    0 incoming packets discarded
    599574 incoming packets delivered
    520762 requests sent out
    8 dropped because of missing route
Icmp:
    19361 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
        echo requests: 8415
        echo replies: 10946
    19361 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        echo request: 10946
        echo replies: 8415
IcmpMsg:
        InType0: 10946
        InType8: 8415
        OutType0: 8415
        OutType8: 10946
Tcp:
    15 active connections openings
    15 passive connection openings
    0 failed connection attempts
    0 connection resets received
    29 connections established
    477938 segments received
    482321 segments send out
    4 segments retransmited
    0 bad segments received.
    0 resets sent
Udp:
    586 packets received
    0 packets to unknown port received.
    0 packet receive errors
    649 packets sent
UdpLite:
TcpExt:
    15862 delayed acks sent
    Quick ack mode was activated 1 times
    1 packets directly queued to recvmsg prequeue.
    390374 packet headers predicted
    1767 acknowledgments not containing data payload received
    58169 predicted acknowledgments
    4 congestion windows recovered without slow start after partial ack
    4 other TCP timeouts
    1 DSACKs sent for old packets
    4 DSACKs received
    TCPSackShiftFallback: 3
IpExt:
    InNoRoutes: 1
    InMcastPkts: 40015
    OutMcastPkts: 18427
    InBcastPkts: 80035
    InOctets: 1116615859
    OutOctets: 33742922
    InMcastOctets: 1046924948
    OutMcastOctets: 734556
    InBcastOctets: 7255577
  
 
--------------------- 
----- Original Message -----
From: Hannes Frederic Sowa
Sent: 12/19/13 06:32 PM
To: Eric Dumazet
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
 On Thu, Dec 19, 2013 at 09:24:18AM -0800, Eric Dumazet wrote:
> On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> > The only reason why i give information about 2.6.36.4 is that its the
> > only latest kernel that was functioning properly.
> > i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> > versions to help a coder see when/where the isse was implemented in
> > the kernel.
> > 
> > I do not need a backport patch for an old kernel, I generally only
> > need the issue looked into and get fixed so that I dont have to use an
> > old kernel. :)
> > 
> > I have no issue reproducing the issue on the recent kernels. however i
> > have not tried the GIT kernel.
> > 
> > I restarted the server just a moment ago. i will install and run
> > dropwatch and provide feedback in about 17hours. 
> 
> You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
> 2^32 packets ?
> 
> Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
> imbalance...

That's my thought, too. :/

The ipmr.c rcu conversion happend in 2.6.37. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 17:24 ` Eric Dumazet
@ 2013-12-19 17:32   ` Hannes Frederic Sowa
  2013-12-22  3:10   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 32+ messages in thread
From: Hannes Frederic Sowa @ 2013-12-19 17:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Bob Falken, Ben Greear, netdev

On Thu, Dec 19, 2013 at 09:24:18AM -0800, Eric Dumazet wrote:
> On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> > The only reason why i give information about 2.6.36.4 is that its the
> > only latest kernel that was functioning properly.
> > i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> > versions to help a coder see when/where the isse was implemented in
> > the kernel.
> > 
> > I do not need a backport patch for an old kernel, I generally only
> > need the issue looked into and get fixed so that I dont have to use an
> > old kernel. :)
> > 
> > I have no issue reproducing the issue on the recent kernels. however i
> > have not tried the GIT kernel.
> > 
> > I restarted the server just a moment ago. i will install and run
> > dropwatch and provide feedback in about 17hours. 
> 
> You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
> 2^32 packets ?
> 
> Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
> imbalance...

That's my thought, too. :/

The ipmr.c rcu conversion happend in 2.6.37.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
  2013-12-19 16:28 Bob Falken
@ 2013-12-19 17:24 ` Eric Dumazet
  2013-12-19 17:32   ` Hannes Frederic Sowa
  2013-12-22  3:10   ` Hannes Frederic Sowa
  0 siblings, 2 replies; 32+ messages in thread
From: Eric Dumazet @ 2013-12-19 17:24 UTC (permalink / raw)
  To: Bob Falken; +Cc: Hannes Frederic Sowa, Ben Greear, netdev

On Thu, 2013-12-19 at 17:28 +0100, Bob Falken wrote:
> The only reason why i give information about 2.6.36.4 is that its the
> only latest kernel that was functioning properly.
> i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel
> versions to help a coder see when/where the isse was implemented in
> the kernel.
> 
> I do not need a backport patch for an old kernel, I generally only
> need the issue looked into and get fixed so that I dont have to use an
> old kernel. :)
> 
> I have no issue reproducing the issue on the recent kernels. however i
> have not tried the GIT kernel.
> 
> I restarted the server just a moment ago. i will install and run
> dropwatch and provide feedback in about 17hours. 

You said that "cat /proc/net/ip_mr_cache" gives nothing at all after
2^32 packets ?

Thats a bit scary ... maybe a 32bit refcnt overflow, because of some
imbalance...

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2013-12-19 16:28 Bob Falken
  2013-12-19 17:24 ` Eric Dumazet
  0 siblings, 1 reply; 32+ messages in thread
From: Bob Falken @ 2013-12-19 16:28 UTC (permalink / raw)
  To: Hannes Frederic Sowa, Ben Greear; +Cc: netdev

The only reason why i give information about 2.6.36.4 is that its the only latest kernel that was functioning properly.
i.e kernel >= 2.6.37 is not woking. so its a bisecting of the kernel versions to help a coder see when/where the isse was implemented in the kernel.

I do not need a backport patch for an old kernel, I generally only need the issue looked into and get fixed so that I dont have to use an old kernel. :)

I have no issue reproducing the issue on the recent kernels. however i have not tried the GIT kernel.

I restarted the server just a moment ago. i will install and run dropwatch and provide feedback in about 17hours. 

Thanks
----- Original Message -----
From: Hannes Frederic Sowa
Sent: 12/19/13 04:48 PM
To: Ben Greear
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
 On Thu, Dec 19, 2013 at 07:15:37AM -0800, Ben Greear wrote:
> On 12/19/2013 07:09 AM, Hannes Frederic Sowa wrote:
> >On Thu, Dec 19, 2013 at 03:48:16PM +0100, Bob Falken wrote:
> >>Hello, I have an issue after kernel 2.6.37 and above.
> >>If i roll back to kernel 2.6.36.4 everything is fine.
> >>if recive more than 4294967295 multicast packets, the kernel does not 
> >>register the multicast packets. and multicast routing does not 
> >>functioning.
> >>(Tested bouth FIB_HASH and FIB_TRIE)
> >>Tested with xorp and pimd.
> >>I have abount 24 multicast groups, and it takes me about 17hours to 
> >>reproduce the issue after a reboot.
> >>Reboot is reqired to fix the issue. (Tested to stop/start pimd/xorp, 
> >>reload network module for the network interface "e1000e",
> >>Used birdge adapter and remove bridge adapter and readd bridge adapter to 
> >>clear counters. none of thouse solves the issue.)
> >
> >Please test this with a recent kernel. 2.6.37 is really old and you 
> >normally
> >won't get good support here with such old kernels.
> 
> Note that he did test up to 3.11.9 and it still showed failures.

Oh sorry, I did not read to the end. ;)

An interesting hint could be to use dropwatch or perf script net_dropmonitor
to check where the fragments get dropped. Also nstat could give additional
hints where something might get wrong. Please use a recent kernel while
debugging this issue. Maybe a patch can get backported later.

Thanks,

 Hannes 

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-01-13  0:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-19 14:48 Multicast routing stops functioning after 4G multicast packets recived Bob Falken
2013-12-19 15:09 ` Hannes Frederic Sowa
2013-12-19 15:15   ` Ben Greear
2013-12-19 15:48     ` Hannes Frederic Sowa
2014-01-04 19:55 ` Julian Anastasov
2014-01-04 23:38   ` Hannes Frederic Sowa
2014-01-05  8:56     ` Julian Anastasov
2014-01-05 10:41       ` Hannes Frederic Sowa
2014-01-05 19:12         ` Eric Dumazet
2013-12-19 16:28 Bob Falken
2013-12-19 17:24 ` Eric Dumazet
2013-12-19 17:32   ` Hannes Frederic Sowa
2013-12-22  3:10   ` Hannes Frederic Sowa
2013-12-21 22:35 Bob Falken
2014-01-03  7:37 ` Hannes Frederic Sowa
2014-01-04 18:53 Bob Falken
2014-01-07 17:01 Bob Falken
2014-01-07 17:43 ` Hannes Frederic Sowa
2014-01-07 20:11   ` Hannes Frederic Sowa
2014-01-07 20:20     ` Hannes Frederic Sowa
2014-01-07 20:26     ` Eric Dumazet
2014-01-07 20:29       ` Hannes Frederic Sowa
2014-01-09 20:14 Bob Falken
2014-01-10  6:36 ` Hannes Frederic Sowa
2014-01-10  7:01   ` Eric Dumazet
2014-01-10  7:10     ` Hannes Frederic Sowa
2014-01-10  7:32       ` Eric Dumazet
2014-01-10  7:43         ` Hannes Frederic Sowa
2014-01-10  7:50           ` Hannes Frederic Sowa
2014-01-12  7:42             ` Hannes Frederic Sowa
2014-01-13  0:56               ` Eric Dumazet
2014-01-12  0:25 Bob Falken

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.