* Re: Multicast routing stops functioning after 4G multicast packets recived.
@ 2014-01-09 20:14 Bob Falken
2014-01-10 6:36 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Bob Falken @ 2014-01-09 20:14 UTC (permalink / raw)
To: Hannes Frederic Sowa, Julian Anastasov, netdev
Hello,
Testing this patch as im typing this. will check status in about 12hours.
Unfortuantly, I dont have any receivers avaialble for requesting the multicast stream on the edge point anymore.
So there is not TX traffic a.t.m..
I will have a better test-lab available next week. (hopefully).
Thanks again.
----- Original Message -----
From: Hannes Frederic Sowa
Sent: 01/07/14 09:20 PM
To: Bob Falken, Julian Anastasov, netdev@vger.kernel.org
Subject: Re: Multicast routing stops functioning after 4G multicast packets recived.
On Tue, Jan 07, 2014 at 09:11:47PM +0100, Hannes Frederic Sowa wrote:
> On Tue, Jan 07, 2014 at 06:43:22PM +0100, Hannes Frederic Sowa wrote:
> > On Tue, Jan 07, 2014 at 06:01:44PM +0100, Bob Falken wrote:
> > > Hello,
> > >
> > > I patched, kernel 3.2.53 yesterday,
> > > 8774002632packets, and ~9.1TB later, the multicast routing seems to function properly. :)
> > >
> > > Kudos for fixing this.
> > >
> > > I will keep checking the next days.
> > >
> > > As for IPv6 MR, my current setup i.e: the Multicast source, does not support IPv6, so cannot do a check for that natively.
> > >
> > > Unless I can translate IPv4 multicast into IPv6 multicast easily, using some iptable prerouting rules(?).
> >
> > Thank you for testing!
> >
> > I'll review the RCU regions again and will prepare the patches (before
> > introduction of ebc0ffae5dfb44 ("fib: RCU conversion of fib_lookup()")
> > and after.
>
> It seems ip(6)mr_fib_lookup is not always called from rcu section
> (ndo_start_xmit), so I had to restructure a bit. Could you retest this
> patch as preparation for a submission to stable? Thanks!
>
> RCU conversion can be done later then.
Broken patch, sorry. Please try this one:
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..e5e9071 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
struct mr_table **mrt)
{
- struct ipmr_result res;
- struct fib_lookup_arg arg = { .result = &res};
int err;
+ struct ipmr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv4.mr_rules_ops,
flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,22 @@ failure:
static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
{
+ int err;
+ struct ipmr_result res;
struct net *net = dev_net(dev);
- struct mr_table *mrt;
+
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
+
struct flowi4 fl4 = {
.flowi4_oif = dev->ifindex,
.flowi4_iif = skb->skb_iif,
.flowi4_mark = skb->mark,
};
- int err;
- err = ipmr_fib_lookup(net, &fl4, &mrt);
+ err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+ flowi4_to_flowi(&fl4), 0, &arg);
if (err < 0) {
kfree_skb(skb);
return err;
@@ -466,9 +475,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
read_lock(&mrt_lock);
dev->stats.tx_bytes += skb->len;
dev->stats.tx_packets++;
- ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+ ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ IGMPMSG_WHOLEPKT);
read_unlock(&mrt_lock);
kfree_skb(skb);
+ if (arg.rule)
+ fib_rule_put(arg.rule);
return NETDEV_TX_OK;
}
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..45ec621 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
struct mr6_table **mrt)
{
- struct ip6mr_result res;
- struct fib_lookup_arg arg = { .result = &res};
int err;
+ struct ip6mr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
struct net_device *dev)
{
+ int err;
+ struct ip6mr_result res;
struct net *net = dev_net(dev);
- struct mr6_table *mrt;
struct flowi6 fl6 = {
.flowi6_oif = dev->ifindex,
.flowi6_iif = skb->skb_iif,
.flowi6_mark = skb->mark,
};
- int err;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
- err = ip6mr_fib_lookup(net, &fl6, &mrt);
+ err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+ flowi6_to_flowi(&fl6), 0, &arg);
if (err < 0) {
kfree_skb(skb);
return err;
@@ -711,9 +718,12 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
read_lock(&mrt_lock);
dev->stats.tx_bytes += skb->len;
dev->stats.tx_packets++;
- ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+ ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ MRT6MSG_WHOLEPKT);
read_unlock(&mrt_lock);
kfree_skb(skb);
+ if (arg.rule)
+ fib_rule_put(arg.rule);
return NETDEV_TX_OK;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-09 20:14 Multicast routing stops functioning after 4G multicast packets recived Bob Falken
@ 2014-01-10 6:36 ` Hannes Frederic Sowa
2014-01-10 7:01 ` Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10 6:36 UTC (permalink / raw)
To: Bob Falken; +Cc: Julian Anastasov, netdev, kaber, tgraf, eric.dumazet
On Thu, Jan 09, 2014 at 09:14:11PM +0100, Bob Falken wrote:
> Hello,
> Testing this patch as im typing this. will check status in about 12hours.
> Unfortuantly, I dont have any receivers avaialble for requesting the multicast stream on the edge point anymore.
> So there is not TX traffic a.t.m..
>
> I will have a better test-lab available next week. (hopefully).
Ok, so I am proposing this patch. Only difference from the RFC is that
I removed the superfluous arg.rule NULL-pointer checks (I hate if they
are superfluous and they always seem to spread ;) ).
Maybe you could test this one instead and David could pick it up as soon
as your results are in.
I'll also look for the stable kernels where FIB_LOOKUP_NOREF is not
yet available.
Thank you,
Hannes
[PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
When introducing multiple table support for multicast forwarding in
IPv4 and IPv6, necessary fib_rules_put reference count decrements were
forgotten.
Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.
So, use FIB_LOOKUP_NOREF if we are already in a RCU protected section and
correctly deal with reference counter if not (called from ndo_start_xmit).
Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv4/ipmr.c | 23 +++++++++++++++++------
net/ipv6/ip6mr.c | 21 +++++++++++++++------
2 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..c8d0857 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
struct mr_table **mrt)
{
- struct ipmr_result res;
- struct fib_lookup_arg arg = { .result = &res, };
int err;
+ struct ipmr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv4.mr_rules_ops,
flowi4_to_flowi(flp4), 0, &arg);
@@ -448,16 +451,22 @@ failure:
static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
{
+ int err;
+ struct ipmr_result res;
struct net *net = dev_net(dev);
- struct mr_table *mrt;
+
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
+
struct flowi4 fl4 = {
.flowi4_oif = dev->ifindex,
.flowi4_iif = skb->skb_iif,
.flowi4_mark = skb->mark,
};
- int err;
- err = ipmr_fib_lookup(net, &fl4, &mrt);
+ err = fib_rules_lookup(net->ipv4.mr_rules_ops,
+ flowi4_to_flowi(&fl4), 0, &arg);
if (err < 0) {
kfree_skb(skb);
return err;
@@ -466,9 +475,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
read_lock(&mrt_lock);
dev->stats.tx_bytes += skb->len;
dev->stats.tx_packets++;
- ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
+ ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ IGMPMSG_WHOLEPKT);
read_unlock(&mrt_lock);
kfree_skb(skb);
+ fib_rule_put(arg.rule);
return NETDEV_TX_OK;
}
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..38347a3 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
struct mr6_table **mrt)
{
- struct ip6mr_result res;
- struct fib_lookup_arg arg = { .result = &res, };
int err;
+ struct ip6mr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
flowi6_to_flowi(flp6), 0, &arg);
@@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
struct net_device *dev)
{
+ int err;
+ struct ip6mr_result res;
struct net *net = dev_net(dev);
- struct mr6_table *mrt;
struct flowi6 fl6 = {
.flowi6_oif = dev->ifindex,
.flowi6_iif = skb->skb_iif,
.flowi6_mark = skb->mark,
};
- int err;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ };
- err = ip6mr_fib_lookup(net, &fl6, &mrt);
+ err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
+ flowi6_to_flowi(&fl6), 0, &arg);
if (err < 0) {
kfree_skb(skb);
return err;
@@ -711,9 +718,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
read_lock(&mrt_lock);
dev->stats.tx_bytes += skb->len;
dev->stats.tx_packets++;
- ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
+ ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
+ MRT6MSG_WHOLEPKT);
read_unlock(&mrt_lock);
kfree_skb(skb);
+ fib_rule_put(arg.rule);
return NETDEV_TX_OK;
}
--
1.8.4.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 6:36 ` Hannes Frederic Sowa
@ 2014-01-10 7:01 ` Eric Dumazet
2014-01-10 7:10 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2014-01-10 7:01 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Fri, 2014-01-10 at 07:36 +0100, Hannes Frederic Sowa wrote:
> Ok, so I am proposing this patch. Only difference from the RFC is that
> I removed the superfluous arg.rule NULL-pointer checks (I hate if they
> are superfluous and they always seem to spread ;) ).
>
> Maybe you could test this one instead and David could pick it up as soon
> as your results are in.
>
> I'll also look for the stable kernels where FIB_LOOKUP_NOREF is not
> yet available.
>
> Thank you,
>
> Hannes
>
> [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
>
> When introducing multiple table support for multicast forwarding in
> IPv4 and IPv6, necessary fib_rules_put reference count decrements were
> forgotten.
>
> Bob Falken reported that after 4G packets, multicast forwarding stopped
> working. This was because of a rule reference counter overflow which
> freed the rule as soon as the overflow happend.
>
> So, use FIB_LOOKUP_NOREF if we are already in a RCU protected section and
> correctly deal with reference counter if not (called from ndo_start_xmit).
>
> Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
> Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
> Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> net/ipv4/ipmr.c | 23 +++++++++++++++++------
> net/ipv6/ip6mr.c | 21 +++++++++++++++------
> 2 files changed, 32 insertions(+), 12 deletions(-)
>
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 421a249..c8d0857 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c
> @@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
> static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
> struct mr_table **mrt)
> {
> - struct ipmr_result res;
> - struct fib_lookup_arg arg = { .result = &res, };
> int err;
> + struct ipmr_result res;
> + struct fib_lookup_arg arg = {
> + .result = &res,
> + .flags = FIB_LOOKUP_NOREF,
> + };
>
> err = fib_rules_lookup(net->ipv4.mr_rules_ops,
> flowi4_to_flowi(flp4), 0, &arg);
> @@ -448,16 +451,22 @@ failure:
>
> static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> + int err;
> + struct ipmr_result res;
> struct net *net = dev_net(dev);
> - struct mr_table *mrt;
> +
> + struct fib_lookup_arg arg = {
> + .result = &res,
> + };
> +
> struct flowi4 fl4 = {
> .flowi4_oif = dev->ifindex,
> .flowi4_iif = skb->skb_iif,
> .flowi4_mark = skb->mark,
> };
> - int err;
>
> - err = ipmr_fib_lookup(net, &fl4, &mrt);
> + err = fib_rules_lookup(net->ipv4.mr_rules_ops,
> + flowi4_to_flowi(&fl4), 0, &arg);
Its not clear to me why you expand ipmr_fib_lookup()
Is there something wrong with existing code ?
Its not mentioned in changelog
> if (err < 0) {
> kfree_skb(skb);
> return err;
> @@ -466,9 +475,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb, struct net_device *dev)
> read_lock(&mrt_lock);
> dev->stats.tx_bytes += skb->len;
> dev->stats.tx_packets++;
> - ipmr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, IGMPMSG_WHOLEPKT);
> + ipmr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> + IGMPMSG_WHOLEPKT);
> read_unlock(&mrt_lock);
> kfree_skb(skb);
> + fib_rule_put(arg.rule);
This is the one line that is really missing, patch could be smaller.
> return NETDEV_TX_OK;
> }
>
> diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
> index f365310..38347a3 100644
> --- a/net/ipv6/ip6mr.c
> +++ b/net/ipv6/ip6mr.c
> @@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
> static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
> struct mr6_table **mrt)
> {
> - struct ip6mr_result res;
> - struct fib_lookup_arg arg = { .result = &res, };
> int err;
> + struct ip6mr_result res;
> + struct fib_lookup_arg arg = {
> + .result = &res,
> + .flags = FIB_LOOKUP_NOREF,
> + };
>
> err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
> flowi6_to_flowi(flp6), 0, &arg);
> @@ -693,16 +696,20 @@ static const struct inet6_protocol pim6_protocol = {
> static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
> struct net_device *dev)
> {
> + int err;
> + struct ip6mr_result res;
> struct net *net = dev_net(dev);
> - struct mr6_table *mrt;
> struct flowi6 fl6 = {
> .flowi6_oif = dev->ifindex,
> .flowi6_iif = skb->skb_iif,
> .flowi6_mark = skb->mark,
> };
> - int err;
> + struct fib_lookup_arg arg = {
> + .result = &res,
> + };
>
> - err = ip6mr_fib_lookup(net, &fl6, &mrt);
> + err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
> + flowi6_to_flowi(&fl6), 0, &arg);
same remark here.
> if (err < 0) {
> kfree_skb(skb);
> return err;
> @@ -711,9 +718,11 @@ static netdev_tx_t reg_vif_xmit(struct sk_buff *skb,
> read_lock(&mrt_lock);
> dev->stats.tx_bytes += skb->len;
> dev->stats.tx_packets++;
> - ip6mr_cache_report(mrt, skb, mrt->mroute_reg_vif_num, MRT6MSG_WHOLEPKT);
> + ip6mr_cache_report(res.mrt, skb, res.mrt->mroute_reg_vif_num,
> + MRT6MSG_WHOLEPKT);
> read_unlock(&mrt_lock);
> kfree_skb(skb);
> + fib_rule_put(arg.rule);
> return NETDEV_TX_OK;
> }
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 7:01 ` Eric Dumazet
@ 2014-01-10 7:10 ` Hannes Frederic Sowa
2014-01-10 7:32 ` Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10 7:10 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> Its not clear to me why you expand ipmr_fib_lookup()
>
> Is there something wrong with existing code ?
There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
section, one is not.
ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
other function so I still have access to arg.rule to decrement the
reference counter.
Do you agree?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 7:10 ` Hannes Frederic Sowa
@ 2014-01-10 7:32 ` Eric Dumazet
2014-01-10 7:43 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2014-01-10 7:32 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > Its not clear to me why you expand ipmr_fib_lookup()
> >
> > Is there something wrong with existing code ?
>
> There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> section, one is not.
>
> ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> other function so I still have access to arg.rule to decrement the
> reference counter.
>
> Do you agree?
Hmm, I see the problem now.
What about adding a parameter to ipmr_fib_lookup(),
to keep its spirit ?
ipmr_fib_lookup(net, &fl4, &mrt);
->
ipmr_fib_lookup(net, &fl4, &mrt, &rule);
Since ipmr_rt_fib_lookup() has the same rule leak, no ?
Its a bit late here, so maybe following is just stupid :
Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 7:32 ` Eric Dumazet
@ 2014-01-10 7:43 ` Hannes Frederic Sowa
2014-01-10 7:50 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10 7:43 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Thu, Jan 09, 2014 at 11:32:59PM -0800, Eric Dumazet wrote:
> On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> > On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > > Its not clear to me why you expand ipmr_fib_lookup()
> > >
> > > Is there something wrong with existing code ?
> >
> > There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> > section, one is not.
> >
> > ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> > chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> > just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> > other function so I still have access to arg.rule to decrement the
> > reference counter.
> >
> > Do you agree?
>
> Hmm, I see the problem now.
>
> What about adding a parameter to ipmr_fib_lookup(),
> to keep its spirit ?
>
> ipmr_fib_lookup(net, &fl4, &mrt);
> ->
> ipmr_fib_lookup(net, &fl4, &mrt, &rule);
>
> Since ipmr_rt_fib_lookup() has the same rule leak, no ?
No, ipmr_rt_fib_lookup is fine. This function gets called only from
rcu read locked section and we don't take table reference because of
FIB_LOOKUP_NOREF, so we don't need to put reference counter on arg.table.
We could add the additional argument, just ignoring it in ipmr_rt_fib_lookup.
>
> Its a bit late here, so maybe following is just stupid :
> Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
We could add bool noref to ipmr_fib_lookup indicating we want to drop
reference to rule just after lookup.
I'll check if freeing a rule has additional side-effects on dependencies
in reg_vif_xmit. That would be a nice solution actually, thanks!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 7:43 ` Hannes Frederic Sowa
@ 2014-01-10 7:50 ` Hannes Frederic Sowa
2014-01-12 7:42 ` Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-10 7:50 UTC (permalink / raw)
To: Eric Dumazet, Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Fri, Jan 10, 2014 at 08:43:25AM +0100, Hannes Frederic Sowa wrote:
> On Thu, Jan 09, 2014 at 11:32:59PM -0800, Eric Dumazet wrote:
> > On Fri, 2014-01-10 at 08:10 +0100, Hannes Frederic Sowa wrote:
> > > On Thu, Jan 09, 2014 at 11:01:46PM -0800, Eric Dumazet wrote:
> > > > Its not clear to me why you expand ipmr_fib_lookup()
> > > >
> > > > Is there something wrong with existing code ?
> > >
> > > There are three users of ipmr_fib_lookup, two of them are in rcu_read_lock
> > > section, one is not.
> > >
> > > ipmr_fib_lookup does not pass down arg.rule reference, so I don't have a
> > > chance to call fib_rule_put(arg.rule) on it. Thus I left ipmr_fib_lookup,
> > > just adding FIB_LOOKUP_NOREF and expanding ipmr_fib_lookup into the
> > > other function so I still have access to arg.rule to decrement the
> > > reference counter.
> > >
> > > Do you agree?
> >
> > Hmm, I see the problem now.
> >
> > What about adding a parameter to ipmr_fib_lookup(),
> > to keep its spirit ?
> >
> > ipmr_fib_lookup(net, &fl4, &mrt);
> > ->
> > ipmr_fib_lookup(net, &fl4, &mrt, &rule);
> >
> > Since ipmr_rt_fib_lookup() has the same rule leak, no ?
>
> No, ipmr_rt_fib_lookup is fine. This function gets called only from
> rcu read locked section and we don't take table reference because of
> FIB_LOOKUP_NOREF, so we don't need to put reference counter on arg.table.
arg.rule not table, actually.
> We could add the additional argument, just ignoring it in ipmr_rt_fib_lookup.
>
> >
> > Its a bit late here, so maybe following is just stupid :
> > Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
>
> We could add bool noref to ipmr_fib_lookup indicating we want to drop
> reference to rule just after lookup.
>
> I'll check if freeing a rule has additional side-effects on dependencies
> in reg_vif_xmit. That would be a nice solution actually, thanks!
Hmm, rule holds a reference to the net namespace in use. I don't know
if we want to add this special case. I guess net-namespace reference
cannot be removed while processing ndo_start_xmit callback but I don't
like this special case somehow. But I guess it is possible.
Your opinion on that?
Thanks,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-10 7:50 ` Hannes Frederic Sowa
@ 2014-01-12 7:42 ` Hannes Frederic Sowa
2014-01-13 0:56 ` Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-12 7:42 UTC (permalink / raw)
To: Eric Dumazet, Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Fri, Jan 10, 2014 at 08:50:05AM +0100, Hannes Frederic Sowa wrote:
> > > Its a bit late here, so maybe following is just stupid :
> > > Cant we do the fib_rule_put() inside ipmr_fib_lookup() ?
> >
> > We could add bool noref to ipmr_fib_lookup indicating we want to drop
> > reference to rule just after lookup.
> >
> > I'll check if freeing a rule has additional side-effects on dependencies
> > in reg_vif_xmit. That would be a nice solution actually, thanks!
>
> Hmm, rule holds a reference to the net namespace in use. I don't know
> if we want to add this special case. I guess net-namespace reference
> cannot be removed while processing ndo_start_xmit callback but I don't
> like this special case somehow. But I guess it is possible.
>
> Your opinion on that?
Hm, Eric. If we do that we can just specifiy FIB_LOOKUP_NOREF
unconditionally. FIB_LOOKUP_NOREF has no other side effects on a ipmr
lookup as taking the reference on the rule, which we would drop after
that.
So we would actually be going back to the first patch in this thread. I
guess it is just a matter of style?
Greetings,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Multicast routing stops functioning after 4G multicast packets recived.
2014-01-12 7:42 ` Hannes Frederic Sowa
@ 2014-01-13 0:56 ` Eric Dumazet
2014-01-13 1:45 ` [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding Hannes Frederic Sowa
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2014-01-13 0:56 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Sun, 2014-01-12 at 08:42 +0100, Hannes Frederic Sowa wrote:
> Hm, Eric. If we do that we can just specifiy FIB_LOOKUP_NOREF
> unconditionally. FIB_LOOKUP_NOREF has no other side effects on a ipmr
> lookup as taking the reference on the rule, which we would drop after
> that.
>
> So we would actually be going back to the first patch in this thread. I
> guess it is just a matter of style?
Hi Hannes, please submit a formal patch, so that we can have a proper
ground for discussion (I guess I'll only add my Acked-by)
Thanks !
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
2014-01-13 0:56 ` Eric Dumazet
@ 2014-01-13 1:45 ` Hannes Frederic Sowa
2014-01-13 15:18 ` Eric Dumazet
2014-01-15 1:40 ` David Miller
0 siblings, 2 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-13 1:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
Bob Falken reported that after 4G packets, multicast forwarding stopped
working. This was because of a rule reference counter overflow which
freed the rule as soon as the overflow happend.
This patch solves this by adding the FIB_LOOKUP_NOREF flag to
fib_rules_lookup calls. This is safe even from non-rcu locked sections
as in this case the flag only implies not taking a reference to the rule,
which we don't need at all.
Rules only hold references to the namespace, which are guaranteed to be
available during the call of the non-rcu protected function reg_vif_xmit
because of the interface reference which itself holds a reference to
the net namespace.
Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
Bob Falken already tested this patch, as it is similar to my first
attempt but the additional and similar fix for ipv6.
We need an additional fix for kernels without FIB_LOOKUP_NOREF, but I'll
move that to tomorrow, as it is already late here.
net/ipv4/ipmr.c | 7 +++++--
net/ipv6/ip6mr.c | 7 +++++--
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 421a249..b9b3472 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id)
static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4,
struct mr_table **mrt)
{
- struct ipmr_result res;
- struct fib_lookup_arg arg = { .result = &res, };
int err;
+ struct ipmr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv4.mr_rules_ops,
flowi4_to_flowi(flp4), 0, &arg);
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index f365310..0eb4038 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id)
static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6,
struct mr6_table **mrt)
{
- struct ip6mr_result res;
- struct fib_lookup_arg arg = { .result = &res, };
int err;
+ struct ip6mr_result res;
+ struct fib_lookup_arg arg = {
+ .result = &res,
+ .flags = FIB_LOOKUP_NOREF,
+ };
err = fib_rules_lookup(net->ipv6.mr6_rules_ops,
flowi6_to_flowi(flp6), 0, &arg);
--
1.8.4.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
2014-01-13 1:45 ` [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding Hannes Frederic Sowa
@ 2014-01-13 15:18 ` Eric Dumazet
2014-01-13 23:38 ` Hannes Frederic Sowa
2014-01-15 1:40 ` David Miller
1 sibling, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2014-01-13 15:18 UTC (permalink / raw)
To: Hannes Frederic Sowa; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Mon, 2014-01-13 at 02:45 +0100, Hannes Frederic Sowa wrote:
> Bob Falken reported that after 4G packets, multicast forwarding stopped
> working. This was because of a rule reference counter overflow which
> freed the rule as soon as the overflow happend.
>
> This patch solves this by adding the FIB_LOOKUP_NOREF flag to
> fib_rules_lookup calls. This is safe even from non-rcu locked sections
> as in this case the flag only implies not taking a reference to the rule,
> which we don't need at all.
We need to not forget this when/if we remove FIB_LOOKUP_NOREF flag,
as all callers use it : We'll have to keep rcu_read_lock() in
fib_rules_lookup()
>
> Rules only hold references to the namespace, which are guaranteed to be
> available during the call of the non-rcu protected function reg_vif_xmit
> because of the interface reference which itself holds a reference to
> the net namespace.
>
> Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
> Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
> Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---
> Bob Falken already tested this patch, as it is similar to my first
> attempt but the additional and similar fix for ipv6.
>
> We need an additional fix for kernels without FIB_LOOKUP_NOREF, but I'll
> move that to tomorrow, as it is already late here.
Acked-by: Eric Dumazet <edumazet@google.com>
Thanks !
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
2014-01-13 15:18 ` Eric Dumazet
@ 2014-01-13 23:38 ` Hannes Frederic Sowa
0 siblings, 0 replies; 13+ messages in thread
From: Hannes Frederic Sowa @ 2014-01-13 23:38 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Bob Falken, Julian Anastasov, netdev, kaber, tgraf
On Mon, Jan 13, 2014 at 07:18:34AM -0800, Eric Dumazet wrote:
> On Mon, 2014-01-13 at 02:45 +0100, Hannes Frederic Sowa wrote:
> > Bob Falken reported that after 4G packets, multicast forwarding stopped
> > working. This was because of a rule reference counter overflow which
> > freed the rule as soon as the overflow happend.
> >
> > This patch solves this by adding the FIB_LOOKUP_NOREF flag to
> > fib_rules_lookup calls. This is safe even from non-rcu locked sections
> > as in this case the flag only implies not taking a reference to the rule,
> > which we don't need at all.
>
> We need to not forget this when/if we remove FIB_LOOKUP_NOREF flag,
> as all callers use it : We'll have to keep rcu_read_lock() in
> fib_rules_lookup()
Yes. Also maybe there is a tiny race between ipmr_rules_exit and reg_vif_xmit.
mrt table might get cleaned up while in reg_vif_xmit and initialisation code
has an error and calls ipmr_rules_exit, because mr_tables are not reference
counted and not managed by rcu (but reg_vif_xmit is also not rcuified). This
could happen if an error happens while initialisation I don't think net
namespace destruction is a problem here, as reg_vif_xmit still has a valid
reference to the interface and thus the namespace.
> > Rules only hold references to the namespace, which are guaranteed to be
> > available during the call of the non-rcu protected function reg_vif_xmit
> > because of the interface reference which itself holds a reference to
> > the net namespace.
> >
> > Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
> > Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
> > Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
> > Cc: Patrick McHardy <kaber@trash.net>
> > Cc: Thomas Graf <tgraf@suug.ch>
> > Cc: Julian Anastasov <ja@ssi.bg>
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > ---
> > Bob Falken already tested this patch, as it is similar to my first
> > attempt but the additional and similar fix for ipv6.
> >
> > We need an additional fix for kernels without FIB_LOOKUP_NOREF, but I'll
> > move that to tomorrow, as it is already late here.
>
> Acked-by: Eric Dumazet <edumazet@google.com>
Thanks for the review!
I don't think we need an additional patch for this for the longterm kernels
2.6.32 and 2.6.34, as rule support for ipmr and ip6mr only entered in 2.6.35.
RCUification of fib_lookup happened in 2.6.37. So this patch should cover all
current stable kernels.
Greetings,
Hannes
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding
2014-01-13 1:45 ` [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding Hannes Frederic Sowa
2014-01-13 15:18 ` Eric Dumazet
@ 2014-01-15 1:40 ` David Miller
1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2014-01-15 1:40 UTC (permalink / raw)
To: hannes; +Cc: eric.dumazet, NetFestivalHaveFun, ja, netdev, kaber, tgraf
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Mon, 13 Jan 2014 02:45:22 +0100
> Bob Falken reported that after 4G packets, multicast forwarding stopped
> working. This was because of a rule reference counter overflow which
> freed the rule as soon as the overflow happend.
>
> This patch solves this by adding the FIB_LOOKUP_NOREF flag to
> fib_rules_lookup calls. This is safe even from non-rcu locked sections
> as in this case the flag only implies not taking a reference to the rule,
> which we don't need at all.
>
> Rules only hold references to the namespace, which are guaranteed to be
> available during the call of the non-rcu protected function reg_vif_xmit
> because of the interface reference which itself holds a reference to
> the net namespace.
>
> Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables")
> Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables")
> Reported-by: Bob Falken <NetFestivalHaveFun@gmx.com>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Julian Anastasov <ja@ssi.bg>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Applied and queued up for -stable, th anks.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2014-01-15 1:40 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 20:14 Multicast routing stops functioning after 4G multicast packets recived Bob Falken
2014-01-10 6:36 ` Hannes Frederic Sowa
2014-01-10 7:01 ` Eric Dumazet
2014-01-10 7:10 ` Hannes Frederic Sowa
2014-01-10 7:32 ` Eric Dumazet
2014-01-10 7:43 ` Hannes Frederic Sowa
2014-01-10 7:50 ` Hannes Frederic Sowa
2014-01-12 7:42 ` Hannes Frederic Sowa
2014-01-13 0:56 ` Eric Dumazet
2014-01-13 1:45 ` [PATCH net] net: avoid reference counter overflows on fib_rules in multicast forwarding Hannes Frederic Sowa
2014-01-13 15:18 ` Eric Dumazet
2014-01-13 23:38 ` Hannes Frederic Sowa
2014-01-15 1:40 ` David Miller
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.