netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG in ipmr_queue_xmit()
@ 2015-10-29 23:55 Ani Sinha
  2015-10-30  0:15 ` Florian Westphal
  0 siblings, 1 reply; 14+ messages in thread
From: Ani Sinha @ 2015-10-29 23:55 UTC (permalink / raw)
  To: netdev, Patrick McHardy, Hideaki YOSHIFUJI, James Morris,
	Alexey Kuznetsov, David S. Miller, ani, fruggeri

Hi guys:

We are noticing the following kernel BUG in 3.18 kernel. The 
code path that leads to the crash is the following :

 ip_mroute_setsockopt()
  ->ipmr_mfc_add()
      ->ipmr_cache_resolve()
        ->ip_mr_forward()
           -> ipmr_queue_xmit()
             -> ipmr_forward_finish()
               ->IP_INC_STATS_BH()
                  -> SNMP_INC_STATS64_BH()
                    -> SNMP_INC_STATS_BH()
                          -> __this_cpu_inc()
                              -> __this_cpu_add()
                                  -> __this_cpu_preempt_check()
                                     -> check_preemption_disabled()

I have verified that preempt_count() is 0 when the crash happens.
Is anyone else seeing the same crash in the laetst upstream code? I dug 
around a little bit and it does not look like there were any fixes that 
went into post 3.18 kernel which could have disabled preemption in this 
code path but I could be wrong. 

thoughts?

[  499.991221] BUG: using __this_cpu_add() in preemptible [00000000] code: KernelMfib/2758
[  500.086877] caller is __this_cpu_preempt_check+0x13/0x15
[  500.086884] CPU: 0 PID: 2758 Comm: KernelMfib Tainted: P           O   3.18.19.Ar-2716649.EosKernelnextcolonafix #2
[  500.086891]  ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
[  500.086906]  0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
[  500.086912]  ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
[  500.086918] Call Trace:
[  500.086926] [<ffffffff81482b2a>] dump_stack+0x52/0x80
[  500.086931] [<ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
[  500.086936] [<ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
[  500.086942] [<ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
[  500.086947] [<ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
[  500.086953] [<ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
[  500.086959] [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[  500.086967] [<ffffffff810e6974>] ? pollwake+0x4d/0x51
[  500.086972] [<ffffffff81058ac0>] ? default_wake_function+0x0/0xf
[  500.086977] [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[  500.086981] [<ffffffff810613d9>] ? __wake_up_common+0x45/0x77
[  500.086987] [<ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[  500.086991] [<ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
[  500.086996] [<ffffffff8139a519>] ? sock_def_readable+0x71/0x75
[  500.087002] [<ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
[  500.087008] [<ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
[  500.087012] [<ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
[  500.087017] [<ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
[  500.087021] [<ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
[  500.087025] [<ffffffff810d5738>] ? new_sync_read+0x82/0xaa
[  500.087030] [<ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
[  500.087034] [<ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
[  500.087038] [<ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
[  500.087043] [<ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
[  500.087048] [<ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
[  500.087054] [<ffffffff81488ea1>] cstar_dispatch+0x7/0x1e


-Ani

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-29 23:55 kernel BUG in ipmr_queue_xmit() Ani Sinha
@ 2015-10-30  0:15 ` Florian Westphal
  2015-10-30  1:41   ` Ani Sinha
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-10-30  0:15 UTC (permalink / raw)
  To: Ani Sinha; +Cc: netdev, ani, fruggeri

Ani Sinha <ani@arista.com> wrote:

[ trimmed CC list ]

> We are noticing the following kernel BUG in 3.18 kernel. The 
> code path that leads to the crash is the following :
> 
>  ip_mroute_setsockopt()
>   ->ipmr_mfc_add()
>       ->ipmr_cache_resolve()
>         ->ip_mr_forward()
>            -> ipmr_queue_xmit()
>              -> ipmr_forward_finish()
>                ->IP_INC_STATS_BH()
>                   -> SNMP_INC_STATS64_BH()
>                     -> SNMP_INC_STATS_BH()
>                           -> __this_cpu_inc()
>                               -> __this_cpu_add()
>                                   -> __this_cpu_preempt_check()
>                                      -> check_preemption_disabled()
> 
> I have verified that preempt_count() is 0 when the crash happens.
> Is anyone else seeing the same crash in the laetst upstream code? I dug 
> around a little bit and it does not look like there were any fixes that 
> went into post 3.18 kernel which could have disabled preemption in this 
> code path but I could be wrong. 
> 
> thoughts?

Send a patch to preempt_disable before ip_mr_forward call in the affected
setsockopt path?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30  0:15 ` Florian Westphal
@ 2015-10-30  1:41   ` Ani Sinha
  2015-10-30  4:15     ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Ani Sinha @ 2015-10-30  1:41 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Ani Sinha, netdev, ani, fruggeri



On Fri, 30 Oct 2015, Florian Westphal wrote:

> Ani Sinha <ani@arista.com> wrote:
> 
> [ trimmed CC list ]
> 
> > We are noticing the following kernel BUG in 3.18 kernel. The 
> > code path that leads to the crash is the following :
> > 
> >  ip_mroute_setsockopt()
> >   ->ipmr_mfc_add()
> >       ->ipmr_cache_resolve()
> >         ->ip_mr_forward()
> >            -> ipmr_queue_xmit()
> >              -> ipmr_forward_finish()
> >                ->IP_INC_STATS_BH()
> >                   -> SNMP_INC_STATS64_BH()
> >                     -> SNMP_INC_STATS_BH()
> >                           -> __this_cpu_inc()
> >                               -> __this_cpu_add()
> >                                   -> __this_cpu_preempt_check()
> >                                      -> check_preemption_disabled()
> > 
> > I have verified that preempt_count() is 0 when the crash happens.
> > Is anyone else seeing the same crash in the laetst upstream code? I dug 
> > around a little bit and it does not look like there were any fixes that 
> > went into post 3.18 kernel which could have disabled preemption in this 
> > code path but I could be wrong. 
> > 
> > thoughts?
> 
> Send a patch to preempt_disable before ip_mr_forward call in the affected
> setsockopt path?
> 

>From bfa982b5f8d91294d724486542163d3db5e6908a Mon Sep 17 00:00:00 2001
From: Ani Sinha <ani@arista.com>
Date: Thu, 29 Oct 2015 18:09:20 -0700
Subject: [PATCH 1/1] ipmr: fix a kernel BUG() due to calling __this_cpu_add()
 in preemptible  context. Reproduced in 3.18.19 kernel version.

BUG: using __this_cpu_add() in preemptible [00000000] code: KernelMfib/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: KernelMfib Tainted: P       O   3.18.19 #2
 ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
 0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
 ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[<ffffffff81482b2a>] dump_stack+0x52/0x80
[<ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
[<ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
[<ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
[<ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
[<ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
[<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<ffffffff810e6974>] ? pollwake+0x4d/0x51
[<ffffffff81058ac0>] ? default_wake_function+0x0/0xf
[<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<ffffffff810613d9>] ? __wake_up_common+0x45/0x77
[<ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[<ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
[<ffffffff8139a519>] ? sock_def_readable+0x71/0x75
[<ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
[<ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
[<ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
[<ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
[<ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
[<ffffffff810d5738>] ? new_sync_read+0x82/0xaa
[<ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
[<ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
[<ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
[<ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
[<ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
[<ffffffff81488ea1>] cstar_dispatch+0x7/0x1e

Signed-off-by: Ani Sinha <ani@arista.com>
---
 net/ipv4/ipmr.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 866ee89..48df3cc 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
 
 			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
 		} else {
+			preempt_disable();
 			ip_mr_forward(net, mrt, skb, c, 0);
+			preempt_enable();
 		}
 	}
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30  1:41   ` Ani Sinha
@ 2015-10-30  4:15     ` Eric Dumazet
  2015-10-30 10:36       ` Florian Westphal
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2015-10-30  4:15 UTC (permalink / raw)
  To: Ani Sinha; +Cc: Florian Westphal, netdev, ani, fruggeri

On Thu, 2015-10-29 at 18:41 -0700, Ani Sinha wrote:

> 
> Signed-off-by: Ani Sinha <ani@arista.com>
> ---
>  net/ipv4/ipmr.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 866ee89..48df3cc 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c
> @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
>  
>  			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
>  		} else {
> +			preempt_disable();
>  			ip_mr_forward(net, mrt, skb, c, 0);
> +			preempt_enable();
>  		}
>  	}
>  }

I do not believe this fix is correct.

Better replace the 
IP_INC_STATS_BH() by IP_INC_STATS()

and IP_ADD_STATS_BH() by IP_ADD_STATS()

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30  4:15     ` Eric Dumazet
@ 2015-10-30 10:36       ` Florian Westphal
  2015-10-30 10:40         ` Hannes Frederic Sowa
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-10-30 10:36 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ani Sinha, Florian Westphal, netdev, ani, fruggeri

Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > Signed-off-by: Ani Sinha <ani@arista.com>
> > ---
> >  net/ipv4/ipmr.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> > index 866ee89..48df3cc 100644
> > --- a/net/ipv4/ipmr.c
> > +++ b/net/ipv4/ipmr.c
> > @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
> >  
> >  			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
> >  		} else {
> > +			preempt_disable();
> >  			ip_mr_forward(net, mrt, skb, c, 0);
> > +			preempt_enable();
> >  		}
> >  	}
> >  }
> 
> I do not believe this fix is correct.

Yes, sorry.  I should have suggested local_bh_disable instead.

> Better replace the
> IP_INC_STATS_BH() by IP_INC_STATS()
>
> and IP_ADD_STATS_BH() by IP_ADD_STATS()

Hmm, whats the rationale for this?

Note that IP_ADD_STATS_BH in question is unconditional (not in
error path).  It seems that its virtually always called from softirq
except in the setsockopt case.

Thanks Eric.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 10:36       ` Florian Westphal
@ 2015-10-30 10:40         ` Hannes Frederic Sowa
  2015-10-30 10:48           ` Florian Westphal
  0 siblings, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-30 10:40 UTC (permalink / raw)
  To: Florian Westphal, Eric Dumazet; +Cc: Ani Sinha, netdev, ani, fruggeri

On Fri, Oct 30, 2015, at 11:36, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > Signed-off-by: Ani Sinha <ani@arista.com>
> > > ---
> > >  net/ipv4/ipmr.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> > > index 866ee89..48df3cc 100644
> > > --- a/net/ipv4/ipmr.c
> > > +++ b/net/ipv4/ipmr.c
> > > @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
> > >  
> > >  			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
> > >  		} else {
> > > +			preempt_disable();
> > >  			ip_mr_forward(net, mrt, skb, c, 0);
> > > +			preempt_enable();
> > >  		}
> > >  	}
> > >  }
> > 
> > I do not believe this fix is correct.
> 
> Yes, sorry.  I should have suggested local_bh_disable instead.
> 
> > Better replace the
> > IP_INC_STATS_BH() by IP_INC_STATS()
> >
> > and IP_ADD_STATS_BH() by IP_ADD_STATS()
> 
> Hmm, whats the rationale for this?
> 
> Note that IP_ADD_STATS_BH in question is unconditional (not in
> error path).  It seems that its virtually always called from softirq
> except in the setsockopt case.

The naming of the functions is bad if you compare them to e.g.
spin_lock_bh.

STATS_BH can only be used from bottom half and the normal ones (without
_BH) can be called from everywhere. It is a common pattern in the
kernel.

Eric's proposal is correct.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 10:40         ` Hannes Frederic Sowa
@ 2015-10-30 10:48           ` Florian Westphal
  2015-10-30 11:00             ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Florian Westphal @ 2015-10-30 10:48 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: Florian Westphal, Eric Dumazet, Ani Sinha, netdev, ani, fruggeri

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> > > > @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
> > > >  
> > > >  			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
> > > >  		} else {
> > > > +			preempt_disable();
> > > >  			ip_mr_forward(net, mrt, skb, c, 0);
> > > > +			preempt_enable();
> > > >  		}
> > > >  	}
> > > >  }
> > > 
> > > I do not believe this fix is correct.
> > 
> > Yes, sorry.  I should have suggested local_bh_disable instead.
> > 
> > > Better replace the
> > > IP_INC_STATS_BH() by IP_INC_STATS()
> > >
> > > and IP_ADD_STATS_BH() by IP_ADD_STATS()
> > 
> > Hmm, whats the rationale for this?
> > 
> > Note that IP_ADD_STATS_BH in question is unconditional (not in
> > error path).  It seems that its virtually always called from softirq
> > except in the setsockopt case.
> 
> The naming of the functions is bad if you compare them to e.g.
> spin_lock_bh.
> 
> STATS_BH can only be used from bottom half and the normal ones (without
> _BH) can be called from everywhere. It is a common pattern in the
> kernel.
> 
> Eric's proposal is correct.

Yes, its correct but it results in 4 additonal bh on/off calls
for the common case, hence my question.

Moving the one ip_mr_forward into bh-off keeps the bh-disable thing
in the setsockopt path.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 10:48           ` Florian Westphal
@ 2015-10-30 11:00             ` Eric Dumazet
  2015-10-30 17:47               ` Ani Sinha
  0 siblings, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2015-10-30 11:00 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Hannes Frederic Sowa, Ani Sinha, netdev, ani, fruggeri

On Fri, 2015-10-30 at 11:48 +0100, Florian Westphal wrote:
> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> > > > > @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
> > > > >  
> > > > >  			rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
> > > > >  		} else {
> > > > > +			preempt_disable();
> > > > >  			ip_mr_forward(net, mrt, skb, c, 0);
> > > > > +			preempt_enable();
> > > > >  		}
> > > > >  	}
> > > > >  }
> > > > 
> > > > I do not believe this fix is correct.
> > > 
> > > Yes, sorry.  I should have suggested local_bh_disable instead.
> > > 
> > > > Better replace the
> > > > IP_INC_STATS_BH() by IP_INC_STATS()
> > > >
> > > > and IP_ADD_STATS_BH() by IP_ADD_STATS()
> > > 
> > > Hmm, whats the rationale for this?
> > > 
> > > Note that IP_ADD_STATS_BH in question is unconditional (not in
> > > error path).  It seems that its virtually always called from softirq
> > > except in the setsockopt case.
> > 
> > The naming of the functions is bad if you compare them to e.g.
> > spin_lock_bh.
> > 
> > STATS_BH can only be used from bottom half and the normal ones (without
> > _BH) can be called from everywhere. It is a common pattern in the
> > kernel.
> > 
> > Eric's proposal is correct.
> 
> Yes, its correct but it results in 4 additonal bh on/off calls
> for the common case, hence my question.
> 
> Moving the one ip_mr_forward into bh-off keeps the bh-disable thing
> in the setsockopt path.

I have no idea how long is the ip_mr_forward(net, mrt, skb, c, 0)
section, and if GFP_KERNEL allocations were attempted in this path.

The proposed fix might add other regressions.

I do not want to spend time auditing this code that nobody uses.

While on x86, IP_INC_STATS() does not use additional bh on/off calls

In general, we should disable interrupts (even if soft) for limited
amount of times.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 11:00             ` Eric Dumazet
@ 2015-10-30 17:47               ` Ani Sinha
  2015-10-30 19:12                 ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Ani Sinha @ 2015-10-30 17:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Florian Westphal, Hannes Frederic Sowa, netdev, Ani Sinha, fruggeri

On Fri, Oct 30, 2015 at 4:00 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2015-10-30 at 11:48 +0100, Florian Westphal wrote:
>> Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
>> > > > > @@ -936,7 +936,9 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
>> > > > >
>> > > > >                       rtnl_unicast(skb, net, NETLINK_CB(skb).portid);
>> > > > >               } else {
>> > > > > +                     preempt_disable();
>> > > > >                       ip_mr_forward(net, mrt, skb, c, 0);
>> > > > > +                     preempt_enable();
>> > > > >               }
>> > > > >       }
>> > > > >  }
>> > > >
>> > > > I do not believe this fix is correct.
>> > >
>> > > Yes, sorry.  I should have suggested local_bh_disable instead.
>> > >
>> > > > Better replace the
>> > > > IP_INC_STATS_BH() by IP_INC_STATS()
>> > > >
>> > > > and IP_ADD_STATS_BH() by IP_ADD_STATS()
>> > >
>> > > Hmm, whats the rationale for this?
>> > >
>> > > Note that IP_ADD_STATS_BH in question is unconditional (not in
>> > > error path).  It seems that its virtually always called from softirq
>> > > except in the setsockopt case.
>> >
>> > The naming of the functions is bad if you compare them to e.g.
>> > spin_lock_bh.
>> >
>> > STATS_BH can only be used from bottom half and the normal ones (without
>> > _BH) can be called from everywhere. It is a common pattern in the
>> > kernel.
>> >
>> > Eric's proposal is correct.
>>
>> Yes, its correct but it results in 4 additonal bh on/off calls
>> for the common case, hence my question.
>>
>> Moving the one ip_mr_forward into bh-off keeps the bh-disable thing
>> in the setsockopt path.
>
> I have no idea how long is the ip_mr_forward(net, mrt, skb, c, 0)
> section, and if GFP_KERNEL allocations were attempted in this path.
>
> The proposed fix might add other regressions.
>
> I do not want to spend time auditing this code that nobody uses.
>
> While on x86, IP_INC_STATS() does not use additional bh on/off calls
>

for 32 bit archs, it does in SNMP_ADD_STATS64_USER()


> In general, we should disable interrupts (even if soft) for limited
> amount of times.
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 17:47               ` Ani Sinha
@ 2015-10-30 19:12                 ` Eric Dumazet
  2015-10-30 21:10                   ` Ani Sinha
  2015-10-30 23:54                   ` [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Ani Sinha
  0 siblings, 2 replies; 14+ messages in thread
From: Eric Dumazet @ 2015-10-30 19:12 UTC (permalink / raw)
  To: Ani Sinha
  Cc: Florian Westphal, Hannes Frederic Sowa, netdev, Ani Sinha, fruggeri

On Fri, 2015-10-30 at 10:47 -0700, Ani Sinha wrote:

> for 32 bit archs, it does in SNMP_ADD_STATS64_USER()

Sure. But x86 these days is 64bit, at 99 % maybe.

We do not make changes that looks 'maybe better' for i486 or i586

Just do the same that multiple similar patches did.

Example :

757efd32d5ce31f67193cc0e6a56e4dffcc42fb1

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG in ipmr_queue_xmit()
  2015-10-30 19:12                 ` Eric Dumazet
@ 2015-10-30 21:10                   ` Ani Sinha
  2015-10-30 23:54                   ` [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Ani Sinha
  1 sibling, 0 replies; 14+ messages in thread
From: Ani Sinha @ 2015-10-30 21:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Florian Westphal, Hannes Frederic Sowa, netdev, Ani Sinha, fruggeri

On Fri, Oct 30, 2015 at 12:12 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2015-10-30 at 10:47 -0700, Ani Sinha wrote:
>
>> for 32 bit archs, it does in SNMP_ADD_STATS64_USER()
>
> Sure. But x86 these days is 64bit, at 99 % maybe.
>
> We do not make changes that looks 'maybe better' for i486 or i586
>
> Just do the same that multiple similar patches did.
>
> Example :
>
> 757efd32d5ce31f67193cc0e6a56e4dffcc42fb1

OK thanks for pointing me to this. Seems we have a precedence for this
I will go ahead and send a patch as per your suggestion.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.
  2015-10-30 19:12                 ` Eric Dumazet
  2015-10-30 21:10                   ` Ani Sinha
@ 2015-10-30 23:54                   ` Ani Sinha
  2015-11-01 22:35                     ` Eric Dumazet
  2015-11-02 20:57                     ` David Miller
  1 sibling, 2 replies; 14+ messages in thread
From: Ani Sinha @ 2015-10-30 23:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ani Sinha, Florian Westphal, Hannes Frederic Sowa, netdev,
	Ani Sinha, fruggeri

Fixes the following kernel BUG :

BUG: using __this_cpu_add() in preemptible [00000000] code: bash/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: bash Tainted: P           O   3.18.19 #2
 ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
 0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
 ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[<ffffffff81482b2a>] dump_stack+0x52/0x80
[<ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
[<ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
[<ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
[<ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
[<ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
[<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<ffffffff810e6974>] ? pollwake+0x4d/0x51
[<ffffffff81058ac0>] ? default_wake_function+0x0/0xf
[<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<ffffffff810613d9>] ? __wake_up_common+0x45/0x77
[<ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[<ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
[<ffffffff8139a519>] ? sock_def_readable+0x71/0x75
[<ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
[<ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
[<ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
[<ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
[<ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
[<ffffffff810d5738>] ? new_sync_read+0x82/0xaa
[<ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
[<ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
[<ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
[<ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
[<ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
[<ffffffff81488ea1>] cstar_dispatch+0x7/0x1e

Signed-off-by: Ani Sinha <ani@arista.com>
---
 net/ipv4/ipmr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 866ee89..8e8203d 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1682,8 +1682,8 @@ static inline int ipmr_forward_finish(struct sock *sk, struct sk_buff *skb)
 {
 	struct ip_options *opt = &(IPCB(skb)->opt);
 
-	IP_INC_STATS_BH(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTFORWDATAGRAMS);
-	IP_ADD_STATS_BH(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTOCTETS, skb->len);
+	IP_INC_STATS(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTFORWDATAGRAMS);
+	IP_ADD_STATS(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTOCTETS, skb->len);
 
 	if (unlikely(opt->optlen))
 		ip_forward_options(skb);
@@ -1745,7 +1745,7 @@ static void ipmr_queue_xmit(struct net *net, struct mr_table *mrt,
 		 * to blackhole.
 		 */
 
-		IP_INC_STATS_BH(dev_net(dev), IPSTATS_MIB_FRAGFAILS);
+		IP_INC_STATS(dev_net(dev), IPSTATS_MIB_FRAGFAILS);
 		ip_rt_put(rt);
 		goto out_free;
 	}
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.
  2015-10-30 23:54                   ` [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Ani Sinha
@ 2015-11-01 22:35                     ` Eric Dumazet
  2015-11-02 20:57                     ` David Miller
  1 sibling, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2015-11-01 22:35 UTC (permalink / raw)
  To: Ani Sinha
  Cc: Florian Westphal, Hannes Frederic Sowa, netdev, Ani Sinha, fruggeri

On Fri, 2015-10-30 at 16:54 -0700, Ani Sinha wrote:
> Fixes the following kernel BUG :

> Signed-off-by: Ani Sinha <ani@arista.com>
> ---
>  net/ipv4/ipmr.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.
  2015-10-30 23:54                   ` [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Ani Sinha
  2015-11-01 22:35                     ` Eric Dumazet
@ 2015-11-02 20:57                     ` David Miller
  1 sibling, 0 replies; 14+ messages in thread
From: David Miller @ 2015-11-02 20:57 UTC (permalink / raw)
  To: ani; +Cc: eric.dumazet, fw, hannes, netdev, ani, fruggeri

From: Ani Sinha <ani@arista.com>
Date: Fri, 30 Oct 2015 16:54:31 -0700 (PDT)

> Fixes the following kernel BUG :
 ...
> Signed-off-by: Ani Sinha <ani@arista.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-02 20:57 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 23:55 kernel BUG in ipmr_queue_xmit() Ani Sinha
2015-10-30  0:15 ` Florian Westphal
2015-10-30  1:41   ` Ani Sinha
2015-10-30  4:15     ` Eric Dumazet
2015-10-30 10:36       ` Florian Westphal
2015-10-30 10:40         ` Hannes Frederic Sowa
2015-10-30 10:48           ` Florian Westphal
2015-10-30 11:00             ` Eric Dumazet
2015-10-30 17:47               ` Ani Sinha
2015-10-30 19:12                 ` Eric Dumazet
2015-10-30 21:10                   ` Ani Sinha
2015-10-30 23:54                   ` [PATCH 1/1] ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context Ani Sinha
2015-11-01 22:35                     ` Eric Dumazet
2015-11-02 20:57                     ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).