netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc
@ 2014-10-13 16:20 Marcelo Ricardo Leitner
  2014-10-13 16:52 ` David Miller
  2014-10-13 17:51 ` [PATCH stable v3.2 v3.4] " David Miller
  0 siblings, 2 replies; 11+ messages in thread
From: Marcelo Ricardo Leitner @ 2014-10-13 16:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, hannes

Further tests revealed that after moving the garbage collector to a work
queue and protecting it with a spinlock may leave the system prone to
soft lockups if bottom half gets very busy.

It was reproced with a set of firewall rules that REJECTed packets. If
the NIC bottom half handler ends up running on the same CPU that is
running the garbage collector on a very large cache, the garbage
collector will not be able to do its job due to the amount of work
needed for handling the REJECTs and also won't reschedule.

The fix is to disable bottom half during the garbage collecting, as it
already was in the first place (most calls to it came from softirqs).

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---

Notes:
    Hi Dave,
    
    This is needed for stables 3.2 and 3.4, as those are the ones that we
    applied the previous patches:
        ipv4: move route garbage collector to work queue
        ipv4: avoid parallel route cache gc executions
    
    Thanks.

 net/ipv4/route.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 9e7909eef8d10107008f8d629f9f2d75fde52eb2..6c34bc98bce7147cf6c242439f2afeb9adf28c72 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -998,7 +998,7 @@ static void __do_rt_garbage_collect(int elasticity, int min_interval)
 	 * do not make it too frequently.
 	 */
 
-	spin_lock(&rt_gc_lock);
+	spin_lock_bh(&rt_gc_lock);
 
 	RT_CACHE_STAT_INC(gc_total);
 
@@ -1101,7 +1101,7 @@ work_done:
 	    dst_entries_get_slow(&ipv4_dst_ops) < ipv4_dst_ops.gc_thresh)
 		expire = ip_rt_gc_timeout;
 out:
-	spin_unlock(&rt_gc_lock);
+	spin_unlock_bh(&rt_gc_lock);
 }
 
 static void __rt_garbage_collect(struct work_struct *w)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc
  2014-10-13 16:20 [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc Marcelo Ricardo Leitner
@ 2014-10-13 16:52 ` David Miller
  2014-10-13 16:58   ` Marcelo Ricardo Leitner
  2014-10-13 17:51 ` [PATCH stable v3.2 v3.4] " David Miller
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2014-10-13 16:52 UTC (permalink / raw)
  To: mleitner; +Cc: netdev, hannes

From: Marcelo Ricardo Leitner <mleitner@redhat.com>
Date: Mon, 13 Oct 2014 13:20:38 -0300

> Further tests revealed that after moving the garbage collector to a work
> queue and protecting it with a spinlock may leave the system prone to
> soft lockups if bottom half gets very busy.
> 
> It was reproced with a set of firewall rules that REJECTed packets. If
> the NIC bottom half handler ends up running on the same CPU that is
> running the garbage collector on a very large cache, the garbage
> collector will not be able to do its job due to the amount of work
> needed for handling the REJECTs and also won't reschedule.
> 
> The fix is to disable bottom half during the garbage collecting, as it
> already was in the first place (most calls to it came from softirqs).
> 
> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Please add my:

Acked-by: David S. Miller <davem@davemloft.net>

and submit this directly to -stable, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc
  2014-10-13 16:52 ` David Miller
@ 2014-10-13 16:58   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 11+ messages in thread
From: Marcelo Ricardo Leitner @ 2014-10-13 16:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, hannes

On 13-10-2014 13:52, David Miller wrote:
> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Date: Mon, 13 Oct 2014 13:20:38 -0300
>
>> Further tests revealed that after moving the garbage collector to a work
>> queue and protecting it with a spinlock may leave the system prone to
>> soft lockups if bottom half gets very busy.
>>
>> It was reproced with a set of firewall rules that REJECTed packets. If
>> the NIC bottom half handler ends up running on the same CPU that is
>> running the garbage collector on a very large cache, the garbage
>> collector will not be able to do its job due to the amount of work
>> needed for handling the REJECTs and also won't reschedule.
>>
>> The fix is to disable bottom half during the garbage collecting, as it
>> already was in the first place (most calls to it came from softirqs).
>>
>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
>> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>
> Please add my:
>
> Acked-by: David S. Miller <davem@davemloft.net>
>
> and submit this directly to -stable, thanks.

Will do. Thanks.

Marcelo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-13 16:20 [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc Marcelo Ricardo Leitner
  2014-10-13 16:52 ` David Miller
@ 2014-10-13 17:51 ` David Miller
  2014-10-20  3:09   ` Ben Hutchings
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2014-10-13 17:51 UTC (permalink / raw)
  To: mleitner; +Cc: stable, netdev, hannes

From: Marcelo Ricardo Leitner <mleitner@redhat.com>
Date: Mon, 13 Oct 2014 14:03:30 -0300

> Further tests revealed that after moving the garbage collector to a work
> queue and protecting it with a spinlock may leave the system prone to
> soft lockups if bottom half gets very busy.
> 
> It was reproced with a set of firewall rules that REJECTed packets. If
> the NIC bottom half handler ends up running on the same CPU that is
> running the garbage collector on a very large cache, the garbage
> collector will not be able to do its job due to the amount of work
> needed for handling the REJECTs and also won't reschedule.
> 
> The fix is to disable bottom half during the garbage collecting, as it
> already was in the first place (most calls to it came from softirqs).
> 
> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Acked-by: David S. Miller <davem@davemloft.net>
> Cc: stable@vger.kernel.org

-stable folks, please integrate this directly, thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-13 17:51 ` [PATCH stable v3.2 v3.4] " David Miller
@ 2014-10-20  3:09   ` Ben Hutchings
  2014-10-20  4:23     ` David Miller
  2014-10-21 19:08     ` Marcelo Ricardo Leitner
  0 siblings, 2 replies; 11+ messages in thread
From: Ben Hutchings @ 2014-10-20  3:09 UTC (permalink / raw)
  To: David Miller; +Cc: mleitner, stable, netdev, hannes

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

On Mon, 2014-10-13 at 13:51 -0400, David Miller wrote:
> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
> Date: Mon, 13 Oct 2014 14:03:30 -0300
> 
> > Further tests revealed that after moving the garbage collector to a work
> > queue and protecting it with a spinlock may leave the system prone to
> > soft lockups if bottom half gets very busy.
> > 
> > It was reproced with a set of firewall rules that REJECTed packets. If
> > the NIC bottom half handler ends up running on the same CPU that is
> > running the garbage collector on a very large cache, the garbage
> > collector will not be able to do its job due to the amount of work
> > needed for handling the REJECTs and also won't reschedule.
> > 
> > The fix is to disable bottom half during the garbage collecting, as it
> > already was in the first place (most calls to it came from softirqs).
> > 
> > Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> > Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> > Acked-by: David S. Miller <davem@davemloft.net>
> > Cc: stable@vger.kernel.org
> 
> -stable folks, please integrate this directly, thanks!

I've appplied this and the previous two patches mentioned ('ipv4: move
route garbage collector to work queue' and 'ipv4: avoid parallel route
cache gc executions').  But I didn't get the other two from you.  The
last batch of networking fixes I received and applied was dated
2014-08-07, and the next one I've seen is dated 2014-10-11 and has
nothing for 3.2 or 3.4.  Did I miss one between these?

Ben.

-- 
Ben Hutchings
[W]e found...that it wasn't as easy to get programs right as we had thought.
... I realized that a large part of my life from then on was going to be spent
in finding mistakes in my own programs. - Maurice Wilkes, 1949

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-20  3:09   ` Ben Hutchings
@ 2014-10-20  4:23     ` David Miller
  2014-10-20 12:23       ` Ben Hutchings
  2014-10-21 19:08     ` Marcelo Ricardo Leitner
  1 sibling, 1 reply; 11+ messages in thread
From: David Miller @ 2014-10-20  4:23 UTC (permalink / raw)
  To: ben; +Cc: mleitner, stable, netdev, hannes

From: Ben Hutchings <ben@decadent.org.uk>
Date: Mon, 20 Oct 2014 04:09:41 +0100

> I've appplied this and the previous two patches mentioned ('ipv4: move
> route garbage collector to work queue' and 'ipv4: avoid parallel route
> cache gc executions').  But I didn't get the other two from you.  The
> last batch of networking fixes I received and applied was dated
> 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
> nothing for 3.2 or 3.4.  Did I miss one between these?

I'm at the point where I'm personally not going to go back more than
four releases, anything more than that is rediculous.

And this time that was 3.17, 3.16, 3.14, and 3.10

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-20  4:23     ` David Miller
@ 2014-10-20 12:23       ` Ben Hutchings
  2014-10-20 15:39         ` David Miller
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Hutchings @ 2014-10-20 12:23 UTC (permalink / raw)
  To: David Miller; +Cc: mleitner, stable, netdev, hannes

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

On Mon, 2014-10-20 at 00:23 -0400, David Miller wrote:
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Mon, 20 Oct 2014 04:09:41 +0100
> 
> > I've appplied this and the previous two patches mentioned ('ipv4: move
> > route garbage collector to work queue' and 'ipv4: avoid parallel route
> > cache gc executions').  But I didn't get the other two from you.  The
> > last batch of networking fixes I received and applied was dated
> > 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
> > nothing for 3.2 or 3.4.  Did I miss one between these?
> 
> I'm at the point where I'm personally not going to go back more than
> four releases, anything more than that is rediculous.
> 
> And this time that was 3.17, 3.16, 3.14, and 3.10

OK.  I appreciate all the work you've done to backport to 3.2, but would
also have appreciated an explicit note to say you were dropping the
earlier versions.

Thanks, Ben.

-- 
Ben Hutchings
[W]e found...that it wasn't as easy to get programs right as we had thought.
... I realized that a large part of my life from then on was going to be spent
in finding mistakes in my own programs. - Maurice Wilkes, 1949

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-20 12:23       ` Ben Hutchings
@ 2014-10-20 15:39         ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2014-10-20 15:39 UTC (permalink / raw)
  To: ben; +Cc: mleitner, stable, netdev, hannes

From: Ben Hutchings <ben@decadent.org.uk>
Date: Mon, 20 Oct 2014 13:23:28 +0100

> On Mon, 2014-10-20 at 00:23 -0400, David Miller wrote:
>> From: Ben Hutchings <ben@decadent.org.uk>
>> Date: Mon, 20 Oct 2014 04:09:41 +0100
>> 
>> > I've appplied this and the previous two patches mentioned ('ipv4: move
>> > route garbage collector to work queue' and 'ipv4: avoid parallel route
>> > cache gc executions').  But I didn't get the other two from you.  The
>> > last batch of networking fixes I received and applied was dated
>> > 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
>> > nothing for 3.2 or 3.4.  Did I miss one between these?
>> 
>> I'm at the point where I'm personally not going to go back more than
>> four releases, anything more than that is rediculous.
>> 
>> And this time that was 3.17, 3.16, 3.14, and 3.10
> 
> OK.  I appreciate all the work you've done to backport to 3.2, but would
> also have appreciated an explicit note to say you were dropping the
> earlier versions.

My apologies, but as time goes on and more -stable releases go active
I will certainly be tail dropping to keep it within 4 releases.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-20  3:09   ` Ben Hutchings
  2014-10-20  4:23     ` David Miller
@ 2014-10-21 19:08     ` Marcelo Ricardo Leitner
  2014-10-21 19:32       ` Ben Hutchings
  1 sibling, 1 reply; 11+ messages in thread
From: Marcelo Ricardo Leitner @ 2014-10-21 19:08 UTC (permalink / raw)
  To: Ben Hutchings, David Miller; +Cc: stable, netdev, hannes

On 20-10-2014 01:09, Ben Hutchings wrote:
> On Mon, 2014-10-13 at 13:51 -0400, David Miller wrote:
>> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
>> Date: Mon, 13 Oct 2014 14:03:30 -0300
>>
>>> Further tests revealed that after moving the garbage collector to a work
>>> queue and protecting it with a spinlock may leave the system prone to
>>> soft lockups if bottom half gets very busy.
>>>
>>> It was reproced with a set of firewall rules that REJECTed packets. If
>>> the NIC bottom half handler ends up running on the same CPU that is
>>> running the garbage collector on a very large cache, the garbage
>>> collector will not be able to do its job due to the amount of work
>>> needed for handling the REJECTs and also won't reschedule.
>>>
>>> The fix is to disable bottom half during the garbage collecting, as it
>>> already was in the first place (most calls to it came from softirqs).
>>>
>>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
>>> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> Acked-by: David S. Miller <davem@davemloft.net>
>>> Cc: stable@vger.kernel.org
>>
>> -stable folks, please integrate this directly, thanks!
>
> I've appplied this and the previous two patches mentioned ('ipv4: move
> route garbage collector to work queue' and 'ipv4: avoid parallel route
> cache gc executions').  But I didn't get the other two from you.  The
> last batch of networking fixes I received and applied was dated
> 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
> nothing for 3.2 or 3.4.  Did I miss one between these?

Sorry to ask Ben but, where did you apply them? I'm not seeing the commits on 
linux-stable.git and couldn't find their summaries anywhere else.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-21 19:08     ` Marcelo Ricardo Leitner
@ 2014-10-21 19:32       ` Ben Hutchings
  2014-10-21 19:51         ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 11+ messages in thread
From: Ben Hutchings @ 2014-10-21 19:32 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: David Miller, stable, netdev, hannes

[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]

On Tue, 2014-10-21 at 17:08 -0200, Marcelo Ricardo Leitner wrote:
> On 20-10-2014 01:09, Ben Hutchings wrote:
> > On Mon, 2014-10-13 at 13:51 -0400, David Miller wrote:
> >> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
> >> Date: Mon, 13 Oct 2014 14:03:30 -0300
> >>
> >>> Further tests revealed that after moving the garbage collector to a work
> >>> queue and protecting it with a spinlock may leave the system prone to
> >>> soft lockups if bottom half gets very busy.
> >>>
> >>> It was reproced with a set of firewall rules that REJECTed packets. If
> >>> the NIC bottom half handler ends up running on the same CPU that is
> >>> running the garbage collector on a very large cache, the garbage
> >>> collector will not be able to do its job due to the amount of work
> >>> needed for handling the REJECTs and also won't reschedule.
> >>>
> >>> The fix is to disable bottom half during the garbage collecting, as it
> >>> already was in the first place (most calls to it came from softirqs).
> >>>
> >>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
> >>> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >>> Acked-by: David S. Miller <davem@davemloft.net>
> >>> Cc: stable@vger.kernel.org
> >>
> >> -stable folks, please integrate this directly, thanks!
> >
> > I've appplied this and the previous two patches mentioned ('ipv4: move
> > route garbage collector to work queue' and 'ipv4: avoid parallel route
> > cache gc executions').  But I didn't get the other two from you.  The
> > last batch of networking fixes I received and applied was dated
> > 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
> > nothing for 3.2 or 3.4.  Did I miss one between these?
> 
> Sorry to ask Ben but, where did you apply them? I'm not seeing the commits on 
> linux-stable.git and couldn't find their summaries anywhere else.

They're in a patch queue that I've just pushed to:

git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

Ben.

-- 
Ben Hutchings
Reality is just a crutch for people who can't handle science fiction.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH stable v3.2 v3.4] ipv4: disable bh while doing route gc
  2014-10-21 19:32       ` Ben Hutchings
@ 2014-10-21 19:51         ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 11+ messages in thread
From: Marcelo Ricardo Leitner @ 2014-10-21 19:51 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, stable, netdev, hannes

On 21-10-2014 17:32, Ben Hutchings wrote:
> On Tue, 2014-10-21 at 17:08 -0200, Marcelo Ricardo Leitner wrote:
>> On 20-10-2014 01:09, Ben Hutchings wrote:
>>> On Mon, 2014-10-13 at 13:51 -0400, David Miller wrote:
>>>> From: Marcelo Ricardo Leitner <mleitner@redhat.com>
>>>> Date: Mon, 13 Oct 2014 14:03:30 -0300
>>>>
>>>>> Further tests revealed that after moving the garbage collector to a work
>>>>> queue and protecting it with a spinlock may leave the system prone to
>>>>> soft lockups if bottom half gets very busy.
>>>>>
>>>>> It was reproced with a set of firewall rules that REJECTed packets. If
>>>>> the NIC bottom half handler ends up running on the same CPU that is
>>>>> running the garbage collector on a very large cache, the garbage
>>>>> collector will not be able to do its job due to the amount of work
>>>>> needed for handling the REJECTs and also won't reschedule.
>>>>>
>>>>> The fix is to disable bottom half during the garbage collecting, as it
>>>>> already was in the first place (most calls to it came from softirqs).
>>>>>
>>>>> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
>>>>> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>>> Acked-by: David S. Miller <davem@davemloft.net>
>>>>> Cc: stable@vger.kernel.org
>>>>
>>>> -stable folks, please integrate this directly, thanks!
>>>
>>> I've appplied this and the previous two patches mentioned ('ipv4: move
>>> route garbage collector to work queue' and 'ipv4: avoid parallel route
>>> cache gc executions').  But I didn't get the other two from you.  The
>>> last batch of networking fixes I received and applied was dated
>>> 2014-08-07, and the next one I've seen is dated 2014-10-11 and has
>>> nothing for 3.2 or 3.4.  Did I miss one between these?
>>
>> Sorry to ask Ben but, where did you apply them? I'm not seeing the commits on
>> linux-stable.git and couldn't find their summaries anywhere else.
>
> They're in a patch queue that I've just pushed to:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

Cool, thanks!

Marcelo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-10-21 19:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-13 16:20 [PATCH stable 3.2 3.4] ipv4: disable bh while doing route gc Marcelo Ricardo Leitner
2014-10-13 16:52 ` David Miller
2014-10-13 16:58   ` Marcelo Ricardo Leitner
2014-10-13 17:51 ` [PATCH stable v3.2 v3.4] " David Miller
2014-10-20  3:09   ` Ben Hutchings
2014-10-20  4:23     ` David Miller
2014-10-20 12:23       ` Ben Hutchings
2014-10-20 15:39         ` David Miller
2014-10-21 19:08     ` Marcelo Ricardo Leitner
2014-10-21 19:32       ` Ben Hutchings
2014-10-21 19:51         ` Marcelo Ricardo Leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).