BBR and TCP internal pacing causing interrupt storm with pfifo

All of lore.kernel.org
 help / color / mirror / Atom feed

* BBR and TCP internal pacing causing interrupt storm with pfifo_fast
@ 2018-10-09 16:38 Gasper Zejn
  2018-10-09 17:00 ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Gasper Zejn @ 2018-10-09 16:38 UTC (permalink / raw)
  To: Kevin Yang, Eric Dumazet, netdev

Hello,

I am seeing interrupt storms of over 100k-900k local timer interrupts
when changing between network devices or networks with open TCP
connections when not using sch_fq (I was using pfifo_fast). Using sch_fq
makes the bug with interrupt storm go away.

The interrupts all called tcp_pace_kick (according to perf), which seems
to return HRTIMER_NORESTART, but apparently somewhere calls another
function, that does restart the timer.

The bug is fairly easy to reproduce. Congestion control needs to be BBR,
network scheduler was pfifo_fast, and there need to be open TCP
connections when changing network in such a way that TCP connections
cannot continue to work (eg. different client IP addresses). The more
connections the more interrupts. The connection handling code will cause
interrupt storm, which eventually sets down as the connections time out.
It is a bit annoying as high interrupt rate does not show as load. I
successfully reproduced this with 4.18.12, but this has been happening
for some time, with previous versions of kernel too.

I'd like to thank you for the comment regarding use of sch_fq with BBR
above the tcp_needs_internal_pacing function. It has pointed me in the
direction to find the workaround.

Kind regards,

Gasper Zejn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-09 16:38 BBR and TCP internal pacing causing interrupt storm with pfifo_fast Gasper Zejn
@ 2018-10-09 17:00 ` Eric Dumazet
  2018-10-09 17:22   ` Gasper Zejn
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-10-09 17:00 UTC (permalink / raw)
  To: Gasper Zejn, Kevin Yang, Eric Dumazet, netdev



On 10/09/2018 09:38 AM, Gasper Zejn wrote:
> Hello,
> 
> I am seeing interrupt storms of over 100k-900k local timer interrupts
> when changing between network devices or networks with open TCP
> connections when not using sch_fq (I was using pfifo_fast). Using sch_fq
> makes the bug with interrupt storm go away.
> 

That is for what kind of traffic ?

If your TCP flows send 100k-3M packets per second, then yes, the pacing timers
could be setup in the 100k-900k range.

> The interrupts all called tcp_pace_kick (according to perf), which seems
> to return HRTIMER_NORESTART, but apparently somewhere calls another
> function, that does restart the timer.
> 
> The bug is fairly easy to reproduce. Congestion control needs to be BBR,
> network scheduler was pfifo_fast, and there need to be open TCP
> connections when changing network in such a way that TCP connections
> cannot continue to work (eg. different client IP addresses). The more
> connections the more interrupts. The connection handling code will cause
> interrupt storm, which eventually sets down as the connections time out.
> It is a bit annoying as high interrupt rate does not show as load. I
> successfully reproduced this with 4.18.12, but this has been happening
> for some time, with previous versions of kernel too.
> 
> 
> I'd like to thank you for the comment regarding use of sch_fq with BBR
> above the tcp_needs_internal_pacing function. It has pointed me in the
> direction to find the workaround.
>

Well, BBR has been very clear about sch_fq being the best packet scheduler

net/ipv4/tcp_bbr.c currently says :

/* ...
 *
 * NOTE: BBR might be used with the fq qdisc ("man tc-fq") with pacing enabled,
 * otherwise TCP stack falls back to an internal pacing using one high
 * resolution timer per TCP socket and may use more resources.
 */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-09 17:00 ` Eric Dumazet
@ 2018-10-09 17:22   ` Gasper Zejn
  2018-10-09 17:26     ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Gasper Zejn @ 2018-10-09 17:22 UTC (permalink / raw)
  To: Eric Dumazet, Kevin Yang, Eric Dumazet, netdev

On 09. 10. 2018 19:00, Eric Dumazet wrote:
>
> On 10/09/2018 09:38 AM, Gasper Zejn wrote:
>> Hello,
>>
>> I am seeing interrupt storms of over 100k-900k local timer interrupts
>> when changing between network devices or networks with open TCP
>> connections when not using sch_fq (I was using pfifo_fast). Using sch_fq
>> makes the bug with interrupt storm go away.
>>
> That is for what kind of traffic ?
>
> If your TCP flows send 100k-3M packets per second, then yes, the pacing timers
> could be setup in the 100k-900k range.
>
Traffic is nowhere in that range, think of having a few browser tabs of
javascript rich
web pages open, mostly idle, for example slack, gmail or tweetdeck. No
significant
packet rate is needed, just open connections.

>> The interrupts all called tcp_pace_kick (according to perf), which seems
>> to return HRTIMER_NORESTART, but apparently somewhere calls another
>> function, that does restart the timer.
>>
>> The bug is fairly easy to reproduce. Congestion control needs to be BBR,
>> network scheduler was pfifo_fast, and there need to be open TCP
>> connections when changing network in such a way that TCP connections
>> cannot continue to work (eg. different client IP addresses). The more
>> connections the more interrupts. The connection handling code will cause
>> interrupt storm, which eventually sets down as the connections time out.
>> It is a bit annoying as high interrupt rate does not show as load. I
>> successfully reproduced this with 4.18.12, but this has been happening
>> for some time, with previous versions of kernel too.
>>
>>
>> I'd like to thank you for the comment regarding use of sch_fq with BBR
>> above the tcp_needs_internal_pacing function. It has pointed me in the
>> direction to find the workaround.
>>
> Well, BBR has been very clear about sch_fq being the best packet scheduler
>
> net/ipv4/tcp_bbr.c currently says :
>
> /* ...
>  *
>  * NOTE: BBR might be used with the fq qdisc ("man tc-fq") with pacing enabled,
>  * otherwise TCP stack falls back to an internal pacing using one high
>  * resolution timer per TCP socket and may use more resources.
>  */
>
I am not disputing FQ being the best packet packet scheduler, it does
seem however
that some effort has been made to make BBR work without FQ too. Using more
resources in that case is perfectly fine. But going from ~ thousand
interrupts to few
hundred thousand interrupts (and in the process consuming most of the cpu)
seems to indicate that a corner case was somehow hit as this happens the
moment the network is changed and not before.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-09 17:22   ` Gasper Zejn
@ 2018-10-09 17:26     ` Eric Dumazet
  2018-10-15 10:26       ` Gasper Zejn
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-10-09 17:26 UTC (permalink / raw)
  To: zelo.zejn; +Cc: Eric Dumazet, Kevin Yang, netdev

On Tue, Oct 9, 2018 at 10:22 AM Gasper Zejn <zelo.zejn@gmail.com> wrote:
>
> On 09. 10. 2018 19:00, Eric Dumazet wrote:
> >
> > On 10/09/2018 09:38 AM, Gasper Zejn wrote:
> >> Hello,
> >>
> >> I am seeing interrupt storms of over 100k-900k local timer interrupts
> >> when changing between network devices or networks with open TCP
> >> connections when not using sch_fq (I was using pfifo_fast). Using sch_fq
> >> makes the bug with interrupt storm go away.
> >>
> > That is for what kind of traffic ?
> >
> > If your TCP flows send 100k-3M packets per second, then yes, the pacing timers
> > could be setup in the 100k-900k range.
> >
> Traffic is nowhere in that range, think of having a few browser tabs of
> javascript rich
> web pages open, mostly idle, for example slack, gmail or tweetdeck. No
> significant
> packet rate is needed, just open connections.

No idea of what is going on really. A repro would be nice.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-09 17:26     ` Eric Dumazet
@ 2018-10-15 10:26       ` Gasper Zejn
  2018-10-15 14:50         ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Gasper Zejn @ 2018-10-15 10:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Eric Dumazet, Kevin Yang, netdev


On 09. 10. 2018 19:26, Eric Dumazet wrote:
> On Tue, Oct 9, 2018 at 10:22 AM Gasper Zejn <zelo.zejn@gmail.com> wrote:
>> On 09. 10. 2018 19:00, Eric Dumazet wrote:
>>> On 10/09/2018 09:38 AM, Gasper Zejn wrote:
>>>> Hello,
>>>>
>>>> I am seeing interrupt storms of over 100k-900k local timer interrupts
>>>> when changing between network devices or networks with open TCP
>>>> connections when not using sch_fq (I was using pfifo_fast). Using sch_fq
>>>> makes the bug with interrupt storm go away.
>>>>
>>> That is for what kind of traffic ?
>>>
>>> If your TCP flows send 100k-3M packets per second, then yes, the pacing timers
>>> could be setup in the 100k-900k range.
>>>
>> Traffic is nowhere in that range, think of having a few browser tabs of
>> javascript rich
>> web pages open, mostly idle, for example slack, gmail or tweetdeck. No
>> significant
>> packet rate is needed, just open connections.
> No idea of what is going on really. A repro would be nice.

I've tried to isolate the issue as best I could. There seems to be an
issue if the TCP socket has keepalive set and send queue is not empty
and the route goes away.

https://github.com/zejn/bbr_pfifo_interrupts_issue

Hope this helps,
Gasper

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-15 10:26       ` Gasper Zejn
@ 2018-10-15 14:50         ` Eric Dumazet
  2018-10-15 16:23           ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2018-10-15 14:50 UTC (permalink / raw)
  To: Gasper Zejn; +Cc: Eric Dumazet, Kevin Yang, netdev

On Mon, Oct 15, 2018 at 3:26 AM Gasper Zejn <zelo.zejn@gmail.com> wrote:
>
>
> I've tried to isolate the issue as best I could. There seems to be an
> issue if the TCP socket has keepalive set and send queue is not empty
> and the route goes away.
>
> https://github.com/zejn/bbr_pfifo_interrupts_issue
>
> Hope this helps,
> Gasper

This is awesome Gasper, I will take a look thanks.

Note that we are about to send a patch series (targeting net-next) to
polish the EDT patch series that was merged last month for linux-4.20.
TCP internal pacing is going to be much better performance-wise.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: BBR and TCP internal pacing causing interrupt storm with pfifo_fast
  2018-10-15 14:50         ` Eric Dumazet
@ 2018-10-15 16:23           ` Eric Dumazet
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2018-10-15 16:23 UTC (permalink / raw)
  To: Eric Dumazet, Gasper Zejn; +Cc: Eric Dumazet, Kevin Yang, netdev

On 10/15/2018 07:50 AM, Eric Dumazet wrote:
> On Mon, Oct 15, 2018 at 3:26 AM Gasper Zejn <zelo.zejn@gmail.com> wrote:
>>
>>
>> I've tried to isolate the issue as best I could. There seems to be an
>> issue if the TCP socket has keepalive set and send queue is not empty
>> and the route goes away.
>>
>> https://github.com/zejn/bbr_pfifo_interrupts_issue
>>
>> Hope this helps,
>> Gasper
> 
> This is awesome Gasper, I will take a look thanks.
> 
> Note that we are about to send a patch series (targeting net-next) to
> polish the EDT patch series that was merged last month for linux-4.20.
> TCP internal pacing is going to be much better performance-wise.
> 

Yeah, I believe that :

Commit c092dd5f4a7f4e4dbbcc8cf2e50b516bf07e432f ("tcp: switch
tcp_internal_pacing() to tcp_wstamp_ns")
has incidentally fixed the issue.

That is because it calls tcp_internal_pacing() from
tcp_update_skb_after_send() which is called only if the packet was
correctly sent by IP layer.

Before this patch, tcp_internal_pacing() was called from
__tcp_transmit_skb() before we attempted to send the clone
and the clone could be dropped in IP layer (lack of route for example)
right away.

So in case the packet was not sent because of a route problem, the high resolution
timer would kick soon after and TCP xmit path would be entered again, triggering this loop problem.

I am going to send the 2nd round of EDT patches, so that you can try David Miller net-next tree
with all the patches we believe are needed for 4.20. Once proven to work, we might have to backport
the series to 4.18 and 4.19

Thanks !

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-16  0:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09 16:38 BBR and TCP internal pacing causing interrupt storm with pfifo_fast Gasper Zejn
2018-10-09 17:00 ` Eric Dumazet
2018-10-09 17:22   ` Gasper Zejn
2018-10-09 17:26     ` Eric Dumazet
2018-10-15 10:26       ` Gasper Zejn
2018-10-15 14:50         ` Eric Dumazet
2018-10-15 16:23           ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.