netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yuchung Cheng <ycheng@google.com>
To: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: netdev <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Patrick McHardy <kaber@trash.net>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	James Morris <jmorris@namei.org>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero
Date: Wed, 6 Jan 2016 10:19:10 -0800	[thread overview]
Message-ID: <CAK6E8=dSRmf35AaC6mOcRaGSxn81CE76WiOfNx6jb5X8uhDGqw@mail.gmail.com> (raw)
In-Reply-To: <DDC4BD3D-A5F5-4EBF-8461-7C63ABC9F852@natalenko.name>

On Wed, Jan 6, 2016 at 8:50 AM, Oleksandr Natalenko
<oleksandr@natalenko.name> wrote:
>
> Unfortunately, the patch didn't help -- I've got the same stacktrace with slightly different offset (+3) within the function.
>
> Now trying to get full stacktrace via netconsole. Need more time.
>
> Meanwhile, any other ideas on what went wrong?

That's odd b/c the patch already checks and avoids div0. Can post me
the stacktrace and kernel warnings if any ...

One possibility is that tcp_cwnd_reduction() may set a cwnd of 0,
which then gets used to start another recovery phase. This may or may
not be the culprit of this div0 issue because I wasn't able to
reproduce exactly your issue on our servers. But I will post the fix
today and CC you.

>
>
> On December 22, 2015 4:10:32 AM EET, Yuchung Cheng <ycheng@google.com> wrote:
> >On Mon, Dec 21, 2015 at 12:25 PM, Oleksandr Natalenko
> ><oleksandr@natalenko.name> wrote:
> >> Commit 3759824da87b30ce7a35b4873b62b0ba38905ef5 (tcp: PRR uses CRB
> >mode by
> >> default and SS mode conditionally) introduced changes to
> >net/ipv4/tcp_input.c
> >> tcp_cwnd_reduction() that, possibly, cause division by zero, and
> >therefore,
> >> kernel panic in interrupt handler [1].
> >>
> >> Reverting 3759824da87b30ce7a35b4873b62b0ba38905ef5 seems to fix the
> >issue.
> >>
> >> I'm able to reproduce the issue on 4.3.0–4.3.3 once per several day
> >> (occasionally).
> >>
> >> What could be done to help in debugging this issue?
> >Do you have ECN enabled (i.e. sysctl net.ipv4.tcp_ecn > 0)?
> >
> >If so I suspect an ACK carrying ECE during CA_Loss causes entering CWR
> >state w/o calling tcp_init_cwnd_reduct() to set tp->prior_cwnd. Can
> >you try this debug / quick-fix patch and send me the error message if
> >any?
> >
> >
> >>
> >> Regards,
> >>   Oleksandr.
> >>
> >> [1] http://i.piccy.info/
> >>
> >i9/6f5cb187c4ff282d189f78c63f95af43/1450729403/283985/951663/panic.jpg
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

  reply	other threads:[~2016-01-06 18:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-21 20:25 [REGRESSION] tcp/ipv4: kernel panic because of (possible) division by zero Oleksandr Natalenko
2015-12-22  2:10 ` Yuchung Cheng
2015-12-22 20:13   ` Oleksandr Natalenko
2016-01-06 16:50   ` Oleksandr Natalenko
2016-01-06 18:19     ` Yuchung Cheng [this message]
2016-01-06 18:43       ` Yuchung Cheng
2016-01-06 18:49         ` Oleksandr Natalenko
2016-01-09 17:34         ` Oleksandr Natalenko
2016-01-10 10:23         ` Oleksandr Natalenko
2016-01-10 14:48           ` Neal Cardwell
2016-01-10 14:54             ` Neal Cardwell
2016-01-10 14:57               ` Oleksandr Natalenko
2016-01-10 14:57             ` Oleksandr Natalenko
2016-01-10 17:29               ` Neal Cardwell
2016-01-10 17:50                 ` Oleksandr Natalenko
2016-01-10 18:00                   ` Neal Cardwell
2016-01-10 21:56                 ` Oleksandr Natalenko
2016-01-11 18:47                   ` Neal Cardwell
2016-01-11 23:26                     ` Oleksandr Natalenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK6E8=dSRmf35AaC6mOcRaGSxn81CE76WiOfNx6jb5X8uhDGqw@mail.gmail.com' \
    --to=ycheng@google.com \
    --cc=davem@davemloft.net \
    --cc=jmorris@namei.org \
    --cc=kaber@trash.net \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oleksandr@natalenko.name \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).