regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Stefan Dietrich <roots@gmx.de>
Cc: Greg KH <greg@kroah.com>,
	netdev@vger.kernel.org, stable@vger.kernel.org,
	regressions@lists.linux.dev,
	Vinicius Costa Gomes <vinicius.gomes@intel.com>,
	Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	intel-wired-lan@lists.osuosl.org
Subject: Re: [REGRESSION] Kernel 5.15 reboots / freezes upon ifup/ifdown
Date: Wed, 24 Nov 2021 15:34:49 -0800	[thread overview]
Message-ID: <20211124153449.72c9cfcd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> (raw)
In-Reply-To: <8119066974f099aa11f08a4dad3653ac0ba32cd6.camel@gmx.de>

On Wed, 24 Nov 2021 18:20:40 +0100 Stefan Dietrich wrote:
> Hi all,
> 
> six exciting hours and a lot of learning later, here it is.
> Symptomatically, the critical commit appears for me between 5.14.21-
> 051421-generic and 5.15.0-051500rc2-generic - I did not find an amd64
> build for rc1.
> 
> Please see the git-bisect output below and let me know how I may
> further assist in debugging!

Well, let's CC those involved, shall we? :)

Thanks for working thru the bisection!

> a90ec84837325df4b9a6798c2cc0df202b5680bd is the first bad commit
> commit a90ec84837325df4b9a6798c2cc0df202b5680bd
> Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> Date:   Mon Jul 26 20:36:57 2021 -0700
> 
>     igc: Add support for PTP getcrosststamp()
> 
>     i225 supports PCIe Precision Time Measurement (PTM), allowing us to
>     support the PTP_SYS_OFFSET_PRECISE ioctl() in the driver via the
>     getcrosststamp() function.
> 
>     The easiest way to expose the PTM registers would be to configure
> the PTM
>     dialogs to run periodically, but the PTP_SYS_OFFSET_PRECISE ioctl()
>     semantics are more aligned to using a kind of "one-shot" way of
> retrieving
>     the PTM timestamps. But this causes a bit more code to be written:
> the
>     trigger registers for the PTM dialogs are not cleared
> automatically.
> 
>     i225 can be configured to send "fake" packets with the PTM
>     information, adding support for handling these types of packets is
>     left for the future.
> 
>     PTM improves the accuracy of time synchronization, for example,
> using
>     phc2sys, while a simple application is sending packets as fast as
>     possible. First, without .getcrosststamp():
> 
>     phc2sys[191.382]: enp4s0 sys offset      -959 s2 freq    -454
> delay   4492
>     phc2sys[191.482]: enp4s0 sys offset       798 s2 freq   +1015
> delay   4069
>     phc2sys[191.583]: enp4s0 sys offset       962 s2 freq   +1418
> delay   3849
>     phc2sys[191.683]: enp4s0 sys offset       924 s2 freq   +1669
> delay   3753
>     phc2sys[191.783]: enp4s0 sys offset       664 s2 freq   +1686
> delay   3349
>     phc2sys[191.883]: enp4s0 sys offset       218 s2 freq   +1439
> delay   2585
>     phc2sys[191.983]: enp4s0 sys offset       761 s2 freq   +2048
> delay   3750
>     phc2sys[192.083]: enp4s0 sys offset       756 s2 freq   +2271
> delay   4061
>     phc2sys[192.183]: enp4s0 sys offset       809 s2 freq   +2551
> delay   4384
>     phc2sys[192.283]: enp4s0 sys offset      -108 s2 freq   +1877
> delay   2480
>     phc2sys[192.383]: enp4s0 sys offset     -1145 s2 freq    +807
> delay   4438
>     phc2sys[192.484]: enp4s0 sys offset       571 s2 freq   +2180
> delay   3849
>     phc2sys[192.584]: enp4s0 sys offset       241 s2 freq   +2021
> delay   3389
>     phc2sys[192.684]: enp4s0 sys offset       405 s2 freq   +2257
> delay   3829
>     phc2sys[192.784]: enp4s0 sys offset        17 s2 freq   +1991
> delay   3273
>     phc2sys[192.884]: enp4s0 sys offset       152 s2 freq   +2131
> delay   3948
>     phc2sys[192.984]: enp4s0 sys offset      -187 s2 freq   +1837
> delay   3162
>     phc2sys[193.084]: enp4s0 sys offset     -1595 s2 freq    +373
> delay   4557
>     phc2sys[193.184]: enp4s0 sys offset       107 s2 freq   +1597
> delay   3740
>     phc2sys[193.284]: enp4s0 sys offset       199 s2 freq   +1721
> delay   4010
>     phc2sys[193.385]: enp4s0 sys offset      -169 s2 freq   +1413
> delay   3701
>     phc2sys[193.485]: enp4s0 sys offset       -47 s2 freq   +1484
> delay   3581
>     phc2sys[193.585]: enp4s0 sys offset       -65 s2 freq   +1452
> delay   3778
>     phc2sys[193.685]: enp4s0 sys offset        95 s2 freq   +1592
> delay   3888
>     phc2sys[193.785]: enp4s0 sys offset       206 s2 freq   +1732
> delay   4445
>     phc2sys[193.885]: enp4s0 sys offset      -652 s2 freq    +936
> delay   2521
>     phc2sys[193.985]: enp4s0 sys offset      -203 s2 freq   +1189
> delay   3391
>     phc2sys[194.085]: enp4s0 sys offset      -376 s2 freq    +955
> delay   2951
>     phc2sys[194.185]: enp4s0 sys offset      -134 s2 freq   +1084
> delay   3330
>     phc2sys[194.285]: enp4s0 sys offset       -22 s2 freq   +1156
> delay   3479
>     phc2sys[194.386]: enp4s0 sys offset        32 s2 freq   +1204
> delay   3602
>     phc2sys[194.486]: enp4s0 sys offset       122 s2 freq   +1303
> delay   3731
> 
>     Statistics for this run (total of 2179 lines), in nanoseconds:
>       average: -1.12
>       stdev: 634.80
>       max: 1551
>       min: -2215
> 
>     With .getcrosststamp() via PCIe PTM:
> 
>     phc2sys[367.859]: enp4s0 sys offset         6 s2 freq   +1727
> delay      0
>     phc2sys[367.959]: enp4s0 sys offset        -2 s2 freq   +1721
> delay      0
>     phc2sys[368.059]: enp4s0 sys offset         5 s2 freq   +1727
> delay      0
>     phc2sys[368.160]: enp4s0 sys offset        -1 s2 freq   +1723
> delay      0
>     phc2sys[368.260]: enp4s0 sys offset        -4 s2 freq   +1719
> delay      0
>     phc2sys[368.360]: enp4s0 sys offset        -5 s2 freq   +1717
> delay      0
>     phc2sys[368.460]: enp4s0 sys offset         1 s2 freq   +1722
> delay      0
>     phc2sys[368.560]: enp4s0 sys offset        -3 s2 freq   +1718
> delay      0
>     phc2sys[368.660]: enp4s0 sys offset         5 s2 freq   +1725
> delay      0
>     phc2sys[368.760]: enp4s0 sys offset        -1 s2 freq   +1721
> delay      0
>     phc2sys[368.860]: enp4s0 sys offset         0 s2 freq   +1721
> delay      0
>     phc2sys[368.960]: enp4s0 sys offset         0 s2 freq   +1721
> delay      0
>     phc2sys[369.061]: enp4s0 sys offset         4 s2 freq   +1725
> delay      0
>     phc2sys[369.161]: enp4s0 sys offset         1 s2 freq   +1724
> delay      0
>     phc2sys[369.261]: enp4s0 sys offset         4 s2 freq   +1727
> delay      0
>     phc2sys[369.361]: enp4s0 sys offset         8 s2 freq   +1732
> delay      0
>     phc2sys[369.461]: enp4s0 sys offset         7 s2 freq   +1733
> delay      0
>     phc2sys[369.561]: enp4s0 sys offset         4 s2 freq   +1733
> delay      0
>     phc2sys[369.661]: enp4s0 sys offset         1 s2 freq   +1731
> delay      0
>     phc2sys[369.761]: enp4s0 sys offset         1 s2 freq   +1731
> delay      0
>     phc2sys[369.861]: enp4s0 sys offset        -5 s2 freq   +1725
> delay      0
>     phc2sys[369.961]: enp4s0 sys offset        -4 s2 freq   +1725
> delay      0
>     phc2sys[370.062]: enp4s0 sys offset         2 s2 freq   +1730
> delay      0
>     phc2sys[370.162]: enp4s0 sys offset        -7 s2 freq   +1721
> delay      0
>     phc2sys[370.262]: enp4s0 sys offset        -3 s2 freq   +1723
> delay      0
>     phc2sys[370.362]: enp4s0 sys offset         1 s2 freq   +1726
> delay      0
>     phc2sys[370.462]: enp4s0 sys offset        -3 s2 freq   +1723
> delay      0
>     phc2sys[370.562]: enp4s0 sys offset        -1 s2 freq   +1724
> delay      0
>     phc2sys[370.662]: enp4s0 sys offset        -4 s2 freq   +1720
> delay      0
>     phc2sys[370.762]: enp4s0 sys offset        -7 s2 freq   +1716
> delay      0
>     phc2sys[370.862]: enp4s0 sys offset        -2 s2 freq   +1719
> delay      0
> 
>     Statistics for this run (total of 2179 lines), in nanoseconds:
>       average: 0.14
>       stdev: 5.03
>       max: 48
>       min: -27
> 
>     For reference, the statistics for runs without PCIe congestion show
>     that the improvements from enabling PTM are less dramatic. For two
>     runs of 16466 entries:
>       without PTM: avg -0.04 stdev 10.57 max 39 min -42
>       with PTM: avg 0.01 stdev 4.20 max 19 min -16
> 
>     One possible explanation is that when PTM is not enabled, and
> there's a lot
>     of traffic in the PCIe fabric, some register reads will take more
> time
>     than the others because of congestion on the PCIe fabric.
> 
>     When PTM is enabled, even if the PTM dialogs take more time to
>     complete under heavy traffic, the time measurements do not depend
> on
>     the time to read the registers.
> 
>     This was implemented following the i225 EAS version 0.993.
> 
>     Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
>     Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
>     Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
> 
>  drivers/net/ethernet/intel/igc/igc.h         |   1 +
>  drivers/net/ethernet/intel/igc/igc_defines.h |  31 +++++
>  drivers/net/ethernet/intel/igc/igc_ptp.c     | 179
> +++++++++++++++++++++++++++
>  drivers/net/ethernet/intel/igc/igc_regs.h    |  23 ++++
>  4 files changed, 234 insertions(+)
> 
> 
> On Wed, 2021-11-24 at 08:33 +0100, Greg KH wrote:
> > On Wed, Nov 24, 2021 at 08:28:39AM +0100, Stefan Dietrich wrote:  
> > > Summary: When attempting to rise or shut down a NIC manually or via
> > > network-manager under 5.15, the machine reboots or freezes.
> > >
> > > Occurs with: 5.15.4-051504-generic and earlier 5.15 mainline (
> > > https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.4/) as well as
> > > liquorix flavours.
> > > Does not occur with: 5.14 and 5.13 (both with various flavours)  
> >
> > Can you use 'git bisect' between 5.14 and 5.15 to find the problem
> > commit?
> >
> > thanks,
> >
> > greg k-h  
> 


  reply	other threads:[~2021-11-24 23:34 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-24  7:28 Stefan Dietrich
2021-11-24  7:33 ` Greg KH
2021-11-24  7:42   ` Stefan Dietrich
2021-11-24 17:20   ` Stefan Dietrich
2021-11-24 23:34     ` Jakub Kicinski [this message]
2021-11-25  1:07       ` Vinicius Costa Gomes
2021-11-25  1:13         ` Jakub Kicinski
2021-11-25  8:41         ` Stefan Dietrich
2021-12-01 11:45           ` Thorsten Leemhuis
2021-12-01 17:47             ` Vinicius Costa Gomes
2021-12-01 18:57               ` [PATCH] igc: Avoid possible deadlock during suspend/resume Vinicius Costa Gomes
2021-12-02  6:41                 ` Greg KH
2021-12-02  6:50                   ` Vinicius Costa Gomes
2021-12-02  8:34                 ` Stefan Dietrich
2021-12-02 22:34                   ` Vinicius Costa Gomes
2021-12-10  9:40                     ` Thorsten Leemhuis
2021-12-10 13:45                       ` Stefan Dietrich
2021-12-10 14:01                         ` Thorsten Leemhuis
2021-12-10 14:51                           ` Stefan Dietrich
2021-12-11  0:41                             ` Vinicius Costa Gomes
2021-12-11  9:50                               ` Stefan Dietrich
2021-12-13 18:32                                 ` Vinicius Costa Gomes
2021-12-14  6:39                                   ` Stefan Dietrich
2021-11-24  7:48 ` [REGRESSION] Kernel 5.15 reboots / freezes upon ifup/ifdown Thorsten Leemhuis
2021-11-25 11:15   ` Thorsten Leemhuis
2021-11-24  8:05 ` Stefan Dietrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211124153449.72c9cfcd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com \
    --to=kuba@kernel.org \
    --cc=anthony.l.nguyen@intel.com \
    --cc=dvorax.fuxbrumer@linux.intel.com \
    --cc=greg@kroah.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=netdev@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=roots@gmx.de \
    --cc=stable@vger.kernel.org \
    --cc=vinicius.gomes@intel.com \
    --subject='Re: [REGRESSION] Kernel 5.15 reboots / freezes upon ifup/ifdown' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).