From: <Patrick.Mclean@sony.com>
To: <greg@kroah.com>
Cc: <stable@vger.kernel.org>, <regressions@lists.linux.dev>,
<ayal@nvidia.com>, <saeedm@nvidia.com>, <netdev@vger.kernel.org>,
<leonro@nvidia.com>, <Aaron.U'ren@sony.com>,
<Russell.Brown@sony.com>, <Victor.Payno@sony.com>
Subject: Re: mlx5_core 5.10 stable series regression starting at 5.10.65
Date: Tue, 21 Sep 2021 22:22:57 +0000 [thread overview]
Message-ID: <BY5PR13MB3604527F4A98D0F86B02AC98EEA19@BY5PR13MB3604.namprd13.prod.outlook.com> (raw)
In-Reply-To: <YUl8PKVz/em51KHR@kroah.com>
> On Mon, Sep 20, 2021 at 08:22:44PM +0000, Patrick.Mclean@sony.com wrote:
> > In 5.10 stable kernels since 5.10.65 certain mlx5 cards are no longer usable (relevant dmesg logs and lspci output are pasted below).
> >
> > Bisecting the problem tracks the problem down to this commit:
> > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=fe6322774ca28669868a7e231e173e09f7422118__;!!JmoZiZGBv3RvKRSx!phUrsR595UusBY2Q9eNJQS7-VNtnb72Rcvhe-W0QKDPir1WY9mvWOkLLfe63k-6Uvw$
> >
> > Here is how lscpi -nn identifies the cards:
> > 41:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
> > 41:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
> >
> > Here are the relevant dmesg logs:
> > [ 13.409473] mlx5_core 0000:41:00.0: firmware version: 16.31.1014
> > [ 13.415944] mlx5_core 0000:41:00.0: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
> > [ 13.707425] mlx5_core 0000:41:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
> > [ 13.718221] mlx5_core 0000:41:00.0: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
> > [ 13.740607] mlx5_core 0000:41:00.0: Port module event: module 0, Cable plugged
> > [ 13.759857] mlx5_core 0000:41:00.0: mlx5_pcie_event:294:(pid 586): PCIe slot advertised sufficient power (75W).
> > [ 17.986973] mlx5_core 0000:41:00.0: E-Switch: cleanup
> > [ 18.686204] mlx5_core 0000:41:00.0: init_one:1371:(pid 803): mlx5_load_one failed with error code -22
> > [ 18.701352] mlx5_core: probe of 0000:41:00.0 failed with error -22
> > [ 18.727364] mlx5_core 0000:41:00.1: firmware version: 16.31.1014
> > [ 18.743853] mlx5_core 0000:41:00.1: 126.016 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x16 link)
> > [ 19.015349] mlx5_core 0000:41:00.1: Rate limit: 127 rates are supported, range: 0Mbps to 24414Mbps
> > [ 19.025157] mlx5_core 0000:41:00.1: E-Switch: Total vports 2, per vport: max uc(128) max mc(2048)
> > [ 19.053569] mlx5_core 0000:41:00.1: Port module event: module 1, Cable unplugged
> > [ 19.062093] mlx5_core 0000:41:00.1: mlx5_pcie_event:294:(pid 591): PCIe slot advertised sufficient power (75W).
> > [ 22.826932] mlx5_core 0000:41:00.1: E-Switch: cleanup
> > [ 23.544747] mlx5_core 0000:41:00.1: init_one:1371:(pid 803): mlx5_load_one failed with error code -22
> > [ 23.555071] mlx5_core: probe of 0000:41:00.1 failed with error -22
> >
> > Please let me know if I can provide any further information.
>
> If you revert that single change, do things work properly?
Yes, things work properly after reverting that single change (tested with 5.10.67).
> Does newer kernels (5.14, 5.15-rc2) work properly for you as well?
We tested 5.14.6, and it works as expected.
next prev parent reply other threads:[~2021-09-21 22:23 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-20 20:22 mlx5_core 5.10 stable series regression starting at 5.10.65 Patrick.Mclean
2021-09-21 6:31 ` Greg KH
2021-09-21 22:22 ` Patrick.Mclean [this message]
2021-09-22 6:21 ` Leon Romanovsky
2021-09-23 11:04 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BY5PR13MB3604527F4A98D0F86B02AC98EEA19@BY5PR13MB3604.namprd13.prod.outlook.com \
--to=patrick.mclean@sony.com \
--cc=Aaron.U'ren@sony.com \
--cc=Russell.Brown@sony.com \
--cc=Victor.Payno@sony.com \
--cc=ayal@nvidia.com \
--cc=greg@kroah.com \
--cc=leonro@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=regressions@lists.linux.dev \
--cc=saeedm@nvidia.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).