regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info>
To: Leon Romanovsky <leon@kernel.org>, Paul Moore <paul@paul-moore.com>
Cc: Linux regressions mailing list <regressions@lists.linux.dev>,
	Saeed Mahameed <saeed@kernel.org>, Shay Drory <shayd@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	netdev@vger.kernel.org, selinux@vger.kernel.org
Subject: Re: Potential regression/bug in net/mlx5 driver
Date: Thu, 13 Apr 2023 15:49:30 +0200	[thread overview]
Message-ID: <87264550-91eb-2d41-e3f3-c3a51425d7a4@leemhuis.info> (raw)
In-Reply-To: <20230410054605.GL182481@unreal>



On 10.04.23 07:46, Leon Romanovsky wrote:
> On Sun, Apr 09, 2023 at 07:50:34PM -0400, Paul Moore wrote:
>> On Sun, Apr 9, 2023 at 4:48 AM Linux regression tracking (Thorsten
>> Leemhuis) <regressions@leemhuis.info> wrote:
>>> On 30.03.23 03:27, Paul Moore wrote:
>>>> On Wed, Mar 29, 2023 at 6:20 PM Saeed Mahameed <saeed@kernel.org> wrote:
>>>>> On 28 Mar 19:08, Paul Moore wrote:
>>>>>>
>>>>>> Starting with the v6.3-rcX kernel releases I noticed that my
>>>>>> InfiniBand devices were no longer present under /sys/class/infiniband,
>>>>>> causing some of my automated testing to fail.  It took me a while to
>>>>>> find the time to bisect the issue, but I eventually identified the
>>>>>> problematic commit:
>>>>>>
>>>>>>  commit fe998a3c77b9f989a30a2a01fb00d3729a6d53a4
>>>>>>  Author: Shay Drory <shayd@nvidia.com>
>>>>>>  Date:   Wed Jun 29 11:38:21 2022 +0300
>>>>>>
>>>>>>   net/mlx5: Enable management PF initialization
>>>>>>
>>>>>>   Enable initialization of DPU Management PF, which is a new loopback PF
>>>>>>   designed for communication with BMC.
>>>>>>   For now Management PF doesn't support nor require most upper layer
>>>>>>   protocols so avoid them.
>>>>>>
>>>>>>   Signed-off-by: Shay Drory <shayd@nvidia.com>
>>>>>>   Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
>>>>>>   Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
>>>>>>   Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
>>>>>>
>>>>>> I'm not a mlx5 driver expert so I can't really offer much in the way
>>>>>> of a fix, but as a quick test I did remove the
>>>>>> 'mlx5_core_is_management_pf(...)' calls in mlx5/core/dev.c and
>>>>>> everything seemed to work okay on my test system (or rather the tests
>>>>>> ran without problem).
>>>>>>
>>>>>> If you need any additional information, or would like me to test a
>>>>>> patch, please let me know.
>>>>>
>>>>> Our team is looking into this, the current theory is that you have an old
>>>>> FW that doesn't have the correct capabilities set.
>>>>
>>>> That's very possible; I installed this card many years ago and haven't
>>>> updated the FW once.
>>>>
>>>>  I'm happy to update the FW (do you have a
>>>> pointer/how-to?), but it might be good to identify a fix first as I'm
>>>> guessing there will be others like me ...
>>>
>>> Nothing happened here for about ten days afaics (or was there progress
>>> and I just missed it?). That made me wonder: how sound is Paul's guess
>>> that there will be others that might run into this? If that's likely it
>>> afaics would be good to get this regression fixed before the release,
>>> which is just two or three weeks away.
>>>
>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>> --
>>> Everything you wanna know about Linux kernel regression tracking:
>>> https://linux-regtracking.leemhuis.info/about/#tldr
>>> If I did something stupid, please tell me, as explained on that page.
>>>
>>> #regzbot poke
>>
>> I haven't seen any updates from the mlx5 driver folks, although I may
>> not have been CC'd?
> 
> We are extremely slow these days due to combination of holidays
> (Easter, Passover, Ramadan, spring break e.t.c).

That's how it is sometimes, no worries. But well, rc7 is only a three
days away and 6.3 thus might be out in 10 days already. Hence allow me
to ask: is it possible to fix this by reverting the culprit now (and
reapplying it later in fixed form). If that's and option I'd say "go for
it", to ensure that revert makes it into rc7 and thus is tested at least
one week before the final (or two, if Linus decides to do a rc8).

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

  reply	other threads:[~2023-04-13 13:49 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 23:08 Potential regression/bug in net/mlx5 driver Paul Moore
2023-03-29 22:20 ` Saeed Mahameed
2023-03-30  1:27   ` Paul Moore
2023-04-09  8:48     ` Linux regression tracking (Thorsten Leemhuis)
2023-04-09 23:50       ` Paul Moore
2023-04-10  5:46         ` Leon Romanovsky
2023-04-13 13:49           ` Linux regression tracking (Thorsten Leemhuis) [this message]
2023-04-13 14:54           ` Jakub Kicinski
2023-04-13 15:19             ` Paul Moore
2023-04-13 21:12               ` Saeed Mahameed
2023-04-13 22:21                 ` Jakub Kicinski
2023-04-13 22:34                   ` Saeed Mahameed
2023-04-13 22:51                     ` Jakub Kicinski
2023-04-14  3:03                       ` Saeed Mahameed
2023-04-14  3:26                         ` Jakub Kicinski
2023-04-14 14:37                           ` Paul Moore
2023-04-14 22:20                           ` Saeed Mahameed
2023-04-15  0:34                             ` Jakub Kicinski
2023-04-15  4:40                               ` Saeed Mahameed
2023-04-17 15:38                                 ` Jakub Kicinski
2023-04-20  0:43                                   ` Saeed Mahameed
2023-04-20  0:46                                     ` Jakub Kicinski
2023-04-20  4:02                                       ` Saeed Mahameed
2023-03-31 13:10 ` Linux regression tracking #adding (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87264550-91eb-2d41-e3f3-c3a51425d7a4@leemhuis.info \
    --to=regressions@leemhuis.info \
    --cc=leon@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paul@paul-moore.com \
    --cc=regressions@lists.linux.dev \
    --cc=saeed@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=selinux@vger.kernel.org \
    --cc=shayd@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).