stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Jack Wang <xjtuwjp@gmail.com>
Cc: Timo Rothenpieler <timo@rothenpieler.org>,
	gregkh@linuxfoundation.org, stable@vger.kernel.org,
	Eran Ben Elisha <eranbe@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: Backport missing mlx5 fixes after 50b2412b7e7
Date: Sun, 22 Nov 2020 12:48:45 -0500	[thread overview]
Message-ID: <20201122174845.GK643756@sasha-vm> (raw)
In-Reply-To: <CAD+HZHWy1dba7z0UcX3cofSgzQvFUcfRms+zC+RvJoqh3p5MoQ@mail.gmail.com>

On Fri, Nov 20, 2020 at 07:18:04AM +0100, Jack Wang wrote:
>Timo Rothenpieler <timo@rothenpieler.org> 于2020年11月18日周三 下午7:28写道:
>>
>> Hi,
>>
>> After 50b2412b7e7862c5af0cbf4b10d93bc5c712d021 was backported to stable
>> branches (I only tested 5.4), some serious issues started to arrise.
>>
>> According to linux-rdma, the following two patches that need to go along
>> with 50b2412b7e are missing:
>>
>> > 1. 1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout
>> > 2. 410bd754cd73 net/mlx5: Add retry mechanism to the command entry ...
>>
>> I managed to apply those mostly cleanly after also applying two
>> dependencies.
>> So the complete list of needed commits for 5.4 is:
>>
>> 1. 3ed879965cc4 net/mlx5: Use async EQ setup cleanup helpers ...
>> 2. 1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout
>> 3. d43b7007dbd1 net/mlx5: Fix a race when moving command ...
>> 4. 410bd754cd73 net/mlx5: Add retry mechanism to the command entry ...
>>
>> With those 4 commits applied, the issue is fixed.
>> For reference, that's the output I get with 5.4.77:
>>
>> > Nov 17 01:12:58 store01 kernel: mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
>> > Nov 17 01:12:58 store01 kernel: mlx5_core 0000:01:00.0: cmd_work_handler:887:(pid 383): failed to allocate command entry
>> > Nov 17 01:12:58 store01 kernel: infiniband mlx5_0: reg_mr_callback:104:(pid 383): async reg mr failed. status -11
>> > Nov 17 01:12:58 store01 kernel: mlx5_core 0000:01:00.0: cmd_work_handler:887:(pid 383): failed to allocate command entry
>> > Nov 17 01:12:58 store01 kernel: mlx5_core 0000:01:00.0: mlx5e_create_mdev_resources:104:(pid 1): alloc td failed, -11
>> > Nov 17 01:12:58 store01 kernel: mlx5_0, 1: ipoib_intf_alloc failed -11
>>
>+cc Greg & Sascha
>Hi,
>
>We hit the same problem on mlx5, I've tested four mentioned commits,
>it works fine, please include them in future 5.4 kernel.

Looks like Greg picked them up, thanks!

-- 
Thanks,
Sasha

  reply	other threads:[~2020-11-22 17:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 18:28 Backport missing mlx5 fixes after 50b2412b7e7 Timo Rothenpieler
2020-11-20  6:18 ` Jack Wang
2020-11-22 17:48   ` Sasha Levin [this message]
2020-11-20  8:37 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201122174845.GK643756@sasha-vm \
    --to=sashal@kernel.org \
    --cc=eranbe@nvidia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@ziepe.ca \
    --cc=saeedm@nvidia.com \
    --cc=stable@vger.kernel.org \
    --cc=timo@rothenpieler.org \
    --cc=xjtuwjp@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).