netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cosmin Ratiu <cratiu@nvidia.com>
To: "taoliu828@163.com" <taoliu828@163.com>
Cc: Roi Dayan <roid@nvidia.com>, Paul Blakey <paulb@nvidia.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Vlad Buslov <vladbu@nvidia.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Dima Chumak <dchumak@nvidia.com>
Subject: Re: Report mlx5_core crash
Date: Tue, 27 Feb 2024 17:39:14 +0000	[thread overview]
Message-ID: <6f659353e4fe3070aaf4cb3cab6dd388414744b9.camel@nvidia.com> (raw)
In-Reply-To: <ZcRGl758ek_at4Ha@liutao02-mac.local>

On Thu, 2024-02-08 at 11:12 +0800, Tao Liu wrote:
> Hi Cosmin,
> 
> Thanks for your reply.
> 
> It's hard to reproduce the crash directly.  In our case the rule forwards ip
> broadcast traffic to 5 vxlan remotes. And driver creates 6 mlx5_flow_rule
> which include 5 mlx5_pkt_reformat and 1 counter.
> It triggers only when two *dr_action in struct mlx5_pkt_reformat have same
> lower 32 bits, which determined by memory allocation.
> 
> Is it possible that we do some fault injection in unit test to reproduce?

In the end, no complicated fault injection was needed. I just had to
pay proper attention to your awesome initial analysis and I've managed
to understand the problems.

I've also prepared fixes for both of them, the patches are under review
in our internal tree and should hopefully soon be on their way
upstream.

But from the stack traces you reported, I noticed you are running with
OFED. I will talk to my colleagues and let you know as soon as a new
build with the fixes included can be used to test.

Cosmin.

      reply	other threads:[~2024-02-27 17:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3016cbe9-57e9-4ef4-a979-ac0db1b3ef31@163.com>
2024-01-31 14:19 ` Report mlx5_core crash Tao Liu
2024-02-06  7:01   ` Tao Liu
2024-02-06 23:42     ` Saeed Mahameed
2024-02-07 10:33     ` Cosmin Ratiu
2024-02-08  3:12       ` Tao Liu
2024-02-27 17:39         ` Cosmin Ratiu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6f659353e4fe3070aaf4cb3cab6dd388414744b9.camel@nvidia.com \
    --to=cratiu@nvidia.com \
    --cc=dchumak@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulb@nvidia.com \
    --cc=roid@nvidia.com \
    --cc=saeedm@nvidia.com \
    --cc=taoliu828@163.com \
    --cc=vladbu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).