linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Edwin Peer <edwin.peer@broadcom.com>
Cc: Ido Schimmel <idosch@idosch.org>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Ido Schimmel <idosch@mellanox.com>,
	Jiri Pirko <jiri@mellanox.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>,
	syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com,
	Michael Chan <michael.chan@broadcom.com>
Subject: Re: [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device
Date: Wed, 27 Oct 2021 12:43:50 +0300	[thread overview]
Message-ID: <YXkfVuthd6G75L/S@unreal> (raw)
In-Reply-To: <CAKOOJTyrUUydu9aNJSB4S_5dfqjkc6Y-14up4-V+aNcQ7TWVdQ@mail.gmail.com>

On Wed, Oct 27, 2021 at 01:46:44AM -0700, Edwin Peer wrote:
> On Tue, Oct 26, 2021 at 11:43 PM Leon Romanovsky <leon@kernel.org> wrote:
> 
> > In our case, the eth driver is part of mlx5_core module, so at the
> > device creation phase that module is already loaded and driver/core
> > will try to autoprobe it.
> 
> > However, the last step is not always performed and controlled by the
> > userspace. Users can disable driver autoprobe and bind manually. This
> > is pretty standard practice in the SR-IOV or VFIO modes.
> 
> While you say the netdev will not necessarily be bound, that still
> sounds like the netdev will indeed be presented to user space before
> devlink_register() when it is auto-probed?

And it is ok, there is no binding between netdev and devlink. They are
independent. For example, IB cards have devlink through mlx5_core, but
without netdev at all.

> 
> > This is why devlink has monitor mode where you can see devlink device
> > addition and removal. It is user space job to check that device is
> > ready.
> 
> Isn't it more a question of what existing user space _does_ rather
> than what user space _should_ do?

No, because races of kernel vs. user space always existed and every time,
the answer was the same - it is user space job to avoid race. Kernel
needs to protect itself and we successfully did it here.

Latest example of such programming model can be seen in VFIO too:
https://lore.kernel.org/kvm/20211026090605.91646-1-yishaih@nvidia.com/T/#m89ef326da9405dcefe482bd5a6a54aff7e0b90bc
"Yes. If the userspace races ioctls they get a deserved mess.

This race exists no matter what we do, as soon as the unlock happens a
racing reset ioctl could run in during the system call exit path.

The purpose of the locking is to protect the kernel from hostile
userspace, not to allow userspace to execute concurrent ioctl's in a
sensible way."

> 
> > > This isn't about kernel API. This is precisely about existing user
> > > space that expects devlink to work immediately after the netdev
> > > appears.
> >
> > Can you please share open source project that has such assumption?
> 
> I'm no python expert, but it looks like
> https://github.com/openstack-charmers/mlnx-switchdev-mode/ might.

No, it is not.

> We've certainly had implicit user space assumptions trip over
> registration order before, hence the change we made in January last
> year to move devlink registration earlier. Granted, upon deeper
> analysis, that specific case pertained to phys port name via sysfs,
> which technically only needs port attrs via ndo_get_devlink_port, not
> devlink_register(). That said, I'm certainly not confident that there
> are no other existing users that might expect to be able to invoke
> devlink in ifup scripts.

It is not what your change did, you moved devlink ops registration logic
to be before register_netdev(), this is continue to be in upstream code.
We are registering ops in devlink_alloc() and that function stays to be
at the same place.

> 
> > > What do you suggest instead?
> >
> > Fix your test respect devlink notifications and don't ignore them.
> 
> That's not very helpful. The test case does what the user in the field
> did to break it. We can't assume users have always been using our APIs
> the way we intended.

There is no API here.

Thanks

> 
> Regards,
> Edwin Peer

      reply	other threads:[~2021-10-27  9:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-24  8:42 [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device Leon Romanovsky
2021-10-24  9:05 ` Ido Schimmel
2021-10-24  9:54   ` Leon Romanovsky
2021-10-24 10:48     ` Ido Schimmel
2021-10-25  5:34       ` Leon Romanovsky
2021-10-25  8:08         ` Ido Schimmel
2021-10-25 10:56           ` Leon Romanovsky
2021-10-26  6:51             ` Ido Schimmel
2021-10-26  7:18               ` Leon Romanovsky
2021-10-26 14:09                 ` Ido Schimmel
2021-10-26 16:14                   ` Leon Romanovsky
2021-10-26 19:02                     ` Jakub Kicinski
2021-10-26 19:30                       ` Leon Romanovsky
2021-10-26 19:56                         ` Jakub Kicinski
2021-10-27  5:56                           ` Leon Romanovsky
2021-10-27 14:17                             ` Jakub Kicinski
2021-10-27 15:17                               ` Leon Romanovsky
2021-10-27 19:15                                 ` Leon Romanovsky
2021-10-27 19:28                                   ` Jakub Kicinski
2021-10-25 18:24           ` Jakub Kicinski
2021-10-25 19:12             ` Leon Romanovsky
2021-10-25 23:19   ` Edwin Peer
2021-10-26  5:56     ` Leon Romanovsky
2021-10-26 17:34       ` Edwin Peer
2021-10-26 19:22         ` Leon Romanovsky
2021-10-26 20:03           ` Edwin Peer
2021-10-27  6:43             ` Leon Romanovsky
2021-10-27  8:46               ` Edwin Peer
2021-10-27  9:43                 ` Leon Romanovsky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXkfVuthd6G75L/S@unreal \
    --to=leon@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edwin.peer@broadcom.com \
    --cc=idosch@idosch.org \
    --cc=idosch@mellanox.com \
    --cc=jiri@mellanox.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).