linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Bjorn Helgaas <helgaas@kernel.org>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	Keith Busch <kbusch@kernel.org>,
	Leon Romanovsky <leon@kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-pci <linux-pci@vger.kernel.org>,
	linux-rdma@vger.kernel.org, Netdev <netdev@vger.kernel.org>,
	Don Dutile <ddutile@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"David S . Miller" <davem@davemloft.net>
Subject: Re: [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count
Date: Wed, 31 Mar 2021 17:03:45 +0200	[thread overview]
Message-ID: <YGSPUewD5J+F7ZRe@kroah.com> (raw)
In-Reply-To: <20210331121929.GX2710221@ziepe.ca>

On Wed, Mar 31, 2021 at 09:19:29AM -0300, Jason Gunthorpe wrote:
> On Wed, Mar 31, 2021 at 08:38:39AM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Mar 30, 2021 at 07:43:41PM -0300, Jason Gunthorpe wrote:
> > > > With 0000:01:00.0/sriov/BB:DD.F/vf_msix_count, sriov/ will contain
> > > > 1 file and 1K subdirectories.
> > > 
> > > The smallest directory sizes is with the current patch since it
> > > re-uses the existing VF directory. Do we care about directory size at
> > > the sysfs level?
> > 
> > No, that should not matter.
> > 
> > The "issue" here is that you "broke" the device chain here by adding a
> > random kobject to the directory tree: "BB:DD.F"
> > 
> > Again, devices are allowed to have attributes associated with it to be
> > _ONE_ subdirectory level deep.
> > 
> > So, to use your path above, this is allowed:
> > 	0000:01:00.0/sriov/vf_msix_count
> > 
> > as these are sriov attributes for the 0000:01:00.0 device, but this is
> > not:
> > 	0000:01:00.0/sriov/BB:DD.F/vf_msix_count
> > as you "threw" a random kobject called BB:DD.F into the middle.
> >
> > If you want to have "BB:DD.F" in there, then it needs to be a real
> > struct device and _THEN_ it needs to point its parent to "0000:01:00.0",
> > another struct device, as "sriov" is NOT ANYTHING in the heirachy here
> > at all.
> 
> It isn't a struct device object at all though, it just organizing
> attributes.

That's the point, it really is not.  You are forced to create a real
object for that subdirectory, and by doing so, you are "breaking" the
driver/device model.  As is evident by userspace not knowing what is
going on here.

> > Does that help?  The rules are:
> > 	- Once you use a 'struct device', all subdirs below that device
> > 	  are either an attribute group for that device or a child
> > 	  device.
> > 	- A struct device can NOT have an attribute group as a parent,
> > 	  it can ONLY have another struct device as a parent.
> > 
> > If you break those rules, the kernel has the ability to get really
> > confused unless you are very careful, and userspace will be totally lost
> > as you can not do anything special there.
> 
> The kernel gets confused?

Putting a kobject as a child of a struct device can easily cause
confusion as that is NOT what you should be doing.  Especially if you
then try to add a device to be a child of that kobject.  Now the kernel
core can not walk all devices in the correct order and lots of other
problems that you are lucky you do not hit.

> I'm not sure I understand why userspace gets confused. I can guess
> udev has some issue, but everything else seems OK, it is just a path.

No, it is not a "path".

Again, here are the driver/device model rules:
	- once you have a "struct device", only "struct device" can be
	  children.
	- You are allowed to place attributes in subdirectories if you
	  want to make things cleaner.  Don't make me doubt giving that
	  type of permission to people by trying to abuse it by going
	  "lower" than one level.
	- If you have to represent something dynamic below a struct
	  device that is not an attribute, make it a real struct device.

And userspace "gets confused" because it thinks it can properly walk the
tree and get a record of all devices in the system.  When you put a
kobject in there just to make a new subdirectory, you break the
notification model and everything else.

Again, do not do that.

> > > > I'm dense and don't fully understand Greg's subdirectory comment.
> > > 
> > > I also don't know udev well enough. I've certainly seen drivers
> > > creating extra subdirectories using kobjects.
> > 
> > And those drivers are broken.  Please point them out to me and I will be
> > glad to go fix them.  Or tell their authors why they are broken :)
> 
> How do you fix them? It is uAPI at this point so we can't change the
> directory names. Can't make them struct devices (userspace would get
> confused if we add *more* sysfs files)

How would userspace get confused?  If anything it would suddenly "wake
up" and see these attributes properly.

> Grep for kobject_init_and_add() under drivers/ and I think you get a
> pretty good overview of the places.

Yes, lots of places where people are abusing things and it should not be
done.  Do not add to the mess by knowingly adding broken code please.

> Since it seems like kind of a big problem can we make this allowed
> somehow?

No, not at all.  Please do not do that.  I will look into the existing
users and try to see if I can fix them up.  Maybe start annoying people
by throwing warnings if you try to register a kobject as a child of a
device...

> > > > But it doesn't seem like that level of control would be in a udev rule
> > > > anyway.  A PF udev rule might *start* a program to manage MSI-X
> > > > vectors, but such a program should be able to deal with whatever
> > > > directory structure we want.
> > >
> > > Yes, I can't really see this being used from udev either. 
> > 
> > It doesn't matter if you think it could be used, it _will_ be used as
> > you are exposing this stuff to userspace.
> 
> Well, from what I understand, it wont be used because udev can't do
> three level deep attributes, and if that hasn't been a problem in that
> last 10 years for the existing places, it might not ever be needed in
> udev at all.

If userspace is not seeing these attributes then WHY CREATE THEM AT
ALL???

Seriously, what is needing to see these in sysfs if not the tools that
we have today to use sysfs?  Are you wanting to create new tools instead
to handle these new attributes?  Maybe just do not create them in the
first place?

> > > I assume there is also the usual race about triggering the uevent
> > > before the subdirectories are created, but we have the
> > > dev_set_uevent_suppress() thing now for that..
> > 
> > Unless you are "pci bus code" you shouldn't be using that :)
> 
> There are over 40 users now.

Let's not add more if you do not have to.  PCI has not needed it so far,
no need to rush to add it here if at all possible please.

thanks,

greg k-h

  reply	other threads:[~2021-03-31 15:04 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-01  7:55 [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 1/4] PCI: Add a sysfs file to change the MSI-X table size of SR-IOV VFs Leon Romanovsky
2021-03-01  8:14   ` Greg Kroah-Hartman
2021-03-01  8:32     ` Leon Romanovsky
2021-03-01  8:37       ` Greg Kroah-Hartman
2021-03-01  8:53         ` Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 2/4] net/mlx5: Add dynamic MSI-X capabilities bits Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 3/4] net/mlx5: Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-01  7:55 ` [PATCH mlx5-next v7 4/4] net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks Leon Romanovsky
2021-03-07  8:11 ` [PATCH mlx5-next v7 0/4] Dynamically assign MSI-X vectors count Leon Romanovsky
2021-03-07 18:55 ` Alexander Duyck
2021-03-07 19:19   ` Leon Romanovsky
2021-03-08 16:33     ` Alexander Duyck
2021-03-08 19:20       ` Leon Romanovsky
2021-03-10 19:09   ` Bjorn Helgaas
2021-03-10 20:10     ` Leon Romanovsky
2021-03-10 20:21       ` Greg Kroah-Hartman
2021-03-11  8:37         ` Leon Romanovsky
2021-03-10 23:34     ` Alexander Duyck
2021-03-11 18:17       ` Bjorn Helgaas
2021-03-11 19:16         ` Keith Busch
2021-03-11 19:21           ` Leon Romanovsky
2021-03-11 20:22           ` Jason Gunthorpe
2021-03-11 20:50             ` Keith Busch
2021-03-11 21:44               ` Jason Gunthorpe
2021-03-25 17:21                 ` Bjorn Helgaas
2021-03-25 17:36                   ` Jason Gunthorpe
2021-03-25 18:20                     ` Bjorn Helgaas
2021-03-25 18:28                       ` Jason Gunthorpe
2021-03-26  6:44                         ` Leon Romanovsky
2021-03-26 16:00                           ` Alexander Duyck
2021-03-26 16:56                             ` Jason Gunthorpe
2021-03-26 17:08                             ` Bjorn Helgaas
2021-03-26 17:12                               ` Jason Gunthorpe
2021-03-27  6:00                                 ` Leon Romanovsky
2021-03-26 17:29                               ` Keith Busch
2021-03-26 17:31                                 ` Jason Gunthorpe
2021-03-26 18:50                               ` Alexander Duyck
2021-03-26 19:01                                 ` Jason Gunthorpe
2021-03-30  1:29                                   ` Bjorn Helgaas
2021-03-30 13:57                                     ` Jason Gunthorpe
2021-03-30 15:00                                       ` Bjorn Helgaas
2021-03-30 19:47                                         ` Jason Gunthorpe
2021-03-30 20:41                                           ` Bjorn Helgaas
2021-03-30 22:43                                             ` Jason Gunthorpe
2021-03-31  6:38                                               ` Greg Kroah-Hartman
2021-03-31 12:19                                                 ` Jason Gunthorpe
2021-03-31 15:03                                                   ` Greg Kroah-Hartman [this message]
2021-03-31 17:07                                                     ` Jason Gunthorpe
2021-03-31  4:08                                             ` Leon Romanovsky
2021-04-01  1:23                                               ` Bjorn Helgaas
2021-04-01 11:49                                                 ` Leon Romanovsky
2021-03-30 18:10                                     ` Keith Busch
2021-03-26 19:36                                 ` Bjorn Helgaas
2021-03-27 12:38                                   ` Greg Kroah-Hartman
2021-03-25 18:31                     ` Keith Busch
2021-03-25 18:36                       ` Jason Gunthorpe
2021-03-11 19:17         ` Leon Romanovsky
2021-03-11 19:37         ` Alexander Duyck
2021-03-11 19:51           ` Leon Romanovsky
2021-03-11 20:11             ` Alexander Duyck
2021-03-11 20:19           ` Jason Gunthorpe
2021-03-11 21:49             ` Alexander Duyck
2021-03-11 23:20               ` Jason Gunthorpe
2021-03-12  2:53                 ` Alexander Duyck
2021-03-12  6:32                   ` Leon Romanovsky
2021-03-12 16:59                     ` Alexander Duyck
2021-03-12 17:03                       ` Jason Gunthorpe
2021-03-12 18:34                         ` Leon Romanovsky
2021-03-12 18:41                       ` Leon Romanovsky
2021-03-12 13:00                   ` Jason Gunthorpe
2021-03-12 13:36                     ` Keith Busch
2021-03-11 20:31         ` Jason Gunthorpe
2021-03-10  5:58 ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGSPUewD5J+F7ZRe@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=davem@davemloft.net \
    --cc=ddutile@redhat.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).