KVM Archive on lore.kernel.org
 help / color / Atom feed
* mdevctl: A shoestring mediated device management and persistence utility
@ 2019-05-23 23:20 Alex Williamson
  2019-05-24 10:11 ` Cornelia Huck
                   ` (3 more replies)
  0 siblings, 4 replies; 46+ messages in thread
From: Alex Williamson @ 2019-05-23 23:20 UTC (permalink / raw)
  To: kvm
  Cc: Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Cornelia Huck

Hi,

Currently mediated device management, much like SR-IOV VF management,
is largely left as an exercise for the user.  This is an attempt to
provide something and see where it goes.  I doubt we'll solve
everyone's needs on the first pass, but maybe we'll solve enough and
provide helpers for the rest.  Without further ado, I'll point to what
I have so far:

https://github.com/awilliam/mdevctl

This is inspired by driverctl, which is also a bash utility.  mdevctl
uses udev and systemd to record and recreate mdev devices for
persistence and provides a command line utility for querying, listing,
starting, stopping, adding, and removing mdev devices.  Currently, for
better or worse, it considers anything created to be persistent.  I can
imagine a global configuration option that might disable this and
perhaps an autostart flag per mdev device, such that mdevctl might
simply "know" about some mdevs but not attempt to create them
automatically.  Clearly command line usage help, man pages, and
packaging are lacking as well, release early, release often, plus this
is a discussion starter to see if perhaps this is sufficient to meet
some needs.

Originally I thought about making a utility to manage both mdev and
SR-IOV VFs all in one, but it seemed more natural to start here
(besides, I couldn't think of a good name for the combined utility).
If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
see whether they have enough synergy to become one.

It would be really useful if s390 folks could help me understand
whether it's possible to glean all the information necessary to
recreate a ccw or ap mdev device from sysfs.  I expect the file where
we currently only store the mdev_type to evolve into something that
includes more information to facilitate more complicated devices.  For
now I make no claims to maintaining compatibility of recorded mdev
devices, it will absolutely change, but I didn't want to get bogged
down in making sure I don't accidentally source a root kit hidden in an
mdev config file.

I'm also curious how or if libvirt or openstack might use this.  If
nothing else, it makes libvirt hook scripts easier to write, especially
if we add an option not to autostart mdevs, or if users don't mind
persistent mdevs, maybe there's nothing more to do.

BTW, feel free to clean up by bash, I'm a brute force and ignorance
shell coder ;)  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-23 23:20 mdevctl: A shoestring mediated device management and persistence utility Alex Williamson
@ 2019-05-24 10:11 ` Cornelia Huck
  2019-05-24 14:43   ` Alex Williamson
  2019-06-07 16:06   ` Halil Pasic
  2019-06-13 16:17 ` Christophe de Dinechin
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-05-24 10:11 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza

On Thu, 23 May 2019 17:20:01 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> Hi,
> 
> Currently mediated device management, much like SR-IOV VF management,
> is largely left as an exercise for the user.  This is an attempt to
> provide something and see where it goes.  I doubt we'll solve
> everyone's needs on the first pass, but maybe we'll solve enough and
> provide helpers for the rest.  Without further ado, I'll point to what
> I have so far:
> 
> https://github.com/awilliam/mdevctl
> 
> This is inspired by driverctl, which is also a bash utility.  mdevctl
> uses udev and systemd to record and recreate mdev devices for
> persistence and provides a command line utility for querying, listing,
> starting, stopping, adding, and removing mdev devices.  Currently, for
> better or worse, it considers anything created to be persistent.  I can
> imagine a global configuration option that might disable this and
> perhaps an autostart flag per mdev device, such that mdevctl might
> simply "know" about some mdevs but not attempt to create them
> automatically.  Clearly command line usage help, man pages, and
> packaging are lacking as well, release early, release often, plus this
> is a discussion starter to see if perhaps this is sufficient to meet
> some needs.
> 
> Originally I thought about making a utility to manage both mdev and
> SR-IOV VFs all in one, but it seemed more natural to start here
> (besides, I couldn't think of a good name for the combined utility).
> If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> see whether they have enough synergy to become one.
> 
> It would be really useful if s390 folks could help me understand
> whether it's possible to glean all the information necessary to
> recreate a ccw or ap mdev device from sysfs.  I expect the file where
> we currently only store the mdev_type to evolve into something that
> includes more information to facilitate more complicated devices.  For
> now I make no claims to maintaining compatibility of recorded mdev
> devices, it will absolutely change, but I didn't want to get bogged
> down in making sure I don't accidentally source a root kit hidden in an
> mdev config file.

I played a bit with it on my LPAR, and it is at least not obviously
broken with vfio-ccw :) I don't have any ap devices to play with,
though.

> 
> I'm also curious how or if libvirt or openstack might use this.  If
> nothing else, it makes libvirt hook scripts easier to write, especially
> if we add an option not to autostart mdevs, or if users don't mind
> persistent mdevs, maybe there's nothing more to do.
> 
> BTW, feel free to clean up by bash, I'm a brute force and ignorance
> shell coder ;)  

Not that I'm a good shell coder, but I sent you a pull req at least
adding a basic help text ;)

I have not yet looked at most of the code, though.

> Thanks,
> 
> Alex


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-24 10:11 ` Cornelia Huck
@ 2019-05-24 14:43   ` Alex Williamson
  2019-06-07 16:06   ` Halil Pasic
  1 sibling, 0 replies; 46+ messages in thread
From: Alex Williamson @ 2019-05-24 14:43 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza

On Fri, 24 May 2019 12:11:06 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 23 May 2019 17:20:01 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Hi,
> > 
> > Currently mediated device management, much like SR-IOV VF management,
> > is largely left as an exercise for the user.  This is an attempt to
> > provide something and see where it goes.  I doubt we'll solve
> > everyone's needs on the first pass, but maybe we'll solve enough and
> > provide helpers for the rest.  Without further ado, I'll point to what
> > I have so far:
> > 
> > https://github.com/awilliam/mdevctl
> > 
> > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > uses udev and systemd to record and recreate mdev devices for
> > persistence and provides a command line utility for querying, listing,
> > starting, stopping, adding, and removing mdev devices.  Currently, for
> > better or worse, it considers anything created to be persistent.  I can
> > imagine a global configuration option that might disable this and
> > perhaps an autostart flag per mdev device, such that mdevctl might
> > simply "know" about some mdevs but not attempt to create them
> > automatically.  Clearly command line usage help, man pages, and
> > packaging are lacking as well, release early, release often, plus this
> > is a discussion starter to see if perhaps this is sufficient to meet
> > some needs.
> > 
> > Originally I thought about making a utility to manage both mdev and
> > SR-IOV VFs all in one, but it seemed more natural to start here
> > (besides, I couldn't think of a good name for the combined utility).
> > If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> > see whether they have enough synergy to become one.
> > 
> > It would be really useful if s390 folks could help me understand
> > whether it's possible to glean all the information necessary to
> > recreate a ccw or ap mdev device from sysfs.  I expect the file where
> > we currently only store the mdev_type to evolve into something that
> > includes more information to facilitate more complicated devices.  For
> > now I make no claims to maintaining compatibility of recorded mdev
> > devices, it will absolutely change, but I didn't want to get bogged
> > down in making sure I don't accidentally source a root kit hidden in an
> > mdev config file.  
> 
> I played a bit with it on my LPAR, and it is at least not obviously
> broken with vfio-ccw :) I don't have any ap devices to play with,
> though.

Awesome, that's a good start.  Thanks for testing!


> > I'm also curious how or if libvirt or openstack might use this.  If
> > nothing else, it makes libvirt hook scripts easier to write, especially
> > if we add an option not to autostart mdevs, or if users don't mind
> > persistent mdevs, maybe there's nothing more to do.
> > 
> > BTW, feel free to clean up by bash, I'm a brute force and ignorance
> > shell coder ;)    
> 
> Not that I'm a good shell coder, but I sent you a pull req at least
> adding a basic help text ;)
> 
> I have not yet looked at most of the code, though.

Pulled, thanks again!  I'll continue to try to fill in some of the
gaps.  I also forgot to mention that the integration with systemd means
that mdevs that mdevctl knows about can be started and stopped with
systemctl, ie:

# systemctl {start|stop} mdev@$UUID.service

You'll see that sbin/mdevctl start/stop-mdev is effectively just an
alias to this.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-24 10:11 ` Cornelia Huck
  2019-05-24 14:43   ` Alex Williamson
@ 2019-06-07 16:06   ` Halil Pasic
  2019-06-11 19:45     ` Cornelia Huck
  1 sibling, 1 reply; 46+ messages in thread
From: Halil Pasic @ 2019-06-07 16:06 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Fri, 24 May 2019 12:11:06 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 23 May 2019 17:20:01 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Hi,
> > 

[..]

> > 
> > It would be really useful if s390 folks could help me understand
> > whether it's possible to glean all the information necessary to
> > recreate a ccw or ap mdev device from sysfs.  I expect the file where
> > we currently only store the mdev_type to evolve into something that
> > includes more information to facilitate more complicated devices.  For
> > now I make no claims to maintaining compatibility of recorded mdev
> > devices, it will absolutely change, but I didn't want to get bogged
> > down in making sure I don't accidentally source a root kit hidden in an
> > mdev config file.
> 
> I played a bit with it on my LPAR, and it is at least not obviously
> broken with vfio-ccw :) I don't have any ap devices to play with,
> though.
> 

Sorry for being late...

I guess for vfio-ccw one needs to make sure that the ccw device is bound
to the vfio-ccw driver first, and only after that can one use  
create-mdev to create the mdev on top of the subchannel.

So to make this work persistently (survive a reboot) one would need to
take care of the subchannel getting bound to the right vfio_ccw driver
before mdevctl is called. Right?

BTW how does this concurrence situation between the drivers io_subchannel
and vfio_ccw work? Especially if both are build in?

> > 
> > I'm also curious how or if libvirt or openstack might use this.  If
> > nothing else, it makes libvirt hook scripts easier to write, especially
> > if we add an option not to autostart mdevs, or if users don't mind
> > persistent mdevs, maybe there's nothing more to do.
> > 

+1

@Alex: I'm curious what is the big management picture for non-auto looks
like.

Regards,
Halil

[..]


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-07 16:06   ` Halil Pasic
@ 2019-06-11 19:45     ` Cornelia Huck
  2019-06-11 20:28       ` Alex Williamson
  2019-06-12 15:07       ` Halil Pasic
  0 siblings, 2 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-06-11 19:45 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Alex Williamson, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Fri, 7 Jun 2019 18:06:30 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Fri, 24 May 2019 12:11:06 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 23 May 2019 17:20:01 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > Hi,
> > >   
> 
> [..]
> 
> > > 
> > > It would be really useful if s390 folks could help me understand
> > > whether it's possible to glean all the information necessary to
> > > recreate a ccw or ap mdev device from sysfs.  I expect the file where
> > > we currently only store the mdev_type to evolve into something that
> > > includes more information to facilitate more complicated devices.  For
> > > now I make no claims to maintaining compatibility of recorded mdev
> > > devices, it will absolutely change, but I didn't want to get bogged
> > > down in making sure I don't accidentally source a root kit hidden in an
> > > mdev config file.  
> > 
> > I played a bit with it on my LPAR, and it is at least not obviously
> > broken with vfio-ccw :) I don't have any ap devices to play with,
> > though.
> >   
> 
> Sorry for being late...
> 
> I guess for vfio-ccw one needs to make sure that the ccw device is bound
> to the vfio-ccw driver first, and only after that can one use  
> create-mdev to create the mdev on top of the subchannel.
> 
> So to make this work persistently (survive a reboot) one would need to
> take care of the subchannel getting bound to the right vfio_ccw driver
> before mdevctl is called. Right?
> 
> BTW how does this concurrence situation between the drivers io_subchannel
> and vfio_ccw work? Especially if both are build in?

If you have two drivers that match to the same device type, you'll
always have the issue that the driver that is first matched with the
device will bind to it and you have to do the unbind/rebind dance to
get it bound to the correct device driver. (I guess that this was the
basic motivation behind the ap bus default driver infrastructure,
right?) I think that in our case the io_subchannel driver will be
called first (alphabetical order and the fact that vfio-ccw will often
be a module). I'm not sure if it is within the scope of mdevctl to
ensure that the device is bound to the correct driver, or if it rather
should work with devices already bound to the correct driver only.
Maybe a separate udev-rules generator?

There's also the question where that automatic configuration should
stop. Should cio_ignore handling be part of it as well? [That's a
non-generic interface, of course. Tooling within s390-tools, maybe?]

> > > 
> > > I'm also curious how or if libvirt or openstack might use this.
> > > If nothing else, it makes libvirt hook scripts easier to write,
> > > especially if we add an option not to autostart mdevs, or if
> > > users don't mind persistent mdevs, maybe there's nothing more to
> > > do. 
> 
> +1
> 
> @Alex: I'm curious what is the big management picture for non-auto
> looks like.
> 
> Regards,
> Halil
> 
> [..]
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-11 19:45     ` Cornelia Huck
@ 2019-06-11 20:28       ` Alex Williamson
  2019-06-12  7:14         ` Cornelia Huck
  2019-06-12 15:07       ` Halil Pasic
  1 sibling, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-11 20:28 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety,
	Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Tue, 11 Jun 2019 21:45:08 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 7 Jun 2019 18:06:30 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Fri, 24 May 2019 12:11:06 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> > > On Thu, 23 May 2019 17:20:01 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > >     
> > > > Hi,
> > > >     
> > 
> > [..]
> >   
> > > > 
> > > > It would be really useful if s390 folks could help me understand
> > > > whether it's possible to glean all the information necessary to
> > > > recreate a ccw or ap mdev device from sysfs.  I expect the file where
> > > > we currently only store the mdev_type to evolve into something that
> > > > includes more information to facilitate more complicated devices.  For
> > > > now I make no claims to maintaining compatibility of recorded mdev
> > > > devices, it will absolutely change, but I didn't want to get bogged
> > > > down in making sure I don't accidentally source a root kit hidden in an
> > > > mdev config file.    
> > > 
> > > I played a bit with it on my LPAR, and it is at least not obviously
> > > broken with vfio-ccw :) I don't have any ap devices to play with,
> > > though.
> > >     
> > 
> > Sorry for being late...
> > 
> > I guess for vfio-ccw one needs to make sure that the ccw device is bound
> > to the vfio-ccw driver first, and only after that can one use  
> > create-mdev to create the mdev on top of the subchannel.
> > 
> > So to make this work persistently (survive a reboot) one would need to
> > take care of the subchannel getting bound to the right vfio_ccw driver
> > before mdevctl is called. Right?
> > 
> > BTW how does this concurrence situation between the drivers io_subchannel
> > and vfio_ccw work? Especially if both are build in?  
> 
> If you have two drivers that match to the same device type, you'll
> always have the issue that the driver that is first matched with the
> device will bind to it and you have to do the unbind/rebind dance to
> get it bound to the correct device driver. (I guess that this was the
> basic motivation behind the ap bus default driver infrastructure,
> right?) I think that in our case the io_subchannel driver will be
> called first (alphabetical order and the fact that vfio-ccw will often
> be a module). I'm not sure if it is within the scope of mdevctl to
> ensure that the device is bound to the correct driver, or if it rather
> should work with devices already bound to the correct driver only.
> Maybe a separate udev-rules generator?

Getting a device bound to a specific driver is exactly the domain of
driverctl.  Implement the sysfs interfaces driverctl uses and see if it
works.  Driverctl defaults to PCI and knows some extra things about
PCI, but appears to be written to be generally bus agnostic.  Thanks,

Alex

> There's also the question where that automatic configuration should
> stop. Should cio_ignore handling be part of it as well? [That's a
> non-generic interface, of course. Tooling within s390-tools, maybe?]
> 
> > > > 
> > > > I'm also curious how or if libvirt or openstack might use this.
> > > > If nothing else, it makes libvirt hook scripts easier to write,
> > > > especially if we add an option not to autostart mdevs, or if
> > > > users don't mind persistent mdevs, maybe there's nothing more to
> > > > do.   
> > 
> > +1
> > 
> > @Alex: I'm curious what is the big management picture for non-auto
> > looks like.
> > 
> > Regards,
> > Halil
> > 
> > [..]
> >   
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-11 20:28       ` Alex Williamson
@ 2019-06-12  7:14         ` Cornelia Huck
  2019-06-12 15:54           ` Halil Pasic
  0 siblings, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-06-12  7:14 UTC (permalink / raw)
  To: Alex Williamson, Halil Pasic
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza

On Tue, 11 Jun 2019 14:28:22 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 11 Jun 2019 21:45:08 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 7 Jun 2019 18:06:30 +0200
> > Halil Pasic <pasic@linux.ibm.com> wrote:

> > > I guess for vfio-ccw one needs to make sure that the ccw device is bound
> > > to the vfio-ccw driver first, and only after that can one use  
> > > create-mdev to create the mdev on top of the subchannel.
> > > 
> > > So to make this work persistently (survive a reboot) one would need to
> > > take care of the subchannel getting bound to the right vfio_ccw driver
> > > before mdevctl is called. Right?
> > > 
> > > BTW how does this concurrence situation between the drivers io_subchannel
> > > and vfio_ccw work? Especially if both are build in?    
> > 
> > If you have two drivers that match to the same device type, you'll
> > always have the issue that the driver that is first matched with the
> > device will bind to it and you have to do the unbind/rebind dance to
> > get it bound to the correct device driver. (I guess that this was the
> > basic motivation behind the ap bus default driver infrastructure,
> > right?) I think that in our case the io_subchannel driver will be
> > called first (alphabetical order and the fact that vfio-ccw will often
> > be a module). I'm not sure if it is within the scope of mdevctl to
> > ensure that the device is bound to the correct driver, or if it rather
> > should work with devices already bound to the correct driver only.
> > Maybe a separate udev-rules generator?  
> 
> Getting a device bound to a specific driver is exactly the domain of
> driverctl.  Implement the sysfs interfaces driverctl uses and see if it
> works.  Driverctl defaults to PCI and knows some extra things about
> PCI, but appears to be written to be generally bus agnostic.  Thanks,
> 
> Alex

Ok, looked at driverctl. Extending this one for non-PCI seems like a
reasonable path. However, we would also need to extend any non-PCI
device type we want to support with a driver_override attribute like
you did for PCI in 782a985d7af26db39e86070d28f987cad2 -- so this is
only for newer kernels. Adding that attribute for subchannels looks
feasible at a glance, but I have not tried to actually do it :)

Halil, do you think that would make sense?

[This might also help with the lcs vs. ctc confusion on a certain 3088
cu model if this is added for ccw devices as well; but I'm not sure if
these are still out in the wild at all. Probably not worth the effort
for that.]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-11 19:45     ` Cornelia Huck
  2019-06-11 20:28       ` Alex Williamson
@ 2019-06-12 15:07       ` Halil Pasic
  1 sibling, 0 replies; 46+ messages in thread
From: Halil Pasic @ 2019-06-12 15:07 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Tue, 11 Jun 2019 21:45:08 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 7 Jun 2019 18:06:30 +0200
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > On Fri, 24 May 2019 12:11:06 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Thu, 23 May 2019 17:20:01 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > >   
> > > > Hi,
> > > >   
> > 
> > [..]
> > 
> > > > 
> > > > It would be really useful if s390 folks could help me understand
> > > > whether it's possible to glean all the information necessary to
> > > > recreate a ccw or ap mdev device from sysfs.  I expect the file where
> > > > we currently only store the mdev_type to evolve into something that
> > > > includes more information to facilitate more complicated devices.  For
> > > > now I make no claims to maintaining compatibility of recorded mdev
> > > > devices, it will absolutely change, but I didn't want to get bogged
> > > > down in making sure I don't accidentally source a root kit hidden in an
> > > > mdev config file.  
> > > 
> > > I played a bit with it on my LPAR, and it is at least not obviously
> > > broken with vfio-ccw :) I don't have any ap devices to play with,
> > > though.
> > >   
> > 
> > Sorry for being late...
> > 
> > I guess for vfio-ccw one needs to make sure that the ccw device is bound
> > to the vfio-ccw driver first, and only after that can one use  
> > create-mdev to create the mdev on top of the subchannel.
> > 
> > So to make this work persistently (survive a reboot) one would need to
> > take care of the subchannel getting bound to the right vfio_ccw driver
> > before mdevctl is called. Right?
> > 
> > BTW how does this concurrence situation between the drivers io_subchannel
> > and vfio_ccw work? Especially if both are build in?
> 
> If you have two drivers that match to the same device type, you'll
> always have the issue that the driver that is first matched with the
> device will bind to it and you have to do the unbind/rebind dance to
> get it bound to the correct device driver. (I guess that this was the
> basic motivation behind the ap bus default driver infrastructure,
> right?) 

Yes and no. The main idea behind the ap bus default driver infrastructure
is to make devices 'return' to the right driver (family) even if the
devices linux representation (kobj) gets destroyed and reconstructed
(e.g. due to loss of connectivity).

> I think that in our case the io_subchannel driver will be
> called first (alphabetical order and the fact that vfio-ccw will often
> be a module).

Did some more digging vfio_ccw seems to be device_initcall, while
io_subchannel subsys_initcall. I.e. we are safe there.

> I'm not sure if it is within the scope of mdevctl to
> ensure that the device is bound to the correct driver, or if it rather
> should work with devices already bound to the correct driver only.
> Maybe a separate udev-rules generator?
>

My point is, for persistent mdevctl should do it's magic after this
mechanism kicked in. Otherwise mdevctl will fail to accomplish its
purpose.

I wonder if that is the case with driverctl. From what I say both have
"""
DefaultDependencies=no
Before=basic.target
"""
in the @.service file...

I'm just trying to figure out how the end-to end solution looks like.

> There's also the question where that automatic configuration should
> stop. Should cio_ignore handling be part of it as well? [That's a
> non-generic interface, of course. Tooling within s390-tools, maybe?]
>

Isn't cio_ignore a /proc interface? I don't think mdevctl should be
concerned with cio_ignore.

Regards,
Halil


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-12  7:14         ` Cornelia Huck
@ 2019-06-12 15:54           ` Halil Pasic
  2019-06-13 10:02             ` Cornelia Huck
  0 siblings, 1 reply; 46+ messages in thread
From: Halil Pasic @ 2019-06-12 15:54 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Alex Williamson, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Wed, 12 Jun 2019 09:14:39 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Tue, 11 Jun 2019 14:28:22 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Tue, 11 Jun 2019 21:45:08 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Fri, 7 Jun 2019 18:06:30 +0200
> > > Halil Pasic <pasic@linux.ibm.com> wrote:
> 
> > > > I guess for vfio-ccw one needs to make sure that the ccw device is bound
> > > > to the vfio-ccw driver first, and only after that can one use  
> > > > create-mdev to create the mdev on top of the subchannel.
> > > > 
> > > > So to make this work persistently (survive a reboot) one would need to
> > > > take care of the subchannel getting bound to the right vfio_ccw driver
> > > > before mdevctl is called. Right?
> > > > 
> > > > BTW how does this concurrence situation between the drivers io_subchannel
> > > > and vfio_ccw work? Especially if both are build in?    
> > > 
> > > If you have two drivers that match to the same device type, you'll
> > > always have the issue that the driver that is first matched with the
> > > device will bind to it and you have to do the unbind/rebind dance to
> > > get it bound to the correct device driver. (I guess that this was the
> > > basic motivation behind the ap bus default driver infrastructure,
> > > right?) I think that in our case the io_subchannel driver will be
> > > called first (alphabetical order and the fact that vfio-ccw will often
> > > be a module). I'm not sure if it is within the scope of mdevctl to
> > > ensure that the device is bound to the correct driver, or if it rather
> > > should work with devices already bound to the correct driver only.
> > > Maybe a separate udev-rules generator?  
> > 
> > Getting a device bound to a specific driver is exactly the domain of
> > driverctl.  Implement the sysfs interfaces driverctl uses and see if it
> > works.  Driverctl defaults to PCI and knows some extra things about
> > PCI, but appears to be written to be generally bus agnostic.  Thanks,
> > 
> > Alex
> 

@Alex: Thanks! I was not aware of driverctl.

> Ok, looked at driverctl. Extending this one for non-PCI seems like a
> reasonable path. However, we would also need to extend any non-PCI
> device type we want to support with a driver_override attribute like
> you did for PCI in 782a985d7af26db39e86070d28f987cad2 -- so this is
> only for newer kernels. Adding that attribute for subchannels looks
> feasible at a glance, but I have not tried to actually do it :)
> 
> Halil, do you think that would make sense?

Looks doable. Did not quite figure out the details yet, but it seems
that for PCI driver_override has more benefits than for cio (compared
to simple unbind/bind), as matching and probing seems to be more
elaborate for PCI. The benefit I see are
1) the ability to exclude the device form driver binding, and
2) having the same mechanism and thus consistent experience for pci and
cio.

What we IMHO should not do is make driver_override the override the
sch->st == id->type check.

Regards,
Halil

> 
> [This might also help with the lcs vs. ctc confusion on a certain 3088
> cu model if this is added for ccw devices as well; but I'm not sure if
> these are still out in the wild at all. Probably not worth the effort
> for that.]


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-12 15:54           ` Halil Pasic
@ 2019-06-13 10:02             ` Cornelia Huck
  0 siblings, 0 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-06-13 10:02 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Alex Williamson, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza

On Wed, 12 Jun 2019 17:54:34 +0200
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Wed, 12 Jun 2019 09:14:39 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:

> > Ok, looked at driverctl. Extending this one for non-PCI seems like a
> > reasonable path. However, we would also need to extend any non-PCI
> > device type we want to support with a driver_override attribute like
> > you did for PCI in 782a985d7af26db39e86070d28f987cad2 -- so this is
> > only for newer kernels. Adding that attribute for subchannels looks
> > feasible at a glance, but I have not tried to actually do it :)
> > 
> > Halil, do you think that would make sense?  
> 
> Looks doable. Did not quite figure out the details yet, but it seems
> that for PCI driver_override has more benefits than for cio (compared
> to simple unbind/bind), as matching and probing seems to be more
> elaborate for PCI. The benefit I see are
> 1) the ability to exclude the device form driver binding, and
> 2) having the same mechanism and thus consistent experience for pci and
> cio.

Yes, we should provide the same mechanism, even if it is much simpler
for the css bus.

> 
> What we IMHO should not do is make driver_override the override the
> sch->st == id->type check.

Agreed. The number of possible ids is much lower on the css bus, and a
driver wanting to match to any device may simply specify all of them
(not that this looks very useful).

I'm currently playing with this change; will send out a patch when I
have it in reasonable shape.

> 
> Regards,
> Halil
> 
> > 
> > [This might also help with the lcs vs. ctc confusion on a certain 3088
> > cu model if this is added for ccw devices as well; but I'm not sure if
> > these are still out in the wild at all. Probably not worth the effort
> > for that.]  
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-23 23:20 mdevctl: A shoestring mediated device management and persistence utility Alex Williamson
  2019-05-24 10:11 ` Cornelia Huck
@ 2019-06-13 16:17 ` Christophe de Dinechin
  2019-06-13 16:35   ` Alex Williamson
  2019-06-17 14:00 ` Daniel P. Berrangé
  2019-06-25 22:52 ` Alex Williamson
  3 siblings, 1 reply; 46+ messages in thread
From: Christophe de Dinechin @ 2019-06-13 16:17 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	\"Daniel P. Berrangé\,
	Sylvain Bauza, Cornelia Huck



> On 24 May 2019, at 01:20, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> Hi,
> 
> Currently mediated device management, much like SR-IOV VF management,
> is largely left as an exercise for the user.  This is an attempt to
> provide something and see where it goes.  I doubt we'll solve
> everyone's needs on the first pass, but maybe we'll solve enough and
> provide helpers for the rest.  Without further ado, I'll point to what
> I have so far:
> 
> https://github.com/awilliam/mdevctl

While it’s still early, what about :

	mdevctl create-mdev <parent-device> <mdev-type> [<mdev-uuid>]

where if the mdev-uuid is missing, you just run uuidgen within the script?

I sent a small PR in case you think it makes sense.


Thanks,
Christophe

> 
> This is inspired by driverctl, which is also a bash utility.  mdevctl
> uses udev and systemd to record and recreate mdev devices for
> persistence and provides a command line utility for querying, listing,
> starting, stopping, adding, and removing mdev devices.  Currently, for
> better or worse, it considers anything created to be persistent.  I can
> imagine a global configuration option that might disable this and
> perhaps an autostart flag per mdev device, such that mdevctl might
> simply "know" about some mdevs but not attempt to create them
> automatically.  Clearly command line usage help, man pages, and
> packaging are lacking as well, release early, release often, plus this
> is a discussion starter to see if perhaps this is sufficient to meet
> some needs.
> 
> Originally I thought about making a utility to manage both mdev and
> SR-IOV VFs all in one, but it seemed more natural to start here
> (besides, I couldn't think of a good name for the combined utility).
> If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> see whether they have enough synergy to become one.
> 
> It would be really useful if s390 folks could help me understand
> whether it's possible to glean all the information necessary to
> recreate a ccw or ap mdev device from sysfs.  I expect the file where
> we currently only store the mdev_type to evolve into something that
> includes more information to facilitate more complicated devices.  For
> now I make no claims to maintaining compatibility of recorded mdev
> devices, it will absolutely change, but I didn't want to get bogged
> down in making sure I don't accidentally source a root kit hidden in an
> mdev config file.
> 
> I'm also curious how or if libvirt or openstack might use this.  If
> nothing else, it makes libvirt hook scripts easier to write, especially
> if we add an option not to autostart mdevs, or if users don't mind
> persistent mdevs, maybe there's nothing more to do.
> 
> BTW, feel free to clean up by bash, I'm a brute force and ignorance
> shell coder ;)  Thanks,
> 
> Alex


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-13 16:17 ` Christophe de Dinechin
@ 2019-06-13 16:35   ` Alex Williamson
  2019-06-14  9:54     ` [libvirt] " Christophe de Dinechin
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-13 16:35 UTC (permalink / raw)
  To: Christophe de Dinechin
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Cornelia Huck

On Thu, 13 Jun 2019 18:17:53 +0200
Christophe de Dinechin <cdupontd@redhat.com> wrote:

> > On 24 May 2019, at 01:20, Alex Williamson <alex.williamson@redhat.com> wrote:
> > 
> > Hi,
> > 
> > Currently mediated device management, much like SR-IOV VF management,
> > is largely left as an exercise for the user.  This is an attempt to
> > provide something and see where it goes.  I doubt we'll solve
> > everyone's needs on the first pass, but maybe we'll solve enough and
> > provide helpers for the rest.  Without further ado, I'll point to what
> > I have so far:
> > 
> > https://github.com/awilliam/mdevctl  
> 
> While it’s still early, what about :
> 
> 	mdevctl create-mdev <parent-device> <mdev-type> [<mdev-uuid>]
> 
> where if the mdev-uuid is missing, you just run uuidgen within the script?
> 
> I sent a small PR in case you think it makes sense.

It sounds racy.  If the user doesn't provide the UUID then they need to
guess that an mdev device with the same parent and type is theirs.  How
do you resolve two instances of this happening in parallel and both
coming to the same conclusion which is their device.  If a user wants
this sort of headache they can call mdevctl with `uuidgen` but I don't
think we should encourage it further.

BTW, I've moved the project to https://github.com/mdevctl/mdevctl, the
latest commit in the tree above makes that change, I've also updated
the description on my repo to point to the new location.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [libvirt] mdevctl: A shoestring mediated device management and persistence utility
  2019-06-13 16:35   ` Alex Williamson
@ 2019-06-14  9:54     ` " Christophe de Dinechin
  2019-06-14 14:23       ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Christophe de Dinechin @ 2019-06-14  9:54 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Sylvain Bauza, Pavel Hrdina, kvm, Skultety, Libvirt Devel,
	Cornelia Huck, Kirti Wankhede, Erik Skultety



> On 13 Jun 2019, at 18:35, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> On Thu, 13 Jun 2019 18:17:53 +0200
> Christophe de Dinechin <cdupontd@redhat.com> wrote:
> 
>>> On 24 May 2019, at 01:20, Alex Williamson <alex.williamson@redhat.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Currently mediated device management, much like SR-IOV VF management,
>>> is largely left as an exercise for the user.  This is an attempt to
>>> provide something and see where it goes.  I doubt we'll solve
>>> everyone's needs on the first pass, but maybe we'll solve enough and
>>> provide helpers for the rest.  Without further ado, I'll point to what
>>> I have so far:
>>> 
>>> https://github.com/awilliam/mdevctl  
>> 
>> While it’s still early, what about :
>> 
>> 	mdevctl create-mdev <parent-device> <mdev-type> [<mdev-uuid>]
>> 
>> where if the mdev-uuid is missing, you just run uuidgen within the script?
>> 
>> I sent a small PR in case you think it makes sense.
> 
> It sounds racy.  If the user doesn't provide the UUID then they need to
> guess that an mdev device with the same parent and type is theirs.

That is true irrespective of the usage, isn’t it? In other words, when you
invoke `mdevctl create-mdev`, you assert “I own that specific parent/type”.
At least, that’s how I read the way the script behaves today. Whether you
invoke uuidgen inside or outside the script does not change that assertion
(at least with today’s code).

>  How do you resolve two instances of this happening in parallel and both
> coming to the same conclusion which is their device.  If a user wants
> this sort of headache they can call mdevctl with `uuidgen` but I don't
> think we should encourage it further.

I agree there is a race, but if anything, having a usage where you don’t
pass the UUID on the command line is a step in the right direction.
It leaves the door open for the create-mdev script to do smarter things,
like deferring the allocation of the mdevs to an entity that has slightly
more knowledge of the global system state than uuidgen.

In other words, in my mind, `mdevctl create-mdev parent type` does not
imply “this will use uuidgen” but rather, if anything, implies “this will do the
right thing to prevent the race in the future, even if that’s more complex
than just calling uuidgen”.

However, I believe that this means we should reorder the args further.
I would suggest something like:

	mdevctl create-mdev <mdev-type> [<parent-device> [<mdev-uuid>]]

where

1 arg means you let mdevctl choose the parent device for you (future)
   (e.g. I want a VGPU of this type, I don’t really care where it comes from)
2 args mean you want that specific type/parent combination
3 args mean you assert you own that device

That also implies that mdevctl create-mdev should output what it allocated
so that some higher-level software can tell “OK, that’s the instance I got”.

> BTW, I've moved the project to https://github.com/mdevctl/mdevctl, the
> latest commit in the tree above makes that change, I've also updated
> the description on my repo to point to the new location.  Thanks,

Done.

> 
> Alex
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [libvirt] mdevctl: A shoestring mediated device management and persistence utility
  2019-06-14  9:54     ` [libvirt] " Christophe de Dinechin
@ 2019-06-14 14:23       ` Alex Williamson
  2019-06-14 15:06         ` Christophe de Dinechin
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-14 14:23 UTC (permalink / raw)
  To: Christophe de Dinechin
  Cc: Sylvain Bauza, Pavel Hrdina, kvm, Skultety, Libvirt Devel,
	Cornelia Huck, Kirti Wankhede

On Fri, 14 Jun 2019 11:54:42 +0200
Christophe de Dinechin <cdupontd@redhat.com> wrote:

> > On 13 Jun 2019, at 18:35, Alex Williamson <alex.williamson@redhat.com> wrote:
> > 
> > On Thu, 13 Jun 2019 18:17:53 +0200
> > Christophe de Dinechin <cdupontd@redhat.com> wrote:
> >   
> >>> On 24 May 2019, at 01:20, Alex Williamson <alex.williamson@redhat.com> wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> Currently mediated device management, much like SR-IOV VF management,
> >>> is largely left as an exercise for the user.  This is an attempt to
> >>> provide something and see where it goes.  I doubt we'll solve
> >>> everyone's needs on the first pass, but maybe we'll solve enough and
> >>> provide helpers for the rest.  Without further ado, I'll point to what
> >>> I have so far:
> >>> 
> >>> https://github.com/awilliam/mdevctl    
> >> 
> >> While it’s still early, what about :
> >> 
> >> 	mdevctl create-mdev <parent-device> <mdev-type> [<mdev-uuid>]
> >> 
> >> where if the mdev-uuid is missing, you just run uuidgen within the script?
> >> 
> >> I sent a small PR in case you think it makes sense.  
> > 
> > It sounds racy.  If the user doesn't provide the UUID then they need to
> > guess that an mdev device with the same parent and type is theirs.  
> 
> That is true irrespective of the usage, isn’t it? In other words, when you
> invoke `mdevctl create-mdev`, you assert “I own that specific parent/type”.
> At least, that’s how I read the way the script behaves today. Whether you
> invoke uuidgen inside or outside the script does not change that assertion
> (at least with today’s code).

What gives you this impression?  Where is the parent/type ownership
implied?  The intended semantics are "try to create this type of device
under this parent".
 
> >  How do you resolve two instances of this happening in parallel and both
> > coming to the same conclusion which is their device.  If a user wants
> > this sort of headache they can call mdevctl with `uuidgen` but I don't
> > think we should encourage it further.  
> 
> I agree there is a race, but if anything, having a usage where you don’t
> pass the UUID on the command line is a step in the right direction.
> It leaves the door open for the create-mdev script to do smarter things,
> like deferring the allocation of the mdevs to an entity that has slightly
> more knowledge of the global system state than uuidgen.

A user might (likely) require a specific uuid to match their VM
configuration.  I can only think of very niche use cases where a user
doesn't care what uuid they get.

> In other words, in my mind, `mdevctl create-mdev parent type` does not
> imply “this will use uuidgen” but rather, if anything, implies “this will do the
> right thing to prevent the race in the future, even if that’s more complex
> than just calling uuidgen”.

What race are you trying to prevent, uuid collision?

> However, I believe that this means we should reorder the args further.
> I would suggest something like:
> 
> 	mdevctl create-mdev <mdev-type> [<parent-device> [<mdev-uuid>]]
> 
> where

Absolutely not, now you've required mdevctl to implement policy in mdev
placement.  mdevctl follows the unix standard, do one thing and do it
well.  If someone wants to layer placement policy on top of mdevctl,
great, but let's not impose that within mdevctl.

> 1 arg means you let mdevctl choose the parent device for you (future)
>    (e.g. I want a VGPU of this type, I don’t really care where it comes from)
> 2 args mean you want that specific type/parent combination
> 3 args mean you assert you own that device
> 
> That also implies that mdevctl create-mdev should output what it allocated
> so that some higher-level software can tell “OK, that’s the instance I got”.

I don't think we're aligned on what mdevctl is attempting to provide.
Maybe you're describing a layer you'd like to see above mdevctl?
Thanks,

Alex

> > BTW, I've moved the project to https://github.com/mdevctl/mdevctl, the
> > latest commit in the tree above makes that change, I've also updated
> > the description on my repo to point to the new location.  Thanks,  
> 
> Done.
> 
> > 
> > Alex
> > 
> > --
> > libvir-list mailing list
> > libvir-list@redhat.com
> > https://www.redhat.com/mailman/listinfo/libvir-list  
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [libvirt] mdevctl: A shoestring mediated device management and persistence utility
  2019-06-14 14:23       ` Alex Williamson
@ 2019-06-14 15:06         ` Christophe de Dinechin
  2019-06-14 16:04           ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Christophe de Dinechin @ 2019-06-14 15:06 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Sylvain Bauza, Pavel Hrdina, kvm, Skultety, Libvirt Devel,
	Cornelia Huck, Kirti Wankhede



> On 14 Jun 2019, at 16:23, Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> On Fri, 14 Jun 2019 11:54:42 +0200
> Christophe de Dinechin <cdupontd@redhat.com> wrote:
> 
>> That is true irrespective of the usage, isn’t it? In other words, when you
>> invoke `mdevctl create-mdev`, you assert “I own that specific parent/type”.
>> At least, that’s how I read the way the script behaves today. Whether you
>> invoke uuidgen inside or outside the script does not change that assertion
>> (at least with today’s code).
> 
> What gives you this impression?

That your code does nothing to avoid any race today?

Maybe I was confused with the existing `uuidgen` example in you README,
but it looks to me like the usage model involves much more than just
create-mdev, and that any race that might exist is not in create-mdev itself
(or in uuidgen for that matter).

> Where is the parent/type ownership implied?

I did not imply it, but I read some concern about ownership
on your part in "they need to guess that an mdev device
with the same parent and type is *theirs*.” (emphasis mine)

I personally see no change on the “need to guess” implied
by the fact that you run uuidgen inside the script, so
that’s why I tried to guess what you meant.


> The intended semantics are
> "try to create this type of device under this parent”.

Agreed. Which is why I don’t see why trying to create
with some new UUID introduces any race (as long as
the script prints out that UUID, which I admit my patch
entirely failed to to)
> 

>>> How do you resolve two instances of this happening in parallel and both
>>> coming to the same conclusion which is their device.  If a user wants
>>> this sort of headache they can call mdevctl with `uuidgen` but I don't
>>> think we should encourage it further.  
>> 
>> I agree there is a race, but if anything, having a usage where you don’t
>> pass the UUID on the command line is a step in the right direction.
>> It leaves the door open for the create-mdev script to do smarter things,
>> like deferring the allocation of the mdevs to an entity that has slightly
>> more knowledge of the global system state than uuidgen.
> 
> A user might (likely) require a specific uuid to match their VM
> configuration.  I can only think of very niche use cases where a user
> doesn't care what uuid they get.

They do care. But I typically copy-paste my UUIDs, and then

1. copy-pasting at the end is always faster than between
the command and other arguments (3-args case). 

2. copy-pasting the output of the previous command is faster
than having one extra step where I need to copy the same thing twice
(2-args case).

So to me, if the script is intended to be used by humans, my
proposal makes it slightly more comfortable to use. Nothing more.

> 
>> In other words, in my mind, `mdevctl create-mdev parent type` does not
>> imply “this will use uuidgen” but rather, if anything, implies “this will do the
>> right thing to prevent the race in the future, even if that’s more complex
>> than just calling uuidgen”.
> 
> What race are you trying to prevent, uuid collision?

Of course not ;-)

I only added the part of the discussion below trying to figure out what
race you were seeing that was present only with my proposed changes.

I (apparently incorrectly) supposed that you had some kind of mdev
management within the script in mind. Obviously I misinterpreted.
That will teach me to guess when I don’t understand instead of just
ask…

> 
>> However, I believe that this means we should reorder the args further.
>> I would suggest something like:
>> 
>> 	mdevctl create-mdev <mdev-type> [<parent-device> [<mdev-uuid>]]
>> 
>> where
> 
> Absolutely not, now you've required mdevctl to implement policy in mdev
> placement.

No, I’m not requiring it. I’m leaving the door open if one day, say, we decide
to have libvirt tell us about the placement. That usage needs not go in right away,
I marked it as “(future)”.

Basically, all I’m saying is that since it’s early, we can reorder the
arguments so that the one you are most likely to change when you reuse
the command are the one that are last on the command-line, so that it
makes editing or copy-pasting easier. There isn’t more to it, and that’s
why I still do not see any new race introduced by that change.


>  mdevctl follows the unix standard, do one thing and do it
> well.  If someone wants to layer placement policy on top of mdevctl,
> great, but let's not impose that within mdevctl.

I’m not imposing anything (I believe). I was only trying to guess
where you saw things going that would imply there was a race with
my proposal that was not there without :-)

> 
>> 1 arg means you let mdevctl choose the parent device for you (future)
>>   (e.g. I want a VGPU of this type, I don’t really care where it comes from)
>> 2 args mean you want that specific type/parent combination
>> 3 args mean you assert you own that device
>> 
>> That also implies that mdevctl create-mdev should output what it allocated
>> so that some higher-level software can tell “OK, that’s the instance I got”.
> 
> I don't think we're aligned on what mdevctl is attempting to provide.
> Maybe you're describing a layer you'd like to see above mdevctl?
> Thanks,

No, again, I’m just trying to understand where you see a race.

Maybe instead of guessing, I should just ask: where is the race in
the two-args variant (assuming it prints the UUID it used) that does not
exist with the three-args variant?


Thanks
Christophe

> 
> Alex
> 
>>> BTW, I've moved the project to https://github.com/mdevctl/mdevctl, the
>>> latest commit in the tree above makes that change, I've also updated
>>> the description on my repo to point to the new location.  Thanks,  
>> 
>> Done.
>> 
>>> 
>>> Alex
>>> 
>>> --
>>> libvir-list mailing list
>>> libvir-list@redhat.com
>>> https://www.redhat.com/mailman/listinfo/libvir-list  


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [libvirt] mdevctl: A shoestring mediated device management and persistence utility
  2019-06-14 15:06         ` Christophe de Dinechin
@ 2019-06-14 16:04           ` Alex Williamson
  2019-06-17 16:03             ` Cornelia Huck
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-14 16:04 UTC (permalink / raw)
  To: Christophe de Dinechin
  Cc: Sylvain Bauza, Pavel Hrdina, kvm, Skultety, Libvirt Devel,
	Cornelia Huck, Kirti Wankhede

On Fri, 14 Jun 2019 17:06:15 +0200
Christophe de Dinechin <cdupontd@redhat.com> wrote:

> > On 14 Jun 2019, at 16:23, Alex Williamson <alex.williamson@redhat.com> wrote:
> > 
> > On Fri, 14 Jun 2019 11:54:42 +0200
> > Christophe de Dinechin <cdupontd@redhat.com> wrote:
> >   
> >> That is true irrespective of the usage, isn’t it? In other words, when you
> >> invoke `mdevctl create-mdev`, you assert “I own that specific parent/type”.
> >> At least, that’s how I read the way the script behaves today. Whether you
> >> invoke uuidgen inside or outside the script does not change that assertion
> >> (at least with today’s code).  
> > 
> > What gives you this impression?  
> 
> That your code does nothing to avoid any race today?
> 
> Maybe I was confused with the existing `uuidgen` example in you README,
> but it looks to me like the usage model involves much more than just
> create-mdev, and that any race that might exist is not in create-mdev itself
> (or in uuidgen for that matter).

I believe I mentioned this was an early release, error handling is
lacking.  Still, I think the races are minimal, they largely involve
uuid collisions.  Separate users can create mdevs under the same parent
concurrently, they would need to have a uuid collision to conflict.
Otherwise there's the resource issue on the parent, but that's left of
the kernel to manage.  If an mdev fails to be created at the kernel,
mdevctl should unwind, but we're not going to pretend that we have a
lock on the parent's sysfs mdev interfaces.

> > Where is the parent/type ownership implied?  
> 
> I did not imply it, but I read some concern about ownership
> on your part in "they need to guess that an mdev device
> with the same parent and type is *theirs*.” (emphasis mine)
> 
> I personally see no change on the “need to guess” implied
> by the fact that you run uuidgen inside the script, so
> that’s why I tried to guess what you meant.

As I noted in the reply to the pull request, putting `uuidgen` inline
was probably a bad example.  However, the difference is that the user
has imposed the race on themselves if they invoke mdevctl like this,
they've provided a uuid but they didn't record what it is.  This is the
user's problem.  Pushing uuid selection into mdevctl makes it mdevctl's
problem because the interface is fundamentally broken.

> > The intended semantics are
> > "try to create this type of device under this parent”.  
> 
> Agreed. Which is why I don’t see why trying to create
> with some new UUID introduces any race (as long as
> the script prints out that UUID, which I admit my patch
> entirely failed to to)

And that's the piece that makes it fundamentally broken.  Beyond that,
it seems unnecessary.  I don't see this as the primary invocation of
mdevctl and the functionality it adds is trivially accomplished in a
wrapper, so what's the value?

> >>> How do you resolve two instances of this happening in parallel and both
> >>> coming to the same conclusion which is their device.  If a user wants
> >>> this sort of headache they can call mdevctl with `uuidgen` but I don't
> >>> think we should encourage it further.    
> >> 
> >> I agree there is a race, but if anything, having a usage where you don’t
> >> pass the UUID on the command line is a step in the right direction.
> >> It leaves the door open for the create-mdev script to do smarter things,
> >> like deferring the allocation of the mdevs to an entity that has slightly
> >> more knowledge of the global system state than uuidgen.  
> > 
> > A user might (likely) require a specific uuid to match their VM
> > configuration.  I can only think of very niche use cases where a user
> > doesn't care what uuid they get.  
> 
> They do care. But I typically copy-paste my UUIDs, and then
> 
> 1. copy-pasting at the end is always faster than between
> the command and other arguments (3-args case). 
> 
> 2. copy-pasting the output of the previous command is faster
> than having one extra step where I need to copy the same thing twice
> (2-args case).
> 
> So to me, if the script is intended to be used by humans, my
> proposal makes it slightly more comfortable to use. Nothing more.

This is your preference, but I wouldn't call it universal.  Specifying
the uuid last seems backwards to me, we're creating an object so let's
first name that object.  We then specify where that object should be
created and what type it has.  This seems very logical to me, besides,
it's also the exact same order we use when listing mdevs :P

Clearly there's personal preference here, so let's not arbitrarily pick
a different preference.  If copy/paste order is more important to you
then submit a patch to give mdevctl real argument processing so you can
specify --uuid, --parent, --type in whatever order you want.

> >> In other words, in my mind, `mdevctl create-mdev parent type` does not
> >> imply “this will use uuidgen” but rather, if anything, implies “this will do the
> >> right thing to prevent the race in the future, even if that’s more complex
> >> than just calling uuidgen”.  
> > 
> > What race are you trying to prevent, uuid collision?  
> 
> Of course not ;-)
> 
> I only added the part of the discussion below trying to figure out what
> race you were seeing that was present only with my proposed changes.
> 
> I (apparently incorrectly) supposed that you had some kind of mdev
> management within the script in mind. Obviously I misinterpreted.
> That will teach me to guess when I don’t understand instead of just
> ask…

The management mdevctl provides is persistence, interrogation, and
interaction with mdevs and parent devices.  Like every other tool, I'm
going to defer policy related decisions to someone else.  Picking a
uuid is policy.  Picking a parent device is policy.

> >> However, I believe that this means we should reorder the args further.
> >> I would suggest something like:
> >> 
> >> 	mdevctl create-mdev <mdev-type> [<parent-device> [<mdev-uuid>]]
> >> 
> >> where  
> > 
> > Absolutely not, now you've required mdevctl to implement policy in mdev
> > placement.  
> 
> No, I’m not requiring it. I’m leaving the door open if one day, say, we decide
> to have libvirt tell us about the placement. That usage needs not go in right away,
> I marked it as “(future)”.
> 
> Basically, all I’m saying is that since it’s early, we can reorder the
> arguments so that the one you are most likely to change when you reuse
> the command are the one that are last on the command-line, so that it
> makes editing or copy-pasting easier. There isn’t more to it, and that’s
> why I still do not see any new race introduced by that change.

This is just compounding a shortcut made for this initial version,
clearly the solution is to allow arbitrary option ordering, not to pick
a different ordering for a perceived copy/paste efficiency

> >  mdevctl follows the unix standard, do one thing and do it
> > well.  If someone wants to layer placement policy on top of mdevctl,
> > great, but let's not impose that within mdevctl.  
> 
> I’m not imposing anything (I believe). I was only trying to guess
> where you saw things going that would imply there was a race with
> my proposal that was not there without :-)
> 
> >   
> >> 1 arg means you let mdevctl choose the parent device for you (future)
> >>   (e.g. I want a VGPU of this type, I don’t really care where it comes from)
> >> 2 args mean you want that specific type/parent combination
> >> 3 args mean you assert you own that device
> >> 
> >> That also implies that mdevctl create-mdev should output what it allocated
> >> so that some higher-level software can tell “OK, that’s the instance I got”.  
> > 
> > I don't think we're aligned on what mdevctl is attempting to provide.
> > Maybe you're describing a layer you'd like to see above mdevctl?
> > Thanks,  
> 
> No, again, I’m just trying to understand where you see a race.
> 
> Maybe instead of guessing, I should just ask: where is the race in
> the two-args variant (assuming it prints the UUID it used) that does not
> exist with the three-args variant?

Currently users deterministically know the uuid of the mdev they
create, they specify it.  They're not racing other users that
potentially created the same type on the same parent to guess which
mdev might be theirs, potentially picking the wrong one, potentially
picking the same as their counterpart.  Returning the uuid solves this
race, but as you noted, implies a much broader uuid allocation scheme
and policy which I'm not interested in providing in mdevctl until you
can convince me otherwise.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-23 23:20 mdevctl: A shoestring mediated device management and persistence utility Alex Williamson
  2019-05-24 10:11 ` Cornelia Huck
  2019-06-13 16:17 ` Christophe de Dinechin
@ 2019-06-17 14:00 ` Daniel P. Berrangé
  2019-06-17 14:54   ` Alex Williamson
  2019-06-25 22:52 ` Alex Williamson
  3 siblings, 1 reply; 46+ messages in thread
From: Daniel P. Berrangé @ 2019-06-17 14:00 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza, Cornelia Huck

On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:
> Hi,
> 
> Currently mediated device management, much like SR-IOV VF management,
> is largely left as an exercise for the user.  This is an attempt to
> provide something and see where it goes.  I doubt we'll solve
> everyone's needs on the first pass, but maybe we'll solve enough and
> provide helpers for the rest.  Without further ado, I'll point to what
> I have so far:
> 
> https://github.com/awilliam/mdevctl
> 
> This is inspired by driverctl, which is also a bash utility.  mdevctl
> uses udev and systemd to record and recreate mdev devices for
> persistence and provides a command line utility for querying, listing,
> starting, stopping, adding, and removing mdev devices.  Currently, for
> better or worse, it considers anything created to be persistent.  I can
> imagine a global configuration option that might disable this and
> perhaps an autostart flag per mdev device, such that mdevctl might
> simply "know" about some mdevs but not attempt to create them
> automatically.  Clearly command line usage help, man pages, and
> packaging are lacking as well, release early, release often, plus this
> is a discussion starter to see if perhaps this is sufficient to meet
> some needs.

I think from libvirt's POV, we would *not* want devices to be made
unconditionally persistent. We usually wish to expose a choice to
applications whether to have resources be transient or persistent.

So from that POV, a global config option to turn off persistence
is not workable either. We would want control per-device, with
autostart control per device too.

I would simply get rid of the udev rule that magically persists
stuff. Any person/tool using sysfs right now expects devices to
be transient. If they want to have persistence they can stop using
sysfs & use higher level tools directly.

> Originally I thought about making a utility to manage both mdev and
> SR-IOV VFs all in one, but it seemed more natural to start here
> (besides, I couldn't think of a good name for the combined utility).
> If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> see whether they have enough synergy to become one.

[snip]

> I'm also curious how or if libvirt or openstack might use this.  If
> nothing else, it makes libvirt hook scripts easier to write, especially
> if we add an option not to autostart mdevs, or if users don't mind
> persistent mdevs, maybe there's nothing more to do.

We currently have an API for creating host devices in libvirt which
we use for NPIV devices only, which is where we'd like to put mdev
creation support.  This API is for creating transient devices
though, so we don't want anything created this way to magically
become persistent.

For persistence we'd create a new API in libvirt allowing you to
define & undefine the persistent config for a devices, and another
set of APIs to create/delete from the persistent config.

As a general rule, libvirt would prefer to use an API rather than
spawning external commands, but can live with either.

There's also the question of systemd integration - so far everything
in libvirt still works on non-systemd based distros, but this new
tool looks like it requires systemd.  Personally I'm not too bothered
by this but others might be more concerned.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-17 14:00 ` Daniel P. Berrangé
@ 2019-06-17 14:54   ` Alex Williamson
  2019-06-17 15:10     ` Daniel P. Berrangé
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-17 14:54 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza, Cornelia Huck

On Mon, 17 Jun 2019 15:00:00 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:
> > Hi,
> > 
> > Currently mediated device management, much like SR-IOV VF management,
> > is largely left as an exercise for the user.  This is an attempt to
> > provide something and see where it goes.  I doubt we'll solve
> > everyone's needs on the first pass, but maybe we'll solve enough and
> > provide helpers for the rest.  Without further ado, I'll point to what
> > I have so far:
> > 
> > https://github.com/awilliam/mdevctl
> > 
> > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > uses udev and systemd to record and recreate mdev devices for
> > persistence and provides a command line utility for querying, listing,
> > starting, stopping, adding, and removing mdev devices.  Currently, for
> > better or worse, it considers anything created to be persistent.  I can
> > imagine a global configuration option that might disable this and
> > perhaps an autostart flag per mdev device, such that mdevctl might
> > simply "know" about some mdevs but not attempt to create them
> > automatically.  Clearly command line usage help, man pages, and
> > packaging are lacking as well, release early, release often, plus this
> > is a discussion starter to see if perhaps this is sufficient to meet
> > some needs.  
> 
> I think from libvirt's POV, we would *not* want devices to be made
> unconditionally persistent. We usually wish to expose a choice to
> applications whether to have resources be transient or persistent.
> 
> So from that POV, a global config option to turn off persistence
> is not workable either. We would want control per-device, with
> autostart control per device too.

The code has progressed somewhat in the past 3+ weeks, we still persist
all devices, but the start-up mode can be selected per device or with a
global default mode.  Devices configured with 'auto' start-up
automatically while 'manual' devices are simply known and available to
be started.  I imagine we could add a 'transient' mode where we purge
the information about the device when it is removed or the next time
the parent device is added.
 
> I would simply get rid of the udev rule that magically persists
> stuff. Any person/tool using sysfs right now expects devices to
> be transient. If they want to have persistence they can stop using
> sysfs & use higher level tools directly.

I think it's an interesting feature, but it's easy enough to control
via a global option in sysconfig with the default off if it's seen as
overstepping.

> > Originally I thought about making a utility to manage both mdev and
> > SR-IOV VFs all in one, but it seemed more natural to start here
> > (besides, I couldn't think of a good name for the combined utility).
> > If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> > see whether they have enough synergy to become one.  
> 
> [snip]
> 
> > I'm also curious how or if libvirt or openstack might use this.  If
> > nothing else, it makes libvirt hook scripts easier to write, especially
> > if we add an option not to autostart mdevs, or if users don't mind
> > persistent mdevs, maybe there's nothing more to do.  
> 
> We currently have an API for creating host devices in libvirt which
> we use for NPIV devices only, which is where we'd like to put mdev
> creation support.  This API is for creating transient devices
> though, so we don't want anything created this way to magically
> become persistent.
> 
> For persistence we'd create a new API in libvirt allowing you to
> define & undefine the persistent config for a devices, and another
> set of APIs to create/delete from the persistent config.
> 
> As a general rule, libvirt would prefer to use an API rather than
> spawning external commands, but can live with either.
> 
> There's also the question of systemd integration - so far everything
> in libvirt still works on non-systemd based distros, but this new
> tool looks like it requires systemd.  Personally I'm not too bothered
> by this but others might be more concerned.

Yes, Pavel brought up this issue offline as well and it needs more
consideration.  The systemd support still needs work as well, I've
discovered it gets very confused when the mdev device is removed
outside of mdevctl, but I haven't yet been able to concoct a BindsTo=
line that can handle the hyphens in the uuid device name.  I'd say
mdevctl is not intentionally systemd specific, it's simply a byproduct
of the systems it was developed on.  Also, if libvirt were to focus
only on transient devices, then startup via systemctl doesn't make
sense, which probably means stopping via systemctl would also be unused
by libvirt.  So I think this means we just need to make systemd an
optional feature of mdevctl (or drop it) and if libvirt doesn't use it,
that's fine.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-17 14:54   ` Alex Williamson
@ 2019-06-17 15:10     ` Daniel P. Berrangé
  2019-06-17 17:05       ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel P. Berrangé @ 2019-06-17 15:10 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza, Cornelia Huck

On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:
> On Mon, 17 Jun 2019 15:00:00 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:
> > > Hi,
> > > 
> > > Currently mediated device management, much like SR-IOV VF management,
> > > is largely left as an exercise for the user.  This is an attempt to
> > > provide something and see where it goes.  I doubt we'll solve
> > > everyone's needs on the first pass, but maybe we'll solve enough and
> > > provide helpers for the rest.  Without further ado, I'll point to what
> > > I have so far:
> > > 
> > > https://github.com/awilliam/mdevctl
> > > 
> > > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > > uses udev and systemd to record and recreate mdev devices for
> > > persistence and provides a command line utility for querying, listing,
> > > starting, stopping, adding, and removing mdev devices.  Currently, for
> > > better or worse, it considers anything created to be persistent.  I can
> > > imagine a global configuration option that might disable this and
> > > perhaps an autostart flag per mdev device, such that mdevctl might
> > > simply "know" about some mdevs but not attempt to create them
> > > automatically.  Clearly command line usage help, man pages, and
> > > packaging are lacking as well, release early, release often, plus this
> > > is a discussion starter to see if perhaps this is sufficient to meet
> > > some needs.  
> > 
> > I think from libvirt's POV, we would *not* want devices to be made
> > unconditionally persistent. We usually wish to expose a choice to
> > applications whether to have resources be transient or persistent.
> > 
> > So from that POV, a global config option to turn off persistence
> > is not workable either. We would want control per-device, with
> > autostart control per device too.
> 
> The code has progressed somewhat in the past 3+ weeks, we still persist
> all devices, but the start-up mode can be selected per device or with a
> global default mode.  Devices configured with 'auto' start-up
> automatically while 'manual' devices are simply known and available to
> be started.  I imagine we could add a 'transient' mode where we purge
> the information about the device when it is removed or the next time
> the parent device is added.

Having a pesistent config written out & then purged later is still
problematic. If the host crashes, nothing will purge the config file,
so it will become a persistent device. Also when listing devices we
want to be able to report whether it is persistent or transient. The
obvious way todo that is to simply look if a config file exists or
not.

> > I would simply get rid of the udev rule that magically persists
> > stuff. Any person/tool using sysfs right now expects devices to
> > be transient. If they want to have persistence they can stop using
> > sysfs & use higher level tools directly.
> 
> I think it's an interesting feature, but it's easy enough to control
> via a global option in sysconfig with the default off if it's seen as
> overstepping.

A global option is really not desirable, as it means that the behaviour
of the system that libvirt sees can silently change at any time. IMHO
this udev hook is  intermixing the two layers in the stack - keep the
low level sysfs layer completely separate from the higher level mgmt
concepts provided by this mdevctrl.

> > > Originally I thought about making a utility to manage both mdev and
> > > SR-IOV VFs all in one, but it seemed more natural to start here
> > > (besides, I couldn't think of a good name for the combined utility).
> > > If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> > > see whether they have enough synergy to become one.  
> > 
> > [snip]
> > 
> > > I'm also curious how or if libvirt or openstack might use this.  If
> > > nothing else, it makes libvirt hook scripts easier to write, especially
> > > if we add an option not to autostart mdevs, or if users don't mind
> > > persistent mdevs, maybe there's nothing more to do.  
> > 
> > We currently have an API for creating host devices in libvirt which
> > we use for NPIV devices only, which is where we'd like to put mdev
> > creation support.  This API is for creating transient devices
> > though, so we don't want anything created this way to magically
> > become persistent.
> > 
> > For persistence we'd create a new API in libvirt allowing you to
> > define & undefine the persistent config for a devices, and another
> > set of APIs to create/delete from the persistent config.
> > 
> > As a general rule, libvirt would prefer to use an API rather than
> > spawning external commands, but can live with either.

Thinking some more, the key tasks that libvirt needs to deal
with are

 1. Define a persistent config (Must not create any actual device)
 2. Undefine a persistent config (Must not delete any actual device)
 3. Create a device
 4. Delete a device
 5. Get list of all persistent configs
 6. Enable autostart of a device
 7. Disable autostart of a device

For 1, 2, 5, 6 & 7 libvirt doesn't really need a command line tool. As
long as there is a specification for the config file syntax and location
we can directly read/write/stat files. This would be much more efficient
and reliable for libvirt than spawning commands & parsing the output or
exit status.

For 3 & 4 we'd desire the command to accept a config file, either as a
file on disk for persistent devices, or inline on stdin for transient
devices.

> > There's also the question of systemd integration - so far everything
> > in libvirt still works on non-systemd based distros, but this new
> > tool looks like it requires systemd.  Personally I'm not too bothered
> > by this but others might be more concerned.
> 
> Yes, Pavel brought up this issue offline as well and it needs more
> consideration.  The systemd support still needs work as well, I've
> discovered it gets very confused when the mdev device is removed
> outside of mdevctl, but I haven't yet been able to concoct a BindsTo=
> line that can handle the hyphens in the uuid device name.  I'd say
> mdevctl is not intentionally systemd specific, it's simply a byproduct
> of the systems it was developed on.  Also, if libvirt were to focus
> only on transient devices, then startup via systemctl doesn't make
> sense, which probably means stopping via systemctl would also be unused
> by libvirt.  So I think this means we just need to make systemd an
> optional feature of mdevctl (or drop it) and if libvirt doesn't use it,
> that's fine.  Thanks,

I actually think the systemd integration is good, we just need to
make sure things are still doable without it. This could be as simple
as having an initscript that just iteates over the configs creating
each one at startup.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [libvirt] mdevctl: A shoestring mediated device management and persistence utility
  2019-06-14 16:04           ` Alex Williamson
@ 2019-06-17 16:03             ` Cornelia Huck
  0 siblings, 0 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-06-17 16:03 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Christophe de Dinechin, Sylvain Bauza, Pavel Hrdina, kvm,
	Skultety, Libvirt Devel, Kirti Wankhede

On Fri, 14 Jun 2019 10:04:18 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 14 Jun 2019 17:06:15 +0200
> Christophe de Dinechin <cdupontd@redhat.com> wrote:
> 
> > > On 14 Jun 2019, at 16:23, Alex Williamson <alex.williamson@redhat.com> wrote:
> > > 
> > > On Fri, 14 Jun 2019 11:54:42 +0200
> > > Christophe de Dinechin <cdupontd@redhat.com> wrote:

> > > Where is the parent/type ownership implied?    
> > 
> > I did not imply it, but I read some concern about ownership
> > on your part in "they need to guess that an mdev device
> > with the same parent and type is *theirs*.” (emphasis mine)
> > 
> > I personally see no change on the “need to guess” implied
> > by the fact that you run uuidgen inside the script, so
> > that’s why I tried to guess what you meant.  
> 
> As I noted in the reply to the pull request, putting `uuidgen` inline
> was probably a bad example. 

FWIW, I just sent a pull req to get rid of that inline `uuidgen` in the
example.

> However, the difference is that the user
> has imposed the race on themselves if they invoke mdevctl like this,
> they've provided a uuid but they didn't record what it is.  This is the
> user's problem.  Pushing uuid selection into mdevctl makes it mdevctl's
> problem because the interface is fundamentally broken.
> 
> > > The intended semantics are
> > > "try to create this type of device under this parent”.    
> > 
> > Agreed. Which is why I don’t see why trying to create
> > with some new UUID introduces any race (as long as
> > the script prints out that UUID, which I admit my patch
> > entirely failed to to)  
> 
> And that's the piece that makes it fundamentally broken.  Beyond that,
> it seems unnecessary.  I don't see this as the primary invocation of
> mdevctl and the functionality it adds is trivially accomplished in a
> wrapper, so what's the value?
> 
> > >>> How do you resolve two instances of this happening in parallel and both
> > >>> coming to the same conclusion which is their device.  If a user wants
> > >>> this sort of headache they can call mdevctl with `uuidgen` but I don't
> > >>> think we should encourage it further.      
> > >> 
> > >> I agree there is a race, but if anything, having a usage where you don’t
> > >> pass the UUID on the command line is a step in the right direction.
> > >> It leaves the door open for the create-mdev script to do smarter things,
> > >> like deferring the allocation of the mdevs to an entity that has slightly
> > >> more knowledge of the global system state than uuidgen.    
> > > 
> > > A user might (likely) require a specific uuid to match their VM
> > > configuration.  I can only think of very niche use cases where a user
> > > doesn't care what uuid they get.    
> > 
> > They do care. But I typically copy-paste my UUIDs, and then
> > 
> > 1. copy-pasting at the end is always faster than between
> > the command and other arguments (3-args case). 
> > 
> > 2. copy-pasting the output of the previous command is faster
> > than having one extra step where I need to copy the same thing twice
> > (2-args case).
> > 
> > So to me, if the script is intended to be used by humans, my
> > proposal makes it slightly more comfortable to use. Nothing more.  
> 
> This is your preference, but I wouldn't call it universal.  Specifying
> the uuid last seems backwards to me, we're creating an object so let's
> first name that object.  We then specify where that object should be
> created and what type it has.  This seems very logical to me, besides,
> it's also the exact same order we use when listing mdevs :P
> 
> Clearly there's personal preference here, so let's not arbitrarily pick
> a different preference.  If copy/paste order is more important to you
> then submit a patch to give mdevctl real argument processing so you can
> specify --uuid, --parent, --type in whatever order you want.

I agree that these are personal preferences :) Real argument processing
makes sense, however.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-17 15:10     ` Daniel P. Berrangé
@ 2019-06-17 17:05       ` Alex Williamson
  2019-06-18  8:44         ` Daniel P. Berrangé
  2019-06-18 11:01         ` Cornelia Huck
  0 siblings, 2 replies; 46+ messages in thread
From: Alex Williamson @ 2019-06-17 17:05 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza, Cornelia Huck

On Mon, 17 Jun 2019 16:10:30 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:
> > On Mon, 17 Jun 2019 15:00:00 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:  
> > > > Hi,
> > > > 
> > > > Currently mediated device management, much like SR-IOV VF management,
> > > > is largely left as an exercise for the user.  This is an attempt to
> > > > provide something and see where it goes.  I doubt we'll solve
> > > > everyone's needs on the first pass, but maybe we'll solve enough and
> > > > provide helpers for the rest.  Without further ado, I'll point to what
> > > > I have so far:
> > > > 
> > > > https://github.com/awilliam/mdevctl
> > > > 
> > > > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > > > uses udev and systemd to record and recreate mdev devices for
> > > > persistence and provides a command line utility for querying, listing,
> > > > starting, stopping, adding, and removing mdev devices.  Currently, for
> > > > better or worse, it considers anything created to be persistent.  I can
> > > > imagine a global configuration option that might disable this and
> > > > perhaps an autostart flag per mdev device, such that mdevctl might
> > > > simply "know" about some mdevs but not attempt to create them
> > > > automatically.  Clearly command line usage help, man pages, and
> > > > packaging are lacking as well, release early, release often, plus this
> > > > is a discussion starter to see if perhaps this is sufficient to meet
> > > > some needs.    
> > > 
> > > I think from libvirt's POV, we would *not* want devices to be made
> > > unconditionally persistent. We usually wish to expose a choice to
> > > applications whether to have resources be transient or persistent.
> > > 
> > > So from that POV, a global config option to turn off persistence
> > > is not workable either. We would want control per-device, with
> > > autostart control per device too.  
> > 
> > The code has progressed somewhat in the past 3+ weeks, we still persist
> > all devices, but the start-up mode can be selected per device or with a
> > global default mode.  Devices configured with 'auto' start-up
> > automatically while 'manual' devices are simply known and available to
> > be started.  I imagine we could add a 'transient' mode where we purge
> > the information about the device when it is removed or the next time
> > the parent device is added.  
> 
> Having a pesistent config written out & then purged later is still
> problematic. If the host crashes, nothing will purge the config file,
> so it will become a persistent device. Also when listing devices we
> want to be able to report whether it is persistent or transient. The
> obvious way todo that is to simply look if a config file exists or
> not.

I was thinking that the config file would identify the device as
transient, therefore if the system crashed we'd have the opportunity to
purge those entries on the next boot as we're processing the entries
for that parent device.  Clearly it has yet to be implemented, but I
expect there are some advantages to tracking devices via a transient
config entry or else we're constantly re-discovering foreign mdevs.

> > > I would simply get rid of the udev rule that magically persists
> > > stuff. Any person/tool using sysfs right now expects devices to
> > > be transient. If they want to have persistence they can stop using
> > > sysfs & use higher level tools directly.  
> > 
> > I think it's an interesting feature, but it's easy enough to control
> > via a global option in sysconfig with the default off if it's seen as
> > overstepping.  
> 
> A global option is really not desirable, as it means that the behaviour
> of the system that libvirt sees can silently change at any time. IMHO
> this udev hook is  intermixing the two layers in the stack - keep the
> low level sysfs layer completely separate from the higher level mgmt
> concepts provided by this mdevctrl.

It seems like it just means that libvirt needs to be explicit such that
it doesn't rely on a global preference, ex. always using a --transient
option.

> > > > Originally I thought about making a utility to manage both mdev and
> > > > SR-IOV VFs all in one, but it seemed more natural to start here
> > > > (besides, I couldn't think of a good name for the combined utility).
> > > > If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> > > > see whether they have enough synergy to become one.    
> > > 
> > > [snip]
> > >   
> > > > I'm also curious how or if libvirt or openstack might use this.  If
> > > > nothing else, it makes libvirt hook scripts easier to write, especially
> > > > if we add an option not to autostart mdevs, or if users don't mind
> > > > persistent mdevs, maybe there's nothing more to do.    
> > > 
> > > We currently have an API for creating host devices in libvirt which
> > > we use for NPIV devices only, which is where we'd like to put mdev
> > > creation support.  This API is for creating transient devices
> > > though, so we don't want anything created this way to magically
> > > become persistent.
> > > 
> > > For persistence we'd create a new API in libvirt allowing you to
> > > define & undefine the persistent config for a devices, and another
> > > set of APIs to create/delete from the persistent config.
> > > 
> > > As a general rule, libvirt would prefer to use an API rather than
> > > spawning external commands, but can live with either.  
> 
> Thinking some more, the key tasks that libvirt needs to deal
> with are
> 
>  1. Define a persistent config (Must not create any actual device)
>  2. Undefine a persistent config (Must not delete any actual device)
>  3. Create a device
>  4. Delete a device
>  5. Get list of all persistent configs
>  6. Enable autostart of a device
>  7. Disable autostart of a device
> 
> For 1, 2, 5, 6 & 7 libvirt doesn't really need a command line tool. As
> long as there is a specification for the config file syntax and location
> we can directly read/write/stat files. This would be much more efficient
> and reliable for libvirt than spawning commands & parsing the output or
> exit status.

True, but if the roles were reversed, would libvirt be as eager to have
another tool writing into a config directory that it manages?  It's
relatively harmless now, but it might set a bad precedent and lock us
into config file formats that could otherwise be managed once at
package upgrade time.

> For 3 & 4 we'd desire the command to accept a config file, either as a
> file on disk for persistent devices, or inline on stdin for transient
> devices.
> 
> > > There's also the question of systemd integration - so far everything
> > > in libvirt still works on non-systemd based distros, but this new
> > > tool looks like it requires systemd.  Personally I'm not too bothered
> > > by this but others might be more concerned.  
> > 
> > Yes, Pavel brought up this issue offline as well and it needs more
> > consideration.  The systemd support still needs work as well, I've
> > discovered it gets very confused when the mdev device is removed
> > outside of mdevctl, but I haven't yet been able to concoct a BindsTo=
> > line that can handle the hyphens in the uuid device name.  I'd say
> > mdevctl is not intentionally systemd specific, it's simply a byproduct
> > of the systems it was developed on.  Also, if libvirt were to focus
> > only on transient devices, then startup via systemctl doesn't make
> > sense, which probably means stopping via systemctl would also be unused
> > by libvirt.  So I think this means we just need to make systemd an
> > optional feature of mdevctl (or drop it) and if libvirt doesn't use it,
> > that's fine.  Thanks,  
> 
> I actually think the systemd integration is good, we just need to
> make sure things are still doable without it. This could be as simple
> as having an initscript that just iteates over the configs creating
> each one at startup.

Ok, thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-17 17:05       ` Alex Williamson
@ 2019-06-18  8:44         ` Daniel P. Berrangé
  2019-06-18 11:01         ` Cornelia Huck
  1 sibling, 0 replies; 46+ messages in thread
From: Daniel P. Berrangé @ 2019-06-18  8:44 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza, Cornelia Huck

On Mon, Jun 17, 2019 at 11:05:17AM -0600, Alex Williamson wrote:
> On Mon, 17 Jun 2019 16:10:30 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:
> > > On Mon, 17 Jun 2019 15:00:00 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >   
> > > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:  
> > > > > Hi,
> > > > > 
> > > > > Currently mediated device management, much like SR-IOV VF management,
> > > > > is largely left as an exercise for the user.  This is an attempt to
> > > > > provide something and see where it goes.  I doubt we'll solve
> > > > > everyone's needs on the first pass, but maybe we'll solve enough and
> > > > > provide helpers for the rest.  Without further ado, I'll point to what
> > > > > I have so far:
> > > > > 
> > > > > https://github.com/awilliam/mdevctl
> > > > > 
> > > > > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > > > > uses udev and systemd to record and recreate mdev devices for
> > > > > persistence and provides a command line utility for querying, listing,
> > > > > starting, stopping, adding, and removing mdev devices.  Currently, for
> > > > > better or worse, it considers anything created to be persistent.  I can
> > > > > imagine a global configuration option that might disable this and
> > > > > perhaps an autostart flag per mdev device, such that mdevctl might
> > > > > simply "know" about some mdevs but not attempt to create them
> > > > > automatically.  Clearly command line usage help, man pages, and
> > > > > packaging are lacking as well, release early, release often, plus this
> > > > > is a discussion starter to see if perhaps this is sufficient to meet
> > > > > some needs.    
> > > > 
> > > > I think from libvirt's POV, we would *not* want devices to be made
> > > > unconditionally persistent. We usually wish to expose a choice to
> > > > applications whether to have resources be transient or persistent.
> > > > 
> > > > So from that POV, a global config option to turn off persistence
> > > > is not workable either. We would want control per-device, with
> > > > autostart control per device too.  
> > > 
> > > The code has progressed somewhat in the past 3+ weeks, we still persist
> > > all devices, but the start-up mode can be selected per device or with a
> > > global default mode.  Devices configured with 'auto' start-up
> > > automatically while 'manual' devices are simply known and available to
> > > be started.  I imagine we could add a 'transient' mode where we purge
> > > the information about the device when it is removed or the next time
> > > the parent device is added.  
> > 
> > Having a pesistent config written out & then purged later is still
> > problematic. If the host crashes, nothing will purge the config file,
> > so it will become a persistent device. Also when listing devices we
> > want to be able to report whether it is persistent or transient. The
> > obvious way todo that is to simply look if a config file exists or
> > not.
> 
> I was thinking that the config file would identify the device as
> transient, therefore if the system crashed we'd have the opportunity to
> purge those entries on the next boot as we're processing the entries
> for that parent device.  Clearly it has yet to be implemented, but I
> expect there are some advantages to tracking devices via a transient
> config entry or else we're constantly re-discovering foreign mdevs.
> 
> > > > I would simply get rid of the udev rule that magically persists
> > > > stuff. Any person/tool using sysfs right now expects devices to
> > > > be transient. If they want to have persistence they can stop using
> > > > sysfs & use higher level tools directly.  
> > > 
> > > I think it's an interesting feature, but it's easy enough to control
> > > via a global option in sysconfig with the default off if it's seen as
> > > overstepping.  
> > 
> > A global option is really not desirable, as it means that the behaviour
> > of the system that libvirt sees can silently change at any time. IMHO
> > this udev hook is  intermixing the two layers in the stack - keep the
> > low level sysfs layer completely separate from the higher level mgmt
> > concepts provided by this mdevctrl.
> 
> It seems like it just means that libvirt needs to be explicit such that
> it doesn't rely on a global preference, ex. always using a --transient
> option.
> 
> > > > > Originally I thought about making a utility to manage both mdev and
> > > > > SR-IOV VFs all in one, but it seemed more natural to start here
> > > > > (besides, I couldn't think of a good name for the combined utility).
> > > > > If this seems useful, maybe I'll start on a vfctl for SR-IOV and we'll
> > > > > see whether they have enough synergy to become one.    
> > > > 
> > > > [snip]
> > > >   
> > > > > I'm also curious how or if libvirt or openstack might use this.  If
> > > > > nothing else, it makes libvirt hook scripts easier to write, especially
> > > > > if we add an option not to autostart mdevs, or if users don't mind
> > > > > persistent mdevs, maybe there's nothing more to do.    
> > > > 
> > > > We currently have an API for creating host devices in libvirt which
> > > > we use for NPIV devices only, which is where we'd like to put mdev
> > > > creation support.  This API is for creating transient devices
> > > > though, so we don't want anything created this way to magically
> > > > become persistent.
> > > > 
> > > > For persistence we'd create a new API in libvirt allowing you to
> > > > define & undefine the persistent config for a devices, and another
> > > > set of APIs to create/delete from the persistent config.
> > > > 
> > > > As a general rule, libvirt would prefer to use an API rather than
> > > > spawning external commands, but can live with either.  
> > 
> > Thinking some more, the key tasks that libvirt needs to deal
> > with are
> > 
> >  1. Define a persistent config (Must not create any actual device)
> >  2. Undefine a persistent config (Must not delete any actual device)
> >  3. Create a device
> >  4. Delete a device
> >  5. Get list of all persistent configs
> >  6. Enable autostart of a device
> >  7. Disable autostart of a device
> > 
> > For 1, 2, 5, 6 & 7 libvirt doesn't really need a command line tool. As
> > long as there is a specification for the config file syntax and location
> > we can directly read/write/stat files. This would be much more efficient
> > and reliable for libvirt than spawning commands & parsing the output or
> > exit status.
> 
> True, but if the roles were reversed, would libvirt be as eager to have
> another tool writing into a config directory that it manages?  It's
> relatively harmless now, but it might set a bad precedent and lock us
> into config file formats that could otherwise be managed once at
> package upgrade time.

Yes, it is a tradeoff - if we want to hide the format, but avoid spawning
many commands to access this data, then we'll need a library API to
integrate with it, not merely a shell script.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-17 17:05       ` Alex Williamson
  2019-06-18  8:44         ` Daniel P. Berrangé
@ 2019-06-18 11:01         ` Cornelia Huck
       [not found]           ` <CALOCmukPWiXiM+mN0hCTvSwfdHy5UdERU8WnvOXiBrMQ9tH3VA@mail.gmail.com>
  1 sibling, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-06-18 11:01 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Daniel P. Berrangé,
	kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Sylvain Bauza

On Mon, 17 Jun 2019 11:05:17 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Mon, 17 Jun 2019 16:10:30 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:  
> > > On Mon, 17 Jun 2019 15:00:00 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >     
> > > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:    
> > > > > Hi,
> > > > > 
> > > > > Currently mediated device management, much like SR-IOV VF management,
> > > > > is largely left as an exercise for the user.  This is an attempt to
> > > > > provide something and see where it goes.  I doubt we'll solve
> > > > > everyone's needs on the first pass, but maybe we'll solve enough and
> > > > > provide helpers for the rest.  Without further ado, I'll point to what
> > > > > I have so far:
> > > > > 
> > > > > https://github.com/awilliam/mdevctl
> > > > > 
> > > > > This is inspired by driverctl, which is also a bash utility.  mdevctl
> > > > > uses udev and systemd to record and recreate mdev devices for
> > > > > persistence and provides a command line utility for querying, listing,
> > > > > starting, stopping, adding, and removing mdev devices.  Currently, for
> > > > > better or worse, it considers anything created to be persistent.  I can
> > > > > imagine a global configuration option that might disable this and
> > > > > perhaps an autostart flag per mdev device, such that mdevctl might
> > > > > simply "know" about some mdevs but not attempt to create them
> > > > > automatically.  Clearly command line usage help, man pages, and
> > > > > packaging are lacking as well, release early, release often, plus this
> > > > > is a discussion starter to see if perhaps this is sufficient to meet
> > > > > some needs.      
> > > > 
> > > > I think from libvirt's POV, we would *not* want devices to be made
> > > > unconditionally persistent. We usually wish to expose a choice to
> > > > applications whether to have resources be transient or persistent.
> > > > 
> > > > So from that POV, a global config option to turn off persistence
> > > > is not workable either. We would want control per-device, with
> > > > autostart control per device too.    
> > > 
> > > The code has progressed somewhat in the past 3+ weeks, we still persist
> > > all devices, but the start-up mode can be selected per device or with a
> > > global default mode.  Devices configured with 'auto' start-up
> > > automatically while 'manual' devices are simply known and available to
> > > be started.  I imagine we could add a 'transient' mode where we purge
> > > the information about the device when it is removed or the next time
> > > the parent device is added.    
> > 
> > Having a pesistent config written out & then purged later is still
> > problematic. If the host crashes, nothing will purge the config file,
> > so it will become a persistent device. Also when listing devices we
> > want to be able to report whether it is persistent or transient. The
> > obvious way todo that is to simply look if a config file exists or
> > not.  
> 
> I was thinking that the config file would identify the device as
> transient, therefore if the system crashed we'd have the opportunity to
> purge those entries on the next boot as we're processing the entries
> for that parent device.  Clearly it has yet to be implemented, but I
> expect there are some advantages to tracking devices via a transient
> config entry or else we're constantly re-discovering foreign mdevs.

I think we need to reach consensus about the actual scope of the
mdevctl tool.

- Is it supposed to be responsible for managing *all* mdev devices in
  the system, or is it more supposed to be a convenience helper for
  users/software wanting to manage mdevs?
- Do we want mdevctl to manage config files for individual mdevs, or
  are they supposed to be in a common format that can also be managed
  by e.g. libvirt?
- Should mdevctl be a stand-alone tool, provide library functions, or
  both? Related: should it keep any internal state that is not written
  to disk? (I think that also plays into the transient vs. persistent
  question.)

My personal opinion is that mdevctl should be able to tolerate mdevs
being configured by other means, but probably should not try to impose
its own configuration if it detects that (unless explicitly asked to do
so). Not sure how feasible that goal is.

A well-defined config file format is probably a win, even if it only
ends up being used by mdevctl itself.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
       [not found]           ` <CALOCmukPWiXiM+mN0hCTvSwfdHy5UdERU8WnvOXiBrMQ9tH3VA@mail.gmail.com>
@ 2019-06-18 22:12             ` Alex Williamson
  2019-06-19  7:28               ` Daniel P. Berrangé
       [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
  0 siblings, 2 replies; 46+ messages in thread
From: Alex Williamson @ 2019-06-18 22:12 UTC (permalink / raw)
  To: Sylvain Bauza
  Cc: Cornelia Huck, Daniel P. Berrangé,
	kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina

On Tue, 18 Jun 2019 14:48:11 +0200
Sylvain Bauza <sbauza@redhat.com> wrote:

> On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Mon, 17 Jun 2019 11:05:17 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >  
> > > On Mon, 17 Jun 2019 16:10:30 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >  
> > > > On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:  
> > > > > On Mon, 17 Jun 2019 15:00:00 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >  
> > > > > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:  
> >  
> > > > > > > Hi,
> > > > > > >
> > > > > > > Currently mediated device management, much like SR-IOV VF  
> > management,  
> > > > > > > is largely left as an exercise for the user.  This is an attempt  
> > to  
> > > > > > > provide something and see where it goes.  I doubt we'll solve
> > > > > > > everyone's needs on the first pass, but maybe we'll solve enough  
> > and  
> > > > > > > provide helpers for the rest.  Without further ado, I'll point  
> > to what  
> > > > > > > I have so far:
> > > > > > >
> > > > > > > https://github.com/awilliam/mdevctl
> > > > > > >
> > > > > > > This is inspired by driverctl, which is also a bash utility.  
> > mdevctl  
> > > > > > > uses udev and systemd to record and recreate mdev devices for
> > > > > > > persistence and provides a command line utility for querying,  
> > listing,  
> > > > > > > starting, stopping, adding, and removing mdev devices.  
> > Currently, for  
> > > > > > > better or worse, it considers anything created to be  
> > persistent.  I can  
> > > > > > > imagine a global configuration option that might disable this and
> > > > > > > perhaps an autostart flag per mdev device, such that mdevctl  
> > might  
> > > > > > > simply "know" about some mdevs but not attempt to create them
> > > > > > > automatically.  Clearly command line usage help, man pages, and
> > > > > > > packaging are lacking as well, release early, release often,  
> > plus this  
> > > > > > > is a discussion starter to see if perhaps this is sufficient to  
> > meet  
> > > > > > > some needs.  
> > > > > >
> > > > > > I think from libvirt's POV, we would *not* want devices to be made
> > > > > > unconditionally persistent. We usually wish to expose a choice to
> > > > > > applications whether to have resources be transient or persistent.
> > > > > >
> > > > > > So from that POV, a global config option to turn off persistence
> > > > > > is not workable either. We would want control per-device, with
> > > > > > autostart control per device too.  
> > > > >
> > > > > The code has progressed somewhat in the past 3+ weeks, we still  
> > persist  
> > > > > all devices, but the start-up mode can be selected per device or  
> > with a  
> > > > > global default mode.  Devices configured with 'auto' start-up
> > > > > automatically while 'manual' devices are simply known and available  
> > to  
> > > > > be started.  I imagine we could add a 'transient' mode where we purge
> > > > > the information about the device when it is removed or the next time
> > > > > the parent device is added.  
> > > >
> > > > Having a pesistent config written out & then purged later is still
> > > > problematic. If the host crashes, nothing will purge the config file,
> > > > so it will become a persistent device. Also when listing devices we
> > > > want to be able to report whether it is persistent or transient. The
> > > > obvious way todo that is to simply look if a config file exists or
> > > > not.  
> > >
> > > I was thinking that the config file would identify the device as
> > > transient, therefore if the system crashed we'd have the opportunity to
> > > purge those entries on the next boot as we're processing the entries
> > > for that parent device.  Clearly it has yet to be implemented, but I
> > > expect there are some advantages to tracking devices via a transient
> > > config entry or else we're constantly re-discovering foreign mdevs.  
> >
> > I think we need to reach consensus about the actual scope of the
> > mdevctl tool.
> >
> >  
> Thanks Cornelia, my thoughts:
> 
> - Is it supposed to be responsible for managing *all* mdev devices in
> >   the system, or is it more supposed to be a convenience helper for
> >   users/software wanting to manage mdevs?
> >  
> 
> The latter. If an operator (or some software) wants to create mdevs by not
> using mdevctl (and rather directly calling the sysfs), I think it's OK.
> That said, mdevs created by mdevctl would be supported by systemctl, while
> the others not but I think it's okay.

I agree (sort of), and I'm hearing that we should drop any sort of
automatic persistence of mdevs created outside of mdevctl.  The problem
comes when we try to draw the line between unmanaged and manged
devices.  For instance, if we have a command to list mdevs it would
feel incomplete if it didn't list all mdevs both those managed by
mdevctl and those created elsewhere.  For managed devices, I expect
we'll also have commands that allow the mode of the device to be
switched between transient, saved, and persistent.  Should a user then
be allowed to promote an unmanaged device to one of these modes via the
same command?  Should they be allowed to stop an unmanaged device
through driverctl?  Through systemctl?  These all seem like reasonable
things to do, so what then is the difference between transient and
unmanaged mdev and is mdevctl therefore managing all mdevs, not just
those it has created?

> - Do we want mdevctl to manage config files for individual mdevs, or
> >   are they supposed to be in a common format that can also be managed
> >   by e.g. libvirt?
> >  
> 
> Unless I misunderstand, I think mdevctl just helps to create mdevs for
> being used by guests created either by libvirt or QEMU or even others.
> How a guest would allocate a mdev (ie. saying "I'll use this specific mdev
> UUID") is IMHO not something for mdevctl.

Right, mdevctl isn't concerned with how a specific mdev is used, but I
think what Connie is after is more the proposal from Daniel where
libvirt can essentially manage mdevctl config files itself and then
only invoke mdevctl for the dirty work of creating and deleting
devices.  In fact, assuming systemd, libvirt could avoid direct
interaction with mdevctl entirely, instead using systemctl device units
to start and stop the mdevs.  Maybe where that proposal takes a turn is
when we again consider non-systemd hosts, where maybe mdevctl needs to
write out an init script per mdev and libvirt injecting itself into
manipulation of the config files would either need to perform the same
or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
to either condone external config file manipulation or expand the scope
of the project into becoming a library.

> - Should mdevctl be a stand-alone tool, provide library functions, or
> >   both? Related: should it keep any internal state that is not written
> >   to disk? (I think that also plays into the transient vs. persistent
> >   question.)

I don't think we want an mdevctld, if that's what you mean by internal
state not written to disk.  I think we ideally want all state in the
mdev config files or discerned through sysfs.  How we handle
non-systemd hosts may throw a wrench in that though since currently the
systemd integration relies on a template to support arbitrary mdevs and
I'm not sure how to replicate that in other init services.  If we need
to dynamically manage per mdev init files in addition to config files,
we're not so self contained.

> FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> creating persisted mdevs (ie. mdevs that would be recreated after rebooting
> using systemctl). That's the real use case I need.
> Whether libvirt would internally support mdevctl would be nice but that's
> not really something Nova needs, so I leave others providing their own
> thoughts.
> 
> 
> My personal opinion is that mdevctl should be able to tolerate mdevs
> > being configured by other means, but probably should not try to impose
> > its own configuration if it detects that (unless explicitly asked to do
> > so). Not sure how feasible that goal is.
> >
> > That's what I misunderstand : in order to have a guest using a vGPU, you  
> need to do two things :
> 1/ create the mdev
> 2/ allocate this created dev to a specific guest config
> 
> Of course, we could imagine a way to have both steps to be done directly by
> libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.

Yep, we also don't want to presume libvirt is the only consumer here.
mdevctl should also support other VM management tools, users who write
their own management scripts, and even non-VM related use cases.

> > A well-defined config file format is probably a win, even if it only
> > ends up being used by mdevctl itself.

Yes, regardless of whether others touch them, conversion scripts on
upgrade should be avoided.  Do we need something beyond a key=value
file?  So far we're only storing the mdev type and startup mode, but
vfio-ap clearly needs more, apparently key=value1,value2,... type
representation.  Still, I think I'd prefer simple over jumping to xml
or json or yaml.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-18 22:12             ` Alex Williamson
@ 2019-06-19  7:28               ` Daniel P. Berrangé
  2019-06-19  9:46                 ` Cornelia Huck
       [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
  1 sibling, 1 reply; 46+ messages in thread
From: Daniel P. Berrangé @ 2019-06-19  7:28 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Sylvain Bauza, Cornelia Huck, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina

On Tue, Jun 18, 2019 at 04:12:10PM -0600, Alex Williamson wrote:
> On Tue, 18 Jun 2019 14:48:11 +0200
> Sylvain Bauza <sbauza@redhat.com> wrote:
> 
> > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > 
> > > On Mon, 17 Jun 2019 11:05:17 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > >  
> > > > On Mon, 17 Jun 2019 16:10:30 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >  
> > > > > On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:  
> > > > > > On Mon, 17 Jun 2019 15:00:00 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >  
> > > > > > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson wrote:  
> > >  
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Currently mediated device management, much like SR-IOV VF  
> > > management,  
> > > > > > > > is largely left as an exercise for the user.  This is an attempt  
> > > to  
> > > > > > > > provide something and see where it goes.  I doubt we'll solve
> > > > > > > > everyone's needs on the first pass, but maybe we'll solve enough  
> > > and  
> > > > > > > > provide helpers for the rest.  Without further ado, I'll point  
> > > to what  
> > > > > > > > I have so far:
> > > > > > > >
> > > > > > > > https://github.com/awilliam/mdevctl
> > > > > > > >
> > > > > > > > This is inspired by driverctl, which is also a bash utility.  
> > > mdevctl  
> > > > > > > > uses udev and systemd to record and recreate mdev devices for
> > > > > > > > persistence and provides a command line utility for querying,  
> > > listing,  
> > > > > > > > starting, stopping, adding, and removing mdev devices.  
> > > Currently, for  
> > > > > > > > better or worse, it considers anything created to be  
> > > persistent.  I can  
> > > > > > > > imagine a global configuration option that might disable this and
> > > > > > > > perhaps an autostart flag per mdev device, such that mdevctl  
> > > might  
> > > > > > > > simply "know" about some mdevs but not attempt to create them
> > > > > > > > automatically.  Clearly command line usage help, man pages, and
> > > > > > > > packaging are lacking as well, release early, release often,  
> > > plus this  
> > > > > > > > is a discussion starter to see if perhaps this is sufficient to  
> > > meet  
> > > > > > > > some needs.  
> > > > > > >
> > > > > > > I think from libvirt's POV, we would *not* want devices to be made
> > > > > > > unconditionally persistent. We usually wish to expose a choice to
> > > > > > > applications whether to have resources be transient or persistent.
> > > > > > >
> > > > > > > So from that POV, a global config option to turn off persistence
> > > > > > > is not workable either. We would want control per-device, with
> > > > > > > autostart control per device too.  
> > > > > >
> > > > > > The code has progressed somewhat in the past 3+ weeks, we still  
> > > persist  
> > > > > > all devices, but the start-up mode can be selected per device or  
> > > with a  
> > > > > > global default mode.  Devices configured with 'auto' start-up
> > > > > > automatically while 'manual' devices are simply known and available  
> > > to  
> > > > > > be started.  I imagine we could add a 'transient' mode where we purge
> > > > > > the information about the device when it is removed or the next time
> > > > > > the parent device is added.  
> > > > >
> > > > > Having a pesistent config written out & then purged later is still
> > > > > problematic. If the host crashes, nothing will purge the config file,
> > > > > so it will become a persistent device. Also when listing devices we
> > > > > want to be able to report whether it is persistent or transient. The
> > > > > obvious way todo that is to simply look if a config file exists or
> > > > > not.  
> > > >
> > > > I was thinking that the config file would identify the device as
> > > > transient, therefore if the system crashed we'd have the opportunity to
> > > > purge those entries on the next boot as we're processing the entries
> > > > for that parent device.  Clearly it has yet to be implemented, but I
> > > > expect there are some advantages to tracking devices via a transient
> > > > config entry or else we're constantly re-discovering foreign mdevs.  
> > >
> > > I think we need to reach consensus about the actual scope of the
> > > mdevctl tool.
> > >
> > >  
> > Thanks Cornelia, my thoughts:
> > 
> > - Is it supposed to be responsible for managing *all* mdev devices in
> > >   the system, or is it more supposed to be a convenience helper for
> > >   users/software wanting to manage mdevs?
> > >  
> > 
> > The latter. If an operator (or some software) wants to create mdevs by not
> > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > That said, mdevs created by mdevctl would be supported by systemctl, while
> > the others not but I think it's okay.
> 
> I agree (sort of), and I'm hearing that we should drop any sort of
> automatic persistence of mdevs created outside of mdevctl.  The problem
> comes when we try to draw the line between unmanaged and manged
> devices.  For instance, if we have a command to list mdevs it would
> feel incomplete if it didn't list all mdevs both those managed by
> mdevctl and those created elsewhere.  For managed devices, I expect
> we'll also have commands that allow the mode of the device to be
> switched between transient, saved, and persistent.  Should a user then
> be allowed to promote an unmanaged device to one of these modes via the
> same command?  Should they be allowed to stop an unmanaged device
> through driverctl?  Through systemctl?  These all seem like reasonable
> things to do, so what then is the difference between transient and
> unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> those it has created?

To my mind there shouldn't really need to be a difference between
transient mdevs created by mdevctrl and mdevs created by an user
directly using sysfs. Both are mdevs on the running system with
no config file that you have to enumerate by looking at sysfs.
This ties back to my belief that we shouldn't need to have any
config on disk for a transient mdev, just discover them all
dynamically when required.
 
> > - Do we want mdevctl to manage config files for individual mdevs, or
> > >   are they supposed to be in a common format that can also be managed
> > >   by e.g. libvirt?
> > >  
> > 
> > Unless I misunderstand, I think mdevctl just helps to create mdevs for
> > being used by guests created either by libvirt or QEMU or even others.
> > How a guest would allocate a mdev (ie. saying "I'll use this specific mdev
> > UUID") is IMHO not something for mdevctl.
> 
> Right, mdevctl isn't concerned with how a specific mdev is used, but I
> think what Connie is after is more the proposal from Daniel where
> libvirt can essentially manage mdevctl config files itself and then
> only invoke mdevctl for the dirty work of creating and deleting
> devices.  In fact, assuming systemd, libvirt could avoid direct
> interaction with mdevctl entirely, instead using systemctl device units
> to start and stop the mdevs.  Maybe where that proposal takes a turn is
> when we again consider non-systemd hosts, where maybe mdevctl needs to
> write out an init script per mdev and libvirt injecting itself into
> manipulation of the config files would either need to perform the same
> or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> to either condone external config file manipulation or expand the scope
> of the project into becoming a library.

Is mdevctl really tackling a problem that is complex enough that we
will gain significantly by keeping the config files private and forcing
use of a CLI or Library to access them ?  The amount of information
that we need to store per mdev looks pretty small, with minimal
compound structure. To me it feels like we can easily define a standard
config format without suffering any serious long term pain, as the chances
we'd need to radically change it look minimal.

> > - Should mdevctl be a stand-alone tool, provide library functions, or
> > >   both? Related: should it keep any internal state that is not written
> > >   to disk? (I think that also plays into the transient vs. persistent
> > >   question.)
> 
> I don't think we want an mdevctld, if that's what you mean by internal
> state not written to disk.  I think we ideally want all state in the
> mdev config files or discerned through sysfs.  How we handle
> non-systemd hosts may throw a wrench in that though since currently the
> systemd integration relies on a template to support arbitrary mdevs and
> I'm not sure how to replicate that in other init services.  If we need
> to dynamically manage per mdev init files in addition to config files,
> we're not so self contained.

The most important part of the init script integration is just the bulk
creation of mdevs on startup. I think this could be handled on non-systemd
hosts via a fairly dumb init script that does this something approximating
this:

    for dev in `mdevctl list`
    do
        mdevctrl get-autostart $dev
	test $? = 0 && mdevctrl start $dev
    done

ie, iterate over all configs. If the config is marked to autostart,
then start it.

> > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > creating persisted mdevs (ie. mdevs that would be recreated after rebooting
> > using systemctl). That's the real use case I need.
> > Whether libvirt would internally support mdevctl would be nice but that's
> > not really something Nova needs, so I leave others providing their own
> > thoughts.
> > 
> > 
> > My personal opinion is that mdevctl should be able to tolerate mdevs
> > > being configured by other means, but probably should not try to impose
> > > its own configuration if it detects that (unless explicitly asked to do
> > > so). Not sure how feasible that goal is.
> > >
> > > That's what I misunderstand : in order to have a guest using a vGPU, you  
> > need to do two things :
> > 1/ create the mdev
> > 2/ allocate this created dev to a specific guest config
> > 
> > Of course, we could imagine a way to have both steps to be done directly by
> > libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.
> 
> Yep, we also don't want to presume libvirt is the only consumer here.
> mdevctl should also support other VM management tools, users who write
> their own management scripts, and even non-VM related use cases.
> 
> > > A well-defined config file format is probably a win, even if it only
> > > ends up being used by mdevctl itself.
> 
> Yes, regardless of whether others touch them, conversion scripts on
> upgrade should be avoided.  Do we need something beyond a key=value
> file?  So far we're only storing the mdev type and startup mode, but
> vfio-ap clearly needs more, apparently key=value1,value2,... type
> representation.  Still, I think I'd prefer simple over jumping to xml
> or json or yaml.  Thanks,

For libvirt our preference would be something we can easily support
without having to write new parsers. I'm not going to suggest XML,
but JSON is probably our highest preference. If not then a simple
flat file with one line of   key="value"  per setting is something
we already parse for /etc/libvirt/libvirtd.conf file. For slightly
more structure the .ini style file is also good. That's basically
just flat  key=value  pairs, but with [section] headers so you can
represent some level of structured data.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-19  7:28               ` Daniel P. Berrangé
@ 2019-06-19  9:46                 ` Cornelia Huck
  2019-06-19 18:46                   ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-06-19  9:46 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Alex Williamson, Sylvain Bauza, kvm, Libvirt Devel,
	Kirti Wankhede, Erik Skultety, Pavel Hrdina

On Wed, 19 Jun 2019 08:28:02 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Tue, Jun 18, 2019 at 04:12:10PM -0600, Alex Williamson wrote:
> > On Tue, 18 Jun 2019 14:48:11 +0200
> > Sylvain Bauza <sbauza@redhat.com> wrote:
> >   
> > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:

> > > > I think we need to reach consensus about the actual scope of the
> > > > mdevctl tool.
> > > >
> > > >    
> > > Thanks Cornelia, my thoughts:
> > > 
> > > - Is it supposed to be responsible for managing *all* mdev devices in  
> > > >   the system, or is it more supposed to be a convenience helper for
> > > >   users/software wanting to manage mdevs?
> > > >    
> > > 
> > > The latter. If an operator (or some software) wants to create mdevs by not
> > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > That said, mdevs created by mdevctl would be supported by systemctl, while
> > > the others not but I think it's okay.  
> > 
> > I agree (sort of), and I'm hearing that we should drop any sort of
> > automatic persistence of mdevs created outside of mdevctl.  The problem
> > comes when we try to draw the line between unmanaged and manged
> > devices.  For instance, if we have a command to list mdevs it would
> > feel incomplete if it didn't list all mdevs both those managed by
> > mdevctl and those created elsewhere.  For managed devices, I expect
> > we'll also have commands that allow the mode of the device to be
> > switched between transient, saved, and persistent.  Should a user then

Hm, what's the difference between 'saved' and 'persistent'? That
'saved' devices are not necessarily present?

> > be allowed to promote an unmanaged device to one of these modes via the
> > same command?  Should they be allowed to stop an unmanaged device
> > through driverctl?  Through systemctl?  These all seem like reasonable
> > things to do, so what then is the difference between transient and
> > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > those it has created?  
> 
> To my mind there shouldn't really need to be a difference between
> transient mdevs created by mdevctrl and mdevs created by an user
> directly using sysfs. Both are mdevs on the running system with
> no config file that you have to enumerate by looking at sysfs.
> This ties back to my belief that we shouldn't need to have any
> config on disk for a transient mdev, just discover them all
> dynamically when required.

So mdevctl can potentially interact with any mdev device on the system,
it just has to be instructed by a user or software to do so? I think we
can work with that.

>  
> > > - Do we want mdevctl to manage config files for individual mdevs, or  
> > > >   are they supposed to be in a common format that can also be managed
> > > >   by e.g. libvirt?
> > > >    
> > > 
> > > Unless I misunderstand, I think mdevctl just helps to create mdevs for
> > > being used by guests created either by libvirt or QEMU or even others.
> > > How a guest would allocate a mdev (ie. saying "I'll use this specific mdev
> > > UUID") is IMHO not something for mdevctl.  
> > 
> > Right, mdevctl isn't concerned with how a specific mdev is used, but I
> > think what Connie is after is more the proposal from Daniel where
> > libvirt can essentially manage mdevctl config files itself and then
> > only invoke mdevctl for the dirty work of creating and deleting
> > devices.  In fact, assuming systemd, libvirt could avoid direct
> > interaction with mdevctl entirely, instead using systemctl device units
> > to start and stop the mdevs.  Maybe where that proposal takes a turn is
> > when we again consider non-systemd hosts, where maybe mdevctl needs to
> > write out an init script per mdev and libvirt injecting itself into
> > manipulation of the config files would either need to perform the same
> > or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> > to either condone external config file manipulation or expand the scope
> > of the project into becoming a library.  
> 
> Is mdevctl really tackling a problem that is complex enough that we
> will gain significantly by keeping the config files private and forcing
> use of a CLI or Library to access them ?  The amount of information
> that we need to store per mdev looks pretty small, with minimal
> compound structure. To me it feels like we can easily define a standard
> config format without suffering any serious long term pain, as the chances
> we'd need to radically change it look minimal.

We probably can get some consensus on a file format pretty quickly, I
guess; and I agree that there's probably not enough magic in there to
make future changes painful. So yes, let's try to agree on a format
that various entities can consume.

> 
> > > - Should mdevctl be a stand-alone tool, provide library functions, or  
> > > >   both? Related: should it keep any internal state that is not written
> > > >   to disk? (I think that also plays into the transient vs. persistent
> > > >   question.)  
> > 
> > I don't think we want an mdevctld, if that's what you mean by internal
> > state not written to disk.  I think we ideally want all state in the
> > mdev config files or discerned through sysfs.  How we handle
> > non-systemd hosts may throw a wrench in that though since currently the
> > systemd integration relies on a template to support arbitrary mdevs and
> > I'm not sure how to replicate that in other init services.  If we need
> > to dynamically manage per mdev init files in addition to config files,
> > we're not so self contained.  
> 
> The most important part of the init script integration is just the bulk
> creation of mdevs on startup. I think this could be handled on non-systemd
> hosts via a fairly dumb init script that does this something approximating
> this:
> 
>     for dev in `mdevctl list`
>     do
>         mdevctrl get-autostart $dev
> 	test $? = 0 && mdevctrl start $dev
>     done
> 
> ie, iterate over all configs. If the config is marked to autostart,
> then start it.

How likely is a non-systemd system to not have udev? Can we rely on it
for dynamic management? (Any system will likely have _some_ kind of
uevent-consuming handlers, I guess, unless it is a very specialized
system where mdevctl is unlikely to be used at all.)

> 
> > > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > > creating persisted mdevs (ie. mdevs that would be recreated after rebooting
> > > using systemctl). That's the real use case I need.
> > > Whether libvirt would internally support mdevctl would be nice but that's
> > > not really something Nova needs, so I leave others providing their own
> > > thoughts.
> > > 
> > > 
> > > My personal opinion is that mdevctl should be able to tolerate mdevs  
> > > > being configured by other means, but probably should not try to impose
> > > > its own configuration if it detects that (unless explicitly asked to do
> > > > so). Not sure how feasible that goal is.
> > > >
> > > > That's what I misunderstand : in order to have a guest using a vGPU, you    
> > > need to do two things :
> > > 1/ create the mdev
> > > 2/ allocate this created dev to a specific guest config
> > > 
> > > Of course, we could imagine a way to have both steps to be done directly by
> > > libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.  

I don't think mdevctl should care about mdev-to-guest assignments; that
should be done by libvirt, OpenStack, whatever.

> > 
> > Yep, we also don't want to presume libvirt is the only consumer here.
> > mdevctl should also support other VM management tools, users who write
> > their own management scripts, and even non-VM related use cases.
> >   
> > > > A well-defined config file format is probably a win, even if it only
> > > > ends up being used by mdevctl itself.  
> > 
> > Yes, regardless of whether others touch them, conversion scripts on
> > upgrade should be avoided.  Do we need something beyond a key=value
> > file?  So far we're only storing the mdev type and startup mode, but
> > vfio-ap clearly needs more, apparently key=value1,value2,... type
> > representation.  Still, I think I'd prefer simple over jumping to xml
> > or json or yaml.  Thanks,  
> 
> For libvirt our preference would be something we can easily support
> without having to write new parsers. I'm not going to suggest XML,
> but JSON is probably our highest preference. If not then a simple
> flat file with one line of   key="value"  per setting is something
> we already parse for /etc/libvirt/libvirtd.conf file. For slightly
> more structure the .ini style file is also good. That's basically
> just flat  key=value  pairs, but with [section] headers so you can
> represent some level of structured data.

It would be nice if a simple .ini style would cover all things, but I'm
not sure whether vfio-ap isn't already a bit awkward to support with
that. I don't mind JSON much, but would like to avoid XML :)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
       [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
@ 2019-06-19  9:57                 ` Cornelia Huck
  2019-06-19 19:53                 ` Alex Williamson
  1 sibling, 0 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-06-19  9:57 UTC (permalink / raw)
  To: Sylvain Bauza
  Cc: Alex Williamson, Daniel P. Berrangé,
	kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina

On Wed, 19 Jun 2019 11:04:15 +0200
Sylvain Bauza <sbauza@redhat.com> wrote:

> On Wed, Jun 19, 2019 at 12:27 AM Alex Williamson <alex.williamson@redhat.com>
> wrote:
> 
> > On Tue, 18 Jun 2019 14:48:11 +0200
> > Sylvain Bauza <sbauza@redhat.com> wrote:
> >  
> > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:

> > > > I think we need to reach consensus about the actual scope of the
> > > > mdevctl tool.
> > > >
> > > >  
> > > Thanks Cornelia, my thoughts:
> > >
> > > - Is it supposed to be responsible for managing *all* mdev devices in  
> > > >   the system, or is it more supposed to be a convenience helper for
> > > >   users/software wanting to manage mdevs?
> > > >  
> > >
> > > The latter. If an operator (or some software) wants to create mdevs by  
> > not  
> > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > That said, mdevs created by mdevctl would be supported by systemctl,  
> > while  
> > > the others not but I think it's okay.  
> >
> > I agree (sort of), and I'm hearing that we should drop any sort of
> > automatic persistence of mdevs created outside of mdevctl.  The problem
> > comes when we try to draw the line between unmanaged and manged
> > devices.  For instance, if we have a command to list mdevs it would
> > feel incomplete if it didn't list all mdevs both those managed by
> > mdevctl and those created elsewhere.  For managed devices, I expect
> > we'll also have commands that allow the mode of the device to be
> > switched between transient, saved, and persistent.  Should a user then
> > be allowed to promote an unmanaged device to one of these modes via the
> > same command?  Should they be allowed to stop an unmanaged device
> > through driverctl?  Through systemctl?  These all seem like reasonable
> > things to do, so what then is the difference between transient and
> > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > those it has created?
> >
> >  
> Well, IMHO, mdevs created by mdevctl could all be persisted or transient
> just by adding an option when calling mdevctl, like :
> "mdevctl create-mdev [--transient] <uuid> <pci_id> <type>" where default
> would be persisting the mdev.

This sounds useful; the caller can avoid fiddling with sysfs entries
directly, while not committing to a permanent configuration.

> 
> For mdevs *not* created by mdevctl, then a usecase could be "I'd like to
> ask mdevctl to manage mdevs I already created" and if so, a mdevctl command
> like :
> "mdevctl manage-mdev [--transient] <mdev_uuid>"

What kind of 'managing' would this actually enable? If we rely on
mdevctl working with sysfs directly for transient devices, I can't
really think of anything...

> 
> Of course, that would mean that when you list mdevs by "mdev list-all" you
> wouldn't get mdevs managed by mdevctl.
> Thoughts ?
> 
> > - Do we want mdevctl to manage config files for individual mdevs, or  
> > > >   are they supposed to be in a common format that can also be managed
> > > >   by e.g. libvirt?
> > > >  
> > >
> > > Unless I misunderstand, I think mdevctl just helps to create mdevs for
> > > being used by guests created either by libvirt or QEMU or even others.
> > > How a guest would allocate a mdev (ie. saying "I'll use this specific  
> > mdev  
> > > UUID") is IMHO not something for mdevctl.  
> >
> > Right, mdevctl isn't concerned with how a specific mdev is used, but I
> > think what Connie is after is more the proposal from Daniel where
> > libvirt can essentially manage mdevctl config files itself and then
> > only invoke mdevctl for the dirty work of creating and deleting
> > devices.  In fact, assuming systemd, libvirt could avoid direct
> > interaction with mdevctl entirely, instead using systemctl device units
> > to start and stop the mdevs.  Maybe where that proposal takes a turn is
> > when we again consider non-systemd hosts, where maybe mdevctl needs to
> > write out an init script per mdev and libvirt injecting itself into
> > manipulation of the config files would either need to perform the same
> > or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> > to either condone external config file manipulation or expand the scope
> > of the project into becoming a library.
> >
> >  
> Well, like I said, I think it's maybe another user case : just using
> libvirt when you want a guest having vGPUs and then libvirt would create
> mdevs (so users wouln't need to know at that).
> That said, for the moment, I think we don't really need it so maybe a new
> RFE once we at least have mdevctl packaged and supported by RHEL ?

If we allow config file handling directly, libvirt could start using it
even without mdevctl present? (Not sure if that makes sense.)

> 
> 
> > - Should mdevctl be a stand-alone tool, provide library functions, or  
> > > >   both? Related: should it keep any internal state that is not written
> > > >   to disk? (I think that also plays into the transient vs. persistent
> > > >   question.)  
> >
> > I don't think we want an mdevctld, if that's what you mean by internal

Yeah, mdevctld--.

> > state not written to disk.  I think we ideally want all state in the
> > mdev config files or discerned through sysfs.  How we handle
> > non-systemd hosts may throw a wrench in that though since currently the
> > systemd integration relies on a template to support arbitrary mdevs and
> > I'm not sure how to replicate that in other init services.  If we need
> > to dynamically manage per mdev init files in addition to config files,
> > we're not so self contained.
> >  
> > > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > > creating persisted mdevs (ie. mdevs that would be recreated after  
> > rebooting  
> > > using systemctl). That's the real use case I need.
> > > Whether libvirt would internally support mdevctl would be nice but that's
> > > not really something Nova needs, so I leave others providing their own
> > > thoughts.
> > >
> > >
> > > My personal opinion is that mdevctl should be able to tolerate mdevs  
> > > > being configured by other means, but probably should not try to impose
> > > > its own configuration if it detects that (unless explicitly asked to do
> > > > so). Not sure how feasible that goal is.
> > > >
> > > > That's what I misunderstand : in order to have a guest using a vGPU,  
> > you  
> > > need to do two things :
> > > 1/ create the mdev
> > > 2/ allocate this created dev to a specific guest config
> > >
> > > Of course, we could imagine a way to have both steps to be done directly  
> > by  
> > > libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.  
> >
> > Yep, we also don't want to presume libvirt is the only consumer here.
> > mdevctl should also support other VM management tools, users who write
> > their own management scripts, and even non-VM related use cases.
> >
> >  
> Oh yes, please don't premuse mdevctl is only needed by libvirt.
> FWIW, once mdevctl is supported by RHEL, I'd love to use it for OpenStack
> Nova at least because I want to persist the mdevs.
> At the moment, Nova just creates mdevs directly by sysfs and look the
> existing onces up by sysfs, but upstream community in Nova thinks the
> mission statement is not about managing mdevs so we don't really want to
> add in Nova some kind of DB persistence just for mdevs.

So, Nova would basically poke mdevctl, but not interact with the config
files directly? Or am I misunderstanding?

> 
> > > A well-defined config file format is probably a win, even if it only  
> > > > ends up being used by mdevctl itself.  
> >
> > Yes, regardless of whether others touch them, conversion scripts on
> > upgrade should be avoided.  Do we need something beyond a key=value
> > file?  So far we're only storing the mdev type and startup mode, but
> > vfio-ap clearly needs more, apparently key=value1,value2,... type
> > representation.  Still, I think I'd prefer simple over jumping to xml
> > or json or yaml.  Thanks,
> >
> >  
> Heh, in OpenStack discussing about a file format is possibly one of the
> longest arguments we already have, so I leave others to say their own
> opinions but FWIW, as we use Python we tend to prefer YAML files. I don't
> care about the format tho, just take the most convenient for libvirt I'd
> say.

Aww, and here I was looking forward to a nice file format discussion...

More seriously, as I said in my other reply, .ini style would be good,
but using JSON probably gives us more flexibility in the long run.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-19  9:46                 ` Cornelia Huck
@ 2019-06-19 18:46                   ` Alex Williamson
  2019-06-20  8:24                     ` Daniel P. Berrangé
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-19 18:46 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Daniel P. Berrangé,
	Sylvain Bauza, kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety,
	Pavel Hrdina

On Wed, 19 Jun 2019 11:46:59 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 19 Jun 2019 08:28:02 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Tue, Jun 18, 2019 at 04:12:10PM -0600, Alex Williamson wrote:  
> > > On Tue, 18 Jun 2019 14:48:11 +0200
> > > Sylvain Bauza <sbauza@redhat.com> wrote:
> > >     
> > > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:  
> 
> > > > > I think we need to reach consensus about the actual scope of the
> > > > > mdevctl tool.
> > > > >
> > > > >      
> > > > Thanks Cornelia, my thoughts:
> > > > 
> > > > - Is it supposed to be responsible for managing *all* mdev devices in    
> > > > >   the system, or is it more supposed to be a convenience helper for
> > > > >   users/software wanting to manage mdevs?
> > > > >      
> > > > 
> > > > The latter. If an operator (or some software) wants to create mdevs by not
> > > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > > That said, mdevs created by mdevctl would be supported by systemctl, while
> > > > the others not but I think it's okay.    
> > > 
> > > I agree (sort of), and I'm hearing that we should drop any sort of
> > > automatic persistence of mdevs created outside of mdevctl.  The problem
> > > comes when we try to draw the line between unmanaged and manged
> > > devices.  For instance, if we have a command to list mdevs it would
> > > feel incomplete if it didn't list all mdevs both those managed by
> > > mdevctl and those created elsewhere.  For managed devices, I expect
> > > we'll also have commands that allow the mode of the device to be
> > > switched between transient, saved, and persistent.  Should a user then  
> 
> Hm, what's the difference between 'saved' and 'persistent'? That
> 'saved' devices are not necessarily present?

It seems like we're coming up with the following classes:

1) transient
  a) mdevctl created
  b) foreign
2) defined
  a) automatic start-up
  b) manual start-up

I was using persistent for 2b), but that's probably not a good name
because devices can still be stopped, so they're not really
persistently available even in this class.

mdevctl today only has defined devices, transient needs to be
implemented, which should lead to a conclusion on whether 1a) and 1b)
are really the same.

> > > be allowed to promote an unmanaged device to one of these modes via the
> > > same command?  Should they be allowed to stop an unmanaged device
> > > through driverctl?  Through systemctl?  These all seem like reasonable
> > > things to do, so what then is the difference between transient and
> > > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > > those it has created?    
> > 
> > To my mind there shouldn't really need to be a difference between
> > transient mdevs created by mdevctrl and mdevs created by an user
> > directly using sysfs. Both are mdevs on the running system with
> > no config file that you have to enumerate by looking at sysfs.
> > This ties back to my belief that we shouldn't need to have any
> > config on disk for a transient mdev, just discover them all
> > dynamically when required.  
> 
> So mdevctl can potentially interact with any mdev device on the system,
> it just has to be instructed by a user or software to do so? I think we
> can work with that.

Some TBDs around systemd/init support for transient devices and how
transient devices can be promoted to defined.  For instance if a
vfio-ap device requires matrix programming after instantiation, can we
glean that programming from sysfs or is there metadata irrecoverably
lost if no config file is created for a transient device?  This would
also imply that a 1b) foreign device could not be promoted to 2x)
defined device.

> > > > - Do we want mdevctl to manage config files for individual mdevs, or    
> > > > >   are they supposed to be in a common format that can also be managed
> > > > >   by e.g. libvirt?
> > > > >      
> > > > 
> > > > Unless I misunderstand, I think mdevctl just helps to create mdevs for
> > > > being used by guests created either by libvirt or QEMU or even others.
> > > > How a guest would allocate a mdev (ie. saying "I'll use this specific mdev
> > > > UUID") is IMHO not something for mdevctl.    
> > > 
> > > Right, mdevctl isn't concerned with how a specific mdev is used, but I
> > > think what Connie is after is more the proposal from Daniel where
> > > libvirt can essentially manage mdevctl config files itself and then
> > > only invoke mdevctl for the dirty work of creating and deleting
> > > devices.  In fact, assuming systemd, libvirt could avoid direct
> > > interaction with mdevctl entirely, instead using systemctl device units
> > > to start and stop the mdevs.  Maybe where that proposal takes a turn is
> > > when we again consider non-systemd hosts, where maybe mdevctl needs to
> > > write out an init script per mdev and libvirt injecting itself into
> > > manipulation of the config files would either need to perform the same
> > > or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> > > to either condone external config file manipulation or expand the scope
> > > of the project into becoming a library.    
> > 
> > Is mdevctl really tackling a problem that is complex enough that we
> > will gain significantly by keeping the config files private and forcing
> > use of a CLI or Library to access them ?  The amount of information
> > that we need to store per mdev looks pretty small, with minimal
> > compound structure. To me it feels like we can easily define a standard
> > config format without suffering any serious long term pain, as the chances
> > we'd need to radically change it look minimal.  
> 
> We probably can get some consensus on a file format pretty quickly, I
> guess; and I agree that there's probably not enough magic in there to
> make future changes painful. So yes, let's try to agree on a format
> that various entities can consume.
> 
> >   
> > > > - Should mdevctl be a stand-alone tool, provide library functions, or    
> > > > >   both? Related: should it keep any internal state that is not written
> > > > >   to disk? (I think that also plays into the transient vs. persistent
> > > > >   question.)    
> > > 
> > > I don't think we want an mdevctld, if that's what you mean by internal
> > > state not written to disk.  I think we ideally want all state in the
> > > mdev config files or discerned through sysfs.  How we handle
> > > non-systemd hosts may throw a wrench in that though since currently the
> > > systemd integration relies on a template to support arbitrary mdevs and
> > > I'm not sure how to replicate that in other init services.  If we need
> > > to dynamically manage per mdev init files in addition to config files,
> > > we're not so self contained.    
> > 
> > The most important part of the init script integration is just the bulk
> > creation of mdevs on startup. I think this could be handled on non-systemd
> > hosts via a fairly dumb init script that does this something approximating
> > this:
> > 
> >     for dev in `mdevctl list`
> >     do
> >         mdevctrl get-autostart $dev
> > 	test $? = 0 && mdevctrl start $dev
> >     done
> > 
> > ie, iterate over all configs. If the config is marked to autostart,
> > then start it.  
> 
> How likely is a non-systemd system to not have udev? Can we rely on it
> for dynamic management? (Any system will likely have _some_ kind of
> uevent-consuming handlers, I guess, unless it is a very specialized
> system where mdevctl is unlikely to be used at all.)

Yep, I'd be uncomfortable with anything that only processed parent
devices once at boot time, we need to expect mdev parent devices (or at
least their support of mdev within that parent driver) to be dynamic.
Churning through devices at a fixed point in boot also only provides the
persistence, not the $INITSYSTEM based management where there seem to
be multiple affirmations that systemctl base management is a good
thing.  How much do we *really* care about non-systemd hosts?  Or
maybe, how much do we really care about feature parity for non-systemd
hosts and could we shrug off anything beyond basic persistence and
mdevctl-based management?

> > > > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > > > creating persisted mdevs (ie. mdevs that would be recreated after rebooting
> > > > using systemctl). That's the real use case I need.
> > > > Whether libvirt would internally support mdevctl would be nice but that's
> > > > not really something Nova needs, so I leave others providing their own
> > > > thoughts.
> > > > 
> > > > 
> > > > My personal opinion is that mdevctl should be able to tolerate mdevs    
> > > > > being configured by other means, but probably should not try to impose
> > > > > its own configuration if it detects that (unless explicitly asked to do
> > > > > so). Not sure how feasible that goal is.
> > > > >
> > > > > That's what I misunderstand : in order to have a guest using a vGPU, you      
> > > > need to do two things :
> > > > 1/ create the mdev
> > > > 2/ allocate this created dev to a specific guest config
> > > > 
> > > > Of course, we could imagine a way to have both steps to be done directly by
> > > > libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.    
> 
> I don't think mdevctl should care about mdev-to-guest assignments; that
> should be done by libvirt, OpenStack, whatever.
> 
> > > 
> > > Yep, we also don't want to presume libvirt is the only consumer here.
> > > mdevctl should also support other VM management tools, users who write
> > > their own management scripts, and even non-VM related use cases.
> > >     
> > > > > A well-defined config file format is probably a win, even if it only
> > > > > ends up being used by mdevctl itself.    
> > > 
> > > Yes, regardless of whether others touch them, conversion scripts on
> > > upgrade should be avoided.  Do we need something beyond a key=value
> > > file?  So far we're only storing the mdev type and startup mode, but
> > > vfio-ap clearly needs more, apparently key=value1,value2,... type
> > > representation.  Still, I think I'd prefer simple over jumping to xml
> > > or json or yaml.  Thanks,    
> > 
> > For libvirt our preference would be something we can easily support
> > without having to write new parsers. I'm not going to suggest XML,
> > but JSON is probably our highest preference. If not then a simple
> > flat file with one line of   key="value"  per setting is something
> > we already parse for /etc/libvirt/libvirtd.conf file. For slightly
> > more structure the .ini style file is also good. That's basically
> > just flat  key=value  pairs, but with [section] headers so you can
> > represent some level of structured data.  
> 
> It would be nice if a simple .ini style would cover all things, but I'm
> not sure whether vfio-ap isn't already a bit awkward to support with
> that. I don't mind JSON much, but would like to avoid XML :)

I think the vfio-ap issue is simply that we have multiple values for
the same key, if we choose our key to be sysfs attributes, but we can
either redefine our key or the value or the config format to resolve
this.  For instance we can add a sequence number to the key such that
we maintain a 1:1/key:value and interpret the key accordingly.  We can
also escape the value into an array and apply each value in sequence
to the key.  Does something in JSON provide an obviously better
solution to either of these?  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
       [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
  2019-06-19  9:57                 ` Cornelia Huck
@ 2019-06-19 19:53                 ` Alex Williamson
  1 sibling, 0 replies; 46+ messages in thread
From: Alex Williamson @ 2019-06-19 19:53 UTC (permalink / raw)
  To: Sylvain Bauza
  Cc: Cornelia Huck, Daniel P. Berrangé,
	kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina

On Wed, 19 Jun 2019 11:04:15 +0200
Sylvain Bauza <sbauza@redhat.com> wrote:

> On Wed, Jun 19, 2019 at 12:27 AM Alex Williamson <alex.williamson@redhat.com>
> wrote:
> 
> > On Tue, 18 Jun 2019 14:48:11 +0200
> > Sylvain Bauza <sbauza@redhat.com> wrote:
> >  
> > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >  
> > > > On Mon, 17 Jun 2019 11:05:17 -0600
> > > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > > >  
> > > > > On Mon, 17 Jun 2019 16:10:30 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >  
> > > > > > On Mon, Jun 17, 2019 at 08:54:38AM -0600, Alex Williamson wrote:  
> > > > > > > On Mon, 17 Jun 2019 15:00:00 +0100
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >  
> > > > > > > > On Thu, May 23, 2019 at 05:20:01PM -0600, Alex Williamson  
> > wrote:  
> > > >  
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Currently mediated device management, much like SR-IOV VF  
> > > > management,  
> > > > > > > > > is largely left as an exercise for the user.  This is an  
> > attempt  
> > > > to  
> > > > > > > > > provide something and see where it goes.  I doubt we'll solve
> > > > > > > > > everyone's needs on the first pass, but maybe we'll solve  
> > enough  
> > > > and  
> > > > > > > > > provide helpers for the rest.  Without further ado, I'll  
> > point  
> > > > to what  
> > > > > > > > > I have so far:
> > > > > > > > >
> > > > > > > > > https://github.com/awilliam/mdevctl
> > > > > > > > >
> > > > > > > > > This is inspired by driverctl, which is also a bash  
> > utility.  
> > > > mdevctl  
> > > > > > > > > uses udev and systemd to record and recreate mdev devices for
> > > > > > > > > persistence and provides a command line utility for  
> > querying,  
> > > > listing,  
> > > > > > > > > starting, stopping, adding, and removing mdev devices.  
> > > > Currently, for  
> > > > > > > > > better or worse, it considers anything created to be  
> > > > persistent.  I can  
> > > > > > > > > imagine a global configuration option that might disable  
> > this and  
> > > > > > > > > perhaps an autostart flag per mdev device, such that  
> > mdevctl  
> > > > might  
> > > > > > > > > simply "know" about some mdevs but not attempt to create them
> > > > > > > > > automatically.  Clearly command line usage help, man pages,  
> > and  
> > > > > > > > > packaging are lacking as well, release early, release  
> > often,  
> > > > plus this  
> > > > > > > > > is a discussion starter to see if perhaps this is sufficient  
> > to  
> > > > meet  
> > > > > > > > > some needs.  
> > > > > > > >
> > > > > > > > I think from libvirt's POV, we would *not* want devices to be  
> > made  
> > > > > > > > unconditionally persistent. We usually wish to expose a choice  
> > to  
> > > > > > > > applications whether to have resources be transient or  
> > persistent.  
> > > > > > > >
> > > > > > > > So from that POV, a global config option to turn off  
> > persistence  
> > > > > > > > is not workable either. We would want control per-device, with
> > > > > > > > autostart control per device too.  
> > > > > > >
> > > > > > > The code has progressed somewhat in the past 3+ weeks, we still  
> > > > persist  
> > > > > > > all devices, but the start-up mode can be selected per device  
> > or  
> > > > with a  
> > > > > > > global default mode.  Devices configured with 'auto' start-up
> > > > > > > automatically while 'manual' devices are simply known and  
> > available  
> > > > to  
> > > > > > > be started.  I imagine we could add a 'transient' mode where we  
> > purge  
> > > > > > > the information about the device when it is removed or the next  
> > time  
> > > > > > > the parent device is added.  
> > > > > >
> > > > > > Having a pesistent config written out & then purged later is still
> > > > > > problematic. If the host crashes, nothing will purge the config  
> > file,  
> > > > > > so it will become a persistent device. Also when listing devices we
> > > > > > want to be able to report whether it is persistent or transient.  
> > The  
> > > > > > obvious way todo that is to simply look if a config file exists or
> > > > > > not.  
> > > > >
> > > > > I was thinking that the config file would identify the device as
> > > > > transient, therefore if the system crashed we'd have the opportunity  
> > to  
> > > > > purge those entries on the next boot as we're processing the entries
> > > > > for that parent device.  Clearly it has yet to be implemented, but I
> > > > > expect there are some advantages to tracking devices via a transient
> > > > > config entry or else we're constantly re-discovering foreign mdevs.  
> > > >
> > > > I think we need to reach consensus about the actual scope of the
> > > > mdevctl tool.
> > > >
> > > >  
> > > Thanks Cornelia, my thoughts:
> > >
> > > - Is it supposed to be responsible for managing *all* mdev devices in  
> > > >   the system, or is it more supposed to be a convenience helper for
> > > >   users/software wanting to manage mdevs?
> > > >  
> > >
> > > The latter. If an operator (or some software) wants to create mdevs by  
> > not  
> > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > That said, mdevs created by mdevctl would be supported by systemctl,  
> > while  
> > > the others not but I think it's okay.  
> >
> > I agree (sort of), and I'm hearing that we should drop any sort of
> > automatic persistence of mdevs created outside of mdevctl.  The problem
> > comes when we try to draw the line between unmanaged and manged
> > devices.  For instance, if we have a command to list mdevs it would
> > feel incomplete if it didn't list all mdevs both those managed by
> > mdevctl and those created elsewhere.  For managed devices, I expect
> > we'll also have commands that allow the mode of the device to be
> > switched between transient, saved, and persistent.  Should a user then
> > be allowed to promote an unmanaged device to one of these modes via the
> > same command?  Should they be allowed to stop an unmanaged device
> > through driverctl?  Through systemctl?  These all seem like reasonable
> > things to do, so what then is the difference between transient and
> > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > those it has created?
> >
> >  
> Well, IMHO, mdevs created by mdevctl could all be persisted or transient
> just by adding an option when calling mdevctl, like :
> "mdevctl create-mdev [--transient] <uuid> <pci_id> <type>" where default
> would be persisting the mdev.
> 
> For mdevs *not* created by mdevctl, then a usecase could be "I'd like to
> ask mdevctl to manage mdevs I already created" and if so, a mdevctl command
> like :
> "mdevctl manage-mdev [--transient] <mdev_uuid>"
> 
> Of course, that would mean that when you list mdevs by "mdev list-all" you
> wouldn't get mdevs managed by mdevctl.
> Thoughts ?

Is there a missing 'not' in the previous sentence ("...wouldn't get
mdevs *not* managed by mdevctl") or are you suggesting list-all is
actually more like a list-foreign, or maybe list-unmanaged?  I think we
want to provide an interface for a user to see all mdev devices,
transient/{un}managed and defined so that they can make sense of
available instances when we list the types.  Imagine an NVIDIA GRID
environment which only supports heterogeneous mdev types per parent
where unmanaged mdev instances exist.  The available instances fields
when listing the types might show none available to create, but the mdev
listing also shows none that have been created.  That's confusing.  So
we need a way to list all mdevs, and you're even including a way to
promote an unmanaged mdev to managed, so I think we're always managing
all mdevs to some extent.  If we take Daniels suggestion that managed
transient devices should have no on-disk config, then what does the
following command actually do:

# mdevctl manage-mdev --transient <mdev_uuid>

That would imply there's state that's not in a config file that
differentiates this mdev from one created outside of mdevctl.  So all
signs to me are pointing that there is not a clear separation of
managed vs unmanaged devices.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-19 18:46                   ` Alex Williamson
@ 2019-06-20  8:24                     ` Daniel P. Berrangé
  0 siblings, 0 replies; 46+ messages in thread
From: Daniel P. Berrangé @ 2019-06-20  8:24 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, Sylvain Bauza, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina

On Wed, Jun 19, 2019 at 12:46:33PM -0600, Alex Williamson wrote:
> On Wed, 19 Jun 2019 11:46:59 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Wed, 19 Jun 2019 08:28:02 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Tue, Jun 18, 2019 at 04:12:10PM -0600, Alex Williamson wrote:  
> > > > On Tue, 18 Jun 2019 14:48:11 +0200
> > > > Sylvain Bauza <sbauza@redhat.com> wrote:
> > > >     
> > > > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:  
> > 
> > > > > > I think we need to reach consensus about the actual scope of the
> > > > > > mdevctl tool.
> > > > > >
> > > > > >      
> > > > > Thanks Cornelia, my thoughts:
> > > > > 
> > > > > - Is it supposed to be responsible for managing *all* mdev devices in    
> > > > > >   the system, or is it more supposed to be a convenience helper for
> > > > > >   users/software wanting to manage mdevs?
> > > > > >      
> > > > > 
> > > > > The latter. If an operator (or some software) wants to create mdevs by not
> > > > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > > > That said, mdevs created by mdevctl would be supported by systemctl, while
> > > > > the others not but I think it's okay.    
> > > > 
> > > > I agree (sort of), and I'm hearing that we should drop any sort of
> > > > automatic persistence of mdevs created outside of mdevctl.  The problem
> > > > comes when we try to draw the line between unmanaged and manged
> > > > devices.  For instance, if we have a command to list mdevs it would
> > > > feel incomplete if it didn't list all mdevs both those managed by
> > > > mdevctl and those created elsewhere.  For managed devices, I expect
> > > > we'll also have commands that allow the mode of the device to be
> > > > switched between transient, saved, and persistent.  Should a user then  
> > 
> > Hm, what's the difference between 'saved' and 'persistent'? That
> > 'saved' devices are not necessarily present?
> 
> It seems like we're coming up with the following classes:
> 
> 1) transient
>   a) mdevctl created
>   b) foreign
> 2) defined
>   a) automatic start-up
>   b) manual start-up
> 
> I was using persistent for 2b), but that's probably not a good name
> because devices can still be stopped, so they're not really
> persistently available even in this class.

NB, for terminology  when libvirt calls something "persistent" it just
means that there's a configuration file recorded on disk, thus when you
stop the thing, you can still query its config & restart it from that
same config later. 

The best solution for libvirt would be to cope with all 4 of those
classes. 1b is the least important for us, so not the end of the
world if it was missing.

> > > To my mind there shouldn't really need to be a difference between
> > > transient mdevs created by mdevctrl and mdevs created by an user
> > > directly using sysfs. Both are mdevs on the running system with
> > > no config file that you have to enumerate by looking at sysfs.
> > > This ties back to my belief that we shouldn't need to have any
> > > config on disk for a transient mdev, just discover them all
> > > dynamically when required.  
> > 
> > So mdevctl can potentially interact with any mdev device on the system,
> > it just has to be instructed by a user or software to do so? I think we
> > can work with that.
> 
> Some TBDs around systemd/init support for transient devices and how
> transient devices can be promoted to defined.  For instance if a
> vfio-ap device requires matrix programming after instantiation, can we
> glean that programming from sysfs or is there metadata irrecoverably
> lost if no config file is created for a transient device?  This would
> also imply that a 1b) foreign device could not be promoted to 2x)
> defined device.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-05-23 23:20 mdevctl: A shoestring mediated device management and persistence utility Alex Williamson
                   ` (2 preceding siblings ...)
  2019-06-17 14:00 ` Daniel P. Berrangé
@ 2019-06-25 22:52 ` Alex Williamson
  2019-06-26  9:58   ` Cornelia Huck
  3 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-25 22:52 UTC (permalink / raw)
  To: kvm
  Cc: Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Cornelia Huck, Christophe de Dinechin

Hi,

Based on the discussions we've had, I've rewritten the bulk of
mdevctl.  I think it largely does everything we want now, modulo
devices that will need some sort of 1:N values per key for
configuration in the config file versus the 1:1 key:value setup we
currently have (so don't consider the format final just yet).

We now support "transient" devices and there's no distinction or
difference in handling of such devices whether they're created by
mdevctl or externally.  All devices will also have systemd management,
though systemd is no longer required, it's used if available.  The
instance name used for systemd device units has also changed to allow
us to use BindsTo= such that services are not only created, but are
also removed if the device is removed.  Unfortunately it's not a simple
UUID via the systemd route any longer.

Since the original posting, the project has moved from my personal
github to here:

https://github.com/mdevctl/mdevctl

Please see the README there for overview of the new commands and
example of their usage.  There is no attempt to maintain backwards
compatibility with previous versions, this tool is in its infancy.
Also since the original posting, RPM packaging is included, so simply
run 'make rpm' and install the resulting package.

Highlights of this version include proper argument parsing via getopts
so that options can be provided in any order.  I'm still using the
format 'mdevctl {command} [options]' but now it's consistent that all
the options come after the command, in any order.  I think this is
relatively consistent with a variety of other tools.

Devices are no longer automatically persisted, we handle them as
transient, but we also can promote them to persistent through the
'define' command.  The define, undefine, and modify commands all
operate only on the config file, so that we can define separate from
creating.  When promoting from a transient to defined device, we can
use the existing device to create the config.  Both the type and the
startup of a device can be modified in the config, without affecting
the running device.

Starting an mdev device no longer relies exclusively on a saved config,
the device can be fully specified via options to create a transient
device.  Specifying only a uuid is also sufficient for a defined
device.  Some consideration has also been given to uuid collisions.
The mdev interface in the kernel prevents multiple mdevs with the same
uuid running concurrently, but mdevctl allows mdevs to be defined with
the same uuid under separate parent devices.  Some options therefore
allow both a uuid and parent to be specified and require this if the
uuid alone is ambiguous.  Clearly starting two such devices at the same
time will fail and is left to higher level tools to manage, just like
the ability to define more devices than there are available instances on
the host system.

The stop and list commands are largely the same ideas as previous
though the semantics are completely different.  Listing running devices
now notes which are defined versus transient.  Perhaps it might also be
useful when listing defined devices to note which are running.

The sbin/libexec split of mdevctl has been squashed.  There are some
commands in the script that are currently only intended to be used from
udev or systemd, these are simply excluded from the help.  It's
possible we may want to promote the start-parent-mdevs command out of
this class, but the rest are specifically systemd helpers.

I'll include the current help test message below for further semantic
details, but please have a look-see, or better yet give it a try.
Thanks,

Alex

PS - I'm looking at adding udev change events when a device registers
or unregisters with the mdev core, which should help us know when to
trigger creation of persistent, auto started devices.  That support is
included here with the MDEV_STATE="registered|unregistered" environment
values.  Particularly, kvmgt now supports dynamic loading an unloading,
so as long as the enable_gvt=1 option is provided to the i915 driver
mdev support can come and go independent of the parent device.  The
change uevents are necessary to trigger on that, so I'd appreciate any
feedback on those as well.  Until then, the persistence of mdevctl
really depends on mdev support on the parent device being _completely_
setup prior to processing the udev rules.

# mdevctl
Usage: mdevctl {COMMAND} [options...]

Available commands:
define		Define a config for an mdev device.  Options:
	<-u|--uuid=UUID> [<-p|--parent=PARENT> <-t|--type=TYPE>] [-a|--auto]
		If the device specified by the UUID currently exists, parent
		and type may be omitted to use the existing values. The auto
		option marks the device to start on parent availability.
		Running devices are unaffected by this command.
undefine	Undefine, or remove a config for an mdev device.  Options:
	<-u|--uuid=UUID> [-p|--parent=PARENT]
		If a UUID exists for multiple parents, all will be removed
		unless a parent is specified.  Running devices are unaffected
		by this command.
modify		Modify the config for a defined mdev device.  Options:
	<-u|--uuid=UUID> [-p|--parent=PARENT] [-t|--type=TYPE] \
	[[-a|--auto]|[-m|--manual]]
		The parent option further identifies a UUID if it is not
		unique, the parent for a device cannot be modified via this
		command, undefine and re-define should be used instead.  The
		mdev type and startup mode can be modified.  Running devices
		are unaffected by this command.
start		Start an mdev device.  Options:
	<-u|--uuid=UUID> [-p|--parent=PARENT] [-t|--type=TYPE]
		If the UUID is previously defined and unique, the UUID is
		sufficient to start the device (UUIDs may not collide between
		running devices).  If a UUID is used in multiple defined
		configs, the parent device is necessary to identify the config.
		Specifying UUID, PARENT, and TYPE allows devices to be started
		regardless of a previously defined config (ie. transient mdevs).
stop		Stop an mdev device.  Options:
	<-u|--uuid=UUID>
list		List mdev devices.  Options:
	[-d|--defined]|[-t|--types]
		With no options, information about the currently running mdev
		devices is provided.  Specifing DEFINED lists the configuration
		of defined devices, regardless of their running state.
		Specifying TYPES lists the mdev types provided by the currently
		registered mdev parent devices on the system.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-25 22:52 ` Alex Williamson
@ 2019-06-26  9:58   ` Cornelia Huck
  2019-06-26 14:37     ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-06-26  9:58 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Matthew Rosato

On Tue, 25 Jun 2019 16:52:51 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> Hi,
> 
> Based on the discussions we've had, I've rewritten the bulk of
> mdevctl.  I think it largely does everything we want now, modulo
> devices that will need some sort of 1:N values per key for
> configuration in the config file versus the 1:1 key:value setup we
> currently have (so don't consider the format final just yet).

We might want to factor out that config format handling while we're
trying to finalize it.

cc:ing Matt for his awareness. I'm currently not quite sure how to
handle those vfio-ap "write several values to an attribute one at a
time" requirements. Maybe 1:N key:value is the way to go; maybe we
need/want JSON or something like that.

> 
> We now support "transient" devices and there's no distinction or
> difference in handling of such devices whether they're created by
> mdevctl or externally.  All devices will also have systemd management,
> though systemd is no longer required, it's used if available.  The
> instance name used for systemd device units has also changed to allow
> us to use BindsTo= such that services are not only created, but are
> also removed if the device is removed.  Unfortunately it's not a simple
> UUID via the systemd route any longer.

That's a bit unfortunate; however, making it workable without systemd
certainly is a good thing :)

> 
> Since the original posting, the project has moved from my personal
> github to here:
> 
> https://github.com/mdevctl/mdevctl
> 
> Please see the README there for overview of the new commands and
> example of their usage.  There is no attempt to maintain backwards
> compatibility with previous versions, this tool is in its infancy.
> Also since the original posting, RPM packaging is included, so simply
> run 'make rpm' and install the resulting package.

Nice.

> 
> Highlights of this version include proper argument parsing via getopts
> so that options can be provided in any order.  I'm still using the
> format 'mdevctl {command} [options]' but now it's consistent that all
> the options come after the command, in any order.  I think this is
> relatively consistent with a variety of other tools.

Parsing via getops is also very nice.

> 
> Devices are no longer automatically persisted, we handle them as
> transient, but we also can promote them to persistent through the
> 'define' command.  The define, undefine, and modify commands all
> operate only on the config file, so that we can define separate from
> creating.  When promoting from a transient to defined device, we can
> use the existing device to create the config.  Both the type and the
> startup of a device can be modified in the config, without affecting
> the running device.
> 
> Starting an mdev device no longer relies exclusively on a saved config,
> the device can be fully specified via options to create a transient
> device.  Specifying only a uuid is also sufficient for a defined
> device.  Some consideration has also been given to uuid collisions.
> The mdev interface in the kernel prevents multiple mdevs with the same
> uuid running concurrently, but mdevctl allows mdevs to be defined with
> the same uuid under separate parent devices.  Some options therefore
> allow both a uuid and parent to be specified and require this if the
> uuid alone is ambiguous.  Clearly starting two such devices at the same
> time will fail and is left to higher level tools to manage, just like
> the ability to define more devices than there are available instances on
> the host system.

I still have to look into the details of this.

> 
> The stop and list commands are largely the same ideas as previous
> though the semantics are completely different.  Listing running devices
> now notes which are defined versus transient.  Perhaps it might also be
> useful when listing defined devices to note which are running.

Yes, I think it would be useful.

> 
> The sbin/libexec split of mdevctl has been squashed.  There are some
> commands in the script that are currently only intended to be used from
> udev or systemd, these are simply excluded from the help.  It's
> possible we may want to promote the start-parent-mdevs command out of
> this class, but the rest are specifically systemd helpers.
> 
> I'll include the current help test message below for further semantic
> details, but please have a look-see, or better yet give it a try.

Had a quick look, sent two pull requests with some minor tweaks. Still
have to do a proper review, and actually try the thing on an s390.

> Thanks,
> 
> Alex
> 
> PS - I'm looking at adding udev change events when a device registers
> or unregisters with the mdev core, which should help us know when to
> trigger creation of persistent, auto started devices.  That support is
> included here with the MDEV_STATE="registered|unregistered" environment
> values.  Particularly, kvmgt now supports dynamic loading an unloading,
> so as long as the enable_gvt=1 option is provided to the i915 driver
> mdev support can come and go independent of the parent device.  The
> change uevents are necessary to trigger on that, so I'd appreciate any
> feedback on those as well.  Until then, the persistence of mdevctl
> really depends on mdev support on the parent device being _completely_
> setup prior to processing the udev rules.

I'll need to think about this. Do you have some preliminary patches
somewhere?

> 
> # mdevctl
> Usage: mdevctl {COMMAND} [options...]
> 
> Available commands:
> define		Define a config for an mdev device.  Options:
> 	<-u|--uuid=UUID> [<-p|--parent=PARENT> <-t|--type=TYPE>] [-a|--auto]
> 		If the device specified by the UUID currently exists, parent
> 		and type may be omitted to use the existing values. The auto
> 		option marks the device to start on parent availability.
> 		Running devices are unaffected by this command.
> undefine	Undefine, or remove a config for an mdev device.  Options:
> 	<-u|--uuid=UUID> [-p|--parent=PARENT]
> 		If a UUID exists for multiple parents, all will be removed
> 		unless a parent is specified.  Running devices are unaffected
> 		by this command.
> modify		Modify the config for a defined mdev device.  Options:
> 	<-u|--uuid=UUID> [-p|--parent=PARENT] [-t|--type=TYPE] \
> 	[[-a|--auto]|[-m|--manual]]
> 		The parent option further identifies a UUID if it is not
> 		unique, the parent for a device cannot be modified via this
> 		command, undefine and re-define should be used instead.  The
> 		mdev type and startup mode can be modified.  Running devices
> 		are unaffected by this command.
> start		Start an mdev device.  Options:
> 	<-u|--uuid=UUID> [-p|--parent=PARENT] [-t|--type=TYPE]
> 		If the UUID is previously defined and unique, the UUID is
> 		sufficient to start the device (UUIDs may not collide between
> 		running devices).  If a UUID is used in multiple defined
> 		configs, the parent device is necessary to identify the config.
> 		Specifying UUID, PARENT, and TYPE allows devices to be started
> 		regardless of a previously defined config (ie. transient mdevs).
> stop		Stop an mdev device.  Options:
> 	<-u|--uuid=UUID>
> list		List mdev devices.  Options:
> 	[-d|--defined]|[-t|--types]
> 		With no options, information about the currently running mdev
> 		devices is provided.  Specifing DEFINED lists the configuration
> 		of defined devices, regardless of their running state.
> 		Specifying TYPES lists the mdev types provided by the currently
> 		registered mdev parent devices on the system.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-26  9:58   ` Cornelia Huck
@ 2019-06-26 14:37     ` Alex Williamson
  2019-06-27  1:53       ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-26 14:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Matthew Rosato

On Wed, 26 Jun 2019 11:58:06 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Tue, 25 Jun 2019 16:52:51 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Hi,
> > 
> > Based on the discussions we've had, I've rewritten the bulk of
> > mdevctl.  I think it largely does everything we want now, modulo
> > devices that will need some sort of 1:N values per key for
> > configuration in the config file versus the 1:1 key:value setup we
> > currently have (so don't consider the format final just yet).  
> 
> We might want to factor out that config format handling while we're
> trying to finalize it.
> 
> cc:ing Matt for his awareness. I'm currently not quite sure how to
> handle those vfio-ap "write several values to an attribute one at a
> time" requirements. Maybe 1:N key:value is the way to go; maybe we
> need/want JSON or something like that.

Maybe we should just do JSON for future flexibility.  I assume there
are lots of helpers that should make it easy even from a bash script.
I'll look at that next.

> > We now support "transient" devices and there's no distinction or
> > difference in handling of such devices whether they're created by
> > mdevctl or externally.  All devices will also have systemd management,
> > though systemd is no longer required, it's used if available.  The
> > instance name used for systemd device units has also changed to allow
> > us to use BindsTo= such that services are not only created, but are
> > also removed if the device is removed.  Unfortunately it's not a simple
> > UUID via the systemd route any longer.  
> 
> That's a bit unfortunate; however, making it workable without systemd
> certainly is a good thing :)

The "decoder ring" is simply that the instance value takes the systemd
device path of the mdev device itself.  The mdev device is named by the
uuid and is created under the parent device, so we just need to get the
realpath of the parent, append the uuid, and encode it with
systemd-escape.  It's not magic, but it's  not trivial on the command
line either.  We could add a command to mdevctl to print this, but it
doesn't make much sense to call into mdevctl for that and not simply
control the device via mdevctl.

> > Since the original posting, the project has moved from my personal
> > github to here:
> > 
> > https://github.com/mdevctl/mdevctl
> > 
> > Please see the README there for overview of the new commands and
> > example of their usage.  There is no attempt to maintain backwards
> > compatibility with previous versions, this tool is in its infancy.
> > Also since the original posting, RPM packaging is included, so simply
> > run 'make rpm' and install the resulting package.  
> 
> Nice.
> 
> > 
> > Highlights of this version include proper argument parsing via getopts
> > so that options can be provided in any order.  I'm still using the
> > format 'mdevctl {command} [options]' but now it's consistent that all
> > the options come after the command, in any order.  I think this is
> > relatively consistent with a variety of other tools.  
> 
> Parsing via getops is also very nice.
> 
> > 
> > Devices are no longer automatically persisted, we handle them as
> > transient, but we also can promote them to persistent through the
> > 'define' command.  The define, undefine, and modify commands all
> > operate only on the config file, so that we can define separate from
> > creating.  When promoting from a transient to defined device, we can
> > use the existing device to create the config.  Both the type and the
> > startup of a device can be modified in the config, without affecting
> > the running device.
> > 
> > Starting an mdev device no longer relies exclusively on a saved config,
> > the device can be fully specified via options to create a transient
> > device.  Specifying only a uuid is also sufficient for a defined
> > device.  Some consideration has also been given to uuid collisions.
> > The mdev interface in the kernel prevents multiple mdevs with the same
> > uuid running concurrently, but mdevctl allows mdevs to be defined with
> > the same uuid under separate parent devices.  Some options therefore
> > allow both a uuid and parent to be specified and require this if the
> > uuid alone is ambiguous.  Clearly starting two such devices at the same
> > time will fail and is left to higher level tools to manage, just like
> > the ability to define more devices than there are available instances on
> > the host system.  
> 
> I still have to look into the details of this.
> 
> > 
> > The stop and list commands are largely the same ideas as previous
> > though the semantics are completely different.  Listing running devices
> > now notes which are defined versus transient.  Perhaps it might also be
> > useful when listing defined devices to note which are running.  
> 
> Yes, I think it would be useful.
> 
> > 
> > The sbin/libexec split of mdevctl has been squashed.  There are some
> > commands in the script that are currently only intended to be used from
> > udev or systemd, these are simply excluded from the help.  It's
> > possible we may want to promote the start-parent-mdevs command out of
> > this class, but the rest are specifically systemd helpers.
> > 
> > I'll include the current help test message below for further semantic
> > details, but please have a look-see, or better yet give it a try.  
> 
> Had a quick look, sent two pull requests with some minor tweaks. Still
> have to do a proper review, and actually try the thing on an s390.

Pull requests merged, thanks!

> > 
> > PS - I'm looking at adding udev change events when a device registers
> > or unregisters with the mdev core, which should help us know when to
> > trigger creation of persistent, auto started devices.  That support is
> > included here with the MDEV_STATE="registered|unregistered" environment
> > values.  Particularly, kvmgt now supports dynamic loading an unloading,
> > so as long as the enable_gvt=1 option is provided to the i915 driver
> > mdev support can come and go independent of the parent device.  The
> > change uevents are necessary to trigger on that, so I'd appreciate any
> > feedback on those as well.  Until then, the persistence of mdevctl
> > really depends on mdev support on the parent device being _completely_
> > setup prior to processing the udev rules.  
> 
> I'll need to think about this. Do you have some preliminary patches
> somewhere?

I posted what I'm working with, you're cc'd, but for the benefit of the
rest of the list: https://lkml.org/lkml/2019/6/26/574
Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-26 14:37     ` Alex Williamson
@ 2019-06-27  1:53       ` Alex Williamson
  2019-06-27 12:26         ` Cornelia Huck
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-27  1:53 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Matthew Rosato

On Wed, 26 Jun 2019 08:37:20 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Wed, 26 Jun 2019 11:58:06 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Tue, 25 Jun 2019 16:52:51 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > Hi,
> > > 
> > > Based on the discussions we've had, I've rewritten the bulk of
> > > mdevctl.  I think it largely does everything we want now, modulo
> > > devices that will need some sort of 1:N values per key for
> > > configuration in the config file versus the 1:1 key:value setup we
> > > currently have (so don't consider the format final just yet).    
> > 
> > We might want to factor out that config format handling while we're
> > trying to finalize it.
> > 
> > cc:ing Matt for his awareness. I'm currently not quite sure how to
> > handle those vfio-ap "write several values to an attribute one at a
> > time" requirements. Maybe 1:N key:value is the way to go; maybe we
> > need/want JSON or something like that.  
> 
> Maybe we should just do JSON for future flexibility.  I assume there
> are lots of helpers that should make it easy even from a bash script.
> I'll look at that next.

Done.  Throw away any old mdev config files, we use JSON now.  The per
mdev config now looks like this:

{
  "mdev_type": "i915-GVTg_V4_8",
  "start": "auto"
}

My expectation, and what I've already pre-enabled support in set_key
and get_key functions, is that we'd use arrays for values, so we might
have:

  "new_key": ["value1", "value2"]

set_key will automatically convert a comma separated list of values
into such an array, so I'm thinking this would be specified by the user
as:

# mdevctl modify -u UUID --key=new_key --value=value1,value2

We should think about whether ordering is important and maybe
incorporate that into key naming conventions or come up with some
syntax for specifying startup blocks.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27  1:53       ` Alex Williamson
@ 2019-06-27 12:26         ` Cornelia Huck
  2019-06-27 15:00           ` Matthew Rosato
  0 siblings, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-06-27 12:26 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Matthew Rosato

On Wed, 26 Jun 2019 19:53:50 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Wed, 26 Jun 2019 08:37:20 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Wed, 26 Jun 2019 11:58:06 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> > > On Tue, 25 Jun 2019 16:52:51 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:
> > >     
> > > > Hi,
> > > > 
> > > > Based on the discussions we've had, I've rewritten the bulk of
> > > > mdevctl.  I think it largely does everything we want now, modulo
> > > > devices that will need some sort of 1:N values per key for
> > > > configuration in the config file versus the 1:1 key:value setup we
> > > > currently have (so don't consider the format final just yet).      
> > > 
> > > We might want to factor out that config format handling while we're
> > > trying to finalize it.
> > > 
> > > cc:ing Matt for his awareness. I'm currently not quite sure how to
> > > handle those vfio-ap "write several values to an attribute one at a
> > > time" requirements. Maybe 1:N key:value is the way to go; maybe we
> > > need/want JSON or something like that.    
> > 
> > Maybe we should just do JSON for future flexibility.  I assume there
> > are lots of helpers that should make it easy even from a bash script.
> > I'll look at that next.  
> 
> Done.  Throw away any old mdev config files, we use JSON now. 

The code changes look quite straightforward, thanks.

> The per
> mdev config now looks like this:
> 
> {
>   "mdev_type": "i915-GVTg_V4_8",
>   "start": "auto"
> }
> 
> My expectation, and what I've already pre-enabled support in set_key
> and get_key functions, is that we'd use arrays for values, so we might
> have:
> 
>   "new_key": ["value1", "value2"]
> 
> set_key will automatically convert a comma separated list of values
> into such an array, so I'm thinking this would be specified by the user
> as:
> 
> # mdevctl modify -u UUID --key=new_key --value=value1,value2

Looks sensible.

For vfio-ap, we'd probably end up with something like the following:

{
  "mdev_type": "vfio_ap-passthrough",
  "start": "auto",
  "assign_adapter": ["5", "6"],
  "assign_domain": ["4", "0xab"]
}

(following the Guest1 example in the kernel documentation)

<As an aside, what should happen if e.g "assign_adapter" is set to
["6", "7"]? Remove 5, add 7? Remove all values, then set the new ones?
Similar for deleting the "assign_adapter" key. We have an
"unassign_adapter" attribute, but this is not something we can infer
automatically; we need to know that we're dealing with an vfio-ap
matrix device...>

> 
> We should think about whether ordering is important and maybe
> incorporate that into key naming conventions or come up with some
> syntax for specifying startup blocks.  Thanks,
> 
> Alex

Hm...

{
  "foo": "1",
  "bar": "42",
  "baz": {
    "depends": ["foo", "bar"],
    "value": "plahh"
  }
}

Something like that?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27 12:26         ` Cornelia Huck
@ 2019-06-27 15:00           ` Matthew Rosato
  2019-06-27 15:38             ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Matthew Rosato @ 2019-06-27 15:00 UTC (permalink / raw)
  To: Cornelia Huck, Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On 6/27/19 8:26 AM, Cornelia Huck wrote:
> On Wed, 26 Jun 2019 19:53:50 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
>> On Wed, 26 Jun 2019 08:37:20 -0600
>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>
>>> On Wed, 26 Jun 2019 11:58:06 +0200
>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>   
>>>> On Tue, 25 Jun 2019 16:52:51 -0600
>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>     
>>>>> Hi,
>>>>>
>>>>> Based on the discussions we've had, I've rewritten the bulk of
>>>>> mdevctl.  I think it largely does everything we want now, modulo
>>>>> devices that will need some sort of 1:N values per key for
>>>>> configuration in the config file versus the 1:1 key:value setup we
>>>>> currently have (so don't consider the format final just yet).      
>>>>
>>>> We might want to factor out that config format handling while we're
>>>> trying to finalize it.
>>>>
>>>> cc:ing Matt for his awareness. I'm currently not quite sure how to
>>>> handle those vfio-ap "write several values to an attribute one at a
>>>> time" requirements. Maybe 1:N key:value is the way to go; maybe we
>>>> need/want JSON or something like that.    
>>>
>>> Maybe we should just do JSON for future flexibility.  I assume there
>>> are lots of helpers that should make it easy even from a bash script.
>>> I'll look at that next.  
>>
>> Done.  Throw away any old mdev config files, we use JSON now. 
> 
> The code changes look quite straightforward, thanks.
> 
>> The per
>> mdev config now looks like this:
>>
>> {
>>   "mdev_type": "i915-GVTg_V4_8",
>>   "start": "auto"
>> }
>>
>> My expectation, and what I've already pre-enabled support in set_key
>> and get_key functions, is that we'd use arrays for values, so we might
>> have:
>>
>>   "new_key": ["value1", "value2"]
>>
>> set_key will automatically convert a comma separated list of values
>> into such an array, so I'm thinking this would be specified by the user
>> as:
>>
>> # mdevctl modify -u UUID --key=new_key --value=value1,value2
> 
> Looks sensible.
> 
> For vfio-ap, we'd probably end up with something like the following:
> 
> {
>   "mdev_type": "vfio_ap-passthrough",
>   "start": "auto",
>   "assign_adapter": ["5", "6"],
>   "assign_domain": ["4", "0xab"]
> }
> 
> (following the Guest1 example in the kernel documentation)
> 
> <As an aside, what should happen if e.g "assign_adapter" is set to
> ["6", "7"]? Remove 5, add 7? Remove all values, then set the new ones?

IMO remove 5, add 7 would make the most sense.  I'm not sure that doing
an unassign of all adapters (effectively removing all APQNs) followed by
an assign of the new ones would work nicely with Tony's vfio-ap dynamic
configuration patches.

> Similar for deleting the "assign_adapter" key. We have an
> "unassign_adapter" attribute, but this is not something we can infer
> automatically; we need to know that we're dealing with an vfio-ap
> matrix device...>
> 
>>
>> We should think about whether ordering is important and maybe
>> incorporate that into key naming conventions or come up with some
>> syntax for specifying startup blocks.  Thanks,
>>
>> Alex
> 
> Hm...
> 
> {
>   "foo": "1",
>   "bar": "42",
>   "baz": {
>     "depends": ["foo", "bar"],
>     "value": "plahh"
>   }
> }
> 
> Something like that?
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27 15:00           ` Matthew Rosato
@ 2019-06-27 15:38             ` Alex Williamson
  2019-06-27 16:13               ` Matthew Rosato
  2019-06-27 21:15               ` Alex Williamson
  0 siblings, 2 replies; 46+ messages in thread
From: Alex Williamson @ 2019-06-27 15:38 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: Cornelia Huck, kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety,
	Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Thu, 27 Jun 2019 11:00:31 -0400
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 6/27/19 8:26 AM, Cornelia Huck wrote:
> > On Wed, 26 Jun 2019 19:53:50 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> >> On Wed, 26 Jun 2019 08:37:20 -0600
> >> Alex Williamson <alex.williamson@redhat.com> wrote:
> >>  
> >>> On Wed, 26 Jun 2019 11:58:06 +0200
> >>> Cornelia Huck <cohuck@redhat.com> wrote:
> >>>     
> >>>> On Tue, 25 Jun 2019 16:52:51 -0600
> >>>> Alex Williamson <alex.williamson@redhat.com> wrote:
> >>>>       
> >>>>> Hi,
> >>>>>
> >>>>> Based on the discussions we've had, I've rewritten the bulk of
> >>>>> mdevctl.  I think it largely does everything we want now, modulo
> >>>>> devices that will need some sort of 1:N values per key for
> >>>>> configuration in the config file versus the 1:1 key:value setup we
> >>>>> currently have (so don't consider the format final just yet).        
> >>>>
> >>>> We might want to factor out that config format handling while we're
> >>>> trying to finalize it.
> >>>>
> >>>> cc:ing Matt for his awareness. I'm currently not quite sure how to
> >>>> handle those vfio-ap "write several values to an attribute one at a
> >>>> time" requirements. Maybe 1:N key:value is the way to go; maybe we
> >>>> need/want JSON or something like that.      
> >>>
> >>> Maybe we should just do JSON for future flexibility.  I assume there
> >>> are lots of helpers that should make it easy even from a bash script.
> >>> I'll look at that next.    
> >>
> >> Done.  Throw away any old mdev config files, we use JSON now.   
> > 
> > The code changes look quite straightforward, thanks.
> >   
> >> The per
> >> mdev config now looks like this:
> >>
> >> {
> >>   "mdev_type": "i915-GVTg_V4_8",
> >>   "start": "auto"
> >> }
> >>
> >> My expectation, and what I've already pre-enabled support in set_key
> >> and get_key functions, is that we'd use arrays for values, so we might
> >> have:
> >>
> >>   "new_key": ["value1", "value2"]
> >>
> >> set_key will automatically convert a comma separated list of values
> >> into such an array, so I'm thinking this would be specified by the user
> >> as:
> >>
> >> # mdevctl modify -u UUID --key=new_key --value=value1,value2  
> > 
> > Looks sensible.
> > 
> > For vfio-ap, we'd probably end up with something like the following:
> > 
> > {
> >   "mdev_type": "vfio_ap-passthrough",
> >   "start": "auto",
> >   "assign_adapter": ["5", "6"],
> >   "assign_domain": ["4", "0xab"]
> > }
> > 
> > (following the Guest1 example in the kernel documentation)
> > 
> > <As an aside, what should happen if e.g "assign_adapter" is set to
> > ["6", "7"]? Remove 5, add 7? Remove all values, then set the new ones?  
> 
> IMO remove 5, add 7 would make the most sense.  I'm not sure that doing
> an unassign of all adapters (effectively removing all APQNs) followed by
> an assign of the new ones would work nicely with Tony's vfio-ap dynamic
> configuration patches.

Are we conflating operating on the config file versus operating on the
device?  I was thinking that setting a new key value replaces the
existing key, because anything else adds unnecessary complication to
the code and command line.  So in the above example, if the user
specified:

  mdevctl modify -u UUID --key=assign_adapter --value=6,7

The new value is simply ["6", "7"].  This would take effect the next
time the device is started.  We haven't yet considered how to change
running devices, but I think the semantics we have since the respin of
mdevctl separate saved config vs running devices in order to generalize
the support of transient devices.

> > Similar for deleting the "assign_adapter" key. We have an
> > "unassign_adapter" attribute, but this is not something we can infer
> > automatically; we need to know that we're dealing with an vfio-ap
> > matrix device...>
> >   
> >>
> >> We should think about whether ordering is important and maybe
> >> incorporate that into key naming conventions or come up with some
> >> syntax for specifying startup blocks.  Thanks,
> >>
> >> Alex  
> > 
> > Hm...
> > 
> > {
> >   "foo": "1",
> >   "bar": "42",
> >   "baz": {
> >     "depends": ["foo", "bar"],
> >     "value": "plahh"
> >   }
> > }
> > 
> > Something like that?

I'm not sure yet.  I think we need to look at what's feasible (and
easy) with jq.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27 15:38             ` Alex Williamson
@ 2019-06-27 16:13               ` Matthew Rosato
  2019-06-27 21:15               ` Alex Williamson
  1 sibling, 0 replies; 46+ messages in thread
From: Matthew Rosato @ 2019-06-27 16:13 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety,
	Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On 6/27/19 11:38 AM, Alex Williamson wrote:
> On Thu, 27 Jun 2019 11:00:31 -0400
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> On 6/27/19 8:26 AM, Cornelia Huck wrote:
>>> On Wed, 26 Jun 2019 19:53:50 -0600
>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>   
>>>> On Wed, 26 Jun 2019 08:37:20 -0600
>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>  
>>>>> On Wed, 26 Jun 2019 11:58:06 +0200
>>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>>     
>>>>>> On Tue, 25 Jun 2019 16:52:51 -0600
>>>>>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>>>>>       
>>>>>>> Hi,
>>>>>>>
>>>>>>> Based on the discussions we've had, I've rewritten the bulk of
>>>>>>> mdevctl.  I think it largely does everything we want now, modulo
>>>>>>> devices that will need some sort of 1:N values per key for
>>>>>>> configuration in the config file versus the 1:1 key:value setup we
>>>>>>> currently have (so don't consider the format final just yet).        
>>>>>>
>>>>>> We might want to factor out that config format handling while we're
>>>>>> trying to finalize it.
>>>>>>
>>>>>> cc:ing Matt for his awareness. I'm currently not quite sure how to
>>>>>> handle those vfio-ap "write several values to an attribute one at a
>>>>>> time" requirements. Maybe 1:N key:value is the way to go; maybe we
>>>>>> need/want JSON or something like that.      
>>>>>
>>>>> Maybe we should just do JSON for future flexibility.  I assume there
>>>>> are lots of helpers that should make it easy even from a bash script.
>>>>> I'll look at that next.    
>>>>
>>>> Done.  Throw away any old mdev config files, we use JSON now.   
>>>
>>> The code changes look quite straightforward, thanks.
>>>   
>>>> The per
>>>> mdev config now looks like this:
>>>>
>>>> {
>>>>   "mdev_type": "i915-GVTg_V4_8",
>>>>   "start": "auto"
>>>> }
>>>>
>>>> My expectation, and what I've already pre-enabled support in set_key
>>>> and get_key functions, is that we'd use arrays for values, so we might
>>>> have:
>>>>
>>>>   "new_key": ["value1", "value2"]
>>>>
>>>> set_key will automatically convert a comma separated list of values
>>>> into such an array, so I'm thinking this would be specified by the user
>>>> as:
>>>>
>>>> # mdevctl modify -u UUID --key=new_key --value=value1,value2  
>>>
>>> Looks sensible.
>>>
>>> For vfio-ap, we'd probably end up with something like the following:
>>>
>>> {
>>>   "mdev_type": "vfio_ap-passthrough",
>>>   "start": "auto",
>>>   "assign_adapter": ["5", "6"],
>>>   "assign_domain": ["4", "0xab"]
>>> }
>>>
>>> (following the Guest1 example in the kernel documentation)
>>>
>>> <As an aside, what should happen if e.g "assign_adapter" is set to
>>> ["6", "7"]? Remove 5, add 7? Remove all values, then set the new ones?  
>>
>> IMO remove 5, add 7 would make the most sense.  I'm not sure that doing
>> an unassign of all adapters (effectively removing all APQNs) followed by
>> an assign of the new ones would work nicely with Tony's vfio-ap dynamic
>> configuration patches.
> 
> Are we conflating operating on the config file versus operating on the
> device?  I was thinking that setting a new key value replaces the
> existing key, because anything else adds unnecessary complication to
> the code and command line.  So in the above example, if the user
> specified:
> 
>   mdevctl modify -u UUID --key=assign_adapter --value=6,7
> 
> The new value is simply ["6", "7"].  This would take effect the next
> time the device is started.  We haven't yet considered how to change
> running devices, but I think the semantics we have since the respin of
> mdevctl separate saved config vs running devices in order to generalize
> the support of transient devices.

Yeah, my comment was aimed specifically at changes to a running device.
 When considering only the config file I agree: the new key value can
just replace the existing key value.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27 15:38             ` Alex Williamson
  2019-06-27 16:13               ` Matthew Rosato
@ 2019-06-27 21:15               ` Alex Williamson
  2019-06-28  1:57                 ` Alex Williamson
  1 sibling, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-27 21:15 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Thu, 27 Jun 2019 09:38:32 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:
> > On 6/27/19 8:26 AM, Cornelia Huck wrote:  
> > > 
> > > {
> > >   "foo": "1",
> > >   "bar": "42",
> > >   "baz": {
> > >     "depends": ["foo", "bar"],
> > >     "value": "plahh"
> > >   }
> > > }
> > > 
> > > Something like that?  
> 
> I'm not sure yet.  I think we need to look at what's feasible (and
> easy) with jq.  Thanks,

I think it's not too much trouble to remove and insert into arrays, so
what if we were to define the config as:

{
  "mdev_type":"vendor-type",
  "start":"auto",
  "attrs": [
      {"attrX":["Xvalue1","Xvalue2"]},
      {"dir/attrY": "Yvalue1"},
      {"attrX": "Xvalue3"}
    ]
}

"attr" here would define sysfs attributes under the device.  The array
would be processed in order, so in the above example we'd do the
following:

 1. echo Xvalue1 > attrX
 2. echo Xvalue2 > attrX
 3. echo Yvalue1 > dir/attrY
 4. echo Xvalue3 > attrX

When starting the device mdevctl would simply walk the array, if the
attribute key exists write the value(s).  If a write fails or the
attribute doesn't exist, remove the device and report error.

I think it's easiest with jq to manipulate arrays by removing and
inserting by index.  Also if we end up with something like above, it's
ambiguous if we reference the "attrX" key.  So perhaps we add the
following options to the modify command:

--addattr=ATTRIBUTE --delattr --index=INDEX --value=VALUE1[,VALUE2]

We could handle it like a stack, so if --index is not supplied, add to
the end or remove from the end.  If --index is provided, delete that
index or add the attribute at that index.  So if you had the above and
wanted to remove Xvalue1 but keep the ordering, you'd do:

--delattr --index=0
--addattr --index=0 --value=Xvalue2

Which should results in:

  "attrs": [
      {"attrX": "Xvalue2"},
      {"dir/attrY": "Yvalue1"},
      {"attrX": "Xvalue3"}
    ]

If we want to modify a running device, I'm thinking we probably want a
new command and options --attr=ATTRIBUTE --value=VALUE might suffice.

Do we need to support something like this for the 'start' command or
should we leave that for simple devices and require a sequence of:

# mdevctl define ...
# mdevctl modify --addattr...
...
# mdevctl start
# mdevctl undefine

This is effectively the long way to get a transient device.  Otherwise
we'd need to figure out how to have --attr --value appear multiple
times on the start command line.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-27 21:15               ` Alex Williamson
@ 2019-06-28  1:57                 ` Alex Williamson
  2019-06-28  9:06                   ` Cornelia Huck
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-28  1:57 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Thu, 27 Jun 2019 15:15:02 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Thu, 27 Jun 2019 09:38:32 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> > > On 6/27/19 8:26 AM, Cornelia Huck wrote:    
> > > > 
> > > > {
> > > >   "foo": "1",
> > > >   "bar": "42",
> > > >   "baz": {
> > > >     "depends": ["foo", "bar"],
> > > >     "value": "plahh"
> > > >   }
> > > > }
> > > > 
> > > > Something like that?    
> > 
> > I'm not sure yet.  I think we need to look at what's feasible (and
> > easy) with jq.  Thanks,  
> 
> I think it's not too much trouble to remove and insert into arrays, so
> what if we were to define the config as:
> 
> {
>   "mdev_type":"vendor-type",
>   "start":"auto",
>   "attrs": [
>       {"attrX":["Xvalue1","Xvalue2"]},
>       {"dir/attrY": "Yvalue1"},
>       {"attrX": "Xvalue3"}
>     ]
> }
> 
> "attr" here would define sysfs attributes under the device.  The array
> would be processed in order, so in the above example we'd do the
> following:
> 
>  1. echo Xvalue1 > attrX
>  2. echo Xvalue2 > attrX
>  3. echo Yvalue1 > dir/attrY
>  4. echo Xvalue3 > attrX
> 
> When starting the device mdevctl would simply walk the array, if the
> attribute key exists write the value(s).  If a write fails or the
> attribute doesn't exist, remove the device and report error.
> 
> I think it's easiest with jq to manipulate arrays by removing and
> inserting by index.  Also if we end up with something like above, it's
> ambiguous if we reference the "attrX" key.  So perhaps we add the
> following options to the modify command:
> 
> --addattr=ATTRIBUTE --delattr --index=INDEX --value=VALUE1[,VALUE2]
> 
> We could handle it like a stack, so if --index is not supplied, add to
> the end or remove from the end.  If --index is provided, delete that
> index or add the attribute at that index.  So if you had the above and
> wanted to remove Xvalue1 but keep the ordering, you'd do:
> 
> --delattr --index=0
> --addattr --index=0 --value=Xvalue2
> 
> Which should results in:
> 
>   "attrs": [
>       {"attrX": "Xvalue2"},
>       {"dir/attrY": "Yvalue1"},
>       {"attrX": "Xvalue3"}
>     ]
> 
> If we want to modify a running device, I'm thinking we probably want a
> new command and options --attr=ATTRIBUTE --value=VALUE might suffice.
> 
> Do we need to support something like this for the 'start' command or
> should we leave that for simple devices and require a sequence of:
> 
> # mdevctl define ...
> # mdevctl modify --addattr...
> ...
> # mdevctl start
> # mdevctl undefine
> 
> This is effectively the long way to get a transient device.  Otherwise
> we'd need to figure out how to have --attr --value appear multiple
> times on the start command line.  Thanks,

This is now implemented, and yes you can specify '--addattr remove
--value 1' and mdevctl will immediately remove the device after it's
created (more power to the admin).  Listing defined devices also lists
any attributes defined for easy inspection.  It is also possible to
override the conversion of comma separated values into an array by
encoding and escaping the comma.  It's a little cumbersome, but
possible in case a driver isn't fully on board with the one attribute,
one value rule of sysfs.  Does this work for vfio-ap?  I also still
need to check if this allows an NVIDIA vGPU mdev to be configured such
that the framerate limiter can be automatically controlled.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-28  1:57                 ` Alex Williamson
@ 2019-06-28  9:06                   ` Cornelia Huck
  2019-06-28 14:01                     ` Matthew Rosato
  2019-06-28 17:05                     ` Alex Williamson
  0 siblings, 2 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-06-28  9:06 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Thu, 27 Jun 2019 19:57:04 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Thu, 27 Jun 2019 15:15:02 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Thu, 27 Jun 2019 09:38:32 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:  
> > > > On 6/27/19 8:26 AM, Cornelia Huck wrote:      
> > > > > 
> > > > > {
> > > > >   "foo": "1",
> > > > >   "bar": "42",
> > > > >   "baz": {
> > > > >     "depends": ["foo", "bar"],
> > > > >     "value": "plahh"
> > > > >   }
> > > > > }
> > > > > 
> > > > > Something like that?      
> > > 
> > > I'm not sure yet.  I think we need to look at what's feasible (and
> > > easy) with jq.  Thanks,    
> > 
> > I think it's not too much trouble to remove and insert into arrays, so
> > what if we were to define the config as:
> > 
> > {
> >   "mdev_type":"vendor-type",
> >   "start":"auto",
> >   "attrs": [
> >       {"attrX":["Xvalue1","Xvalue2"]},
> >       {"dir/attrY": "Yvalue1"},
> >       {"attrX": "Xvalue3"}
> >     ]
> > }
> > 
> > "attr" here would define sysfs attributes under the device.  The array
> > would be processed in order, so in the above example we'd do the
> > following:
> > 
> >  1. echo Xvalue1 > attrX
> >  2. echo Xvalue2 > attrX
> >  3. echo Yvalue1 > dir/attrY
> >  4. echo Xvalue3 > attrX
> > 
> > When starting the device mdevctl would simply walk the array, if the
> > attribute key exists write the value(s).  If a write fails or the
> > attribute doesn't exist, remove the device and report error.

Yes, I think it makes sense to fail the startup of a device where we
cannot set all attributes to the requested values.

> > 
> > I think it's easiest with jq to manipulate arrays by removing and
> > inserting by index.  Also if we end up with something like above, it's
> > ambiguous if we reference the "attrX" key.  So perhaps we add the
> > following options to the modify command:
> > 
> > --addattr=ATTRIBUTE --delattr --index=INDEX --value=VALUE1[,VALUE2]
> > 
> > We could handle it like a stack, so if --index is not supplied, add to
> > the end or remove from the end.  If --index is provided, delete that
> > index or add the attribute at that index.  So if you had the above and
> > wanted to remove Xvalue1 but keep the ordering, you'd do:
> > 
> > --delattr --index=0
> > --addattr --index=0 --value=Xvalue2
> > 
> > Which should results in:
> > 
> >   "attrs": [
> >       {"attrX": "Xvalue2"},
> >       {"dir/attrY": "Yvalue1"},
> >       {"attrX": "Xvalue3"}
> >     ]

Modifying by index looks reasonable; I just sent a pull request to
print the index of an attribute out as well, so it is easier to specify
the right attribute to modify.

> > 
> > If we want to modify a running device, I'm thinking we probably want a
> > new command and options --attr=ATTRIBUTE --value=VALUE might suffice.
> > 
> > Do we need to support something like this for the 'start' command or
> > should we leave that for simple devices and require a sequence of:
> > 
> > # mdevctl define ...
> > # mdevctl modify --addattr...
> > ...
> > # mdevctl start
> > # mdevctl undefine
> > 
> > This is effectively the long way to get a transient device.  Otherwise
> > we'd need to figure out how to have --attr --value appear multiple
> > times on the start command line.  Thanks,  

What do you think of a way to specify JSON for the attributes directly
on the command line? Or would it be better to just edit the config
files directly?

> 
> This is now implemented, and yes you can specify '--addattr remove
> --value 1' and mdevctl will immediately remove the device after it's
> created (more power to the admin).  Listing defined devices also lists

Fun ;)

> any attributes defined for easy inspection.  It is also possible to
> override the conversion of comma separated values into an array by
> encoding and escaping the comma.  It's a little cumbersome, but
> possible in case a driver isn't fully on board with the one attribute,
> one value rule of sysfs.  Does this work for vfio-ap?  I also still

I do not have ap devices to actually test this with; but defining a
device and adding attributes seems to work.

> need to check if this allows an NVIDIA vGPU mdev to be configured such
> that the framerate limiter can be automatically controlled.  Thanks,
> 
> Alex


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-28  9:06                   ` Cornelia Huck
@ 2019-06-28 14:01                     ` Matthew Rosato
  2019-06-28 17:05                     ` Alex Williamson
  1 sibling, 0 replies; 46+ messages in thread
From: Matthew Rosato @ 2019-06-28 14:01 UTC (permalink / raw)
  To: Cornelia Huck, Alex Williamson
  Cc: kvm, Libvirt Devel, Kirti Wankhede, Erik Skultety, Pavel Hrdina,
	Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On 6/28/19 5:06 AM, Cornelia Huck wrote:
> On Thu, 27 Jun 2019 19:57:04 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
>> On Thu, 27 Jun 2019 15:15:02 -0600
>> Alex Williamson <alex.williamson@redhat.com> wrote:
>>
>>> On Thu, 27 Jun 2019 09:38:32 -0600
>>> Alex Williamson <alex.williamson@redhat.com> wrote:  
>>>>> On 6/27/19 8:26 AM, Cornelia Huck wrote:      
>>>>>>
>>>>>> {
>>>>>>   "foo": "1",
>>>>>>   "bar": "42",
>>>>>>   "baz": {
>>>>>>     "depends": ["foo", "bar"],
>>>>>>     "value": "plahh"
>>>>>>   }
>>>>>> }
>>>>>>
>>>>>> Something like that?      
>>>>
>>>> I'm not sure yet.  I think we need to look at what's feasible (and
>>>> easy) with jq.  Thanks,    
>>>
>>> I think it's not too much trouble to remove and insert into arrays, so
>>> what if we were to define the config as:
>>>
>>> {
>>>   "mdev_type":"vendor-type",
>>>   "start":"auto",
>>>   "attrs": [
>>>       {"attrX":["Xvalue1","Xvalue2"]},
>>>       {"dir/attrY": "Yvalue1"},
>>>       {"attrX": "Xvalue3"}
>>>     ]
>>> }
>>>
>>> "attr" here would define sysfs attributes under the device.  The array
>>> would be processed in order, so in the above example we'd do the
>>> following:
>>>
>>>  1. echo Xvalue1 > attrX
>>>  2. echo Xvalue2 > attrX
>>>  3. echo Yvalue1 > dir/attrY
>>>  4. echo Xvalue3 > attrX
>>>
>>> When starting the device mdevctl would simply walk the array, if the
>>> attribute key exists write the value(s).  If a write fails or the
>>> attribute doesn't exist, remove the device and report error.
> 
> Yes, I think it makes sense to fail the startup of a device where we
> cannot set all attributes to the requested values.
> 
>>>
>>> I think it's easiest with jq to manipulate arrays by removing and
>>> inserting by index.  Also if we end up with something like above, it's
>>> ambiguous if we reference the "attrX" key.  So perhaps we add the
>>> following options to the modify command:
>>>
>>> --addattr=ATTRIBUTE --delattr --index=INDEX --value=VALUE1[,VALUE2]
>>>
>>> We could handle it like a stack, so if --index is not supplied, add to
>>> the end or remove from the end.  If --index is provided, delete that
>>> index or add the attribute at that index.  So if you had the above and
>>> wanted to remove Xvalue1 but keep the ordering, you'd do:
>>>
>>> --delattr --index=0
>>> --addattr --index=0 --value=Xvalue2
>>>
>>> Which should results in:
>>>
>>>   "attrs": [
>>>       {"attrX": "Xvalue2"},
>>>       {"dir/attrY": "Yvalue1"},
>>>       {"attrX": "Xvalue3"}
>>>     ]
> 
> Modifying by index looks reasonable; I just sent a pull request to
> print the index of an attribute out as well, so it is easier to specify
> the right attribute to modify.
> 
>>>
>>> If we want to modify a running device, I'm thinking we probably want a
>>> new command and options --attr=ATTRIBUTE --value=VALUE might suffice.
>>>
>>> Do we need to support something like this for the 'start' command or
>>> should we leave that for simple devices and require a sequence of:
>>>
>>> # mdevctl define ...
>>> # mdevctl modify --addattr...
>>> ...
>>> # mdevctl start
>>> # mdevctl undefine
>>>
>>> This is effectively the long way to get a transient device.  Otherwise
>>> we'd need to figure out how to have --attr --value appear multiple
>>> times on the start command line.  Thanks,  
> 
> What do you think of a way to specify JSON for the attributes directly
> on the command line? Or would it be better to just edit the config
> files directly?
> 
>>
>> This is now implemented, and yes you can specify '--addattr remove
>> --value 1' and mdevctl will immediately remove the device after it's
>> created (more power to the admin).  Listing defined devices also lists
> 
> Fun ;)
> 
>> any attributes defined for easy inspection.  It is also possible to
>> override the conversion of comma separated values into an array by
>> encoding and escaping the comma.  It's a little cumbersome, but
>> possible in case a driver isn't fully on board with the one attribute,
>> one value rule of sysfs.  Does this work for vfio-ap?  I also still
> 
> I do not have ap devices to actually test this with; but defining a
> device and adding attributes seems to work.
> 

I pulled and did a quick test with vfio-ap, it's working.  I was able to
define, modify with the appropriate attributes and start, resulting in a
correctly-configured device.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-28  9:06                   ` Cornelia Huck
  2019-06-28 14:01                     ` Matthew Rosato
@ 2019-06-28 17:05                     ` Alex Williamson
  2019-07-01  8:20                       ` Cornelia Huck
  1 sibling, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-06-28 17:05 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Fri, 28 Jun 2019 11:06:48 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 27 Jun 2019 19:57:04 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Thu, 27 Jun 2019 15:15:02 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > On Thu, 27 Jun 2019 09:38:32 -0600
> > > Alex Williamson <alex.williamson@redhat.com> wrote:    
> > > > > On 6/27/19 8:26 AM, Cornelia Huck wrote:        
> > > > > > 
> > > > > > {
> > > > > >   "foo": "1",
> > > > > >   "bar": "42",
> > > > > >   "baz": {
> > > > > >     "depends": ["foo", "bar"],
> > > > > >     "value": "plahh"
> > > > > >   }
> > > > > > }
> > > > > > 
> > > > > > Something like that?        
> > > > 
> > > > I'm not sure yet.  I think we need to look at what's feasible (and
> > > > easy) with jq.  Thanks,      
> > > 
> > > I think it's not too much trouble to remove and insert into arrays, so
> > > what if we were to define the config as:
> > > 
> > > {
> > >   "mdev_type":"vendor-type",
> > >   "start":"auto",
> > >   "attrs": [
> > >       {"attrX":["Xvalue1","Xvalue2"]},
> > >       {"dir/attrY": "Yvalue1"},
> > >       {"attrX": "Xvalue3"}
> > >     ]
> > > }
> > > 
> > > "attr" here would define sysfs attributes under the device.  The array
> > > would be processed in order, so in the above example we'd do the
> > > following:
> > > 
> > >  1. echo Xvalue1 > attrX
> > >  2. echo Xvalue2 > attrX
> > >  3. echo Yvalue1 > dir/attrY
> > >  4. echo Xvalue3 > attrX
> > > 
> > > When starting the device mdevctl would simply walk the array, if the
> > > attribute key exists write the value(s).  If a write fails or the
> > > attribute doesn't exist, remove the device and report error.  
> 
> Yes, I think it makes sense to fail the startup of a device where we
> cannot set all attributes to the requested values.
> 
> > > 
> > > I think it's easiest with jq to manipulate arrays by removing and
> > > inserting by index.  Also if we end up with something like above, it's
> > > ambiguous if we reference the "attrX" key.  So perhaps we add the
> > > following options to the modify command:
> > > 
> > > --addattr=ATTRIBUTE --delattr --index=INDEX --value=VALUE1[,VALUE2]
> > > 
> > > We could handle it like a stack, so if --index is not supplied, add to
> > > the end or remove from the end.  If --index is provided, delete that
> > > index or add the attribute at that index.  So if you had the above and
> > > wanted to remove Xvalue1 but keep the ordering, you'd do:
> > > 
> > > --delattr --index=0
> > > --addattr --index=0 --value=Xvalue2
> > > 
> > > Which should results in:
> > > 
> > >   "attrs": [
> > >       {"attrX": "Xvalue2"},
> > >       {"dir/attrY": "Yvalue1"},
> > >       {"attrX": "Xvalue3"}
> > >     ]  
> 
> Modifying by index looks reasonable; I just sent a pull request to
> print the index of an attribute out as well, so it is easier to specify
> the right attribute to modify.

Pulled, I had initially separated the per line and interpreted them,
but it felt too verbose, so I went the full other direction or putting
them on a single line and using the compact json representation.  Maybe
this is a reasonable compromise.

> > > If we want to modify a running device, I'm thinking we probably want a
> > > new command and options --attr=ATTRIBUTE --value=VALUE might suffice.
> > > 
> > > Do we need to support something like this for the 'start' command or
> > > should we leave that for simple devices and require a sequence of:
> > > 
> > > # mdevctl define ...
> > > # mdevctl modify --addattr...
> > > ...
> > > # mdevctl start
> > > # mdevctl undefine
> > > 
> > > This is effectively the long way to get a transient device.  Otherwise
> > > we'd need to figure out how to have --attr --value appear multiple
> > > times on the start command line.  Thanks,    
> 
> What do you think of a way to specify JSON for the attributes directly
> on the command line? Or would it be better to just edit the config
> files directly?

Supplying json on the command like seems difficult, even doing so with
with jq requires escaping quotes.  It's not a very friendly
experience.  Maybe something more like how virsh allows snippets of xml
to be included, we could use jq to validate a json snippet provided
as a file and add it to the attributes... of course if we need to allow
libvirt to modify the json config files directly, the user could do
that as well.  Is there a use case you're thinking of?  Maybe we could
augment the 'list' command to take a --uuid and --dumpjson option and
the 'define' command to accept a --jsonfile.  Maybe the 'start' command
could accept the same, so a transient device could define attributes
w/o excessive command line options.  Thanks,

Alex

> > This is now implemented, and yes you can specify '--addattr remove
> > --value 1' and mdevctl will immediately remove the device after it's
> > created (more power to the admin).  Listing defined devices also lists  
> 
> Fun ;)
> 
> > any attributes defined for easy inspection.  It is also possible to
> > override the conversion of comma separated values into an array by
> > encoding and escaping the comma.  It's a little cumbersome, but
> > possible in case a driver isn't fully on board with the one attribute,
> > one value rule of sysfs.  Does this work for vfio-ap?  I also still  
> 
> I do not have ap devices to actually test this with; but defining a
> device and adding attributes seems to work.
> 
> > need to check if this allows an NVIDIA vGPU mdev to be configured such
> > that the framerate limiter can be automatically controlled.  Thanks,
> > 
> > Alex  
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-06-28 17:05                     ` Alex Williamson
@ 2019-07-01  8:20                       ` Cornelia Huck
  2019-07-01 14:40                         ` Alex Williamson
  0 siblings, 1 reply; 46+ messages in thread
From: Cornelia Huck @ 2019-07-01  8:20 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Fri, 28 Jun 2019 11:05:46 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 28 Jun 2019 11:06:48 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:

> > What do you think of a way to specify JSON for the attributes directly
> > on the command line? Or would it be better to just edit the config
> > files directly?  
> 
> Supplying json on the command like seems difficult, even doing so with
> with jq requires escaping quotes.  It's not a very friendly
> experience.  Maybe something more like how virsh allows snippets of xml
> to be included, we could use jq to validate a json snippet provided
> as a file and add it to the attributes... of course if we need to allow
> libvirt to modify the json config files directly, the user could do
> that as well.  Is there a use case you're thinking of?  Maybe we could
> augment the 'list' command to take a --uuid and --dumpjson option and
> the 'define' command to accept a --jsonfile.  Maybe the 'start' command
> could accept the same, so a transient device could define attributes
> w/o excessive command line options.  Thanks,
> 
> Alex

I was mostly thinking about complex configurations where writing a JSON
config would be simpler than adding a lot of command line options.
Something like dumping a JSON file and allowing to refer to a JSON file
as you suggested could be useful; but then, those very complex use
cases are probably already covered by editing the config file directly.
Not sure if it is worth the effort; maybe just leave it as it is for
now.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-07-01  8:20                       ` Cornelia Huck
@ 2019-07-01 14:40                         ` Alex Williamson
  2019-07-01 17:13                           ` Cornelia Huck
  0 siblings, 1 reply; 46+ messages in thread
From: Alex Williamson @ 2019-07-01 14:40 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Mon, 1 Jul 2019 10:20:43 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Fri, 28 Jun 2019 11:05:46 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > On Fri, 28 Jun 2019 11:06:48 +0200
> > Cornelia Huck <cohuck@redhat.com> wrote:  
> 
> > > What do you think of a way to specify JSON for the attributes directly
> > > on the command line? Or would it be better to just edit the config
> > > files directly?    
> > 
> > Supplying json on the command like seems difficult, even doing so with
> > with jq requires escaping quotes.  It's not a very friendly
> > experience.  Maybe something more like how virsh allows snippets of xml
> > to be included, we could use jq to validate a json snippet provided
> > as a file and add it to the attributes... of course if we need to allow
> > libvirt to modify the json config files directly, the user could do
> > that as well.  Is there a use case you're thinking of?  Maybe we could
> > augment the 'list' command to take a --uuid and --dumpjson option and
> > the 'define' command to accept a --jsonfile.  Maybe the 'start' command
> > could accept the same, so a transient device could define attributes
> > w/o excessive command line options.  Thanks,
> > 
> > Alex  
> 
> I was mostly thinking about complex configurations where writing a JSON
> config would be simpler than adding a lot of command line options.
> Something like dumping a JSON file and allowing to refer to a JSON file
> as you suggested could be useful; but then, those very complex use
> cases are probably already covered by editing the config file directly.
> Not sure if it is worth the effort; maybe just leave it as it is for
> now.

Well, I already did it.  It seems useful for creating transient devices
with attribute specifications.  If it's too ugly we can drop it.
Thanks,

Alex

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: mdevctl: A shoestring mediated device management and persistence utility
  2019-07-01 14:40                         ` Alex Williamson
@ 2019-07-01 17:13                           ` Cornelia Huck
  0 siblings, 0 replies; 46+ messages in thread
From: Cornelia Huck @ 2019-07-01 17:13 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Matthew Rosato, kvm, Libvirt Devel, Kirti Wankhede,
	Erik Skultety, Pavel Hrdina, Daniel P. Berrangé,
	Sylvain Bauza, Christophe de Dinechin, Tony Krowiak

On Mon, 1 Jul 2019 08:40:51 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Mon, 1 Jul 2019 10:20:43 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Fri, 28 Jun 2019 11:05:46 -0600
> > Alex Williamson <alex.williamson@redhat.com> wrote:
> >   
> > > On Fri, 28 Jun 2019 11:06:48 +0200
> > > Cornelia Huck <cohuck@redhat.com> wrote:    
> >   
> > > > What do you think of a way to specify JSON for the attributes directly
> > > > on the command line? Or would it be better to just edit the config
> > > > files directly?      
> > > 
> > > Supplying json on the command like seems difficult, even doing so with
> > > with jq requires escaping quotes.  It's not a very friendly
> > > experience.  Maybe something more like how virsh allows snippets of xml
> > > to be included, we could use jq to validate a json snippet provided
> > > as a file and add it to the attributes... of course if we need to allow
> > > libvirt to modify the json config files directly, the user could do
> > > that as well.  Is there a use case you're thinking of?  Maybe we could
> > > augment the 'list' command to take a --uuid and --dumpjson option and
> > > the 'define' command to accept a --jsonfile.  Maybe the 'start' command
> > > could accept the same, so a transient device could define attributes
> > > w/o excessive command line options.  Thanks,
> > > 
> > > Alex    
> > 
> > I was mostly thinking about complex configurations where writing a JSON
> > config would be simpler than adding a lot of command line options.
> > Something like dumping a JSON file and allowing to refer to a JSON file
> > as you suggested could be useful; but then, those very complex use
> > cases are probably already covered by editing the config file directly.
> > Not sure if it is worth the effort; maybe just leave it as it is for
> > now.  
> 
> Well, I already did it.  It seems useful for creating transient devices
> with attribute specifications.  If it's too ugly we can drop it.

I should probably look at the repository before I reply :)

Anyway, this doesn't look too ugly to me; but I think it would benefit
from some usage examples (which I just sent you a pull request for :)

> Thanks,
> 
> Alex


^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, back to index

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23 23:20 mdevctl: A shoestring mediated device management and persistence utility Alex Williamson
2019-05-24 10:11 ` Cornelia Huck
2019-05-24 14:43   ` Alex Williamson
2019-06-07 16:06   ` Halil Pasic
2019-06-11 19:45     ` Cornelia Huck
2019-06-11 20:28       ` Alex Williamson
2019-06-12  7:14         ` Cornelia Huck
2019-06-12 15:54           ` Halil Pasic
2019-06-13 10:02             ` Cornelia Huck
2019-06-12 15:07       ` Halil Pasic
2019-06-13 16:17 ` Christophe de Dinechin
2019-06-13 16:35   ` Alex Williamson
2019-06-14  9:54     ` [libvirt] " Christophe de Dinechin
2019-06-14 14:23       ` Alex Williamson
2019-06-14 15:06         ` Christophe de Dinechin
2019-06-14 16:04           ` Alex Williamson
2019-06-17 16:03             ` Cornelia Huck
2019-06-17 14:00 ` Daniel P. Berrangé
2019-06-17 14:54   ` Alex Williamson
2019-06-17 15:10     ` Daniel P. Berrangé
2019-06-17 17:05       ` Alex Williamson
2019-06-18  8:44         ` Daniel P. Berrangé
2019-06-18 11:01         ` Cornelia Huck
     [not found]           ` <CALOCmukPWiXiM+mN0hCTvSwfdHy5UdERU8WnvOXiBrMQ9tH3VA@mail.gmail.com>
2019-06-18 22:12             ` Alex Williamson
2019-06-19  7:28               ` Daniel P. Berrangé
2019-06-19  9:46                 ` Cornelia Huck
2019-06-19 18:46                   ` Alex Williamson
2019-06-20  8:24                     ` Daniel P. Berrangé
     [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
2019-06-19  9:57                 ` Cornelia Huck
2019-06-19 19:53                 ` Alex Williamson
2019-06-25 22:52 ` Alex Williamson
2019-06-26  9:58   ` Cornelia Huck
2019-06-26 14:37     ` Alex Williamson
2019-06-27  1:53       ` Alex Williamson
2019-06-27 12:26         ` Cornelia Huck
2019-06-27 15:00           ` Matthew Rosato
2019-06-27 15:38             ` Alex Williamson
2019-06-27 16:13               ` Matthew Rosato
2019-06-27 21:15               ` Alex Williamson
2019-06-28  1:57                 ` Alex Williamson
2019-06-28  9:06                   ` Cornelia Huck
2019-06-28 14:01                     ` Matthew Rosato
2019-06-28 17:05                     ` Alex Williamson
2019-07-01  8:20                       ` Cornelia Huck
2019-07-01 14:40                         ` Alex Williamson
2019-07-01 17:13                           ` Cornelia Huck

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org kvm@archiver.kernel.org
	public-inbox-index kvm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox