KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Sylvain Bauza <sbauza@redhat.com>
Cc: "Alex Williamson" <alex.williamson@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Libvirt Devel" <libvir-list@redhat.com>,
	"Kirti Wankhede" <kwankhede@nvidia.com>,
	"Erik Skultety" <eskultet@redhat.com>,
	"Pavel Hrdina" <phrdina@redhat.com>
Subject: Re: mdevctl: A shoestring mediated device management and persistence utility
Date: Wed, 19 Jun 2019 11:57:34 +0200
Message-ID: <20190619115734.59b3b6bb.cohuck@redhat.com> (raw)
In-Reply-To: <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>

On Wed, 19 Jun 2019 11:04:15 +0200
Sylvain Bauza <sbauza@redhat.com> wrote:

> On Wed, Jun 19, 2019 at 12:27 AM Alex Williamson <alex.williamson@redhat.com>
> wrote:
> 
> > On Tue, 18 Jun 2019 14:48:11 +0200
> > Sylvain Bauza <sbauza@redhat.com> wrote:
> >  
> > > On Tue, Jun 18, 2019 at 1:01 PM Cornelia Huck <cohuck@redhat.com> wrote:

> > > > I think we need to reach consensus about the actual scope of the
> > > > mdevctl tool.
> > > >
> > > >  
> > > Thanks Cornelia, my thoughts:
> > >
> > > - Is it supposed to be responsible for managing *all* mdev devices in  
> > > >   the system, or is it more supposed to be a convenience helper for
> > > >   users/software wanting to manage mdevs?
> > > >  
> > >
> > > The latter. If an operator (or some software) wants to create mdevs by  
> > not  
> > > using mdevctl (and rather directly calling the sysfs), I think it's OK.
> > > That said, mdevs created by mdevctl would be supported by systemctl,  
> > while  
> > > the others not but I think it's okay.  
> >
> > I agree (sort of), and I'm hearing that we should drop any sort of
> > automatic persistence of mdevs created outside of mdevctl.  The problem
> > comes when we try to draw the line between unmanaged and manged
> > devices.  For instance, if we have a command to list mdevs it would
> > feel incomplete if it didn't list all mdevs both those managed by
> > mdevctl and those created elsewhere.  For managed devices, I expect
> > we'll also have commands that allow the mode of the device to be
> > switched between transient, saved, and persistent.  Should a user then
> > be allowed to promote an unmanaged device to one of these modes via the
> > same command?  Should they be allowed to stop an unmanaged device
> > through driverctl?  Through systemctl?  These all seem like reasonable
> > things to do, so what then is the difference between transient and
> > unmanaged mdev and is mdevctl therefore managing all mdevs, not just
> > those it has created?
> >
> >  
> Well, IMHO, mdevs created by mdevctl could all be persisted or transient
> just by adding an option when calling mdevctl, like :
> "mdevctl create-mdev [--transient] <uuid> <pci_id> <type>" where default
> would be persisting the mdev.

This sounds useful; the caller can avoid fiddling with sysfs entries
directly, while not committing to a permanent configuration.

> 
> For mdevs *not* created by mdevctl, then a usecase could be "I'd like to
> ask mdevctl to manage mdevs I already created" and if so, a mdevctl command
> like :
> "mdevctl manage-mdev [--transient] <mdev_uuid>"

What kind of 'managing' would this actually enable? If we rely on
mdevctl working with sysfs directly for transient devices, I can't
really think of anything...

> 
> Of course, that would mean that when you list mdevs by "mdev list-all" you
> wouldn't get mdevs managed by mdevctl.
> Thoughts ?
> 
> > - Do we want mdevctl to manage config files for individual mdevs, or  
> > > >   are they supposed to be in a common format that can also be managed
> > > >   by e.g. libvirt?
> > > >  
> > >
> > > Unless I misunderstand, I think mdevctl just helps to create mdevs for
> > > being used by guests created either by libvirt or QEMU or even others.
> > > How a guest would allocate a mdev (ie. saying "I'll use this specific  
> > mdev  
> > > UUID") is IMHO not something for mdevctl.  
> >
> > Right, mdevctl isn't concerned with how a specific mdev is used, but I
> > think what Connie is after is more the proposal from Daniel where
> > libvirt can essentially manage mdevctl config files itself and then
> > only invoke mdevctl for the dirty work of creating and deleting
> > devices.  In fact, assuming systemd, libvirt could avoid direct
> > interaction with mdevctl entirely, instead using systemctl device units
> > to start and stop the mdevs.  Maybe where that proposal takes a turn is
> > when we again consider non-systemd hosts, where maybe mdevctl needs to
> > write out an init script per mdev and libvirt injecting itself into
> > manipulation of the config files would either need to perform the same
> > or fall back to mdevctl.  Unfortunately there seems to be an ultimatum
> > to either condone external config file manipulation or expand the scope
> > of the project into becoming a library.
> >
> >  
> Well, like I said, I think it's maybe another user case : just using
> libvirt when you want a guest having vGPUs and then libvirt would create
> mdevs (so users wouln't need to know at that).
> That said, for the moment, I think we don't really need it so maybe a new
> RFE once we at least have mdevctl packaged and supported by RHEL ?

If we allow config file handling directly, libvirt could start using it
even without mdevctl present? (Not sure if that makes sense.)

> 
> 
> > - Should mdevctl be a stand-alone tool, provide library functions, or  
> > > >   both? Related: should it keep any internal state that is not written
> > > >   to disk? (I think that also plays into the transient vs. persistent
> > > >   question.)  
> >
> > I don't think we want an mdevctld, if that's what you mean by internal

Yeah, mdevctld--.

> > state not written to disk.  I think we ideally want all state in the
> > mdev config files or discerned through sysfs.  How we handle
> > non-systemd hosts may throw a wrench in that though since currently the
> > systemd integration relies on a template to support arbitrary mdevs and
> > I'm not sure how to replicate that in other init services.  If we need
> > to dynamically manage per mdev init files in addition to config files,
> > we're not so self contained.
> >  
> > > FWIW, I'd love using mdevctl for OpenStack (Nova) just at least for
> > > creating persisted mdevs (ie. mdevs that would be recreated after  
> > rebooting  
> > > using systemctl). That's the real use case I need.
> > > Whether libvirt would internally support mdevctl would be nice but that's
> > > not really something Nova needs, so I leave others providing their own
> > > thoughts.
> > >
> > >
> > > My personal opinion is that mdevctl should be able to tolerate mdevs  
> > > > being configured by other means, but probably should not try to impose
> > > > its own configuration if it detects that (unless explicitly asked to do
> > > > so). Not sure how feasible that goal is.
> > > >
> > > > That's what I misunderstand : in order to have a guest using a vGPU,  
> > you  
> > > need to do two things :
> > > 1/ create the mdev
> > > 2/ allocate this created dev to a specific guest config
> > >
> > > Of course, we could imagine a way to have both steps to be done directly  
> > by  
> > > libvirt, but from my opinion, mdevctl is really helping 1/ and not 2/.  
> >
> > Yep, we also don't want to presume libvirt is the only consumer here.
> > mdevctl should also support other VM management tools, users who write
> > their own management scripts, and even non-VM related use cases.
> >
> >  
> Oh yes, please don't premuse mdevctl is only needed by libvirt.
> FWIW, once mdevctl is supported by RHEL, I'd love to use it for OpenStack
> Nova at least because I want to persist the mdevs.
> At the moment, Nova just creates mdevs directly by sysfs and look the
> existing onces up by sysfs, but upstream community in Nova thinks the
> mission statement is not about managing mdevs so we don't really want to
> add in Nova some kind of DB persistence just for mdevs.

So, Nova would basically poke mdevctl, but not interact with the config
files directly? Or am I misunderstanding?

> 
> > > A well-defined config file format is probably a win, even if it only  
> > > > ends up being used by mdevctl itself.  
> >
> > Yes, regardless of whether others touch them, conversion scripts on
> > upgrade should be avoided.  Do we need something beyond a key=value
> > file?  So far we're only storing the mdev type and startup mode, but
> > vfio-ap clearly needs more, apparently key=value1,value2,... type
> > representation.  Still, I think I'd prefer simple over jumping to xml
> > or json or yaml.  Thanks,
> >
> >  
> Heh, in OpenStack discussing about a file format is possibly one of the
> longest arguments we already have, so I leave others to say their own
> opinions but FWIW, as we use Python we tend to prefer YAML files. I don't
> care about the format tho, just take the most convenient for libvirt I'd
> say.

Aww, and here I was looking forward to a nice file format discussion...

More seriously, as I said in my other reply, .ini style would be good,
but using JSON probably gives us more flexibility in the long run.

  parent reply index

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 23:20 Alex Williamson
2019-05-24 10:11 ` Cornelia Huck
2019-05-24 14:43   ` Alex Williamson
2019-06-07 16:06   ` Halil Pasic
2019-06-11 19:45     ` Cornelia Huck
2019-06-11 20:28       ` Alex Williamson
2019-06-12  7:14         ` Cornelia Huck
2019-06-12 15:54           ` Halil Pasic
2019-06-13 10:02             ` Cornelia Huck
2019-06-12 15:07       ` Halil Pasic
2019-06-13 16:17 ` Christophe de Dinechin
2019-06-13 16:35   ` Alex Williamson
2019-06-14  9:54     ` [libvirt] " Christophe de Dinechin
2019-06-14 14:23       ` Alex Williamson
2019-06-14 15:06         ` Christophe de Dinechin
2019-06-14 16:04           ` Alex Williamson
2019-06-17 16:03             ` Cornelia Huck
2019-06-17 14:00 ` Daniel P. Berrangé
2019-06-17 14:54   ` Alex Williamson
2019-06-17 15:10     ` Daniel P. Berrangé
2019-06-17 17:05       ` Alex Williamson
2019-06-18  8:44         ` Daniel P. Berrangé
2019-06-18 11:01         ` Cornelia Huck
     [not found]           ` <CALOCmukPWiXiM+mN0hCTvSwfdHy5UdERU8WnvOXiBrMQ9tH3VA@mail.gmail.com>
2019-06-18 22:12             ` Alex Williamson
2019-06-19  7:28               ` Daniel P. Berrangé
2019-06-19  9:46                 ` Cornelia Huck
2019-06-19 18:46                   ` Alex Williamson
2019-06-20  8:24                     ` Daniel P. Berrangé
     [not found]               ` <CALOCmu=6Xmw-_-SVXujCEcgPY2CQiBQKgfUMJ45WnZ_9XORyUw@mail.gmail.com>
2019-06-19  9:57                 ` Cornelia Huck [this message]
2019-06-19 19:53                 ` Alex Williamson
2019-06-25 22:52 ` Alex Williamson
2019-06-26  9:58   ` Cornelia Huck
2019-06-26 14:37     ` Alex Williamson
2019-06-27  1:53       ` Alex Williamson
2019-06-27 12:26         ` Cornelia Huck
2019-06-27 15:00           ` Matthew Rosato
2019-06-27 15:38             ` Alex Williamson
2019-06-27 16:13               ` Matthew Rosato
2019-06-27 21:15               ` Alex Williamson
2019-06-28  1:57                 ` Alex Williamson
2019-06-28  9:06                   ` Cornelia Huck
2019-06-28 14:01                     ` Matthew Rosato
2019-06-28 17:05                     ` Alex Williamson
2019-07-01  8:20                       ` Cornelia Huck
2019-07-01 14:40                         ` Alex Williamson
2019-07-01 17:13                           ` Cornelia Huck

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190619115734.59b3b6bb.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=libvir-list@redhat.com \
    --cc=phrdina@redhat.com \
    --cc=sbauza@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org kvm@archiver.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox