All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <dgibson@redhat.com>
To: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Julia Suvorova <jusual@redhat.com>,
	qemu devel list <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH] pci: Refuse to hotplug PCI Devices when the Guest OS is not ready
Date: Mon, 26 Oct 2020 17:35:48 +1100	[thread overview]
Message-ID: <20201026173548.113ce1fd@yekko.fritz.box> (raw)
In-Reply-To: <CAC_L=vUh8LU5c8c00OhnbEiW7EzZFWKU61vOdub7c11wDMXeRg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5291 bytes --]

On Fri, 23 Oct 2020 09:47:14 +0300
Marcel Apfelbaum <marcel.apfelbaum@gmail.com> wrote:

> Hi David,
> 
> On Fri, Oct 23, 2020 at 6:49 AM David Gibson <dgibson@redhat.com> wrote:
> 
> > On Thu, 22 Oct 2020 11:01:04 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >  
> > > On Thu, Oct 22, 2020 at 05:50:51PM +0300, Marcel Apfelbaum wrote:
> > >  [...]
> > >
> > > Right. After detecting just failing unconditionally it a bit too
> > > simplistic IMHO.  
> >
> > There's also another factor here, which I thought I'd mentioned
> > already, but looks like I didn't: I think we're still missing some
> > details in what's going on.
> >
> > The premise for this patch is that plugging while the indicator is in
> > transition state is allowed to fail in any way on the guest side.  I
> > don't think that's a reasonable interpretation, because it's unworkable
> > for physical hotplug.  If the indicator starts blinking while you're in
> > the middle of shoving a card in, you'd be in trouble.
> >
> > So, what I'm assuming here is that while "don't plug while blinking" is
> > the instruction for the operator to obey as best they can, on the guest
> > side the rule has to be "start blinking, wait a while and by the time
> > you leave blinking state again, you can be confident any plugs or
> > unplugs have completed".  Obviously still racy in the strict computer
> > science sense, but about the best you can do with slow humans in the
> > mix.
> >
> > So, qemu should of course endeavour to follow that rule as though it
> > was a human operator on a physical machine and not plug when the
> > indicator is blinking.  *But* the qemu plug will in practice be fast
> > enough that if we're hitting real problems here, it suggests the guest
> > is still doing something wrong.
> >  
> 
> I personally think there is a little bit of over-engineering here.
> Let's start with the spec:
> 
>     Power Indicator Blinking
>     A blinking Power Indicator indicates that the slot is powering up or
> powering down and that
>     insertion or removal of the adapter is not permitted.

Right, but the weird bit here is that IIUC, the kernel during general
probe is switching the indicator from off -> blinking -> off without it
ever going to "on" state.  And it seems to do the power on and presence
check with the indicator still in blinking state.  This is different
from the normal sequence on a hotplug:
	press button
	indicator goes to blinking
	...wait...
	indicator goes to full on
	slot powers on
	presence detect

Or am I missing something?

> What exactly is an interpretation here?
> As you stated, the races are theoretical, the whole point of the indicator
> is to let the operator know he can't plug the device just yet.
> 
> I understand it would be more user friendly if the QEMU would wait
> internally for the
> blinking to end, but the whole point of the indicator is to let the
> operator (human or machine)
> know they can't plug the device at a specific time.
> Should QEMU take the responsibility of the operator? Is it even correct?

I don't think there's an unambiguous correct answer here.  But I think
it's reasonable to interpret a general "device_add" as "instruct the
virtual operator to plug in the card ASAP" as easily as "shove the
virtual card in right now".  device_add already covers a bunch of
different pluggable interfaces, so I don't think precisely aligning to
pci-e semantics is really a priority.

> Even if we would want such a feature, how is it related to this patch?
> The patch simply refuses to start a hotplug operation when it knows it will
> not succeed.

I don't think I was clear.  There are two separate and unrelated things
I'm bringing up here.  The first is that having the device_add fail for
transitory reasons that the management layer doesn't really care about
is really bad UX.

But the second point I'm raising here is that "don't plug when blinking"
is not, strictly speaking an implementable strategy for a human
operator.  That makes me conclude that the idea here is plugs started
at basically the same time as the blinking starts (which could be a
little before or a little after, humans being fuzzy) should actually be
acceptable, even if they finish after the indicator is blinking.  It's
not clear to me that the kernel's current behaviour actually permits
that.

> Another way that would make sense to me would be  is a new QEMU interface
> other than
> "add_device", let's say "adding_device_allowed", that would return true if
> the hotplug is allowed
> at this point of time. (I am aware of the theoretical races)
> 
> The above will at least mimic the mechanics of the pyhs world.  The
> operator looks at the indicator,
> the management software checks if adding the device is allowed.
> Since it is a corner case I would prefer the device_add to fail rather than
> introducing a new interface,
> but that's just me.

Oh, no, that's not what I'm suggesting.  Adding an "is allowed" command
is even worse.  It makes the management's task *more* complex, which
making any possible race here even wider.

-- 
David Gibson <dgibson@redhat.com>
Principal Software Engineer, Virtualization, Red Hat

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2020-10-26  6:36 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-22 11:40 [PATCH] pci: Refuse to hotplug PCI Devices when the Guest OS is not ready Marcel Apfelbaum
2020-10-22 12:06 ` Michael S. Tsirkin
2020-10-22 12:56   ` David Gibson
2020-10-22 13:15     ` Michael S. Tsirkin
2020-10-23  3:30       ` David Gibson
2020-10-22 13:55     ` Marcel Apfelbaum
2020-10-22 14:01       ` Michael S. Tsirkin
2020-10-22 14:10         ` Marcel Apfelbaum
2020-10-22 14:32           ` Michael S. Tsirkin
2020-10-22 14:50             ` Marcel Apfelbaum
2020-10-22 15:01               ` Michael S. Tsirkin
2020-10-23  3:49                 ` David Gibson
2020-10-23  6:47                   ` Marcel Apfelbaum
2020-10-23 15:54                     ` Michael S. Tsirkin
2020-10-23 17:27                       ` Igor Mammedov
2020-10-26  6:38                         ` David Gibson
2020-10-26  9:17                         ` Peter Krempa
2020-10-26  6:35                     ` David Gibson [this message]
2020-10-23  6:26                 ` Marcel Apfelbaum
2020-10-26  6:45                   ` David Gibson
2020-10-27 11:26                     ` Michael S. Tsirkin
2020-10-27 12:54                       ` Igor Mammedov
2020-10-27 13:02                         ` Michael S. Tsirkin
2020-10-28  3:34                           ` David Gibson
2020-10-28  3:31                         ` David Gibson
2020-10-28 15:39                           ` Igor Mammedov
2020-10-28 17:49                             ` Michael S. Tsirkin
2020-10-27 11:30                   ` Michael S. Tsirkin
2020-10-23  3:31       ` David Gibson
2020-11-11 12:35 ` Michael S. Tsirkin
2020-11-15 16:48   ` Marcel Apfelbaum
2020-11-11 16:09 ` Roman Kagan
2020-11-15 16:43   ` Marcel Apfelbaum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201026173548.113ce1fd@yekko.fritz.box \
    --to=dgibson@redhat.com \
    --cc=jusual@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.