Re: [Qemu-devel] [PATCH v2 1/5] msix_init: assert programming error

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Cao jin <caoj.fnst@cn.fujitsu.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 1/5] msix_init: assert programming error
Date: Tue, 4 Oct 2016 12:19:10 +0100	[thread overview]
Message-ID: <20161004111909.GC2161@work-vm> (raw)
In-Reply-To: <87shsch30a.fsf@dusky.pond.sub.org>

* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> 
> > * Markus Armbruster (armbru@redhat.com) wrote:
> >> Alex Williamson <alex.williamson@redhat.com> writes:
> >> 
> >> > On Thu, 29 Sep 2016 15:11:27 +0200
> >> > Markus Armbruster <armbru@redhat.com> wrote:
> >> >
> >> >> Alex Williamson <alex.williamson@redhat.com> writes:
> >> >> 
> >> >> > On Tue, 13 Sep 2016 08:16:20 +0200
> >> >> > Markus Armbruster <armbru@redhat.com> wrote:
> >> >> >  
> >> >> >> Cc: Alex for device assignment expertise.
> >> >> >> 
> >> >> >> Cao jin <caoj.fnst@cn.fujitsu.com> writes:
> >> >> >>   
> >> >> >> > On 09/12/2016 09:29 PM, Markus Armbruster wrote:    
> >> >> >> >> Cao jin <caoj.fnst@cn.fujitsu.com> writes:
> >> >> >> >>    
> >> >> >> >>> The input parameters is used for creating the msix capable device, so
> >> >> >> >>> they must obey the PCI spec, or else, it should be programming error.    
> >> >> >> >>
> >> >> >> >> True when the the parameters come from a device model attempting to
> >> >> >> >> define a PCI device violating the spec.  But what if the parameters come
> >> >> >> >> from an actual PCI device violating the spec, via device assignment?    
> >> >> >> >
> >> >> >> > Before the patch, on invalid param, the vfio behaviour is:
> >> >> >> >   error_report("vfio: msix_init failed");
> >> >> >> >   then, device create fail.
> >> >> >> >
> >> >> >> > After the patch, its behaviour is:
> >> >> >> >   asserted.
> >> >> >> >
> >> >> >> > Do you mean we should still report some useful info to user on invalid
> >> >> >> > params?    
> >> >> >> 
> >> >> >> In the normal case, asking msix_init() to create MSI-X that are out of
> >> >> >> spec is a programming error: the code that does it is broken and needs
> >> >> >> fixing.
> >> >> >> 
> >> >> >> Device assignment might be the exception: there, the parameters for
> >> >> >> msix_init() come from the assigned device, not the program.  If they
> >> >> >> violate the spec, the device is broken.  This wouldn't be a programming
> >> >> >> error.  Alex, can this happen?
> >> >> >> 
> >> >> >> If yes, we may want to handle it by failing device assignment.  
> >> >> >
> >> >> >
> >> >> > Generally, I think the entire premise of these sorts of patches is
> >> >> > flawed.  We take a working error path that allows a driver to robustly
> >> >> > abort on unexpected date and turn it into a time bomb.  Often the
> >> >> > excuse for this is that "error handling is hard".  Tough.  Now a
> >> >> > hot-add of a device that triggers this changes from a simple failure to
> >> >> > a denial of service event.  Furthermore, we base that time bomb on our
> >> >> > interpretation of the spec, which we can only validate against in-tree
> >> >> > devices.
> >> >> >
> >> >> > We have actually had assigned devices that fail the sanity test here,
> >> >> > there's a quirk in vfio_msix_early_setup() for a Chelsio device with
> >> >> > this bug.  Do we really want user experiencing aborts when a simple
> >> >> > device initialization failure is sufficient?
> >> >> >
> >> >> > Generally abort code paths like this cause me to do my own sanity
> >> >> > testing, which is really poor practice since we should have that sanity
> >> >> > testing in the common code.  Thanks,  
> >> >> 
> >> >> I prefer to assert on programming error, because 1. it does double duty
> >> >> as documentation, 2. error handling of impossible conditions is commonly
> >> >> wrong, and 3. assertion failures have a much better chance to get the
> >> >> program fixed.  Even when presence of a working error path kills 2., the
> >> >> other two make me stick to assertions.
> >> >
> >> > So we're looking at:
> >> >
> >> >> -    if (nentries < 1 || nentries > PCI_MSIX_FLAGS_QSIZE + 1) {
> >> >> -        return -EINVAL;
> >> >> -    }
> >> >
> >> > vs
> >> >
> >> >> +    assert(nentries >= 1 && nentries <= PCI_MSIX_FLAGS_QSIZE + 1);
> >> >
> >> > How do you argue that one of these provides better self documentation
> >> > than the other?
> >> 
> >> The first one says "this can happen, and when it does, the function
> >> fails cleanly."  For a genuine programming error, this is in part
> >> misleading.
> >> 
> >> The second one says "I assert this can't happen.  We'd be toast if I was
> >> wrong."
> >> 
> >> > The assert may have a better chance of getting fixed, but it's because
> >> > the existence of the assert itself exposes a vulnerability in the code.
> >> > Which would you rather have in production, a VMM that crashes on the
> >> > slightest deviance from the input it expects or one that simply errors
> >> > the faulting code path and continues?
> >> 
> >> Invalid input to a program should never be treated as programming error.
> >> 
> >> > Error handling is hard, which is why we need to look at it as a
> >> > collection of smaller problems.  We return an error at a leaf function
> >> > and let callers of that function decide how to handle it.  If some of
> >> > those callers don't want to deal with error handling, abort there, we
> >> > can come back to them later, but let the code paths that do want proper
> >> > error handling to continue.  If we add aborts into the leaf function,
> >> > then any calling path that wants to be robust against an error needs to
> >> > fully sanitize the input itself, at which point we have different
> >> > drivers sanitizing in different ways, all building up walls to protect
> >> > themselves from the time bombs in these leaf functions.  It's crazy.
> >> 
> >> It depends on the kind of error in the leaf function.
> >> 
> >> I suspect we're talking past each other because we got different kinds
> >> of errors in mind.
> >> 
> >> Programming is impossible without things like preconditions,
> >> postconditions, invariants.
> >> 
> >> If a section of code is entered when its precondition doesn't hold,
> >> we're toast.  This is the archetypical programming error.
> >> 
> >> If it can actually happen, the program is incorrect, and needs fixing.
> >> 
> >> Checking preconditions is often (but not always) practical.  In my
> >> opinion, checking is good practice, and the proper way to check is
> >> assert().  Makes the incorrect program fail before it can do further
> >> damage, and helps with finding the programming error.
> >> 
> >> A preconditions is part of the contract between a function and its
> >> users.  An strong precondition can make the function's job easier, but
> >> that's no use if the resulting function is inconvenient to use.  On the
> >> other hand, complicating the function to get a weaker precondition
> >> nobody actually needs is just as dumb.
> >> 
> >> Returning an error is *not* checking preconditions.  Remember, if the
> >> precondition doesn't hold, we're toast.  If we're toast when we return
> >> an error, we're clearly doing it wrong.
> >> 
> >> You are arguing for weaker preconditions.  I'm not actually disagreeing
> >> with you!  I'm merely expressing my opinion that checking preconditions
> >> with assert() is a good idea.
> >
> > I have a fairly strong dislike for asserts in qemu, and although I'm not
> > always consistent, my reasoning is mainly to do with asserts once a guest
> > is running.
> >
> > Lets imagine you have a happily running guest and then you try and do
> > something new and complex (e.g. hotplug a vfio-device); now lets say that
> > new thing has something very broken about it, do you really want the previously
> > running guest to die?
> 
> If a precondition doesn't hold, we're toast.  The best we can do is
> crash before we mess up things further.
> 
> A problematic condition we can safely recover from can be made an error
> condition.
> 
> I think the crux of our misunderstandings (I hesitate to call it an
> argument) is confusing recoverable error conditions with violated
> preconditions.  We all agree (violently, perhaps) that assert() is not
> an acceptable error handling mechanism.

I think perhaps part of the problem maybe trying to place all types of screwups
into only two categories; 'errors' and 'violations of preconditions'.
Consider some cases:
   a) The user tries to specify an out of range value to a setting;
      an error, probably not fatal (except if it was commandline)

   b) An inconsistency is found in the MMU state
      violation of precondition, fatal.

   c) A host device used for passthrough does something which according
      to the USB/PCI/SCSI specs is illegal
      violation of precondition - but you probably don't want that
      to be fatal.

   d) An inconsistency is found in a specific device emulation
      violation of precondition - but I might not want that to be fatal.

I think we agree on (a),(b), disagree on (d)  and I think this
case might be (c).

> > My view is it can very much depend on how broken you think the
> > world is; you've got to remember that crashing at this point
> > is going to lose the user a VM, and that could mean losing
> > data - so at that point you have to make a decision about whether
> > your lack of confidence in the state of the VM due to the failed
> > precondition is worse than your knowledge that the VM is going to fail.
> >
> > Perhaps giving the user an error and disabling the device lets
> > the admin gravefully shutdown the VM and walk away with all
> > their data intact.
> 
> This is risky business unless you can prove the problematic condition is
> safely isolated.  To elaborate on your device example: say some logic
> error in device emulation code put the device instance in some broken
> state.  If you detect that before the device could mess up anything
> else, fencing the device is safe.  But if device state is borked because
> some other code overran an array, continuing risks making things worse.
> Crashing the guest is bad.  Letting it first overwrite good data with
> bad data is worse.
> 
> Sadly, such proof is hardly ever possible in unrestricted C.  So we're
> down to probabilities and tradeoffs.

Agreed.

> I'd reject a claim that once the guest is running the tradeoffs *always*
> favour trying to hobble on.

Agreed; this is the difference between my case (b) and (d).
My preference is to fail the device in question if it's not a core device;
that way if it's a disk you can't write any more to it to mess it's contents
up further, and you won't read bad data from it - that's about as much
isolation as you're going to get.

However, some of it is also down to our expections of the stability of the
code in question - if the inconsistency is in some code that you know
is complex probably with untested cases and which isn't core to the VM
continuing (e.g. outgoing migration or hotplugging a host device) then
I believe it's OK to issue a scary warning, disable/error the device
in question and hobble on.

I'd say it's OK to argue that a piece of core code should be heavily
isolated from the bits you think are still a bit touchy - so it's
reasonable to me to have an assert in some core code (b) as long
as it's possible to stop any of the (c) and (d) cases triggering it
if they're coded defensively enough to error out before that assert
could be hit.  But then again someone might worry they just can't
deal with all the types of screwup (c) might present.

> If you want a less bleak isolation and recovery story, check out Erlang.
> Note that its "let it crash" philosophy is very much in accordance with
> my views on what can safely be done after detecting a programming error
> / violated precondition.
> 
> > So I wouldn't argue for weaker preconditions, just what the
> > result is if the precondition fails.
> 
> I respectfully disagree with your use of the concept "precondition".

I generally avoid using the word precondition; it's too formal for my
liking given the level we're programming at and the lack of any formal
defs.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK