Re: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state

From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"farman@linux.ibm.com" <farman@linux.ibm.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	"Lu, Baolu" <baolu.lu@intel.com>
Subject: Re: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state
Date: Fri, 7 Jan 2022 13:23:24 -0400	[thread overview]
Message-ID: <20220107172324.GV2328285@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB5276177829EE5ED89AAD82398C4D9@BN9PR11MB5276.namprd11.prod.outlook.com>

On Fri, Jan 07, 2022 at 02:01:55AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Friday, January 7, 2022 8:30 AM
> > 
> > On Fri, Jan 07, 2022 at 12:00:13AM +0000, Tian, Kevin wrote:
> > > > Devices that are poorly designed here will have very long migration
> > > > downtime latencies and people simply won't want to use them.
> > >
> > > Different usages have different latency requirement. Do we just want
> > > people to decide whether to manage state for a device by
> > > measurement?
> > 
> > It doesn't seem unreasonable to allow userspace to set max timer for
> > NDMA for SLA purposes on devices that have unbounded NDMA times. It
> > would probably be some new optional ioctl for devices that can
> > implement it.
> 
> Yes, that's my point.
> 
> > 
> > However, this basically gives up on the idea that a VM can be migrated
> > as any migration can timeout and fail under this philosophy. I think
> > that is still very poor.
> > 
> > Optional migration really can't be sane path forward.
> > 
> 
> How is it different from the scenario where the guest generates a very
> high dirty rate so the precopy phase can never converge to a pre-defined
> threshold then abort the migration after certain timeout?

The hypervisor can halt the VCPU and put a stop to this and complete
the migration.

There is a difference between optional migration under a SLA and
mandatory migration with no SLA - I think both must be supported to be
sane.

> IMHO live migration is always a try-and-fail flavor. A previous migration
> failure doesn't prevent the orchestration stack to retry at a later point.

An operator might need to emergency migrate a VM without the
possibility for failure. For instance there is something wrong with
the base HW. SLA ignored, migration must be done.

IMHO it is completely wrong to view migration as optional, that is a
terrible standard to design HW to.

Jason