kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"farman@linux.ibm.com" <farman@linux.ibm.com>,
	"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
	"pasic@linux.ibm.com" <pasic@linux.ibm.com>,
	"Lu, Baolu" <baolu.lu@intel.com>
Subject: RE: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state
Date: Thu, 6 Jan 2022 06:32:57 +0000	[thread overview]
Message-ID: <BN9PR11MB5276E5F4C19FB368414500368C4C9@BN9PR11MB5276.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20220105124533.GP2328285@nvidia.com>

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, January 5, 2022 8:46 PM
> 
> On Wed, Jan 05, 2022 at 01:59:31AM +0000, Tian, Kevin wrote:
> 
> > > This will block the hypervisor from ever migrating the VM in a very
> > > poor way - it will just hang in the middle of a migration request.
> >
> > it's poor but 'hang' won't happen. PCI spec defines completion timeout
> > for ATS translation request. If timeout the device will abort the in-fly
> > request and report error back to software.
> 
> The PRI time outs have to be long enough to handle swap back from
> disk, so 'hang' will be a fair amount of time..

This reminds me one interesting point.

Putting PRI aside the time to drain in-fly requests is undefined. It depends
on how many pending requests to be waited for before completing the
draining command on the device. This is IP specific (e.g. whether supports
preemption) and also guest specific (e.g. whether it's actively submitting
workload).

So even without hostile attempts the draining time may exceed what an
user tolerates in live migration.

This suggests certain software timeout mechanism might be necessary 
when transitioning to NDMA state, with the timeout value optionally
configurable by the user. If timeout, then fail the state transition
request.

And once such mechanism is in place, PRI is automatically covered as it
is just one implicit reason which may increase the draining time.

> 
> > > Regardless of the complaints of the IP designers, this is a very poor
> > > direction.
> > >
> > > Progress in the hypervisor should never be contingent on a guest VM.
> > >
> >
> > Whether the said DOS is a real concern and how severe it is are usage
> > specific things. Why would we want to hardcode such restriction on
> > an uAPI? Just give the choice to the admin (as long as this restriction is
> > clearly communicated to userspace clearly)...
> 
> IMHO it is not just DOS, PRI can become dependent on IO which requires
> DMA to complete.
> 
> You could quickly get yourself into a deadlock situation where the
> hypervisor has disabled DMA activities of other devices and the vPRI
> simply cannot be completed.

How is it related to PRI which is only about address translation?

Instead, above is a general p2p problem for any draining operation. How 
to solve it needs to be defined clearly for this NDMA state (which I suppose
is being discussed between you and Alex and I still need time to catch
up).

> 
> I just don't see how this scheme is generally workable without a lot
> of limitations.
> 
> While I do agree we should support the HW that exists, we should
> recognize this is not a long term workable design and treat it as
> such.
> 

Definitely agree with this point. We software people should continue
influencing IP designers toward a long-term software friendly design.
and also bear the fact that it takes time... 😊

Thanks
Kevin 

  reply	other threads:[~2022-01-06  6:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 23:34 [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state Alex Williamson
2021-12-10  1:25 ` Jason Gunthorpe
2021-12-13 20:40   ` Alex Williamson
2021-12-14 12:08     ` Cornelia Huck
2021-12-14 16:26     ` Jason Gunthorpe
2021-12-20 22:26       ` Alex Williamson
2022-01-04 20:28         ` Jason Gunthorpe
2022-01-06 18:17           ` Alex Williamson
2022-01-06 21:20             ` Jason Gunthorpe
2022-01-10  7:55               ` Tian, Kevin
2022-01-10 17:34                 ` Alex Williamson
2022-01-11  2:41                   ` Tian, Kevin
2022-01-10 18:11                 ` Jason Gunthorpe
2022-01-11  3:14                   ` Tian, Kevin
2022-01-11 18:19                     ` Jason Gunthorpe
2022-01-04  3:49       ` Tian, Kevin
2022-01-04 16:09         ` Jason Gunthorpe
2022-01-05  1:59           ` Tian, Kevin
2022-01-05 12:45             ` Jason Gunthorpe
2022-01-06  6:32               ` Tian, Kevin [this message]
2022-01-06 15:42                 ` Jason Gunthorpe
2022-01-07  0:00                   ` Tian, Kevin
2022-01-07  0:29                     ` Jason Gunthorpe
2022-01-07  2:01                       ` Tian, Kevin
2022-01-07 17:23                         ` Jason Gunthorpe
2022-01-10  3:14                           ` Tian, Kevin
2022-01-10 17:52                             ` Jason Gunthorpe
2022-01-11  2:57                               ` Tian, Kevin
2022-01-05  3:06           ` Tian, Kevin
2021-12-20 17:38 ` Cornelia Huck
2021-12-20 22:49   ` Alex Williamson
2021-12-21 11:24     ` Cornelia Huck
2022-01-07  8:03 ` Tian, Kevin
2022-01-07 16:36   ` Alex Williamson
2022-01-10  6:01     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR11MB5276E5F4C19FB368414500368C4C9@BN9PR11MB5276.namprd11.prod.outlook.com \
    --to=kevin.tian@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@intel.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=farman@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).