linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: "Rao, Lei" <lei.rao@intel.com>, Christoph Hellwig <hch@lst.de>,
	Jason Gunthorpe <jgg@ziepe.ca>
Cc: kbusch@kernel.org, axboe@fb.com, kch@nvidia.com,
	sagi@grimberg.me, alex.williamson@redhat.com, cohuck@redhat.com,
	yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com,
	kevin.tian@intel.com, mjrosato@linux.ibm.com,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	kvm@vger.kernel.org, eddie.dong@intel.com, yadong.li@intel.com,
	yi.l.liu@intel.com, Konrad.wilk@oracle.com,
	stephen@eideticom.com, hang.yuan@intel.com,
	Oren Duer <oren@nvidia.com>
Subject: Re: [RFC PATCH 5/5] nvme-vfio: Add a document for the NVMe device
Date: Sun, 11 Dec 2022 16:51:02 +0200	[thread overview]
Message-ID: <d4aeda5c-d7bb-4427-5157-fb7530dfd1fb@nvidia.com> (raw)
In-Reply-To: <cf88c2ec-bdd6-1df3-6c77-64a17dc3eb86@intel.com>


On 12/11/2022 3:21 PM, Rao, Lei wrote:
>
>
> On 12/11/2022 8:05 PM, Max Gurtovoy wrote:
>>
>> On 12/6/2022 5:01 PM, Christoph Hellwig wrote:
>>> On Tue, Dec 06, 2022 at 10:48:22AM -0400, Jason Gunthorpe wrote:
>>>> Sadly in Linux we don't have a SRIOV VF lifecycle model that is any
>>>> use.
>>> Beward:  The secondary function might as well be a physical function
>>> as well.  In fact one of the major customers for "smart" multifunction
>>> nvme devices prefers multi-PF devices over SR-IOV VFs. (and all the
>>> symmetric dual ported devices are multi-PF as well).
>>>
>>> So this isn't really about a VF live cycle, but how to manage life
>>> migration, especially on the receive / restore side.  And restoring
>>> the entire controller state is extremely invasive and can't be done
>>> on a controller that is in any classic form live.  In fact a lot
>>> of the state is subsystem-wide, so without some kind of virtualization
>>> of the subsystem it is impossible to actually restore the state.
>>
>> ohh, great !
>>
>> I read this subsystem virtualization proposal of yours after I sent 
>> my proposal for subsystem virtualization in patch 1/5 thread.
>> I guess this means that this is the right way to go.
>> Lets continue brainstorming this idea. I think this can be the way to 
>> migrate NVMe controllers in a standard way.
>>
>>>
>>> To cycle back to the hardware that is posted here, I'm really confused
>>> how it actually has any chance to work and no one has even tried
>>> to explain how it is supposed to work.
>>
>> I guess in vendor specific implementation you can assume some things 
>> that we are discussing now for making it as a standard.
>
> Yes, as I wrote in the cover letter, this is a reference 
> implementation to
> start a discussion and help drive standardization efforts, but this 
> series
> works well for Intel IPU NVMe. As Jason said, there are two use cases:
> shared medium and local medium. I think the live migration of the 
> local medium
> is complicated due to the large amount of user data that needs to be 
> migrated.
> I don't have a good idea to deal with this situation. But for Intel 
> IPU NVMe,
> each VF can connect to remote storage via the NVMF protocol to achieve 
> storage
> offloading. This is the shared medium. In this case, we don't need to 
> migrate
> the user data, which will significantly simplify the work of live 
> migration.

I don't think that medium migration should be part of the SPEC. We can 
specify it's out of scope.

All the idea of live migration is to have a short downtime and I don't 
think we can guarantee short downtime if we need to copy few terabytes 
throw the networking.
If the media copy is taking few seconds, there is no need to do live 
migration of few milisecs downtime. Just do regular migration of a VM.

>
> The series tries to solve the problem of live migration of shared medium.
> But it still lacks dirty page tracking and P2P support, we are also 
> developing
> these features.
>
> About the nvme device state, As described in my document, the VF 
> states include
> VF CSR registers, Every IO Queue Pair state, and the AdminQ state. 
> During the
> implementation, I found that the device state data is small per VF. 
> So, I decided
> to use the admin queue of the Primary controller to send the live 
> migration
> commands to save and restore the VF states like MLX5.

I think and hope we all agree that the AdminQ of the controlling NVMe 
function will be used to migrate the controlled NVMe function.

which document are you refereeing to ?

>
> Thanks,
> Lei
>
>>
>>

  reply	other threads:[~2022-12-11 14:51 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-06  5:58 [RFC PATCH 0/5] Add new VFIO PCI driver for NVMe devices Lei Rao
2022-12-06  5:58 ` [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver Lei Rao
2022-12-06  6:19   ` Christoph Hellwig
2022-12-06 13:44     ` Jason Gunthorpe
2022-12-06 13:51       ` Keith Busch
2022-12-06 14:27         ` Jason Gunthorpe
2022-12-06 13:58       ` Christoph Hellwig
2022-12-06 15:22         ` Jason Gunthorpe
2022-12-06 15:38           ` Christoph Hellwig
2022-12-06 15:51             ` Jason Gunthorpe
2022-12-06 16:55               ` Christoph Hellwig
2022-12-06 19:15                 ` Jason Gunthorpe
2022-12-07  2:30                   ` Max Gurtovoy
2022-12-07  7:58                     ` Christoph Hellwig
2022-12-09  2:11                       ` Tian, Kevin
2022-12-12  7:41                         ` Christoph Hellwig
2022-12-07  7:54                   ` Christoph Hellwig
2022-12-07 10:59                     ` Max Gurtovoy
2022-12-07 13:46                       ` Christoph Hellwig
2022-12-07 14:50                         ` Max Gurtovoy
2022-12-07 16:35                           ` Christoph Hellwig
2022-12-07 13:34                     ` Jason Gunthorpe
2022-12-07 13:52                       ` Christoph Hellwig
2022-12-07 15:07                         ` Jason Gunthorpe
2022-12-07 16:38                           ` Christoph Hellwig
2022-12-07 17:31                             ` Jason Gunthorpe
2022-12-07 18:33                               ` Christoph Hellwig
2022-12-07 20:08                                 ` Jason Gunthorpe
2022-12-09  2:50                                   ` Tian, Kevin
2022-12-09 18:56                                     ` Dong, Eddie
2022-12-11 11:39                                   ` Max Gurtovoy
2022-12-12  7:55                                     ` Christoph Hellwig
2022-12-12 14:49                                       ` Max Gurtovoy
2022-12-12  7:50                                   ` Christoph Hellwig
2022-12-13 14:01                                     ` Jason Gunthorpe
2022-12-13 16:08                                       ` Christoph Hellwig
2022-12-13 17:49                                         ` Jason Gunthorpe
2022-12-06  5:58 ` [RFC PATCH 2/5] nvme-vfio: add new vfio-pci driver for NVMe device Lei Rao
2022-12-06  5:58 ` [RFC PATCH 3/5] nvme-vfio: enable the function of VFIO live migration Lei Rao
2023-01-19 10:21   ` Max Gurtovoy
2023-02-09  9:09     ` Rao, Lei
2022-12-06  5:58 ` [RFC PATCH 4/5] nvme-vfio: check if the hardware supports " Lei Rao
2022-12-06 13:47   ` Keith Busch
2022-12-06  5:58 ` [RFC PATCH 5/5] nvme-vfio: Add a document for the NVMe device Lei Rao
2022-12-06  6:26   ` Christoph Hellwig
2022-12-06 13:05     ` Jason Gunthorpe
2022-12-06 13:09       ` Christoph Hellwig
2022-12-06 13:52         ` Jason Gunthorpe
2022-12-06 14:00           ` Christoph Hellwig
2022-12-06 14:20             ` Jason Gunthorpe
2022-12-06 14:31               ` Christoph Hellwig
2022-12-06 14:48                 ` Jason Gunthorpe
2022-12-06 15:01                   ` Christoph Hellwig
2022-12-06 15:28                     ` Jason Gunthorpe
2022-12-06 15:35                       ` Christoph Hellwig
2022-12-06 18:00                         ` Dong, Eddie
2022-12-12  7:57                           ` Christoph Hellwig
2022-12-11 12:05                     ` Max Gurtovoy
2022-12-11 13:21                       ` Rao, Lei
2022-12-11 14:51                         ` Max Gurtovoy [this message]
2022-12-12  1:20                           ` Rao, Lei
2022-12-12  8:09                           ` Christoph Hellwig
2022-12-09  2:05         ` Tian, Kevin
2022-12-09 16:53           ` Li, Yadong
2022-12-12  8:11             ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4aeda5c-d7bb-4427-5157-fb7530dfd1fb@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=Konrad.wilk@oracle.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@fb.com \
    --cc=cohuck@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=hang.yuan@intel.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=lei.rao@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oren@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=stephen@eideticom.com \
    --cc=yadong.li@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).