netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Lan Tianyu <tianyu.lan@intel.com>,
	bhelgaas@google.com, carolyn.wyborny@intel.com,
	donald.c.skidmore@intel.com, eddie.dong@intel.com,
	nrupal.jani@intel.com, yang.z.zhang@intel.com, agraf@suse.de,
	kvm@vger.kernel.org, pbonzini@redhat.com, qemu-devel@nongnu.org,
	emil.s.tantilov@intel.com, intel-wired-lan@lists.osuosl.org,
	jeffrey.t.kirsher@intel.com, jesse.brandeburg@intel.com,
	john.ronciak@intel.com, linux-kernel@vger.kernel.org,
	linux-pci@vger.kernel.org, matthew.vick@intel.com,
	mitch.a.williams@intel.com, netdev@vger.kernel.org,
	shannon.nelson@intel.com
Subject: Re: [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC
Date: Wed, 28 Oct 2015 23:58:47 -0700	[thread overview]
Message-ID: <5631C3A7.2070900@gmail.com> (raw)
In-Reply-To: <5631B8CA.9040805@intel.com>

On 10/28/2015 11:12 PM, Lan Tianyu wrote:
> On 2015年10月26日 23:03, Alexander Duyck wrote:
>> No.  I think you are missing the fact that there are 256 descriptors per
>> page.  As such if you dirty just 1 you will be pulling in 255 more, of
>> which you may or may not have pulled in the receive buffer for.
>>
>> So for example if you have the descriptor ring size set to 256 then that
>> means you are going to get whatever the descriptor ring has since you
>> will be marking the entire ring dirty with every packet processed,
>> however you cannot guarantee that you are going to get all of the
>> receive buffers unless you go through and flush the entire ring prior to
>> migrating.
>
> Yes, that will be a problem. How about adding tag for each Rx buffer and
> check the tag when deliver the Rx buffer to stack? If tag has been
> overwritten, this means the packet data has been migrated.

Then you have to come up with a pattern that you can guarantee is the 
tag and not part of the packet data.  That isn't going to be something 
that is easy to do.  It would also have a serious performance impact on 
the VF.

>> This is why I have said you will need to do something to force the rings
>> to be flushed such as initiating a PM suspend prior to migrating.  You
>> need to do something to stop the DMA and flush the remaining Rx buffers
>> if you want to have any hope of being able to migrate the Rx in a
>> consistent state.  Beyond that the only other thing you have to worry
>> about are the Rx buffers that have already been handed off to the
>> stack.  However those should be handled if you do a suspend and somehow
>> flag pages as dirty when they are unmapped from the DMA.
>>
>> - Alex
> This will be simple and maybe our first version to enable migration. But
> we still hope to find a way not to disable DMA before stopping VCPU to
> decrease service down time.

You have to stop the Rx DMA at some point anyway.  It is the only means 
to guarantee that the device stops updating buffers and descriptors so 
that you will have a consistent state.

Your code was having to do a bunch of shuffling in order to get things 
set up so that you could bring the interface back up.  I would argue 
that it may actually be faster at least on the bring-up to just drop the 
old rings and start over since it greatly reduced the complexity and the 
amount of device related data that has to be moved.

  reply	other threads:[~2015-10-29  6:58 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-21 16:37 [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 01/12] PCI: Add virtfn_index for struct pci_device Lan Tianyu
2015-10-21 18:07   ` Alexander Duyck
2015-10-24 14:46     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 02/12] IXGBE: Add new mail box event to restore VF status in the PF driver Lan Tianyu
2015-10-21 20:34   ` Alexander Duyck
2015-10-21 16:37 ` [RFC Patch 03/12] IXGBE: Add sysfs interface for Qemu to migrate " Lan Tianyu
2015-10-21 20:45   ` Alexander Duyck
2015-10-25  7:21     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 04/12] IXGBE: Add ixgbe_ping_vf() to notify a specified VF via mailbox msg Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 05/12] IXGBE: Add new sysfs interface of "notify_vf" Lan Tianyu
2015-10-21 20:52   ` Alexander Duyck
2015-10-22 12:51     ` Michael S. Tsirkin
2015-10-24 15:43     ` Lan, Tianyu
2015-10-25  6:03       ` Alexander Duyck
2015-10-25  6:45         ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 06/12] IXGBEVF: Add self emulation layer Lan Tianyu
2015-10-21 20:58   ` Alexander Duyck
2015-10-22 12:50     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-22 15:50       ` Alexander Duyck
2015-10-21 16:37 ` [RFC Patch 07/12] IXGBEVF: Add new mail box event for migration Lan Tianyu
2015-10-21 16:37 ` [RFC Patch 08/12] IXGBEVF: Rework code of finding the end transmit desc of package Lan Tianyu
2015-10-21 21:14   ` Alexander Duyck
2015-10-24 16:12     ` Lan, Tianyu
2015-10-22 12:58   ` Michael S. Tsirkin
2015-10-24 16:08     ` Lan, Tianyu
2015-10-21 16:37 ` [RFC Patch 09/12] IXGBEVF: Add live migration support for VF driver Lan Tianyu
2015-10-21 21:48   ` Alexander Duyck
2015-10-22 12:46   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 10/12] IXGBEVF: Add lock to protect tx/rx ring operation Lan Tianyu
2015-10-21 21:55   ` Alexander Duyck
2015-10-22 12:40   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 11/12] IXGBEVF: Migrate VF statistic data Lan Tianyu
2015-10-22 12:36   ` Michael S. Tsirkin
2015-10-21 16:37 ` [RFC Patch 12/12] IXGBEVF: Track dma dirty pages Lan Tianyu
2015-10-22 12:30   ` Michael S. Tsirkin
2015-10-21 18:45 ` [RFC Patch 00/12] IXGBE: Add live migration support for SRIOV NIC Or Gerlitz
2015-10-21 19:20   ` Alex Williamson
2015-10-21 23:26     ` Alexander Duyck
2015-10-22 12:32     ` [Qemu-devel] " Michael S. Tsirkin
2015-10-22 13:01       ` Alex Williamson
2015-10-22 13:06         ` Michael S. Tsirkin
2015-10-22 15:58     ` Or Gerlitz
2015-10-22 16:17       ` Alex Williamson
2015-10-22 12:55 ` [Qemu-devel] " Michael S. Tsirkin
2015-10-23 18:36 ` Alexander Duyck
2015-10-23 19:05   ` Alex Williamson
2015-10-23 20:01     ` Alexander Duyck
2015-10-26  5:36   ` Lan Tianyu
2015-10-26 15:03     ` Alexander Duyck
2015-10-29  6:12       ` Lan Tianyu
2015-10-29  6:58         ` Alexander Duyck [this message]
2015-10-29  8:33           ` Lan Tianyu
2015-10-29 16:17             ` Alexander Duyck
2015-10-30  2:41               ` Lan Tianyu
2015-10-30 18:04                 ` Alexander Duyck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5631C3A7.2070900@gmail.com \
    --to=alexander.duyck@gmail.com \
    --cc=agraf@suse.de \
    --cc=bhelgaas@google.com \
    --cc=carolyn.wyborny@intel.com \
    --cc=donald.c.skidmore@intel.com \
    --cc=eddie.dong@intel.com \
    --cc=emil.s.tantilov@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matthew.vick@intel.com \
    --cc=mitch.a.williams@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nrupal.jani@intel.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shannon.nelson@intel.com \
    --cc=tianyu.lan@intel.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).