All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Qemu Developers <qemu-devel@nongnu.org>,
	Eduardo Habkost <ehabkost@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Juan Quintela <quintela@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 7/8] migration/ram: ensure write persistence on loading compressed pages to PMEM
Date: Wed, 7 Feb 2018 14:43:31 -0800	[thread overview]
Message-ID: <CAPcyv4idQNgGV0EyRuCenGSU2_OCZS50Z3LHO_ExMTakL-n1ig@mail.gmail.com> (raw)
In-Reply-To: <20180207183717.GW2665@work-vm>

On Wed, Feb 7, 2018 at 10:37 AM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Dan Williams (dan.j.williams@intel.com) wrote:
>> On Wed, Feb 7, 2018 at 10:08 AM, Dr. David Alan Gilbert
>> <dgilbert@redhat.com> wrote:
>> > * Dan Williams (dan.j.williams@intel.com) wrote:
>> >> On Wed, Feb 7, 2018 at 5:24 AM, Dr. David Alan Gilbert
>> >> <dgilbert@redhat.com> wrote:
>> >> > * Haozhong Zhang (haozhong.zhang@intel.com) wrote:
>> >> >> On 02/07/18 13:03 +0000, Dr. David Alan Gilbert wrote:
>> >> >> > * Haozhong Zhang (haozhong.zhang@intel.com) wrote:
>> >> >> > > On 02/07/18 11:54 +0000, Dr. David Alan Gilbert wrote:
>> >> >> > > > * Haozhong Zhang (haozhong.zhang@intel.com) wrote:
>> >> >> > > > > When loading a compressed page to persistent memory, flush CPU cache
>> >> >> > > > > after the data is decompressed. Combined with a call to pmem_drain()
>> >> >> > > > > at the end of memory loading, we can guarantee those compressed pages
>> >> >> > > > > are persistently loaded to PMEM.
>> >> >> > > >
>> >> >> > > > Can you explain why this can use the flush and doesn't need the special
>> >> >> > > > memset?
>> >> >> > >
>> >> >> > > The best approach to ensure the write persistence is to operate pmem
>> >> >> > > all via libpmem, e.g., pmem_memcpy_nodrain() + pmem_drain(). However,
>> >> >> > > the write to pmem in this case is performed by uncompress() which is
>> >> >> > > implemented out of QEMU and libpmem. It may or may not use libpmem,
>> >> >> > > which is not controlled by QEMU. Therefore, we have to use the less
>> >> >> > > optimal approach, that is to flush cache for all pmem addresses that
>> >> >> > > uncompress() may have written, i.e.,/e.g., memcpy() and/or memset() in
>> >> >> > > uncompress(), and pmem_flush() + pmem_drain() in QEMU.
>> >> >> >
>> >> >> > In what way is it less optimal?
>> >> >> > If that's a legal thing to do, then why not just do a pmem_flush +
>> >> >> > pmem_drain right at the end of the ram loading and leave all the rest of
>> >> >> > the code untouched?
>> >> >>
>> >> >> For example, the implementation pmem_memcpy_nodrain() prefers to use
>> >> >> movnt instructions w/o flush to write pmem if those instructions are
>> >> >> available, and falls back to memcpy() + flush if movnt are not
>> >> >> available, so I suppose the latter is less optimal.
>> >> >
>> >> > But if you use normal memcpy calls to copy a few GB of RAM in an
>> >> > incoming migrate and then do a single flush at the end, isn't that
>> >> > better?
>> >>
>> >> Not really, because now you've needlessly polluted the cache and are
>> >> spending CPU looping over the cachelines that could have been bypassed
>> >> with movnt.
>> >
>> > What's different in the pmem case?   Isn't what you've said true in the
>> > normal migrate case as well?
>> >
>>
>> In the normal migrate case the memory is volatile so once the copy is
>> globally visiable you're done. In the pmem case the migration is not
>> complete until the persistent state is synchronized.
>>
>> Note, I'm talking in generalities because I don't know the deeper
>> details of the migrate flow.
>
> On the receive side of a migrate, during a normal precopy migrate
> (without xbzrle) nothing is going to be reading that RAM until
> the whole RAM load has completed anyway - so we're not benefiting from
> it being in the caches either.
>
> In the pmem case, again since nothing is going to be reading from that
> RAM until the end anyway, why bother flushing as we go as opposed to at
> the end?

Flushing at the end implies doing a large loop flushing the caches at
the end of the transfer because the x86 ISA only exposes a
line-by-line flush to unprivileged code rather than a full cache flush
like what the kernel can do with wbinvd. So, better to flush as we go
rather than incur the overhead of the loop at the end. I.e. I'm
assuming it is more efficient to do 'movnt' in the first instance and
not worry about the flush loop.

  reply	other threads:[~2018-02-07 22:43 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-07  7:33 [Qemu-devel] [PATCH v2 0/8] nvdimm: guarantee persistence of QEMU writes to persistent memory Haozhong Zhang
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 1/8] memory, exec: switch file ram allocation functions to 'flags' parameters Haozhong Zhang
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 2/8] hostmem-file: add the 'pmem' option Haozhong Zhang
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 3/8] configure: add libpmem support Haozhong Zhang
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation Haozhong Zhang
2018-02-09 14:27   ` Stefan Hajnoczi
2018-02-09 14:57     ` Haozhong Zhang
2018-02-12 13:55       ` Stefan Hajnoczi
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 5/8] migration/ram: ensure write persistence on loading zero pages to PMEM Haozhong Zhang
2018-02-07 10:17   ` Pankaj Gupta
2018-02-07 11:18     ` Haozhong Zhang
2018-02-07 11:30       ` Pankaj Gupta
2018-02-07 11:38   ` Dr. David Alan Gilbert
2018-02-07 11:52     ` Haozhong Zhang
2018-02-07 12:51       ` Haozhong Zhang
2018-02-07 12:59         ` Dr. David Alan Gilbert
2018-02-07 14:10         ` Pankaj Gupta
2018-02-07 12:56       ` Dr. David Alan Gilbert
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 6/8] migration/ram: ensure write persistence on loading normal " Haozhong Zhang
2018-02-07 11:49   ` Dr. David Alan Gilbert
2018-02-07 12:02     ` Haozhong Zhang
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 7/8] migration/ram: ensure write persistence on loading compressed " Haozhong Zhang
2018-02-07 11:54   ` Dr. David Alan Gilbert
2018-02-07 12:15     ` Haozhong Zhang
2018-02-07 13:03       ` Dr. David Alan Gilbert
2018-02-07 13:20         ` Haozhong Zhang
2018-02-07 13:24           ` Dr. David Alan Gilbert
2018-02-07 18:05             ` Dan Williams
2018-02-07 18:08               ` Dr. David Alan Gilbert
2018-02-07 18:31                 ` Dan Williams
2018-02-07 18:37                   ` Dr. David Alan Gilbert
2018-02-07 22:43                     ` Dan Williams [this message]
2018-02-07  7:33 ` [Qemu-devel] [PATCH v2 8/8] migration/ram: ensure write persistence on loading xbzrle " Haozhong Zhang
2018-02-07 13:08   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4idQNgGV0EyRuCenGSU2_OCZS50Z3LHO_ExMTakL-n1ig@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.