From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35233) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ejPH6-0000Bj-Or for qemu-devel@nongnu.org; Wed, 07 Feb 2018 07:56:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ejPH2-0001ti-El for qemu-devel@nongnu.org; Wed, 07 Feb 2018 07:56:32 -0500 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51230 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ejPH2-0001tA-7u for qemu-devel@nongnu.org; Wed, 07 Feb 2018 07:56:28 -0500 Date: Wed, 7 Feb 2018 12:56:24 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20180207125623.GF2665@work-vm> References: <20180207073331.14158-1-haozhong.zhang@intel.com> <20180207073331.14158-6-haozhong.zhang@intel.com> <20180207113841.GB2665@work-vm> <20180207115207.qeqld4v3hl246qu4@hz-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180207115207.qeqld4v3hl246qu4@hz-desktop> Subject: Re: [Qemu-devel] [PATCH v2 5/8] migration/ram: ensure write persistence on loading zero pages to PMEM List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org, Eduardo Habkost , Igor Mammedov , Paolo Bonzini , mst@redhat.com, Xiao Guangrong , Juan Quintela , Stefan Hajnoczi , Dan Williams * Haozhong Zhang (haozhong.zhang@intel.com) wrote: > On 02/07/18 11:38 +0000, Dr. David Alan Gilbert wrote: > > * Haozhong Zhang (haozhong.zhang@intel.com) wrote: > > > When loading a zero page, check whether it will be loaded to > > > persistent memory If yes, load it by libpmem function > > > pmem_memset_nodrain(). Combined with a call to pmem_drain() at the > > > end of RAM loading, we can guarantee all those zero pages are > > > persistently loaded. > > > > I'm surprised pmem is this invasive to be honest; I hadn't expected > > the need for special memcpy's etc everywhere. We're bound to miss some. > > I assume there's separate kernel work needed to make postcopy work; > > certainly the patches here don't seem to touch the postcopy path. > > This link at > https://wiki.qemu.org/Features/PostCopyLiveMigration#Conflicts shows > that postcopy with memory-backend-file requires kernel support. Can > you point me the details of the required kernel support, so that I can > understand what would be needed to NVDIMM postcopy? I can't, but ask Andrea Arcangeli ( aarcange@redhat.com ); he wrote the userfault kernel code. Note that we have a mechanism for atomically placing a page into memory, so that might also need modifications for pmem; again check with Andrea. Dave > > > > > Depending on the host HW/SW configurations, pmem_drain() can be > > > "sfence". Therefore, we do not call pmem_drain() after each > > > pmem_memset_nodrain(), or use pmem_memset_persist() (equally > > > pmem_memset_nodrain() + pmem_drain()), in order to avoid unnecessary > > > overhead. > > > > > > Signed-off-by: Haozhong Zhang > > > --- > > > include/qemu/pmem.h | 9 +++++++++ > > > migration/ram.c | 34 +++++++++++++++++++++++++++++----- > > > 2 files changed, 38 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/qemu/pmem.h b/include/qemu/pmem.h > > > index 9017596ff0..861d8ecc21 100644 > > > --- a/include/qemu/pmem.h > > > +++ b/include/qemu/pmem.h > > > @@ -26,6 +26,15 @@ pmem_memcpy_persist(void *pmemdest, const void *src, size_t len) > > > return memcpy(pmemdest, src, len); > > > } > > > > > > +static inline void *pmem_memset_nodrain(void *pmemdest, int c, size_t len) > > > +{ > > > + return memset(pmemdest, c, len); > > > +} > > > + > > > +static inline void pmem_drain(void) > > > +{ > > > +} > > > + > > > #endif /* CONFIG_LIBPMEM */ > > > > > > #endif /* !QEMU_PMEM_H */ > > > diff --git a/migration/ram.c b/migration/ram.c > > > index cb1950f3eb..5a0e503818 100644 > > > --- a/migration/ram.c > > > +++ b/migration/ram.c > > > @@ -49,6 +49,7 @@ > > > #include "qemu/rcu_queue.h" > > > #include "migration/colo.h" > > > #include "migration/block.h" > > > +#include "qemu/pmem.h" > > > > > > /***********************************************************/ > > > /* ram save/restore */ > > > @@ -2467,6 +2468,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, > > > return block->host + offset; > > > } > > > > > > +static void ram_handle_compressed_common(void *host, uint8_t ch, uint64_t size, > > > + bool is_pmem) > > > > I don't think there's any advantage of splitting out this _common > > routine; lets just add the parameter to ram_handle_compressed. > > > > > +{ > > > + if (!ch && is_zero_range(host, size)) { > > > + return; > > > + } > > > + > > > + if (!is_pmem) { > > > + memset(host, ch, size); > > > + } else { > > > + pmem_memset_nodrain(host, ch, size); > > > + } > > > > I'm wondering if it would be easier to pass in a memsetfunc ptr and call > > that (defualting to memset if it's NULL). > > Yes, it would be more extensible if we have other special memory > devices in the future. > > Thank, > Haozhong > > > > > > +} > > > + > > > /** > > > * ram_handle_compressed: handle the zero page case > > > * > > > @@ -2479,9 +2494,7 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, > > > */ > > > void ram_handle_compressed(void *host, uint8_t ch, uint64_t size) > > > { > > > - if (ch != 0 || !is_zero_range(host, size)) { > > > - memset(host, ch, size); > > > - } > > > + return ram_handle_compressed_common(host, ch, size, false); > > > } > > > > > > static void *do_data_decompress(void *opaque) > > > @@ -2823,6 +2836,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > bool postcopy_running = postcopy_is_running(); > > > /* ADVISE is earlier, it shows the source has the postcopy capability on */ > > > bool postcopy_advised = postcopy_is_advised(); > > > + bool need_pmem_drain = false; > > > > > > seq_iter++; > > > > > > @@ -2848,6 +2862,8 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > ram_addr_t addr, total_ram_bytes; > > > void *host = NULL; > > > uint8_t ch; > > > + RAMBlock *block = NULL; > > > + bool is_pmem = false; > > > > > > addr = qemu_get_be64(f); > > > flags = addr & ~TARGET_PAGE_MASK; > > > @@ -2864,7 +2880,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > > > > if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE | > > > RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { > > > - RAMBlock *block = ram_block_from_stream(f, flags); > > > + block = ram_block_from_stream(f, flags); > > > > > > host = host_from_ram_block_offset(block, addr); > > > if (!host) { > > > @@ -2874,6 +2890,9 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > } > > > ramblock_recv_bitmap_set(block, host); > > > trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host); > > > + > > > + is_pmem = ramblock_is_pmem(block); > > > + need_pmem_drain = need_pmem_drain || is_pmem; > > > } > > > > > > switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { > > > @@ -2927,7 +2946,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > > > > case RAM_SAVE_FLAG_ZERO: > > > ch = qemu_get_byte(f); > > > - ram_handle_compressed(host, ch, TARGET_PAGE_SIZE); > > > + ram_handle_compressed_common(host, ch, TARGET_PAGE_SIZE, is_pmem); > > > break; > > > > > > case RAM_SAVE_FLAG_PAGE: > > > @@ -2970,6 +2989,11 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) > > > } > > > > > > wait_for_decompress_done(); > > > + > > > + if (need_pmem_drain) { > > > + pmem_drain(); > > > + } > > > + > > > rcu_read_unlock(); > > > trace_ram_load_complete(ret, seq_iter); > > > return ret; > > > -- > > > 2.14.1 > > > > Dave > > > > > > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK