From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Sender: pmem@googlegroups.com MIME-Version: 1.0 In-Reply-To: References: From: Yigal Korman Date: Wed, 27 Jun 2018 17:02:20 +0300 Message-ID: Subject: Re: [PATCH] x86: optimize memcpy_flushcache Content-Type: text/plain; charset="UTF-8" List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , To: Dan Williams Cc: Mikulas Patocka , Mike Snitzer , Ingo Molnar , device-mapper development , linux-nvdimm , X86 ML , pmem List-ID: On Wed, Jun 27, 2018 at 4:03 PM, Dan Williams wrote: > On Wed, Jun 27, 2018 at 4:23 AM, Yigal Korman wrote: >> Hi, >> I'm a bit late on this but I have a question about the original patch - >> I thought that in order for movnt (movntil, movntiq) to push the data >> into the persistency domain (ADR), >> one must work with length that is multiple of cacheline size, >> otherwise the write-combine buffers remain partially >> filled and you need to commit them with a fence (sfence) - which ruins >> the whole performance gain you got here. >> Am I wrong, are the write-combine buffers are part of the ADR domain >> or something? > > The intent is to allow a batch of memcpy_flushcache() calls followed > by a single sfence. Specifying a multiple of a cacheline size does not > necessarily help as sfence is still needed to make sure that the movnt > result has reached the ADR-safe domain. Oh, right, I see that dm-writecache calls writecache_commit_flushed which in turn calls wmb(). I keep confusing *_nocache (i.e. copy_user_nocache) that includes sfence and *_flushcache (i.e. memcpy_flushcache) that doesn't. Thanks for the clear up. -- You received this message because you are subscribed to the Google Groups "pmem" group. To unsubscribe from this group and stop receiving emails from it, send an email to pmem+unsubscribe@googlegroups.com. To post to this group, send email to pmem@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/pmem/CACTTzNZOK8cYZBcSjShFKYCoaW33cFtCZP7bN1CmQg9ZXsVf2w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yigal Korman Subject: Re: [PATCH] x86: optimize memcpy_flushcache Date: Wed, 27 Jun 2018 17:02:20 +0300 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Dan Williams Cc: Mike Snitzer , linux-nvdimm , X86 ML , device-mapper development , Ingo Molnar , Mikulas Patocka , pmem List-Id: dm-devel.ids On Wed, Jun 27, 2018 at 4:03 PM, Dan Williams wrote: > On Wed, Jun 27, 2018 at 4:23 AM, Yigal Korman wrote: >> Hi, >> I'm a bit late on this but I have a question about the original patch - >> I thought that in order for movnt (movntil, movntiq) to push the data >> into the persistency domain (ADR), >> one must work with length that is multiple of cacheline size, >> otherwise the write-combine buffers remain partially >> filled and you need to commit them with a fence (sfence) - which ruins >> the whole performance gain you got here. >> Am I wrong, are the write-combine buffers are part of the ADR domain >> or something? > > The intent is to allow a batch of memcpy_flushcache() calls followed > by a single sfence. Specifying a multiple of a cacheline size does not > necessarily help as sfence is still needed to make sure that the movnt > result has reached the ADR-safe domain. Oh, right, I see that dm-writecache calls writecache_commit_flushed which in turn calls wmb(). I keep confusing *_nocache (i.e. copy_user_nocache) that includes sfence and *_flushcache (i.e. memcpy_flushcache) that doesn't. Thanks for the clear up.